yt-dlp-dags/ansible/playbook-README.md

7.5 KiB

Ansible Playbooks Documentation

This document provides an overview of all available playbooks, their purpose, and how to use them operationally.

Table of Contents

  1. Deployment Workflow
  2. Initial Setup Playbooks
  3. Operational Playbooks
  4. Monitoring and Inspection
  5. Common Operations

Deployment Workflow

The typical deployment workflow follows these steps:

  1. Initial Setup: Configure machines, deploy proxies, install dependencies
  2. Master Setup: Configure Redis, MinIO, and other infrastructure services
  3. Worker Setup: Configure auth generators and download simulators
  4. Operational Management: Start/stop processes, monitor status, cleanup profiles

Initial Setup Playbooks

For a complete, automated installation on fresh nodes, you can use the main full-install playbook:

# Run the complete installation process on all nodes
ansible-playbook ansible/playbook-full-install.yml -i ansible/inventory.green.ini

The steps below are for running each part of the installation manually.

Base Installation

# Deploy base system requirements to all nodes
ansible-playbook ansible/playbook-base-system.yml -i ansible/inventory.green.ini

Proxy Deployment

# Deploy shadowsocks proxies to all nodes (as defined in cluster config)
ansible-playbook ansible/playbook-proxies.yml -i ansible/inventory.green.ini

Code, Environment, and Dependencies

# Sync code to all nodes
ansible-playbook ansible/playbook-stress-sync-code.yml -i ansible/inventory.green.ini

# Generate .env files for all nodes
ansible-playbook ansible/playbook-stress-generate-env.yml -i ansible/inventory.green.ini

# Install dependencies on all nodes
ansible-playbook ansible/playbook-stress-install-deps.yml -i ansible/inventory.green.ini

Operational Playbooks

Master Node Services

Redis and MinIO are deployed as Docker containers during the initial setup (playbook-full-install.yml).

To start the policy enforcer and monitoring tmux sessions on the master node:

# Start policy enforcer and monitoring on master
ansible-playbook ansible/playbook-stress-manage-processes.yml -i ansible/inventory.green.ini -e "start_enforcer=true start_monitor=true"

To stop processes on the master node:

# Stop ONLY the policy enforcer
ansible-playbook ansible/playbook-stress-manage-processes.yml -i ansible/inventory.green.ini -e "stop_enforcer=true"

# Stop ONLY the monitor
ansible-playbook ansible/playbook-stress-manage-processes.yml -i ansible/inventory.green.ini -e "stop_monitor=true"

# Stop BOTH the enforcer and monitor
ansible-playbook ansible/playbook-stress-manage-processes.yml -i ansible/inventory.green.ini -e "stop_sessions=true"

Worker Node Processes

# Start auth generators and download simulators on all workers
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=start"

# Start ONLY auth generators on workers
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=start-auth"

# Start ONLY download simulators on workers
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=start-download"

# Stop all processes on workers
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=stop"

# Stop ONLY auth generators on workers
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=stop-auth"

# Stop ONLY download simulators on workers
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=stop-download"

# Check status of all worker processes
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=status"

Profile Management

# Clean up all profiles
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=cleanup-profiles"

# Clean up specific profile prefix
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=cleanup-profiles" -e "profile_prefix=user1"

Monitoring and Inspection

Status Checks

# Check status of all processes on all nodes
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=status"

# Check enforcer status on master
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=check-enforcer"

# Check profile status
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=profile-status"

Log Inspection

# View tmux session output on a specific node
ssh user@hostname
tmux attach -t stress-auth-user1,user2  # For auth generator
tmux attach -t stress-download-user1,user2  # For download simulator
tmux attach -t stress-enforcer  # For policy enforcer on master

Common Operations

Code and Policy Updates

First, sync your local changes to the jump host from your development machine:

# Sync project to jump host
./tools/sync-to-jump.sh

Then, from the jump host, you can sync code or policies to the cluster nodes:

# Sync all application code (Python sources, scripts, etc.)
ansible-playbook ansible/playbook-stress-sync-code.yml -i ansible/inventory.green.ini

# Sync only policies and CLI configs
ansible-playbook ansible/playbook-stress-sync-configs.yml -i ansible/inventory.green.ini

Adding a New Worker

  1. Update cluster.green.yml with the new worker definition:

    workers:
      new-worker:
        ip: x.x.x.x
        port: 22
        profile_prefixes:
          - "user4"
        proxies:
          - "sslocal-rust-1090"
    
  2. Regenerate inventory:

    ./tools/generate-inventory.py cluster.green.yml
    
  3. Run the full installation playbook, limiting it to the new worker:

    ansible-playbook ansible/playbook-full-install.yml -i ansible/inventory.green.ini --limit new-worker
    
  4. Start processes on the new worker:

    ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=start" --limit new-worker
    

Removing a Worker

  1. Stop all processes on the worker:

    ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=stop" --limit worker-to-remove
    
  2. Remove the worker from cluster.green.yml

  3. Regenerate inventory:

    ./tools/generate-inventory.py cluster.green.yml
    

Emergency Stop All

# Stop all processes on all nodes
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=stop-all"

Stopping Specific Nodes

To stop processes on a specific worker or group of workers, you can use the stop-nodes action and limit the playbook run.

# Stop all processes on a single worker (e.g., dl003)
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=stop-nodes" --limit dl003

# Stop all processes on all workers
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=stop-nodes" --limit workers

Restart Enforcer and Monitoring

# Restart monitoring and enforcer on master
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=restart-monitoring"