yt-dlp-dags/ansible/playbook-README.md

255 lines
8.8 KiB
Markdown

# Ansible Playbooks Documentation
This document provides an overview of all available playbooks, their purpose, and how to use them operationally.
## Table of Contents
1. [Deployment Workflow](#deployment-workflow)
2. [Initial Setup Playbooks](#initial-setup-playbooks)
3. [Operational Playbooks](#operational-playbooks)
4. [Monitoring and Inspection](#monitoring-and-inspection)
5. [Common Operations](#common-operations)
## Deployment Workflow
The typical deployment workflow follows these steps:
1. **Initial Setup**: Configure machines, deploy proxies, install dependencies
2. **Master Setup**: Configure Redis, MinIO, and other infrastructure services
3. **Worker Setup**: Configure auth generators and download simulators
4. **Operational Management**: Start/stop processes, monitor status, cleanup profiles
## Initial Setup Playbooks
For a complete, automated installation on fresh nodes, you can use the main `full-install` playbook:
```bash
# Run the complete installation process on all nodes
ansible-playbook ansible/playbook-full-install.yml -i ansible/inventory.green.ini
```
The steps below are for running each part of the installation manually.
### Base Installation
```bash
# Deploy base system requirements to all nodes
ansible-playbook ansible/playbook-base-system.yml -i ansible/inventory.green.ini
```
### Proxy Deployment
```bash
# Deploy shadowsocks proxies to all nodes (as defined in cluster config)
ansible-playbook ansible/playbook-proxies.yml -i ansible/inventory.green.ini
```
### Code, Environment, and Dependencies
```bash
# Sync code to all nodes
ansible-playbook ansible/playbook-stress-sync-code.yml -i ansible/inventory.green.ini
# Generate .env files for all nodes
ansible-playbook ansible/playbook-stress-generate-env.yml -i ansible/inventory.green.ini
# Install dependencies on all nodes
ansible-playbook ansible/playbook-stress-install-deps.yml -i ansible/inventory.green.ini
```
## Operational Playbooks
### Master Node Services
Redis and MinIO are deployed as Docker containers during the initial setup (`playbook-full-install.yml`).
To start the policy enforcer and monitoring tmux sessions on the master node:
```bash
# Start policy enforcer and monitoring on master
ansible-playbook ansible/playbook-stress-manage-processes.yml -i ansible/inventory.green.ini -e "start_enforcer=true start_monitor=true"
```
To stop processes on the master node:
```bash
# Stop ONLY the policy enforcer
ansible-playbook ansible/playbook-stress-manage-processes.yml -i ansible/inventory.green.ini -e "stop_enforcer=true"
# Stop ONLY the monitor
ansible-playbook ansible/playbook-stress-manage-processes.yml -i ansible/inventory.green.ini -e "stop_monitor=true"
# Stop BOTH the enforcer and monitor
ansible-playbook ansible/playbook-stress-manage-processes.yml -i ansible/inventory.green.ini -e "stop_sessions=true"
```
### Worker Node Processes
```bash
# Start auth generators and download simulators on all workers
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=start"
# Start ONLY auth generators on workers
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=start-auth"
# Start ONLY download simulators on workers
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=start-download"
# Stop all processes on workers
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=stop"
# Stop ONLY auth generators on workers
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=stop-auth"
# Stop ONLY download simulators on workers
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=stop-download"
# Check status of all worker processes
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=status"
```
### Profile Management
The `cleanup-profiles` action can be used to remove profiles from Redis. By default, it cleans up "ungrouped" profiles.
```bash
# Perform a dry run of cleaning up ungrouped profiles
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=cleanup-profiles" -e "dry_run=true"
# Clean up ungrouped profiles
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=cleanup-profiles"
# To clean up ALL profiles (destructive), set cleanup_mode=full
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=cleanup-profiles" -e "cleanup_mode=full"
# You can specify a custom setup policy file for cleanup operations
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=cleanup-profiles" -e "setup_policy=policies/my_custom_setup_policy.yaml"
```
## Monitoring and Inspection
### Status Checks
```bash
# Check status of all processes on all nodes
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=status"
# Check enforcer status on master
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=check-enforcer"
# Check profile status
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=profile-status"
```
### Log Inspection
```bash
# View tmux session output on a specific node
ssh user@hostname
tmux attach -t stress-auth-user1,user2 # For auth generator
tmux attach -t stress-download-user1,user2 # For download simulator
tmux attach -t stress-enforcer # For policy enforcer on master
```
## Common Operations
### Code and Policy Updates
First, sync your local changes to the jump host from your development machine:
```bash
# Sync project to jump host
./tools/sync-to-jump.sh
```
Then, from the jump host, you can sync code or policies to the cluster nodes:
```bash
# Sync all application code (Python sources, scripts, etc.)
ansible-playbook ansible/playbook-stress-sync-code.yml -i ansible/inventory.green.ini
# Sync only policies and CLI configs
ansible-playbook ansible/playbook-stress-sync-policies.yml -i ansible/inventory.green.ini
# To sync files from a custom source directory on the Ansible controller, use the 'source_base_dir' extra variable:
ansible-playbook ansible/playbook-stress-sync-policies.yml -i ansible/inventory.green.ini -e "source_base_dir=/path/to/my-custom-source"
```
### Docker Image Updates
To update the `yt-dlp` docker image used by download simulators, run the following playbook. This builds the image locally on each worker node.
```bash
# Build the yt-dlp docker image locally on each worker node
ansible-playbook ansible/playbook-update-yt-dlp-docker.yml -i ansible/inventory.green.ini
```
### Adding a New Worker
1. Update `cluster.green.yml` with the new worker definition:
```yaml
workers:
new-worker:
ip: x.x.x.x
port: 22
profile_prefixes:
- "user4"
proxies:
- "sslocal-rust-1090"
```
2. Regenerate inventory:
```bash
./tools/generate-inventory.py cluster.green.yml
```
3. Run the full installation playbook, limiting it to the new worker:
```bash
ansible-playbook ansible/playbook-full-install.yml -i ansible/inventory.green.ini --limit new-worker
```
4. Start processes on the new worker:
```bash
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=start" --limit new-worker
```
### Removing a Worker
1. Stop all processes on the worker:
```bash
ansible-playbook ansible/playbook-stress-lifecycle.yml -i ansible/inventory.green.ini -e "action=stop" --limit worker-to-remove
```
2. Remove the worker from `cluster.green.yml`
3. Regenerate inventory:
```bash
./tools/generate-inventory.py cluster.green.yml
```
### Emergency Stop All
```bash
# Stop all processes on all nodes
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=stop-all"
```
### Stopping Specific Nodes
To stop processes on a specific worker or group of workers, you can use the `stop-nodes` action and limit the playbook run.
```bash
# Stop all processes on a single worker (e.g., dl003)
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=stop-nodes" --limit dl003
# Stop all processes on all workers
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=stop-nodes" --limit workers
```
### Restart Enforcer and Monitoring
```bash
# Restart monitoring and enforcer on master using default policies
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=restart-monitoring"
# Restart using a custom enforcer policy
ansible-playbook ansible/playbook-stress-control.yml -i ansible/inventory.green.ini -e "action=restart-monitoring" -e "enforcer_policy=policies/my_other_enforcer.yaml"
```