yt-dlp-dags/ansible/README.md

2.3 KiB

Ansible Deployment for YT-DLP Cluster

This document provides an overview of the Ansible playbooks used to deploy and manage the YT-DLP Airflow cluster.

Main Playbooks

These are the primary entry points for cluster management.

  • playbook-full-with-proxies.yml: (Recommended Entry Point) Deploys shadowsocks proxies and then the entire application stack.
  • playbook-full.yml: Deploys the entire application stack (master and workers) without touching proxies.
  • playbook-master.yml: Deploys/updates only the Airflow master node.
  • playbook-worker.yml: Deploys/updates all Airflow worker nodes.
  • playbook-proxies.yml: Deploys/updates only the shadowsocks proxy services on all nodes.

Component & Utility Playbooks

These playbooks are used for more specific tasks or are called by the main playbooks.

Core Deployment Logic

  • roles/airflow-master/tasks/main.yml: Contains all tasks for setting up the Airflow master services.
  • roles/airflow-worker/tasks/main.yml: Contains all tasks for setting up the Airflow worker services.
  • roles/ytdlp-master/tasks/main.yml: Contains tasks for setting up the YT-DLP management services on the master.
  • roles/ytdlp-worker/tasks/main.yml: Contains tasks for setting up YT-DLP, Camoufox, and other worker-specific services.

Utility & Maintenance

  • playbook-dags.yml: Quickly syncs only the dags/ and config/ directories to all nodes.
  • playbook-hook.yml: Syncs Airflow custom hooks and restarts relevant services.
  • playbook-sync-local.yml: Syncs local development files (e.g., ytops_client, pangramia) to workers.
  • playbooks/pause_worker.yml: Pauses a worker by creating a lock file, preventing it from taking new tasks.
  • playbooks/resume_worker.yml: Resumes a paused worker by removing the lock file.
  • playbooks/playbook-bgutils-start.yml: Starts the bgutil-provider container.
  • playbooks/playbook-bgutils-stop.yml: Stops the bgutil-provider container.
  • playbook-update-s3-vars.yml: Updates the s3_delivery_connection in Airflow.
  • playbook-update-regression-script.yml: Updates the regression.py script on the master.

Deprecated

  • playbook-dl.yml: Older worker deployment logic. Superseded by playbook-worker.yml.
  • playbook-depricated.dl.yml: Older worker deployment logic. Superseded by playbook-worker.yml.