4.6 KiB
Ansible-driven YT-DLP / Airflow Cluster – Quick-Start & Cheat-Sheet
One playbook = one command to deploy, update, restart, or re-configure the entire cluster.
0. Prerequisites (run once on the tower server)
---
## 1. Ansible Vault Setup (run once on your **local machine**)
This project uses Ansible Vault to encrypt sensitive data like passwords and API keys. To run the playbooks, you need to provide the vault password. The recommended way is to create a file named `.vault_pass` in the root of the project directory.
1. **Create the Vault Password File:**
From the project's root directory (e.g., `/opt/yt-ops-services`), create the file. The file should contain only your vault password on a single line.
```bash
# Replace 'your_secret_password_here' with your actual vault password
echo "your_secret_password_here" > .vault_pass
```
2. **Secure the File:**
It's good practice to restrict permissions on this file so only you can read it.
```bash
chmod 600 .vault_pass
```
The `ansible.cfg` file is configured to automatically look for this `.vault_pass` file in the project root.
---
## 1.5. Cluster & Inventory Management
The Ansible inventory (`ansible/inventory.ini`), host-specific variables (`ansible/host_vars/`), and the master `docker-compose.yaml` are dynamically generated from a central cluster definition file (e.g., `cluster.yml`).
**Whenever you add, remove, or change the IP of a node in your `cluster.yml`, you must re-run the generator script.**
1. **Install Script Dependencies (run once):**
The generator script requires `PyYAML` and `Jinja2`. Install them using pip:
```bash
pip3 install PyYAML Jinja2
```
2. **Edit Your Cluster Definition:**
Modify your `cluster.yml` file (located in the project root) to define your master and worker nodes.
3. **Run the Generator Script:**
From the project's root directory, run the following command to update all generated files:
```bash
# Make sure the script is executable first: chmod +x tools/generate-inventory.py
./tools/generate-inventory.py cluster.yml
```
This ensures that Ansible has the correct host information and that the master node's Docker Compose configuration includes the correct `extra_hosts` for log fetching from workers.
---
## 2. Setup and Basic Usage
### Running Ansible Commands
**IMPORTANT:** All `ansible-playbook` commands should be run from within the `ansible/` directory. This allows Ansible to automatically find the `ansible.cfg` and `inventory.ini` files.
```bash
cd ansible
ansible-playbook <playbook_name>.yml
The ansible.cfg file is configured to automatically use the .vault_pass file located in the project root (one level above ansible/). This means you do not need to manually specify --vault-password-file ../.vault_pass in your commands. Ensure your .vault_pass file is located in the project root.
If you run ansible-playbook from the project root instead of the ansible/ directory, you will see warnings about the inventory not being parsed, because Ansible does not automatically find ansible/ansible.cfg.
3. Deployment Scenarios
Full Cluster Deployment
To deploy or update the entire cluster (master and all workers), run the main playbook. This will build/pull images and restart all services.
# Run from inside the ansible/ directory
ansible-playbook playbook-full.yml
Targeted & Fast Deployments
For faster development cycles, you can deploy changes to specific parts of the cluster without rebuilding or re-pulling Docker images.
Updating Only the Master Node (Fast Deploy)
To sync configuration, code, and restart services on the master node without rebuilding the Airflow image or pulling the ytdlp-ops-server image, use the fast_deploy flag with the master playbook. This is ideal for pushing changes to DAGs, Python code, or config files.
# Run from inside the ansible/ directory
ansible-playbook playbook-master.yml --extra-vars "fast_deploy=true"
Updating Only a Specific Worker Node (Fast Deploy)
Similarly, you can update a single worker node. Replace dl001 with the hostname of the worker you want to target from your inventory.ini.
# Run from inside the ansible/ directory
ansible-playbook playbook-worker.yml --limit dl001 --extra-vars "fast_deploy=true"
Updating Only DAGs and Configs
If you have only changed DAGs or configuration files and don't need to restart any services, you can run a much faster playbook that only syncs the dags/ and config/ directories.
# Run from inside the ansible/ directory
ansible-playbook playbook-dags.yml