Compare commits

..

No commits in common. "9fd06b4a7bbead011817cef744879b2a0677fdad" and "2be7716bed6693c5cd08eb57460524633d472f9d" have entirely different histories.

217 changed files with 33001 additions and 2819 deletions

4731
.aider.chat.history.md Normal file

File diff suppressed because it is too large Load Diff

180
.aider.input.history Normal file
View File

@ -0,0 +1,180 @@
# 2025-08-19 17:52:06.094899
+Y
# 2025-08-19 17:55:33.616531
+D
# 2025-08-19 17:55:35.382770
+No
# 2025-08-19 17:55:39.050939
+ls
# 2025-08-19 17:56:18.910148
+/read-only ../yt-dlp-deployment/ansible/airflow_worker
# 2025-08-26 19:27:21.057865
+/model openrouter/qwen/qwen3-coder
# 2025-08-26 19:29:07.866685
+ok. I have copied major files from other repo.. You need to analyze what is necessary, fix pathes, allow to remove or restructgure ansible, create README, but don't much change code, this is workable. So now you need to recheck do we need top-level folder as it, or change them (duplicates thrift_model and pangramia). take a look at ansible/
# 2025-08-26 19:30:47.790538
+D
# 2025-08-26 19:30:56.392322
+No
# 2025-08-26 19:32:26.827936
+Sorry .. this thrift and pangramia are only needed to 1) allow clients to address services 2) to be copied to airflow dockerfiles to allow dags to work with... So examine at first /add ansible/
# 2025-08-26 19:32:47.626964
+D
# 2025-08-26 19:32:51.274712
+No
# 2025-08-26 19:32:53.860479
+D
# 2025-08-26 19:33:07.362019
+No
# 2025-08-26 19:33:20.996991
+/undo setup.py
# 2025-08-26 19:33:49.644236
+/add ansible
# 2025-08-26 19:34:17.769816
+Please examine which files are addressed, but probably missed... from ansibl?
# 2025-08-26 19:34:59.954086
+/model openrouter/qwen/qwen3-coder
# 2025-08-26 19:35:21.760002
+/ask please analyze which files are addressed from ansible, but probably missed from this repo
# 2025-08-26 19:48:30.224955
+D
# 2025-08-26 19:58:41.483184
+please create that folders if missed, but also say if we might remove some top-level folders /run ls
# 2025-08-26 19:59:25.580454
+D
# 2025-08-26 19:59:31.175830
+S
# 2025-08-26 20:04:17.221108
+/run ls
# 2025-08-26 20:09:50.479402
+/add airflow/Dockerfile
# 2025-08-26 20:10:42.985326
+SOrry may you analyze if all that folder top are needed and if rerferenced properly
# 2025-08-26 20:11:08.062598
+D
# 2025-08-26 20:15:58.400588
+Sorry ... it seems for me that thrift_model yt_ops and ytdlp_ops_auth are all for the thrift dependences, used in airflow/Dockerfile build. But we need to check if thery need to be top-level or create a certain inside special folder, and update ansible references, since then it copy that into both master and worker for build.. same for setup.py it's only to setup locally that package (e.g. pangramia) and then be used by clients(two files alike ...) ... So we probably may move them but you need to recheck ansible refs
# 2025-08-26 20:17:25.069042
+/run ls
# 2025-08-26 20:18:34.524722
+we probalby may move both top level py as well as setup.py to that yt_ops_package/ while adding inside pangramia => ln -s to thrift_model/gen_py/pangramia).
# 2025-08-26 20:19:52.199608
+/model
# 2025-08-26 20:20:05.979177
+/model openrouter/qwen/qwen3-coder
# 2025-08-26 20:20:19.321643
+sorry you probably failed with previous
# 2025-08-26 20:21:40.616011
+may I move top level python files to package, ok? then you will update ansible?
# 2025-08-26 20:22:51.544085
+/model openrouter/moonshotai/kimi-k2
# 2025-08-26 20:23:00.944724
+exit
# 2025-08-26 20:25:10.944346
+/add ansible
# 2025-08-26 20:36:07.449988
+/add README.md
# 2025-08-26 20:36:15.896777
+/run ls
# 2025-08-26 20:36:58.783499
+I haven't finished, please move thrift_model and setup.py and all py top level files to new folder yt_ops_package, and update references
# 2025-08-26 20:37:10.904873
+D
# 2025-08-26 20:42:07.375209
+No
# 2025-08-26 20:42:14.699143
+D
# 2025-08-26 20:42:23.537622
+S
# 2025-08-26 20:43:16.492226
+No.. you don't need to udpate that files, you only configure ansible to copy to a proper folder to place
# 2025-08-26 20:47:19.166850
+D
# 2025-08-26 21:03:13.329925
+/clear
# 2025-08-26 21:03:48.279889
+Please recheck that ansible addresses to proper folder and files, change only ansible sciprts, also add README.md
# 2025-08-26 21:13:26.425838
+D
# 2025-08-26 21:13:28.330054
+S
# 2025-08-27 02:03:38.074031
+/clear
# 2025-08-27 02:04:48.031688
+do we need for ansible airflow/docker-compose.camoufox.yaml and airflow/docker-compose-master.yaml and docker-compose-dl.yaml... or we utilize only j2 versions... Also do we need airflow/init-airflow.sh and airflow/init-yt-service.sh and init-yt-service.sh ?
# 2025-08-27 02:04:53.172717
+D
# 2025-08-27 02:06:03.759537
+No
# 2025-08-27 02:13:28.788492
+exit
# 2025-08-27 02:15:18.658843
+may you recheck that ansible/ folder does have proper links to the files and folder from this repo, so it's redy for deployment ... Also recheck that it's ok from logging perspective, alike workers and master are have proper s3 logging configured
# 2025-08-27 02:15:49.033642
+/add ansible/
# 2025-08-27 02:15:51.656556
+added
# 2025-08-27 02:16:44.736374
+D
# 2025-08-27 02:17:22.140783
+S

Binary file not shown.

14
.airflowignore Normal file
View File

@ -0,0 +1,14 @@
# Ignore test files
*_test.py
# Ignore development files
dev_*.py
# Ignore temporary files
*.tmp
*.temp
# Ignore version control
.git/
.gitignore
# Ignore Python cache files
__pycache__/
*.py[cod]
*$py.class

72
.dockerignore Normal file
View File

@ -0,0 +1,72 @@
# Git files
.git
.gitignore
.gitattributes
.github/
# Node.js
node_modules
npm-debug.log
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
venv/
.env
.venv
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg
# Media and temporary files
*.mp4
*.part
*.info.json
*.webm
*.m4a
*.mp3
# Specific files to exclude
generate_tokens_parallel.mjs
generate_tokens_playwright.mjs
# OS specific files
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db
# IDE files
.idea/
.vscode/
*.swp
*.swo
# Build artifacts
target/
# yt_ops_package build artifacts
yt_ops_package/__pycache__/
yt_ops_package/*.py[cod]
yt_ops_package/*$py.class
yt_ops_package/build/
yt_ops_package/dist/
yt_ops_package/*.egg-info/

1
.gitignore vendored Normal file
View File

@ -0,0 +1 @@
.aider*

1
.vault_pass Normal file
View File

@ -0,0 +1 @@
ytdlp-ops

4
README.md Normal file
View File

@ -0,0 +1,4 @@
# YT Ops Services

View File

@ -1 +1 @@
3.6.1 3.6.0

View File

@ -1,6 +1,2 @@
redis-data redis-data
minio-data minio-data
logs
downloadfiles
addfiles
inputfiles

11
airflow/.env.master Normal file
View File

@ -0,0 +1,11 @@
HOSTNAME="af-green"
REDIS_PASSWORD="rOhTAIlTFFylXsjhqwxnYxDChFc"
POSTGRES_PASSWORD="pgdb_pwd_A7bC2xY9zE1wV5uP"
AIRFLOW_UID=1003
AIRFLOW_ADMIN_PASSWORD="2r234sdfrt3q454arq45q355"
YTDLP_BASE_PORT=9090
SERVER_IDENTITY="ytdlp-ops-service-mgmt"
SERVICE_ROLE=management
AIRFLOW_GID=0
MINIO_ROOT_USER=admin
MINIO_ROOT_PASSWORD=0153093693-0009

23
airflow/.env.old Normal file
View File

@ -0,0 +1,23 @@
AIRFLOW_IMAGE_NAME=apache/airflow:2.10.4
_AIRFLOW_WWW_USER_USERNAME=airflow
_AIRFLOW_WWW_USER_PASSWORD=airflow-password-ytld
AIRFLOW_UID=50000
AIRFLOW_PROJ_DIR=.
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow-new-super-pass@89.253.221.173:52919/airflow
AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:airflow-new-super-pass@89.253.221.173:52919/airflow
AIRFLOW__CELERY__BROKER_URL=redis://:rOhTAIlTFFylXsjhqwxnYxDChFc@89.253.221.173:52909/0
AIRFLOW_QUEUE=holisticlegs-download
AIRFLOW_QUEUE_CHECK=holisticlegs-check
AIRFLOW_QUEUE_UPLOAD=holisticlegs-upload
AIRFLOW__WEBSERVER__SECRET_KEY=8DJ6XbtIICassrVxM9jWV3eTlt5N3XtyEdyW
HOSTNAME=85.192.30.55
AIRFLOW_WORKER_DOWNLOAD_MEM_LIMIT=768M
AIRFLOW_WORKER_DOWNLOAD_MEM_RESERV=522M
AIRFLOW_WORKER_DOWNLOAD_CONCURRENCY=2
AIRFLOW_SMALL_WORKERS_MEM_LIMIT=1024M
AIRFLOW_SMALL_WORKERS_MEM_RESERV=512M
~

71
airflow/.env.worker Normal file
View File

@ -0,0 +1,71 @@
HOSTNAME="dl001"
MASTER_HOST_IP=89.253.221.173
REDIS_PASSWORD="rOhTAIlTFFylXsjhqwxnYxDChFc"
POSTGRES_PASSWORD="pgdb_pwd_A7bC2xY9zE1wV5uP"
AIRFLOW_UID=1003
REDIS_HOST=89.253.221.173
REDIS_PORT=52909
SERVER_IDENTITY=ytdlp-ops-service-worker-dl001
# The role of the ytdlp-ops-server instance.
# 'management': Runs only state management tasks (proxy/account status). Use for the master deployment.
# 'worker' or 'all-in-one': Runs token generation tasks. Use for dedicated worker deployments.
SERVICE_ROLE=worker
# --- Envoy & Worker Configuration ---
# The public-facing port for the Envoy load balancer that fronts the WORKERS.
ENVOY_PORT=9080
# The port for Envoy's admin/stats interface.
ENVOY_ADMIN_PORT=9901
# The public-facing port for the standalone MANAGEMENT service.
MANAGEMENT_SERVICE_PORT=9091
# The number of Python server workers to run.
# Set to 1 to simplify debugging. Multi-worker mode is experimental.
YTDLP_WORKERS=1
# The starting port for the Python workers. They will use sequential ports (e.g., 9090, 9091, ...).
YTDLP_BASE_PORT=9090
# --- Camoufox (Browser) Configuration ---
# Comma-separated list of SOCKS5 proxies to be used by Camoufox instances.
# Each proxy will get its own dedicated browser instance (1:1 mapping).
# Example: CAMOUFOX_PROXIES="socks5://user:pass@p.webshare.io:1081,socks5://user:pass@p.webshare.io:1082"
CAMOUFOX_PROXIES="socks5://sslocal-rust-1087:1087"
# Password for VNC access to the Camoufox browser instances.
VNC_PASSWORD="vnc_pwd_Z5xW8cV2bN4mP7lK"
# The starting port for VNC access. Ports will be assigned sequentially (e.g., 5901, 5902, ...).
CAMOUFOX_BASE_VNC_PORT=5901
# The internal port used by Camoufox for its WebSocket server. Usually does not need to be changed.
CAMOUFOX_PORT=12345
# Legacy mode: Use single camoufox instance for all proxies
# CAMOUFOX_LEGACY_MODE=false
# Resource monitoring configuration
CAMOUFOX_MAX_MEMORY_MB=2048
CAMOUFOX_MAX_CPU_PERCENT=80
CAMOUFOX_MAX_CONCURRENT_CONTEXTS=8
CAMOUFOX_HEALTH_CHECK_INTERVAL=30
# Mapping configuration (proxy port → camoufox instance)
# socks5://proxy:1081 → camoufox-1:12345
# socks5://proxy:1082 → camoufox-2:12345
# socks5://proxy:1083 → camoufox-3:12345
# socks5://proxy:1084 → camoufox-4:12345
# --- General Proxy Configuration ---
# A general-purpose SOCKS5 proxy that can be used alongside Camoufox proxies.
# This should be the IP address of the proxy server accessible from within the Docker network.
# '172.17.0.1' is often the host IP from within a container.
SOCKS5_SOCK_SERVER_IP=172.17.0.1
# --- Account Manager Configuration ---
# Account cooldown parameters (values are in minutes).
ACCOUNT_ACTIVE_DURATION_MIN=7
ACCOUNT_COOLDOWN_DURATION_MIN=30
MINIO_ROOT_USER=admin
MINIO_ROOT_PASSWORD=0153093693-0009
AIRFLOW_GID=0

View File

@ -1,5 +1,5 @@
FROM apache/airflow:2.10.3 FROM apache/airflow:2.10.5
ENV AIRFLOW_VERSION=2.10.3 ENV AIRFLOW_VERSION=2.10.5
WORKDIR /app WORKDIR /app
@ -33,32 +33,17 @@ RUN FFMPEG_URL="https://github.com/yt-dlp/FFmpeg-Builds/releases/download/latest
rm -rf /tmp/ffmpeg.tar.xz && \ rm -rf /tmp/ffmpeg.tar.xz && \
ffmpeg -version ffmpeg -version
# Check if airflow group exists, create it if it doesn't, then ensure proper setup # Ensure proper permissions, aligning GID with docker-compose.yaml (1001)
RUN if ! getent group airflow > /dev/null 2>&1; then \ RUN groupadd -g 1001 airflow && \
groupadd -g 1001 airflow; \ usermod -a -G airflow airflow && \
fi && \ chown -R airflow:1001 /app
# Check if airflow user exists and is in the airflow group
if id -u airflow > /dev/null 2>&1; then \
usermod -a -G airflow airflow; \
else \
useradd -u 1003 -g 1001 -m -s /bin/bash airflow; \
fi && \
chown -R airflow:airflow /app && \
chmod g+w /app
# Switch to airflow user for package installation # Switch to airflow user for package installation
USER airflow USER airflow
# Install base Airflow dependencies # Install base Airflow dependencies
# [FIX] Explicitly install a version of botocore compatible with Python 3.12
# to fix a RecursionError when handling S3 remote logs.
RUN pip install --no-cache-dir \ RUN pip install --no-cache-dir \
"apache-airflow==${AIRFLOW_VERSION}" \ "apache-airflow==${AIRFLOW_VERSION}" apache-airflow-providers-docker apache-airflow-providers-http
apache-airflow-providers-docker \
apache-airflow-providers-http \
apache-airflow-providers-amazon \
"botocore>=1.34.118" \
psycopg2-binary "gunicorn==20.1.0"
# --- Install the custom yt_ops_services package --- # --- Install the custom yt_ops_services package ---
# Copy all the necessary source code for the package. # Copy all the necessary source code for the package.

View File

@ -1,34 +0,0 @@
# Stage 1: Extract static assets from the Airflow image
FROM pangramia/ytdlp-ops-airflow:latest AS asset-extractor
# Switch to root to create and write to the /assets directory
USER root
# Create a temporary directory for extracted assets
WORKDIR /assets
# Copy static assets from the Airflow image.
# This dynamically finds the paths to flask_appbuilder and airflow static assets
# to be resilient to version changes.
RUN cp -R $(python -c 'import os, flask_appbuilder; print(os.path.join(os.path.dirname(flask_appbuilder.__file__), "static"))') ./appbuilder && \
cp -R $(python -c 'import os, airflow; print(os.path.join(os.path.dirname(airflow.__file__), "www/static/dist"))') ./dist
# Pre-compress the static assets using gzip
# This improves performance by allowing Caddy to serve compressed files directly.
RUN find ./appbuilder -type f -print0 | xargs -0 gzip -k -9 && \
find ./dist -type f -print0 | xargs -0 gzip -k -9
# Stage 2: Build the final Caddy image
FROM caddy:2-alpine
# Copy the pre-compressed static assets from the first stage
COPY --from=asset-extractor /assets/appbuilder /usr/share/caddy/static/appbuilder
COPY --from=asset-extractor /assets/dist /usr/share/caddy/static/dist
# Copy the Caddyfile configuration. The build context is the project root,
# so the path is relative to that.
COPY configs/Caddyfile /etc/caddy/Caddyfile
# Expose the port Caddy listens on
EXPOSE 8080

View File

@ -1,249 +0,0 @@
# Стратегия Управления Прокси и Аккаунтами
В этом документе описывается интеллектуальная стратегия управления ресурсами (прокси и аккаунтами), используемая в `ytdlp-ops-server`. Цель этой системы — максимизировать процент успешных операций, минимизировать блокировки и обеспечить отказоустойчивость.
Сервер может работать в разных ролях для поддержки распределенной архитектуры, разделяя задачи управления и задачи генерации токенов.
---
## Роли Сервиса и Архитектура
Сервер предназначен для работы в одной из трех ролей, указываемых флагом `--service-role`:
- **`management`**: Один легковесный экземпляр сервиса, отвечающий за все вызовы API управления.
- **Назначение**: Предоставляет централизованную точку входа для мониторинга и управления состоянием всех прокси и аккаунтов в системе.
- **Поведение**: Предоставляет только функции управления (`getProxyStatus`, `banAccount` и т.д.). Вызовы функций генерации токенов будут завершаться ошибкой.
- **Развертывание**: Запускается как один контейнер (`ytdlp-ops-management`) и напрямую открывает свой порт на хост (например, порт `9091`), минуя Envoy.
- **`worker`**: Основная "рабочая лошадка" для генерации токенов и `info.json`.
- **Назначение**: Обрабатывает все запросы на генерацию токенов.
- **Поведение**: Реализует полный API, но его функции управления ограничены его собственным `server_identity`.
- **Развертывание**: Запускается как масштабируемый сервис (`ytdlp-ops-worker`) за балансировщиком нагрузки Envoy (например, порт `9080`).
- **`all-in-one`** (По умолчанию): Один экземпляр, который выполняет как управленческие, так и рабочие функции. Идеально подходит для локальной разработки или небольших развертываний.
Эта архитектура позволяет создать надежную, федеративную систему, где воркеры управляют своими ресурсами локально, в то время как центральный сервис предоставляет глобальное представление для управления и мониторинга.
---
## 1. Управление Жизненным Циклом Аккаунтов (Cooldown / Resting)
**Цель:** Предотвратить чрезмерное использование и последующую блокировку аккаунтов, предоставляя им периоды "отдыха" после интенсивной работы.
### Как это работает:
Жизненный цикл аккаунта состоит из трех состояний:
- **`ACTIVE`**: Аккаунт активен и используется для выполнения задач. При первом успешном использовании запускается таймер его активности.
- **`RESTING`**: Если аккаунт был в состоянии `ACTIVE` дольше установленного лимита, `AccountManager` автоматически переводит его в состояние "отдыха". В этом состоянии Airflow worker не будет выбирать его для новых задач.
- **Возврат в `ACTIVE`**: После завершения периода "отдыха" `AccountManager` автоматически возвращает аккаунт в состояние `ACTIVE`, делая его снова доступным.
### Конфигурация:
Эти параметры настраиваются при запуске `ytdlp-ops-server`.
- `--account-active-duration-min`: "Время работы" в **минутах**, которое аккаунт может быть непрерывно активным до перехода в `RESTING`.
- **Значение по умолчанию:** `30` (минут).
- `--account-cooldown-duration-min`: "Время отдыха" в **минутах**, которое аккаунт должен находиться в состоянии `RESTING`.
- **Значение по умолчанию:** `60` (минут).
**Где настраивать:**
Параметры передаются как аргументы командной строки при запуске сервера. При использовании Docker Compose это делается в файле `airflow/docker-compose-ytdlp-ops.yaml`:
```yaml
command:
# ... другие параметры
- "--account-active-duration-min"
- "${ACCOUNT_ACTIVE_DURATION_MIN:-30}"
- "--account-cooldown-duration-min"
- "${ACCOUNT_COOLDOWN_DURATION_MIN:-60}"
```
Вы можете изменить значения по умолчанию, установив переменные окружения `ACCOUNT_ACTIVE_DURATION_MIN` и `ACCOUNT_COOLDOWN_DURATION_MIN` в вашем `.env` файле.
**Соответствующие файлы:**
- `server_fix/account_manager.py`: Содержит основную логику для переключения состояний.
- `ytdlp_ops_server_fix.py`: Обрабатывает аргументы командной строки.
- `airflow/docker-compose-ytdlp-ops.yaml`: Передает аргументы в контейнер сервера.
---
## 2. Умная Стратегия Банов
**Цель:** Избежать необоснованных банов хороших прокси. Проблема часто может быть в аккаунте, а не в прокси, через который он работает.
### Как это работает:
#### Этап 1: Сначала Бан Аккаунта
- При возникновении серьезной ошибки, требующей бана (например, `BOT_DETECTED` или `SOCKS5_CONNECTION_FAILED`), система применяет санкции **только к аккаунту**, который вызвал ошибку.
- Для прокси эта ошибка просто фиксируется как один сбой, но сам прокси **не банится** и остается в работе.
#### Этап 2: Бан Прокси по "Скользящему Окну"
- Прокси блокируется автоматически, только если он демонстрирует **систематические сбои с РАЗНЫМИ аккаунтами** за короткий промежуток времени.
- Это является надежным индикатором того, что проблема именно в прокси. `ProxyManager` на сервере отслеживает это и автоматически банит такой прокси.
### Конфигурация:
Эти параметры **жестко заданы** как константы в коде и для их изменения требуется редактирование файла.
**Где настраивать:**
- **Файл:** `server_fix/proxy_manager.py`
- **Константы** в классе `ProxyManager`:
- `FAILURE_WINDOW_SECONDS`: Временное окно в секундах для анализа сбоев.
- **Значение по умолчанию:** `3600` (1 час).
- `FAILURE_THRESHOLD_COUNT`: Минимальное общее количество сбоев для запуска проверки.
- **Значение по умолчанию:** `3`.
- `FAILURE_THRESHOLD_UNIQUE_ACCOUNTS`: Минимальное количество **уникальных аккаунтов**, с которыми произошли сбои, чтобы забанить прокси.
- **Значение по умолчанию:** `3`.
**Соответствующие файлы:**
- `server_fix/proxy_manager.py`: Содержит логику "скользящего окна" и константы.
- `airflow/dags/ytdlp_ops_worker_per_url.py`: Функция `handle_bannable_error_callable` реализует политику бана "только аккаунт".
---
### Расшифровка Статусов Аккаунтов
Вы можете просмотреть статус всех аккаунтов с помощью DAG `ytdlp_mgmt_proxy_account`. Статусы имеют следующие значения:
- **`ACTIVE`**: Аккаунт исправен и доступен для использования. По умолчанию, аккаунт считается `ACTIVE`, если у него не установлен конкретный статус.
- **`BANNED`**: Аккаунт временно отключен из-за повторяющихся сбоев (например, ошибок `BOT_DETECTED`) или забанен вручную. В статусе будет указано время, оставшееся до его автоматического возвращения в `ACTIVE` (например, `BANNED (active in 55m)`).
- **`RESTING`**: Аккаунт использовался в течение длительного времени и находится в обязательном периоде "отдыха" для предотвращения "выгорания". В статусе будет указано время, оставшееся до его возвращения в `ACTIVE` (например, `RESTING (active in 25m)`).
- **(Пустой Статус)**: В более старых версиях аккаунт, у которого были только сбои (и ни одного успеха), мог отображаться с пустым статусом. Это было исправлено; теперь такие аккаунты корректно отображаются как `ACTIVE`.
---
## 3. Сквозной Процесс Ротации: Как Всё Работает Вместе
Этот раздел описывает пошаговый процесс того, как воркер получает аккаунт и прокси для одной задачи, объединяя все вышеописанные стратегии управления.
1. **Инициализация Воркера (`ytdlp_ops_worker_per_url`)**
- Запускается DAG, инициированный либо оркестратором, либо предыдущим успешным запуском самого себя.
- Задача `pull_url_from_redis` извлекает URL из очереди `_inbox` в Redis.
2. **Выбор Аккаунта (Воркер Airflow)**
- Выполняется задача `assign_account`.
- Она генерирует полный список потенциальных ID аккаунтов на основе параметра `account_pool` (например, от `my_prefix_01` до `my_prefix_50`).
- Она подключается к Redis и проверяет статус каждого аккаунта из этого списка.
- Она создает новый временный список, содержащий только те аккаунты, которые **не** находятся в состоянии `BANNED` или `RESTING`.
- Если итоговый список активных аккаунтов пуст, воркер завершается с ошибкой (если не включено автосоздание).
- Затем из отфильтрованного списка активных аккаунтов с помощью **`random.choice()`** выбирается один.
- Выбранный `account_id` передается следующей задаче.
3. **Выбор Прокси (`ytdlp-ops-server`)**
- Выполняется задача `get_token`, которая отправляет случайно выбранный `account_id` в Thrift RPC-вызове на `ytdlp-ops-server`.
- На сервере у `ProxyManager` запрашивается прокси.
- `ProxyManager`:
a. Обновляет свое внутреннее состояние, загружая статусы всех прокси из Redis.
b. Фильтрует список, оставляя только прокси со статусом `ACTIVE`.
c. Применяет политику бана по "скользящему окну", потенциально блокируя прокси, которые недавно слишком часто выходили из строя.
d. Выбирает следующий доступный прокси из активного списка, используя индекс **round-robin** (по кругу).
e. Возвращает выбранный `proxy_url`.
4. **Выполнение и Отчетность**
- Теперь у сервера есть и `account_id` (от Airflow), и `proxy_url` (от его `ProxyManager`).
- Он приступает к процессу генерации токенов, используя эти ресурсы.
- По завершении (успешном или неудачном) он сообщает о результате в Redis, обновляя статусы для конкретного аккаунта и прокси, которые были использованы. Это влияет на их счетчики сбоев, таймеры "отдыха" и т.д. для следующего запуска.
Это разделение ответственности является ключевым:
- **Воркер Airflow (задача `assign_account`)** отвечает за **случайный выбор активного аккаунта**, сохраняя при этом "привязку" (повторно используя тот же аккаунт после успеха).
- **Сервер `ytdlp-ops-server`** отвечает за **циклический выбор (round-robin) активного прокси**.
---
## 4. Автоматический Бан Аккаунтов по Количеству Сбоев
**Цель:** Автоматически выводить из ротации аккаунты, которые постоянно вызывают ошибки, не связанные с баном (например, неверный пароль, проблемы с авторизацией).
### Как это работает:
- `AccountManager` отслеживает количество **последовательных** сбоев для каждого аккаунта.
- При успешной операции счетчик сбрасывается.
- Если количество последовательных сбоев достигает заданного порога, аккаунт автоматически банится на определенный срок.
### Конфигурация:
Эти параметры задаются в конструкторе класса `AccountManager`.
**Где настраивать:**
- **Файл:** `server_fix/account_manager.py`
- **Параметры** в `__init__` метода `AccountManager`:
- `failure_threshold`: Количество последовательных сбоев до бана.
- **Значение по умолчанию:** `5`.
- `ban_duration_s`: Длительность бана в секундах.
- **Значение по умолчанию:** `3600` (1 час).
---
## 5. Мониторинг и Восстановление
### Как Проверить Статусы
DAG **`ytdlp_mgmt_proxy_account`** — это основной инструмент для мониторинга состояния ваших ресурсов. Он подключается напрямую к **сервису управления** для выполнения действий.
- **ID DAG'а:** `ytdlp_mgmt_proxy_account`
- **Как использовать:** Запустите DAG из интерфейса Airflow. Убедитесь, что параметры `management_host` и `management_port` правильно указывают на ваш экземпляр сервиса `ytdlp-ops-management`. Для получения полного обзора установите параметры:
- `entity`: `all`
- `action`: `list`
- **Результат:** В логе DAG'а будут отображены таблицы с текущим статусом всех аккаунтов и прокси. Для аккаунтов в состоянии `BANNED` или `RESTING` будет показано время, оставшееся до их активации (например, `RESTING (active in 45m)`). Для прокси будет подсвечено, какой из них является следующим `(next)` в ротации для конкретного воркера.
### Что Произойдет, если Все Аккаунты Будут Забанены или в "Отдыхе"?
Если весь пул аккаунтов станет недоступен (в статусе `BANNED` или `RESTING`), система по умолчанию приостановит работу.
- DAG `ytdlp_ops_worker_per_url` завершится с ошибкой `AirflowException` на шаге `assign_account`, так как пул активных аккаунтов будет пуст.
- Это остановит циклы обработки. Система будет находиться в состоянии паузы до тех пор, пока аккаунты не будут разбанены вручную или пока не истечет их таймер бана/отдыха. После этого вы сможете перезапустить циклы обработки с помощью DAG'а `ytdlp_ops_orchestrator`.
- Граф выполнения DAG `ytdlp_ops_worker_per_url` теперь явно показывает такие задачи, как `assign_account`, `get_token`, `ban_account`, `retry_get_token` и т.д., что делает поток выполнения и точки сбоя более наглядными.
Систему можно настроить на автоматическое создание новых аккаунтов, чтобы предотвратить полную остановку обработки.
#### Автоматическое Создание Аккаунтов при Исчерпании
- **Цель**: Обеспечить непрерывную работу конвейера обработки, даже если все аккаунты в основном пуле временно забанены или находятся в "отдыхе".
- **Как это работает**: Если параметр `auto_create_new_accounts_on_exhaustion` установлен в `True` и пул аккаунтов задан с помощью префикса (а не явного списка), система сгенерирует новый уникальный ID аккаунта, когда обнаружит, что активный пул пуст.
- **Именование новых аккаунтов**: Новые аккаунты создаются в формате `{prefix}-auto-{уникальный_id}`.
- **Конфигурация**:
- **Параметр**: `auto_create_new_accounts_on_exhaustion`
- **Где настраивать**: В конфигурации DAG `ytdlp_ops_orchestrator` при запуске.
- **Значение по умолчанию**: `True`.
---
## 6. Обработка Сбоев и Политика Повторных Попыток
**Цель:** Обеспечить гибкое управление поведением системы, когда воркер сталкивается с ошибкой, требующей бана (например, `BOT_DETECTED`).
### Как это работает
Когда задача `get_token` воркера завершается с ошибкой, требующей бана, поведение системы определяется политикой `on_bannable_failure`, которую можно настроить при запуске `ytdlp_ops_orchestrator`.
### Конфигурация
- **Параметр**: `on_bannable_failure`
- **Где настраивать**: В конфигурации DAG `ytdlp_ops_orchestrator`.
- **Опции**:
- `stop_loop` (Самая строгая):
- Использованный аккаунт банится.
- URL помечается как сбойный в хэше `_fail` в Redis.
- Цикл обработки воркера **останавливается**. "Линия" обработки становится неактивной.
- `retry_with_new_account` (По умолчанию, самая отказоустойчивая):
- Аккаунт, вызвавший сбой, банится.
- Воркер немедленно повторяет обработку **того же URL** с новым, неиспользованным аккаунтом из пула.
- Если повторная попытка успешна, воркер продолжает свой цикл для обработки следующего URL.
- Если повторная попытка также завершается сбоем, второй аккаунт **и использованный прокси** также банятся, и цикл работы воркера останавливается.
- `retry_and_ban_account_only`:
- Похожа на `retry_with_new_account`, но при втором сбое банится **только второй аккаунт**, а не прокси.
- Это полезно, когда вы доверяете своим прокси, но хотите агрессивно перебирать сбойные аккаунты.
- `retry_without_ban` (Самая мягкая):
- Воркер повторяет попытку с новым аккаунтом, но **ни аккаунты, ни прокси никогда не банятся**.
- Эта политика полезна для отладки или когда вы уверены, что сбои являются временными и не вызваны проблемами с ресурсами.
Эта политика позволяет системе быть устойчивой к сбоям отдельных аккаунтов, не теряя URL, и в то же время обеспечивает гранулярный контроль над тем, когда банить аккаунты и/или прокси, если проблема сохраняется.
---
## 7. Логика Работы Worker DAG (`ytdlp_ops_worker_per_url`)
Этот DAG является "рабочей лошадкой" системы. Он спроектирован как самоподдерживающийся цикл для обработки одного URL за запуск. Логика обработки сбоев и повторных попыток теперь явно видна в графе задач DAG.
### Задачи и их назначение:
- **`pull_url_from_redis`**: Извлекает один URL из очереди `_inbox` в Redis. Если очередь пуста, DAG завершается со статусом `skipped`, останавливая эту "линию" обработки.
- **`assign_account`**: Выбирает аккаунт для задачи. Он поддерживает **привязку аккаунта (affinity)**, повторно используя тот же аккаунт из предыдущего успешного запуска в своей "линии". Если это первый запуск или предыдущий был неудачным, он выбирает случайный активный аккаунт.
- **`get_token`**: Основная попытка получить токены и `info.json` путем вызова `ytdlp-ops-server`.
- **`handle_bannable_error_branch`**: Задача-развилка, которая запускается в случае сбоя `get_token`. Она анализирует ошибку и определяет следующий шаг на основе политики `on_bannable_failure`.
- **`ban_account_and_prepare_for_retry`**: Если разрешен повтор, эта задача банит сбойный аккаунт и выбирает новый.
- **`retry_get_token`**: Вторая попытка получить токен с использованием нового аккаунта.
- **`ban_second_account_and_proxy`**: Если и повторная попытка завершается неудачей, эта задача банит второй аккаунт и использованный прокси.
- **`download_and_probe`**: Если `get_token` или `retry_get_token` завершается успешно, эта задача использует `yt-dlp` для скачивания медиа и `ffmpeg` для проверки целостности файла.
- **`mark_url_as_success`**: Если `download_and_probe` завершается успешно, эта задача записывает успешный результат в хэш `_result` в Redis.
- **`handle_generic_failure`**: Если любая задача завершается с неисправимой ошибкой, эта задача записывает подробную информацию об ошибке в хэш `_fail` в Redis.
- **`decide_what_to_do_next`**: Финальная задача-развилка, которая решает, продолжать ли цикл (`trigger_self_run`), остановить его корректно (`stop_loop`) или пометить как сбойный (`fail_loop`).
- **`trigger_self_run`**: Задача, которая фактически запускает следующий экземпляр DAG, создавая непрерывный цикл.

View File

@ -4,7 +4,7 @@
# #
# Variable: AIRFLOW__CORE__DAGS_FOLDER # Variable: AIRFLOW__CORE__DAGS_FOLDER
# #
dags_folder = /opt/airflow/dags dags_folder = /home/ubuntu/airflow/dags
# Hostname by providing a path to a callable, which will resolve the hostname. # Hostname by providing a path to a callable, which will resolve the hostname.
# The format is "package.function". # The format is "package.function".
@ -49,7 +49,7 @@ default_timezone = utc
# #
# Variable: AIRFLOW__CORE__EXECUTOR # Variable: AIRFLOW__CORE__EXECUTOR
# #
executor = CeleryExecutor executor = SequentialExecutor
# The auth manager class that airflow should use. Full import path to the auth manager class. # The auth manager class that airflow should use. Full import path to the auth manager class.
# #
@ -127,7 +127,7 @@ load_examples = False
# #
# Variable: AIRFLOW__CORE__PLUGINS_FOLDER # Variable: AIRFLOW__CORE__PLUGINS_FOLDER
# #
plugins_folder = /opt/airflow/plugins plugins_folder = /home/ubuntu/airflow/plugins
# Should tasks be executed via forking of the parent process # Should tasks be executed via forking of the parent process
# #
@ -261,7 +261,7 @@ dag_ignore_file_syntax = regexp
# #
# Variable: AIRFLOW__CORE__DEFAULT_TASK_RETRIES # Variable: AIRFLOW__CORE__DEFAULT_TASK_RETRIES
# #
default_task_retries = 3 default_task_retries = 3 # Default retries
# The number of seconds each task is going to wait by default between retries. Can be overridden at # The number of seconds each task is going to wait by default between retries. Can be overridden at
# dag or task level. # dag or task level.
@ -296,7 +296,7 @@ task_success_overtime = 20
# #
# Variable: AIRFLOW__CORE__DEFAULT_TASK_EXECUTION_TIMEOUT # Variable: AIRFLOW__CORE__DEFAULT_TASK_EXECUTION_TIMEOUT
# #
default_task_execution_timeout = 3600 default_task_execution_timeout = 3600 # 1 hour timeout
# Updating serialized DAG can not be faster than a minimum interval to reduce database write rate. # Updating serialized DAG can not be faster than a minimum interval to reduce database write rate.
# #
@ -369,7 +369,7 @@ lazy_discover_providers = True
# #
# Variable: AIRFLOW__CORE__HIDE_SENSITIVE_VAR_CONN_FIELDS # Variable: AIRFLOW__CORE__HIDE_SENSITIVE_VAR_CONN_FIELDS
# #
hide_sensitive_var_conn_fields = False hide_sensitive_var_conn_fields = True
# A comma-separated list of extra sensitive keywords to look for in variables names or connection's # A comma-separated list of extra sensitive keywords to look for in variables names or connection's
# extra JSON. # extra JSON.
@ -403,7 +403,7 @@ max_map_length = 1024
# #
# Variable: AIRFLOW__CORE__DAEMON_UMASK # Variable: AIRFLOW__CORE__DAEMON_UMASK
# #
daemon_umask = 0o002 daemon_umask = 0o077
# Class to use as dataset manager. # Class to use as dataset manager.
# #
@ -478,8 +478,6 @@ test_connection = Disabled
# #
max_templated_field_length = 4096 max_templated_field_length = 4096
host_docker_socket = /var/run/docker.sock
[database] [database]
# Path to the ``alembic.ini`` file. You can either provide the file path relative # Path to the ``alembic.ini`` file. You can either provide the file path relative
# to the Airflow home directory or the absolute path if it is located elsewhere. # to the Airflow home directory or the absolute path if it is located elsewhere.
@ -496,10 +494,8 @@ alembic_ini_file_path = alembic.ini
# #
# Variable: AIRFLOW__DATABASE__SQL_ALCHEMY_CONN # Variable: AIRFLOW__DATABASE__SQL_ALCHEMY_CONN
# #
# This is configured via the AIRFLOW__DATABASE__SQL_ALCHEMY_CONN environment variable sql_alchemy_conn = postgresql+psycopg2://airflow:airflow@localhost:5432/airflow
# in the docker-compose files, as it differs between master and workers. #sqlite:////home/ubuntu/airflow/airflow.db
# A dummy value is set here to ensure the env var override is picked up.
sql_alchemy_conn = postgresql://dummy:dummy@dummy/dummy
# Extra engine specific keyword args passed to SQLAlchemy's create_engine, as a JSON-encoded value # Extra engine specific keyword args passed to SQLAlchemy's create_engine, as a JSON-encoded value
# #
@ -538,7 +534,7 @@ sql_alchemy_pool_enabled = True
# #
# Variable: AIRFLOW__DATABASE__SQL_ALCHEMY_POOL_SIZE # Variable: AIRFLOW__DATABASE__SQL_ALCHEMY_POOL_SIZE
# #
sql_alchemy_pool_size = 20 sql_alchemy_pool_size = 10 # Increase pool size
# The maximum overflow size of the pool. # The maximum overflow size of the pool.
# When the number of checked-out connections reaches the size set in pool_size, # When the number of checked-out connections reaches the size set in pool_size,
@ -552,7 +548,7 @@ sql_alchemy_pool_size = 20
# #
# Variable: AIRFLOW__DATABASE__SQL_ALCHEMY_MAX_OVERFLOW # Variable: AIRFLOW__DATABASE__SQL_ALCHEMY_MAX_OVERFLOW
# #
sql_alchemy_max_overflow = 30 sql_alchemy_max_overflow = 20 # Increase max overflow
# The SQLAlchemy pool recycle is the number of seconds a connection # The SQLAlchemy pool recycle is the number of seconds a connection
# can be idle in the pool before it is invalidated. This config does # can be idle in the pool before it is invalidated. This config does
@ -640,14 +636,14 @@ check_migrations = True
# #
# Variable: AIRFLOW__LOGGING__BASE_LOG_FOLDER # Variable: AIRFLOW__LOGGING__BASE_LOG_FOLDER
# #
base_log_folder = /opt/airflow/logs base_log_folder = /home/ubuntu/airflow/logs
# Airflow can store logs remotely in AWS S3, Google Cloud Storage or Elastic Search. # Airflow can store logs remotely in AWS S3, Google Cloud Storage or Elastic Search.
# Set this to ``True`` if you want to enable remote logging. # Set this to ``True`` if you want to enable remote logging.
# #
# Variable: AIRFLOW__LOGGING__REMOTE_LOGGING # Variable: AIRFLOW__LOGGING__REMOTE_LOGGING
# #
remote_logging = True remote_logging = False
# Users must supply an Airflow connection id that provides access to the storage # Users must supply an Airflow connection id that provides access to the storage
# location. Depending on your remote logging service, this may only be used for # location. Depending on your remote logging service, this may only be used for
@ -655,7 +651,7 @@ remote_logging = True
# #
# Variable: AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID # Variable: AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID
# #
remote_log_conn_id = minio_default remote_log_conn_id =
# Whether the local log files for GCS, S3, WASB and OSS remote logging should be deleted after # Whether the local log files for GCS, S3, WASB and OSS remote logging should be deleted after
# they are uploaded to the remote location. # they are uploaded to the remote location.
@ -682,7 +678,7 @@ google_key_path =
# #
# Variable: AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER # Variable: AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER
# #
remote_base_log_folder = s3://airflow-logs/ remote_base_log_folder =
# The remote_task_handler_kwargs param is loaded into a dictionary and passed to the ``__init__`` # The remote_task_handler_kwargs param is loaded into a dictionary and passed to the ``__init__``
# of remote task handler and it overrides the values provided by Airflow config. For example if you set # of remote task handler and it overrides the values provided by Airflow config. For example if you set
@ -733,7 +729,7 @@ fab_logging_level = WARNING
# #
# Variable: AIRFLOW__LOGGING__LOGGING_CONFIG_CLASS # Variable: AIRFLOW__LOGGING__LOGGING_CONFIG_CLASS
# #
logging_config_class = airflow_local_settings.LOGGING_CONFIG logging_config_class =
# Flag to enable/disable Colored logs in Console # Flag to enable/disable Colored logs in Console
# Colour the logs when the controlling terminal is a TTY. # Colour the logs when the controlling terminal is a TTY.
@ -798,8 +794,6 @@ log_formatter_class = airflow.utils.log.timezone_aware.TimezoneAware
# #
secret_mask_adapter = secret_mask_adapter =
secret_mask_exception_args = False
# Specify prefix pattern like mentioned below with stream handler ``TaskHandlerWithCustomFormatter`` # Specify prefix pattern like mentioned below with stream handler ``TaskHandlerWithCustomFormatter``
# #
# Example: task_log_prefix_template = {{ti.dag_id}}-{{ti.task_id}}-{{execution_date}}-{{ti.try_number}} # Example: task_log_prefix_template = {{ti.dag_id}}-{{ti.task_id}}-{{execution_date}}-{{ti.try_number}}
@ -824,7 +818,7 @@ log_processor_filename_template = {{ filename }}.log
# #
# Variable: AIRFLOW__LOGGING__DAG_PROCESSOR_MANAGER_LOG_LOCATION # Variable: AIRFLOW__LOGGING__DAG_PROCESSOR_MANAGER_LOG_LOCATION
# #
dag_processor_manager_log_location = /opt/airflow/logs/dag_processor_manager/dag_processor_manager.log dag_processor_manager_log_location = /home/ubuntu/airflow/logs/dag_processor_manager/dag_processor_manager.log
# Whether DAG processor manager will write logs to stdout # Whether DAG processor manager will write logs to stdout
# #
@ -1394,7 +1388,7 @@ access_denied_message = Access is Denied
# #
# Variable: AIRFLOW__WEBSERVER__CONFIG_FILE # Variable: AIRFLOW__WEBSERVER__CONFIG_FILE
# #
config_file = /opt/airflow/webserver_config.py config_file = /home/ubuntu/airflow/webserver_config.py
# The base url of your website: Airflow cannot guess what domain or CNAME you are using. # The base url of your website: Airflow cannot guess what domain or CNAME you are using.
# This is used to create links in the Log Url column in the Browse - Task Instances menu, # This is used to create links in the Log Url column in the Browse - Task Instances menu,
@ -1510,7 +1504,7 @@ secret_key = tCnTbEabdFBDLHWoT/LxLw==
# #
# Variable: AIRFLOW__WEBSERVER__WORKERS # Variable: AIRFLOW__WEBSERVER__WORKERS
# #
workers = 1 workers = 4
# The worker class gunicorn should use. Choices include # The worker class gunicorn should use. Choices include
# ``sync`` (default), ``eventlet``, ``gevent``. # ``sync`` (default), ``eventlet``, ``gevent``.
@ -1531,7 +1525,7 @@ workers = 1
# #
# Variable: AIRFLOW__WEBSERVER__WORKER_CLASS # Variable: AIRFLOW__WEBSERVER__WORKER_CLASS
# #
worker_class = gevent worker_class = sync
# Log files for the gunicorn webserver. '-' means log to stderr. # Log files for the gunicorn webserver. '-' means log to stderr.
# #
@ -1598,13 +1592,13 @@ grid_view_sorting_order = topological
# #
# Variable: AIRFLOW__WEBSERVER__LOG_FETCH_TIMEOUT_SEC # Variable: AIRFLOW__WEBSERVER__LOG_FETCH_TIMEOUT_SEC
# #
log_fetch_timeout_sec = 10 log_fetch_timeout_sec = 10 # Increase timeout
# Time interval (in secs) to wait before next log fetching. # Time interval (in secs) to wait before next log fetching.
# #
# Variable: AIRFLOW__WEBSERVER__LOG_FETCH_DELAY_SEC # Variable: AIRFLOW__WEBSERVER__LOG_FETCH_DELAY_SEC
# #
log_fetch_delay_sec = 5 log_fetch_delay_sec = 5 # Increase delay
# Distance away from page bottom to enable auto tailing. # Distance away from page bottom to enable auto tailing.
# #
@ -1671,7 +1665,7 @@ default_dag_run_display_number = 25
# #
# Variable: AIRFLOW__WEBSERVER__ENABLE_PROXY_FIX # Variable: AIRFLOW__WEBSERVER__ENABLE_PROXY_FIX
# #
enable_proxy_fix = True enable_proxy_fix = False
# Number of values to trust for ``X-Forwarded-For``. # Number of values to trust for ``X-Forwarded-For``.
# See `Werkzeug: X-Forwarded-For Proxy Fix # See `Werkzeug: X-Forwarded-For Proxy Fix
@ -2104,7 +2098,7 @@ scheduler_idle_sleep_time = 1
# #
# Variable: AIRFLOW__SCHEDULER__MIN_FILE_PROCESS_INTERVAL # Variable: AIRFLOW__SCHEDULER__MIN_FILE_PROCESS_INTERVAL
# #
min_file_process_interval = 60 min_file_process_interval = 60 # Increase to 60 seconds
# How often (in seconds) to check for stale DAGs (DAGs which are no longer present in # How often (in seconds) to check for stale DAGs (DAGs which are no longer present in
# the expected files) which should be deactivated, as well as datasets that are no longer # the expected files) which should be deactivated, as well as datasets that are no longer
@ -2129,7 +2123,7 @@ stale_dag_threshold = 50
# #
# Variable: AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL # Variable: AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL
# #
dag_dir_list_interval = 600 dag_dir_list_interval = 600 # Increase to 600 seconds (10 minutes)
# How often should stats be printed to the logs. Setting to 0 will disable printing stats # How often should stats be printed to the logs. Setting to 0 will disable printing stats
# #
@ -2183,7 +2177,7 @@ orphaned_tasks_check_interval = 300.0
# #
# Variable: AIRFLOW__SCHEDULER__CHILD_PROCESS_LOG_DIRECTORY # Variable: AIRFLOW__SCHEDULER__CHILD_PROCESS_LOG_DIRECTORY
# #
child_process_log_directory = /opt/airflow/logs/scheduler child_process_log_directory = /home/ubuntu/airflow/logs/scheduler
# Local task jobs periodically heartbeat to the DB. If the job has # Local task jobs periodically heartbeat to the DB. If the job has
# not heartbeat in this many seconds, the scheduler will mark the # not heartbeat in this many seconds, the scheduler will mark the
@ -2335,7 +2329,7 @@ trigger_timeout_check_interval = 15
# #
# Variable: AIRFLOW__SCHEDULER__TASK_QUEUED_TIMEOUT # Variable: AIRFLOW__SCHEDULER__TASK_QUEUED_TIMEOUT
# #
task_queued_timeout = 300.0 task_queued_timeout = 300.0 # Reduce to 5 minutes
# How often to check for tasks that have been in the queued state for # How often to check for tasks that have been in the queued state for
# longer than ``[scheduler] task_queued_timeout``. # longer than ``[scheduler] task_queued_timeout``.
@ -2527,7 +2521,7 @@ celery_app_name = airflow.providers.celery.executors.celery_executor
# #
# Variable: AIRFLOW__CELERY__WORKER_CONCURRENCY # Variable: AIRFLOW__CELERY__WORKER_CONCURRENCY
# #
worker_concurrency = 32 worker_concurrency = 32 # Increase worker concurrency
# The maximum and minimum number of pool processes that will be used to dynamically resize # The maximum and minimum number of pool processes that will be used to dynamically resize
# the pool based on load.Enable autoscaling by providing max_concurrency,min_concurrency # the pool based on load.Enable autoscaling by providing max_concurrency,min_concurrency
@ -2553,7 +2547,7 @@ worker_concurrency = 32
# #
# Variable: AIRFLOW__CELERY__WORKER_PREFETCH_MULTIPLIER # Variable: AIRFLOW__CELERY__WORKER_PREFETCH_MULTIPLIER
# #
worker_prefetch_multiplier = 2 worker_prefetch_multiplier = 2 # Increase prefetch multiplier
# Specify if remote control of the workers is enabled. # Specify if remote control of the workers is enabled.
# In some cases when the broker does not support remote control, Celery creates lots of # In some cases when the broker does not support remote control, Celery creates lots of
@ -2570,8 +2564,7 @@ worker_enable_remote_control = true
# #
# Variable: AIRFLOW__CELERY__BROKER_URL # Variable: AIRFLOW__CELERY__BROKER_URL
# #
# This will be configured via environment variables, as it differs between master and workers. broker_url = redis://redis:6379/0
# broker_url =
# The Celery result_backend. When a job finishes, it needs to update the # The Celery result_backend. When a job finishes, it needs to update the
# metadata of the job. Therefore it will post a message on a message bus, # metadata of the job. Therefore it will post a message on a message bus,
@ -2585,10 +2578,9 @@ worker_enable_remote_control = true
# #
# Variable: AIRFLOW__CELERY__RESULT_BACKEND # Variable: AIRFLOW__CELERY__RESULT_BACKEND
# #
# The result_backend is intentionally left blank. result_backend = redis://redis:6379/0
# When blank, Airflow's CeleryExecutor defaults to using the value from #redis://:@localhost:6379/0
# `sql_alchemy_conn` as the result backend, which is the recommended setup. # postgresql+psycopg2://airflow:airflow@localhost:5432/airflow
result_backend =
# Optional configuration dictionary to pass to the Celery result backend SQLAlchemy engine. # Optional configuration dictionary to pass to the Celery result backend SQLAlchemy engine.
# #
@ -2969,7 +2961,86 @@ xcom_objectstorage_threshold = -1
# #
xcom_objectstorage_compression = xcom_objectstorage_compression =
[elasticsearch]
# Elasticsearch host
#
# Variable: AIRFLOW__ELASTICSEARCH__HOST
#
host =
# Format of the log_id, which is used to query for a given tasks logs
#
# Variable: AIRFLOW__ELASTICSEARCH__LOG_ID_TEMPLATE
#
log_id_template = {dag_id}-{task_id}-{run_id}-{map_index}-{try_number}
# Used to mark the end of a log stream for a task
#
# Variable: AIRFLOW__ELASTICSEARCH__END_OF_LOG_MARK
#
end_of_log_mark = end_of_log
# Qualified URL for an elasticsearch frontend (like Kibana) with a template argument for log_id
# Code will construct log_id using the log_id template from the argument above.
# NOTE: scheme will default to https if one is not provided
#
# Example: frontend = http://localhost:5601/app/kibana#/discover?_a=(columns:!(message),query:(language:kuery,query:'log_id: "{log_id}"'),sort:!(log.offset,asc))
#
# Variable: AIRFLOW__ELASTICSEARCH__FRONTEND
#
frontend =
# Write the task logs to the stdout of the worker, rather than the default files
#
# Variable: AIRFLOW__ELASTICSEARCH__WRITE_STDOUT
#
write_stdout = False
# Instead of the default log formatter, write the log lines as JSON
#
# Variable: AIRFLOW__ELASTICSEARCH__JSON_FORMAT
#
json_format = False
# Log fields to also attach to the json output, if enabled
#
# Variable: AIRFLOW__ELASTICSEARCH__JSON_FIELDS
#
json_fields = asctime, filename, lineno, levelname, message
# The field where host name is stored (normally either `host` or `host.name`)
#
# Variable: AIRFLOW__ELASTICSEARCH__HOST_FIELD
#
host_field = host
# The field where offset is stored (normally either `offset` or `log.offset`)
#
# Variable: AIRFLOW__ELASTICSEARCH__OFFSET_FIELD
#
offset_field = offset
# Comma separated list of index patterns to use when searching for logs (default: `_all`).
# The index_patterns_callable takes precedence over this.
#
# Example: index_patterns = something-*
#
# Variable: AIRFLOW__ELASTICSEARCH__INDEX_PATTERNS
#
index_patterns = _all
index_patterns_callable =
[elasticsearch_configs]
#
# Variable: AIRFLOW__ELASTICSEARCH_CONFIGS__HTTP_COMPRESS
#
http_compress = False
#
# Variable: AIRFLOW__ELASTICSEARCH_CONFIGS__VERIFY_CERTS
#
verify_certs = True
[fab] [fab]
# This section contains configs specific to FAB provider. # This section contains configs specific to FAB provider.
@ -3163,5 +3234,8 @@ spark_inject_parent_job_info = False
# #
# templated_html_content_path = # templated_html_content_path =
[core]
host_docker_socket = /var/run/docker.sock
[docker] [docker]
docker_url = unix://var/run/docker.sock docker_url = unix://var/run/docker.sock

BIN
airflow/config/.DS_Store vendored Normal file

Binary file not shown.

View File

@ -1,7 +0,0 @@
import logging
from copy import deepcopy
from airflow.config_templates.airflow_local_settings import DEFAULT_LOGGING_CONFIG
logger = logging.getLogger(__name__)
LOGGING_CONFIG = deepcopy(DEFAULT_LOGGING_CONFIG)

View File

@ -1,16 +1,17 @@
{ {
"minio_default": { "minio_default":
{
"conn_type": "aws", "conn_type": "aws",
"host": "{{ hostvars[groups['airflow_master'][0]].ansible_host }}", "host": "{% raw %}{{ hostvars[groups['airflow_master'][0]].ansible_host }}{% endraw %}",
"login": "admin", "login": "admin",
"password": "0153093693-0009", "password": "0153093693-0009",
"port": 9000, "port": 9000,
"extra": { "extra":
"endpoint_url": "http://{{ hostvars[groups['airflow_master'][0]].ansible_host }}:9000", {
"region_name": "us-east-1", "endpoint_url": "http://{% raw %}{{ hostvars[groups['airflow_master'][0]].ansible_host }}{% endraw %}:9000",
"aws_access_key_id": "admin", "aws_access_key_id": "admin",
"aws_secret_access_key": "0153093693-0009", "aws_secret_access_key": "0153093693-0009",
"verify": false "region_name": "us-east-1"
} }
} }
} }

View File

@ -0,0 +1,13 @@
{
"redis_default":
{
"conn_type": "redis",
"host": "89.253.221.173",
"password": "rOhTAIlTFFylXsjhqwxnYxDChFc",
"port": 52909,
"extra":
{
"db": 0
}
}
}

View File

@ -2,8 +2,8 @@
"redis_default": "redis_default":
{ {
"conn_type": "redis", "conn_type": "redis",
"host": "{{ hostvars[groups['airflow_master'][0]].ansible_host }}", "host": "redis",
"port": 52909, "port": 6379,
"password": "{{ vault_redis_password }}", "password": "{{ vault_redis_password }}",
"extra": "{\"db\": 0}" "extra": "{\"db\": 0}"
} }

View File

@ -1,33 +0,0 @@
:8080 {
# Serve pre-compressed static assets and enable on-the-fly compression for other assets.
encode gzip
# Define routes for static assets.
# Caddy will automatically look for pre-gzipped files (.gz) if available.
route /static/appbuilder* {
uri strip_prefix /static/appbuilder
root * /usr/share/caddy/static/appbuilder
file_server {
precompressed gzip
}
}
route /static/dist* {
uri strip_prefix /static/dist
root * /usr/share/caddy/static/dist
file_server {
precompressed gzip
}
}
# Reverse proxy all other requests to the Airflow webserver.
route {
reverse_proxy airflow-webserver:8080 {
# Set headers to ensure correct proxy behavior
header_up Host {http.request.host}
header_up X-Real-IP {http.request.remote.ip}
header_up X-Forwarded-For {http.request.remote.ip}
header_up X-Forwarded-Proto {http.request.scheme}
}
}
}

View File

@ -1,82 +0,0 @@
# THIS FILE IS AUTO-GENERATED BY generate_envoy_config.py
# DO NOT EDIT MANUALLY.
#
# It contains the service definitions for the camoufox instances
# and adds the necessary dependencies to the main services.
services:
{% for proxy in camoufox_proxies %}
{% set container_base_port = camoufox_port + loop.index0 * worker_count %}
{% set host_base_port = container_base_port %}
camoufox-{{ loop.index }}:
build:
context: ../camoufox
dockerfile: Dockerfile
args:
VNC_PASSWORD: "{{ vnc_password }}"
image: camoufox:latest
container_name: ytdlp-ops-camoufox-{{ loop.index }}-1
restart: unless-stopped
shm_size: '2gb' # Mitigates browser crashes due to shared memory limitations
ports:
- "{{ host_base_port }}-{{ host_base_port + worker_count - 1 }}:{{ container_base_port }}-{{ container_base_port + worker_count - 1 }}"
environment:
- DISPLAY=:99
- MOZ_HEADLESS_STACKSIZE=2097152
- CAMOUFOX_MAX_MEMORY_MB=2048
- CAMOUFOX_MAX_CONCURRENT_CONTEXTS=8
- CAMOUFOX_RESTART_THRESHOLD_MB=1500
volumes:
- /tmp/.X11-unix:/tmp/.X11-unix:rw
- camoufox-data-{{ loop.index }}:/app/context-data
- camoufox-browser-cache:/root/.cache/ms-playwright # Persist browser binaries
command: [
"--ws-host", "0.0.0.0",
"--port", "{{ container_base_port }}",
"--num-instances", "{{ worker_count }}",
"--ws-path", "mypath",
"--proxy-url", "{{ proxy.url }}",
"--headless",
"--monitor-resources",
"--memory-restart-threshold", "1800",
"--preferences", "layers.acceleration.disabled=true,dom.ipc.processCount=2,media.memory_cache_max_size=102400,browser.cache.memory.capacity=102400"
]
deploy:
resources:
limits:
memory: 2.5G
logging:
driver: "json-file"
options:
max-size: "100m"
max-file: "3"
networks:
- proxynet
{% endfor %}
{% if camoufox_proxies %}
# This service is a dependency anchor. The main services depend on it,
# and it in turn depends on all camoufox instances.
camoufox-group:
image: alpine:latest
command: ["echo", "Camoufox group ready."]
restart: "no"
depends_on:
{% for proxy in camoufox_proxies %}
- camoufox-{{ loop.index }}
{% endfor %}
networks:
- proxynet
{% endif %}
volumes:
{% for proxy in camoufox_proxies %}
camoufox-data-{{ loop.index }}:
{% endfor %}
{% if camoufox_proxies %}
camoufox-browser-cache:
{% endif %}
networks:
proxynet:
name: airflow_proxynet
external: true

View File

@ -1,23 +0,0 @@
import socket
import logging
logger = logging.getLogger(__name__)
def get_ip_address():
"""
Get the primary IP address of the host.
This is used by Airflow workers to advertise their IP for log serving,
ensuring the webserver can reach them in a multi-host environment.
"""
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
try:
# This doesn't even have to be reachable
s.connect(('10.255.255.255', 1))
ip_address = s.getsockname()[0]
logger.info(f"Determined host IP address as: {ip_address}")
except Exception as e:
logger.warning(f"Could not determine IP address, falling back to 127.0.0.1. Error: {e}")
ip_address = '127.0.0.1'
finally:
s.close()
return ip_address

View File

@ -45,9 +45,6 @@ DEFAULT_MANAGEMENT_SERVICE_IP = Variable.get("MANAGEMENT_SERVICE_HOST", default_
DEFAULT_MANAGEMENT_SERVICE_PORT = Variable.get("MANAGEMENT_SERVICE_PORT", default_var=9080) DEFAULT_MANAGEMENT_SERVICE_PORT = Variable.get("MANAGEMENT_SERVICE_PORT", default_var=9080)
DEFAULT_REDIS_CONN_ID = "redis_default" DEFAULT_REDIS_CONN_ID = "redis_default"
# Version tracking for debugging
DAG_VERSION = "1.7.1" # Updated to handle Redis configuration errors
# Helper function to connect to Redis, similar to other DAGs # Helper function to connect to Redis, similar to other DAGs
def _get_redis_client(redis_conn_id: str): def _get_redis_client(redis_conn_id: str):
@ -67,23 +64,7 @@ def _get_redis_client(redis_conn_id: str):
def _list_proxy_statuses(client, server_identity): def _list_proxy_statuses(client, server_identity):
"""Lists the status of proxies.""" """Lists the status of proxies."""
logger.info(f"Listing proxy statuses for server: {server_identity or 'ALL'}") logger.info(f"Listing proxy statuses for server: {server_identity or 'ALL'}")
logger.info("NOTE: Proxy statuses are read from server's internal state via Thrift service") statuses = client.getProxyStatus(server_identity)
try:
statuses = client.getProxyStatus(server_identity)
except PBServiceException as e:
if "Redis is not configured for this server" in e.message:
logger.error(f"Redis not configured on server: {e.message}")
print(f"\nERROR: Server configuration issue - {e.message}\n")
print("This server does not have Redis configured for proxy management.\n")
return
else:
# Re-raise if it's a different PBServiceException
raise
except Exception as e:
logger.error(f"Unexpected error getting proxy statuses: {e}", exc_info=True)
print(f"\nERROR: Unexpected error getting proxy statuses: {e}\n")
return
if not statuses: if not statuses:
logger.info("No proxy statuses found.") logger.info("No proxy statuses found.")
return return
@ -126,7 +107,6 @@ def _list_proxy_statuses(client, server_identity):
def _list_account_statuses(client, account_id, redis_conn_id): def _list_account_statuses(client, account_id, redis_conn_id):
"""Lists the status of accounts, enriching with live data from Redis.""" """Lists the status of accounts, enriching with live data from Redis."""
logger.info(f"Listing account statuses for account: {account_id or 'ALL'}") logger.info(f"Listing account statuses for account: {account_id or 'ALL'}")
logger.info("NOTE: Account statuses are read from the Thrift service and enriched with live data from Redis.")
redis_client = None redis_client = None
try: try:
@ -210,9 +190,6 @@ def _list_account_statuses(client, account_id, redis_conn_id):
def manage_system_callable(**context): def manage_system_callable(**context):
"""Main callable to interact with the system management endpoints.""" """Main callable to interact with the system management endpoints."""
# Log version for debugging
logger.info(f"Running ytdlp_mgmt_proxy_account DAG version {DAG_VERSION}")
params = context["params"] params = context["params"]
entity = params["entity"] entity = params["entity"]
action = params["action"] action = params["action"]
@ -289,85 +266,122 @@ def manage_system_callable(**context):
print(f"\nSuccessfully deleted {deleted_count} DagRun(s) for DAG '{dag_id}'.\n") print(f"\nSuccessfully deleted {deleted_count} DagRun(s) for DAG '{dag_id}'.\n")
return # End execution return # End execution
# Handle Thrift-based deletion actions # Handle direct Redis actions separately to avoid creating an unnecessary Thrift connection.
if action == "delete_from_redis": if action == "delete_from_redis":
client, transport = None, None redis_conn_id = params["redis_conn_id"]
try: redis_client = _get_redis_client(redis_conn_id)
client, transport = get_thrift_client(host, port)
if entity == "proxy": if entity == "accounts_and_proxies":
proxy_url = params.get("proxy_url") # --- Delete Proxy ---
server_identity = params.get("server_identity") proxy_url = params.get("proxy_url")
server_identity = params.get("server_identity")
if proxy_url and server_identity: if proxy_url and server_identity:
logger.info(f"Deleting proxy '{proxy_url}' for server '{server_identity}' from Redis via Thrift service...") proxy_state_key = f"proxy_status:{server_identity}"
result = client.deleteProxyFromRedis(proxy_url, server_identity)
if result: logger.warning(f"Deleting proxy '{proxy_url}' state from hash '{proxy_state_key}' from Redis.")
print(f"\nSuccessfully deleted proxy '{proxy_url}' for server '{server_identity}' from Redis.\n")
with redis_client.pipeline() as pipe:
pipe.hdel(proxy_state_key, proxy_url)
results = pipe.execute()
hdel_result = results[0]
print(f"\nSuccessfully removed proxy '{proxy_url}' from state hash (result: {hdel_result}).")
else:
logger.warning("No 'proxy_url' or 'server_identity' provided. Deleting ALL proxy state keys from Redis.")
patterns = ["proxy_status:*"]
keys_to_delete = []
for pattern in patterns:
found_keys = [key for key in redis_client.scan_iter(pattern)]
if found_keys:
logger.info(f"Found {len(found_keys)} keys for pattern '{pattern}'.")
keys_to_delete.extend(found_keys)
else: else:
print(f"\nFailed to delete proxy '{proxy_url}' for server '{server_identity}' from Redis.\n") logger.info(f"No keys found for pattern '{pattern}'.")
if not keys_to_delete:
print("\nNo proxy keys found to delete.\n")
else: else:
logger.info("Deleting all proxies from Redis via Thrift service...") print(f"\nWARNING: Found {len(keys_to_delete)} proxy-related keys to remove from Redis.")
# If server_identity is provided, delete all proxies for that server deleted_count = redis_client.delete(*keys_to_delete)
# If server_identity is None, delete all proxies for ALL servers print(f"\nSuccessfully removed {deleted_count} proxy-related keys from Redis.\n")
result = client.deleteAllProxiesFromRedis(server_identity)
if server_identity: # --- Delete Account ---
print(f"\nSuccessfully deleted all proxies for server '{server_identity}' from Redis. Count: {result}\n") account_prefix = params.get("account_id")
pattern = f"account_status:{account_prefix}*" if account_prefix else "account_status:*"
logger.warning(f"Searching for account status keys in Redis with pattern: '{pattern}'")
keys_to_delete = [key for key in redis_client.scan_iter(pattern)]
if not keys_to_delete:
print(f"\nNo accounts found matching pattern '{pattern}'.\n")
else:
print(f"\nWARNING: Found {len(keys_to_delete)} accounts to remove from Redis.")
for key in keys_to_delete[:10]:
print(f" - {key.decode('utf-8')}")
if len(keys_to_delete) > 10:
print(f" ... and {len(keys_to_delete) - 10} more.")
deleted_count = redis_client.delete(*keys_to_delete)
print(f"\nSuccessfully removed {deleted_count} accounts from Redis.\n")
return # End execution for this action
if entity == "account":
account_prefix = params.get("account_id") # Repurpose account_id param as an optional prefix
pattern = f"account_status:{account_prefix}*" if account_prefix else "account_status:*"
logger.warning(f"Searching for account status keys in Redis with pattern: '{pattern}'")
keys_to_delete = [key for key in redis_client.scan_iter(pattern)]
if not keys_to_delete:
print(f"\nNo accounts found matching pattern '{pattern}'.\n")
return
print(f"\nWARNING: Found {len(keys_to_delete)} accounts to remove from Redis.")
for key in keys_to_delete[:10]:
print(f" - {key.decode('utf-8')}")
if len(keys_to_delete) > 10:
print(f" ... and {len(keys_to_delete) - 10} more.")
deleted_count = redis_client.delete(*keys_to_delete)
print(f"\nSuccessfully removed {deleted_count} accounts from Redis.\n")
elif entity == "proxy":
proxy_url = params.get("proxy_url")
server_identity = params.get("server_identity")
if proxy_url and server_identity:
proxy_state_key = f"proxy_status:{server_identity}"
logger.warning(f"Deleting proxy '{proxy_url}' state from hash '{proxy_state_key}' from Redis.")
with redis_client.pipeline() as pipe:
pipe.hdel(proxy_state_key, proxy_url)
results = pipe.execute()
hdel_result = results[0]
print(f"\nSuccessfully removed proxy '{proxy_url}' from state hash (result: {hdel_result}).\n")
else:
logger.warning("No 'proxy_url' or 'server_identity' provided. Deleting ALL proxy state keys from Redis.")
patterns = ["proxy_status:*"]
keys_to_delete = []
for pattern in patterns:
found_keys = [key for key in redis_client.scan_iter(pattern)]
if found_keys:
logger.info(f"Found {len(found_keys)} keys for pattern '{pattern}'.")
keys_to_delete.extend(found_keys)
else: else:
print(f"\nSuccessfully deleted all proxies from Redis across ALL servers. Count: {result}\n") logger.info(f"No keys found for pattern '{pattern}'.")
elif entity == "account": if not keys_to_delete:
account_id = params.get("account_id") print("\nNo proxy keys found to delete.\n")
return
if account_id: print(f"\nWARNING: Found {len(keys_to_delete)} proxy-related keys to remove from Redis.")
logger.info(f"Deleting account '{account_id}' from Redis via Thrift service...") deleted_count = redis_client.delete(*keys_to_delete)
result = client.deleteAccountFromRedis(account_id) print(f"\nSuccessfully removed {deleted_count} proxy-related keys from Redis.\n")
if result:
print(f"\nSuccessfully deleted account '{account_id}' from Redis.\n")
else:
print(f"\nFailed to delete account '{account_id}' from Redis.\n")
else:
logger.info("Deleting all accounts from Redis via Thrift service...")
# If account_id is provided as prefix, delete all accounts with that prefix
# If account_id is None, delete all accounts
account_prefix = params.get("account_id")
result = client.deleteAllAccountsFromRedis(account_prefix)
if account_prefix:
print(f"\nSuccessfully deleted all accounts with prefix '{account_prefix}' from Redis. Count: {result}\n")
else:
print(f"\nSuccessfully deleted all accounts from Redis. Count: {result}\n")
elif entity == "accounts_and_proxies":
# Delete accounts
account_prefix = params.get("account_id") # Repurpose account_id param as an optional prefix
logger.info("Deleting accounts from Redis via Thrift service...")
account_result = client.deleteAllAccountsFromRedis(account_prefix)
if account_prefix:
print(f"\nSuccessfully deleted {account_result} account keys with prefix '{account_prefix}' from Redis.\n")
else:
print(f"\nSuccessfully deleted {account_result} account keys from Redis.\n")
# Delete proxies
server_identity = params.get("server_identity")
logger.info("Deleting proxies from Redis via Thrift service...")
proxy_result = client.deleteAllProxiesFromRedis(server_identity)
if server_identity:
print(f"\nSuccessfully deleted {proxy_result} proxy keys for server '{server_identity}' from Redis.\n")
else:
print(f"\nSuccessfully deleted {proxy_result} proxy keys from Redis across ALL servers.\n")
except (PBServiceException, PBUserException) as e:
logger.error(f"Thrift error performing delete action: {e.message}", exc_info=True)
print(f"\nERROR: Thrift service error: {e.message}\n")
raise
except Exception as e:
logger.error(f"Error performing delete action: {e}", exc_info=True)
print(f"\nERROR: An unexpected error occurred: {e}\n")
raise
finally:
if transport and transport.isOpen():
transport.close()
logger.info("Thrift connection closed.")
return # End execution for this action return # End execution for this action
client, transport = None, None client, transport = None, None
@ -658,15 +672,6 @@ with DAG(
### YT-DLP Proxy and Account Manager DAG ### YT-DLP Proxy and Account Manager DAG
This DAG provides tools to manage the state of proxies and accounts used by the `ytdlp-ops-server`. This DAG provides tools to manage the state of proxies and accounts used by the `ytdlp-ops-server`.
Select an `entity` and an `action` to perform. Select an `entity` and an `action` to perform.
**IMPORTANT NOTE ABOUT DATA SOURCES:**
- **Proxy Statuses**: Read from the server's internal state via Thrift service calls.
- **Account Statuses**: Read from the Thrift service, and then enriched with live cooldown data directly from Redis.
**IMPORTANT NOTE ABOUT PROXY MANAGEMENT:**
- Proxies are managed by the server's internal state through Thrift methods
- There is NO direct Redis manipulation for proxies - they are managed entirely by the server
- To properly manage proxies, use the Thrift service methods (ban, unban, etc.)
""", """,
params={ params={
"management_host": Param(DEFAULT_MANAGEMENT_SERVICE_IP, type="string", title="Management Service Host", description="The hostname or IP of the management service. Can be a Docker container name (e.g., 'envoy-thrift-lb') if on the same network."), "management_host": Param(DEFAULT_MANAGEMENT_SERVICE_IP, type="string", title="Management Service Host", description="The hostname or IP of the management service. Can be a Docker container name (e.g., 'envoy-thrift-lb') if on the same network."),
@ -689,14 +694,14 @@ with DAG(
- `unban`: Un-ban a specific proxy. Requires `proxy_url`. - `unban`: Un-ban a specific proxy. Requires `proxy_url`.
- `ban_all`: Sets the status of all proxies for a given `server_identity` (or all servers) to `BANNED`. - `ban_all`: Sets the status of all proxies for a given `server_identity` (or all servers) to `BANNED`.
- `unban_all`: Resets the status of all proxies for a given `server_identity` (or all servers) to `ACTIVE`. - `unban_all`: Resets the status of all proxies for a given `server_identity` (or all servers) to `ACTIVE`.
- `delete_from_redis`: **(Destructive)** Deletes proxy status from Redis via Thrift service. This permanently removes the proxy from being tracked by the system. If `proxy_url` and `server_identity` are provided, it deletes a single proxy. If only `server_identity` is provided, it deletes all proxies for that server. If neither is provided, it deletes ALL proxies across all servers. - `delete_from_redis`: **(Destructive)** Deletes proxy **state** from Redis. This action does not remove the proxy from the service's configuration, but rather resets its status (ban/active, success/failure counts) to the default. The service will continue to manage the proxy. If `proxy_url` and `server_identity` are provided, it deletes a single proxy's state. If they are omitted, it deletes **ALL** proxy state keys (`proxy_status:*`).
#### Actions for `entity: account` #### Actions for `entity: account`
- `list_with_status`: View status of all accounts, optionally filtered by `account_id` (as a prefix). - `list_with_status`: View status of all accounts, optionally filtered by `account_id` (as a prefix).
- `ban`: Ban a specific account. Requires `account_id`. - `ban`: Ban a specific account. Requires `account_id`.
- `unban`: Un-ban a specific account. Requires `account_id`. - `unban`: Un-ban a specific account. Requires `account_id`.
- `unban_all`: Sets the status of all accounts (or those matching a prefix in `account_id`) to `ACTIVE`. - `unban_all`: Sets the status of all accounts (or those matching a prefix in `account_id`) to `ACTIVE`.
- `delete_from_redis`: **(Destructive)** Deletes account status from Redis via Thrift service. This permanently removes the account from being tracked by the system. If `account_id` is provided, it deletes that specific account. If `account_id` is provided as a prefix, it deletes all accounts matching that prefix. If `account_id` is empty, it deletes ALL accounts. - `delete_from_redis`: **(Destructive)** Deletes account status keys from Redis. This permanently removes the account from being tracked by the system. This is different from `unban`. Use with caution.
#### Actions for `entity: accounts_and_proxies` #### Actions for `entity: accounts_and_proxies`
- This entity performs the selected action on **both** proxies and accounts where applicable. - This entity performs the selected action on **both** proxies and accounts where applicable.
@ -705,7 +710,7 @@ with DAG(
- `unban`: Un-ban a specific proxy AND a specific account. Requires `proxy_url`, `server_identity`, and `account_id`. - `unban`: Un-ban a specific proxy AND a specific account. Requires `proxy_url`, `server_identity`, and `account_id`.
- `ban_all`: Ban all proxies for a `server_identity` (or all servers). Does not affect accounts. - `ban_all`: Ban all proxies for a `server_identity` (or all servers). Does not affect accounts.
- `unban_all`: Un-ban all proxies for a `server_identity` (or all servers) AND all accounts (optionally filtered by `account_id` as a prefix). - `unban_all`: Un-ban all proxies for a `server_identity` (or all servers) AND all accounts (optionally filtered by `account_id` as a prefix).
- `delete_from_redis`: Deletes both account and proxy status from Redis via Thrift service. For accounts, if `account_id` is provided as a prefix, it deletes all accounts matching that prefix. If `account_id` is empty, it deletes ALL accounts. For proxies, if `server_identity` is provided, it deletes all proxies for that server. If `server_identity` is empty, it deletes ALL proxies across all servers. - `delete_from_redis`: Deletes proxy and account **state** from Redis. For proxies, this resets their status but they remain managed by the service. For accounts, this permanently removes them from the system's tracking. If `proxy_url` and `server_identity` are provided, it deletes a single proxy's state. If they are omitted, it deletes **ALL** proxy state (keys matching `proxy_status:*`). It will also delete all accounts matching the `account_id` prefix (or all accounts if `account_id` is empty).
#### Actions for `entity: airflow_meta` #### Actions for `entity: airflow_meta`
- `clear_dag_runs`: **(Destructive)** Deletes DAG run history and associated task instances from the database, removing them from the UI. This allows the runs to be re-created if backfilling is enabled. - `clear_dag_runs`: **(Destructive)** Deletes DAG run history and associated task instances from the database, removing them from the UI. This allows the runs to be re-created if backfilling is enabled.
@ -716,7 +721,7 @@ with DAG(
"server_identity": Param( "server_identity": Param(
None, None,
type=["null", "string"], type=["null", "string"],
description="The identity of the server instance (for proxy management). Leave blank to list all or delete all proxies.", description="The identity of the server instance (for proxy management). Leave blank to list all.",
), ),
"proxy_url": Param( "proxy_url": Param(
None, None,
@ -726,7 +731,7 @@ with DAG(
"account_id": Param( "account_id": Param(
None, None,
type=["null", "string"], type=["null", "string"],
description="The account ID to act upon. For `unban_all` or `delete_from_redis` on accounts, this can be an optional prefix. Leave blank to delete all accounts.", description="The account ID to act upon. For `unban_all` or `delete_from_redis` on accounts, this can be an optional prefix.",
), ),
"redis_conn_id": Param( "redis_conn_id": Param(
DEFAULT_REDIS_CONN_ID, DEFAULT_REDIS_CONN_ID,

89
airflow/deploy-dl.sh Executable file
View File

@ -0,0 +1,89 @@
#!/bin/bash
set -euo pipefail
# --- Environment Setup ---
ENV=""
# Parse command-line arguments
if [[ "$#" -gt 0 && "$1" == "--env" ]]; then
if [[ -n "$2" && ("$2" == "prod" || "$2" == "test") ]]; then
ENV="$2"
else
echo "Error: Invalid environment specified for deploy-dl.sh. Use 'prod' or 'test'." >&2
exit 1
fi
else
echo "Usage: $0 --env [prod|test]" >&2
exit 1
fi
# --- Configuration ---
SSH_USER="alex_p"
if [[ "$ENV" == "prod" ]]; then
WORKER_SERVERS=("dl003")
elif [[ "$ENV" == "test" ]]; then
WORKER_SERVERS=("dl001")
fi
REMOTE_DEST_PATH="/srv/airflow_dl_worker/"
# List of files and directories to sync from the project root.
# This script assumes it is run from the project root via deploy_all.sh
ROOT_FILES_TO_SYNC=(
"Dockerfile"
"get_info_json_client.py"
"proxy_manager_client.py"
"setup.py"
"VERSION"
"generate_tokens_direct.mjs"
)
AIRFLOW_FILES_TO_SYNC=(
"docker-compose-ytdlp-ops.yaml"
"init-airflow.sh"
)
DIRS_TO_SYNC=(
"airflow/camoufox/"
"airflow/inputfiles/"
"server_fix/"
"token_generator/"
"utils/"
"yt_ops_services/"
)
RSYNC_OPTS="-avz --progress --delete --exclude='__pycache__/' --exclude='*.pyc' --exclude='*.pyo' --exclude='node_modules/'"
echo ">>> Deploying to DL WORKER(S) for environment: $ENV"
# --- Deployment ---
for worker in "${WORKER_SERVERS[@]}"; do
WORKER_HOST="${SSH_USER}@${worker}"
echo "--------------------------------------------------"
echo ">>> Deploying to WORKER: $WORKER_HOST"
echo "--------------------------------------------------"
echo ">>> Creating remote directory on WORKER: $WORKER_HOST"
ssh "$WORKER_HOST" "mkdir -p $REMOTE_DEST_PATH"
echo ">>> Syncing individual files to WORKER..."
for f in "${ROOT_FILES_TO_SYNC[@]}"; do
echo " - Syncing $f"
rsync $RSYNC_OPTS "$f" "$WORKER_HOST:$REMOTE_DEST_PATH"
done
for f in "${AIRFLOW_FILES_TO_SYNC[@]}"; do
echo " - Syncing airflow/$f"
rsync $RSYNC_OPTS "airflow/$f" "$WORKER_HOST:$REMOTE_DEST_PATH"
done
echo ">>> Syncing directories to WORKER..."
for d in "${DIRS_TO_SYNC[@]}"; do
echo " - Syncing $d"
rsync $RSYNC_OPTS "$d" "$WORKER_HOST:$REMOTE_DEST_PATH"
done
echo ">>> Renaming worker compose file on remote..."
ssh "$WORK_HOST" "cd $REMOTE_DEST_PATH && ln -sf docker-compose-ytdlp-ops.yaml docker-compose.yaml"
done
echo ">>> DL WORKER(S) deployment sync complete."
exit 0

77
airflow/deploy-master.sh Executable file
View File

@ -0,0 +1,77 @@
#!/bin/bash
set -euo pipefail
# --- Environment Setup ---
ENV=""
# Parse command-line arguments
if [[ "$#" -gt 0 && "$1" == "--env" ]]; then
if [[ -n "$2" && ("$2" == "prod" || "$2" == "test") ]]; then
ENV="$2"
else
echo "Error: Invalid environment specified for deploy-master.sh. Use 'prod' or 'test'." >&2
exit 1
fi
else
echo "Usage: $0 --env [prod|test]" >&2
exit 1
fi
# --- Configuration ---
SSH_USER="alex_p"
if [[ "$ENV" == "prod" ]]; then
MASTER_SERVER="af-green"
elif [[ "$ENV" == "test" ]]; then
MASTER_SERVER="af-test"
fi
REMOTE_DEST_PATH="/srv/airflow_master/"
MASTER_HOST="${SSH_USER}@${MASTER_SERVER}"
# List of files and directories to sync from the project root.
# This script assumes it is run from the project root via deploy_all.sh
ROOT_FILES_TO_SYNC=(
"Dockerfile"
"get_info_json_client.py"
"proxy_manager_client.py"
"setup.py"
"VERSION"
)
AIRFLOW_FILES_TO_SYNC=(
"docker-compose-master.yaml"
"init-airflow.sh"
"nginx.conf"
)
DIRS_TO_SYNC=(
"airflow/inputfiles/"
"server_fix/"
"yt_ops_services/"
)
RSYNC_OPTS="-avz --progress --delete --exclude='__pycache__/' --exclude='*.pyc' --exclude='*.pyo' --exclude='node_modules/'"
echo ">>> Deploying to MASTER for environment: $ENV"
# --- Deployment ---
echo ">>> Creating remote directory on MASTER: $MASTER_HOST"
ssh "$MASTER_HOST" "mkdir -p $REMOTE_DEST_PATH"
echo ">>> Syncing individual files to MASTER..."
for f in "${ROOT_FILES_TO_SYNC[@]}"; do
rsync $RSYNC_OPTS "$f" "$MASTER_HOST:$REMOTE_DEST_PATH"
done
for f in "${AIRFLOW_FILES_TO_SYNC[@]}"; do
rsync $RSYNC_OPTS "airflow/$f" "$MASTER_HOST:$REMOTE_DEST_PATH"
done
echo ">>> Syncing directories to MASTER..."
for d in "${DIRS_TO_SYNC[@]}"; do
rsync $RSYNC_OPTS "$d" "$MASTER_HOST:$REMOTE_DEST_PATH"
done
echo ">>> Renaming master compose file on remote..."
ssh "$MASTER_HOST" "cd $REMOTE_DEST_PATH && ln -sf docker-compose-master.yaml docker-compose.yaml"
echo ">>> MASTER deployment sync complete."
exit 0

View File

@ -22,25 +22,9 @@ x-airflow-common:
- "{{ hostvars[host]['inventory_hostname'] }}:{{ hostvars[host]['ansible_host'] }}" - "{{ hostvars[host]['inventory_hostname'] }}:{{ hostvars[host]['ansible_host'] }}"
{% endfor %} {% endfor %}
env_file: env_file:
# The .env file is located in the project root (e.g., /srv/airflow_dl_worker), - .env
# so we provide an absolute path to it.
- "{{ airflow_worker_dir }}/.env"
environment: environment:
&airflow-common-env &airflow-common-env
AIRFLOW__CORE__PARALLELISM: 64
AIRFLOW__CORE__MAX_ACTIVE_TASKS_PER_DAG: 32
AIRFLOW__SCHEDULER__PARSING_PROCESSES: 4
AIRFLOW__WEBSERVER__WORKERS: 5
AIRFLOW__WEBSERVER__WORKER_CLASS: "gevent"
AIRFLOW__LOGGING__SECRET_MASK_EXCEPTION_ARGS: False
# Prevent slow webserver when low memory?
GUNICORN_CMD_ARGS: --max-requests 20 --max-requests-jitter 3 --worker-tmp-dir /dev/shm
# Airflow Core # Airflow Core
AIRFLOW__CORE__EXECUTOR: CeleryExecutor AIRFLOW__CORE__EXECUTOR: CeleryExecutor
AIRFLOW__CORE__LOAD_EXAMPLES: 'false' AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
@ -48,17 +32,15 @@ x-airflow-common:
# Backend connections - These should point to the master node # Backend connections - These should point to the master node
# Set MASTER_HOST_IP, POSTGRES_PASSWORD, and REDIS_PASSWORD in your .env file # Set MASTER_HOST_IP, POSTGRES_PASSWORD, and REDIS_PASSWORD in your .env file
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:${{ '{' }}POSTGRES_PASSWORD{{ '}' }}@${{ '{' }}MASTER_HOST_IP{{ '}' }}:{{ postgres_port }}/airflow AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:${POSTGRES_PASSWORD:-pgdb_pwd_A7bC2xY9zE1wV5uP}@${MASTER_HOST_IP}:5432/airflow
IRFLOW__CELERY__RESULT_BACKEND: db+postgresql+psycopg2://airflow:${{ '{' }}POSTGRES_PASSWORD{{ '}' }}@${{ '{' }}MASTER_HOST_IP{{ '}' }}:{{ postgres_port }}/airflow AIRFLOW__CELERY__BROKER_URL: redis://:${REDIS_PASSWORD:-redis_pwd_K3fG8hJ1mN5pQ2sT}@${MASTER_HOST_IP}:52909/0
AIRFLOW__CELERY__BROKER_URL: redis://:${REDIS_PASSWORD}@${MASTER_HOST_IP}:52909/0 AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:${POSTGRES_PASSWORD:-pgdb_pwd_A7bC2xY9zE1wV5uP}@${MASTER_HOST_IP}:5432/airflow
# Remote Logging - connection is configured directly via environment variables # Remote Logging - connection is fetched from DB, which is on master
#_PIP_ADDITIONAL_REQUIREMENTS: ${{ '{' }}_PIP_ADDITIONAL_REQUIREMENTS:- apache-airflow-providers-docker apache-airflow-providers-http thrift>=0.16.0,<=0.20.0 backoff>=2.2.1 python-dotenv==1.0.1 psutil>=5.9.0 apache-airflow-providers-amazon{{ '}' }}
AIRFLOW__LOGGING__REMOTE_LOGGING: "True" AIRFLOW__LOGGING__REMOTE_LOGGING: "True"
AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER: "s3://airflow-logs" AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER: "s3://airflow-logs"
AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID: minio_default AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID: minio_default
AIRFLOW__LOGGING__ENCRYPT_S3_LOGS: "False" AIRFLOW__LOGGING__ENCRYPT_S3_LOGS: "False"
#AIRFLOW__LOGGING__LOG_ID_TEMPLATE: "{dag_id}-{task_id}-{run_id}-{try_number}"
AIRFLOW__WEBSERVER__SECRET_KEY: 'qmALu5JCAW0518WGAqkVZQ==' AIRFLOW__WEBSERVER__SECRET_KEY: 'qmALu5JCAW0518WGAqkVZQ=='
AIRFLOW__CORE__INTERNAL_API_SECRET_KEY: 'qmALu5JCAW0518WGAqkVZQ==' AIRFLOW__CORE__INTERNAL_API_SECRET_KEY: 'qmALu5JCAW0518WGAqkVZQ=='
AIRFLOW__CORE__LOCAL_SETTINGS_PATH: "/opt/airflow/config/custom_task_hooks.py" AIRFLOW__CORE__LOCAL_SETTINGS_PATH: "/opt/airflow/config/custom_task_hooks.py"
@ -70,13 +52,12 @@ x-airflow-common:
- ${AIRFLOW_PROJ_DIR:-.}/logs:/opt/airflow/logs - ${AIRFLOW_PROJ_DIR:-.}/logs:/opt/airflow/logs
# Mount config for local settings and other configurations # Mount config for local settings and other configurations
- ${AIRFLOW_PROJ_DIR:-.}/config:/opt/airflow/config - ${AIRFLOW_PROJ_DIR:-.}/config:/opt/airflow/config
- ${AIRFLOW_PROJ_DIR:-.}/config/airflow.cfg:/opt/airflow/airflow.cfg
# Mount download directories # Mount download directories
- ${AIRFLOW_PROJ_DIR:-.}/downloadfiles:/opt/airflow/downloadfiles - ${AIRFLOW_PROJ_DIR:-.}/downloadfiles:/opt/airflow/downloadfiles
- ${AIRFLOW_PROJ_DIR:-.}/addfiles:/opt/airflow/addfiles - ${AIRFLOW_PROJ_DIR:-.}/addfiles:/opt/airflow/addfiles
- ${AIRFLOW_PROJ_DIR:-.}/inputfiles:/opt/airflow/inputfiles - ${AIRFLOW_PROJ_DIR:-.}/inputfiles:/opt/airflow/inputfiles
# Use AIRFLOW_UID from .env file to fix permission issues. # Use AIRFLOW_UID and AIRFLOW_GID from .env file to fix permission issues.
user: "${AIRFLOW_UID:-50000}" user: "${AIRFLOW_UID:-50000}:${AIRFLOW_GID:-0}"
services: services:
airflow-worker: airflow-worker:
@ -109,8 +90,6 @@ services:
AIRFLOW__CELERY__WORKER_TAGS: "dl" AIRFLOW__CELERY__WORKER_TAGS: "dl"
AIRFLOW__CELERY__WORKER_PREFETCH_MULTIPLIER: "1" AIRFLOW__CELERY__WORKER_PREFETCH_MULTIPLIER: "1"
AIRFLOW__CELERY__WORKER_CONCURRENCY: ${AIRFLOW_WORKER_DOWNLOAD_CONCURRENCY:-16} AIRFLOW__CELERY__WORKER_CONCURRENCY: ${AIRFLOW_WORKER_DOWNLOAD_CONCURRENCY:-16}
# Use prefork pool for better compatibility with blocking libraries.
AIRFLOW__CELERY__POOL: "prefork"
AIRFLOW__CELERY__TASK_ACKS_LATE: "False" AIRFLOW__CELERY__TASK_ACKS_LATE: "False"
AIRFLOW__CELERY__OPERATION_TIMEOUT: "2.0" AIRFLOW__CELERY__OPERATION_TIMEOUT: "2.0"
AIRFLOW__CELERY__WORKER_NAME: "worker-dl@%h" AIRFLOW__CELERY__WORKER_NAME: "worker-dl@%h"
@ -128,6 +107,23 @@ services:
- proxynet - proxynet
restart: always restart: always
airflow-triggerer:
<<: *airflow-common
container_name: airflow-dl-triggerer-1
hostname: ${HOSTNAME}
command: triggerer
healthcheck:
test: ["CMD-SHELL", 'airflow jobs check --job-type TriggererJob --hostname "$${HOSTNAME}"']
interval: 30s
timeout: 30s
retries: 5
start_period: 60s
environment:
<<: *airflow-common-env
PYTHONASYNCIODEBUG: 1
DUMB_INIT_SETSID: 0
restart: always
docker-socket-proxy: docker-socket-proxy:
profiles: profiles:
- disabled - disabled

View File

@ -45,6 +45,26 @@
# Feel free to modify this file to suit your needs. # Feel free to modify this file to suit your needs.
--- ---
name: airflow-master name: airflow-master
x-minio-common: &minio-common
image: quay.io/minio/minio:RELEASE.2025-07-23T15-54-02Z
command: server --console-address ":9001" http://minio{1...3}/data{1...2}
expose:
- "9000"
- "9001"
networks:
- proxynet
env_file:
- .env
environment:
MINIO_ROOT_USER: ${{ '{' }}MINIO_ROOT_USER:-admin{{ '}' }}
MINIO_ROOT_PASSWORD: ${{ '{' }}MINIO_ROOT_PASSWORD:-0153093693-0009{{ '}' }}
healthcheck:
test: ["CMD", "mc", "ready", "local"]
interval: 5s
timeout: 5s
retries: 5
restart: always
x-airflow-common: x-airflow-common:
&airflow-common &airflow-common
# In order to add custom dependencies or upgrade provider packages you can use your extended image. # In order to add custom dependencies or upgrade provider packages you can use your extended image.
@ -54,14 +74,13 @@ x-airflow-common:
# Add extra hosts here to allow the master services (webserver, scheduler) to resolve # Add extra hosts here to allow the master services (webserver, scheduler) to resolve
# the hostnames of your remote DL workers. This is crucial for fetching logs. # the hostnames of your remote DL workers. This is crucial for fetching logs.
# Format: - "hostname:ip_address" # Format: - "hostname:ip_address"
# This section is auto-generated by Ansible from the inventory. # IMPORTANT: This section is auto-generated from cluster.yml
extra_hosts: extra_hosts:
{% for host in groups['all'] %} {% for host_name, host_ip in all_hosts.items() %}
- "{{ hostvars[host]['inventory_hostname'] }}:{{ hostvars[host]['ansible_host'] }}" - "{{ host_name }}:{{ host_ip }}"
{% endfor %} {% endfor %}
env_file: env_file:
# The .env file is located in the project root, one level above the 'configs' directory. - .env
- ".env"
networks: networks:
- proxynet - proxynet
environment: environment:
@ -69,58 +88,54 @@ x-airflow-common:
AIRFLOW__CORE__PARALLELISM: 64 AIRFLOW__CORE__PARALLELISM: 64
AIRFLOW__CORE__MAX_ACTIVE_TASKS_PER_DAG: 32 AIRFLOW__CORE__MAX_ACTIVE_TASKS_PER_DAG: 32
AIRFLOW__SCHEDULER__PARSING_PROCESSES: 4 AIRFLOW__SCHEDULER__PARSING_PROCESSES: 4
AIRFLOW__WEBSERVER__WORKER_CLASS: gevent
AIRFLOW__WEBSERVER__WORKERS: 8
AIRFLOW__LOGGING__SECRET_MASK_EXCEPTION_ARGS: 'false'
# Prevent slow webserver when low memory?
GUNICORN_CMD_ARGS: --worker-tmp-dir /dev/shm
AIRFLOW__CORE__EXECUTOR: CeleryExecutor AIRFLOW__CORE__EXECUTOR: CeleryExecutor
# For master services, connect to Postgres and Redis using internal Docker service names. AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:${{ '{' }}POSTGRES_PASSWORD:-pgdb_pwd_A7bC2xY9zE1wV5uP{{ '}' }}@postgres/airflow
# Passwords are sourced from the .env file. AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:${{ '{' }}POSTGRES_PASSWORD:-pgdb_pwd_A7bC2xY9zE1wV5uP{{ '}' }}@postgres/airflow
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:${{ '{' }}POSTGRES_PASSWORD{{ '}' }}@postgres:5432/airflow AIRFLOW__CELERY__BROKER_URL: redis://:${{ '{' }}REDIS_PASSWORD:-redis_pwd_K3fG8hJ1mN5pQ2sT{{ '}' }}@redis:6379/0
AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql+psycopg2://airflow:${{ '{' }}POSTGRES_PASSWORD{{ '}' }}@postgres:5432/airflow
AIRFLOW__CELERY__BROKER_URL: redis://:${{ '{' }}REDIS_PASSWORD{{ '}' }}@redis:6379/0
AIRFLOW__CORE__FERNET_KEY: '' AIRFLOW__CORE__FERNET_KEY: ''
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true' AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
AIRFLOW__CORE__LOAD_EXAMPLES: 'false' AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
AIRFLOW__API__AUTH_BACKENDS: 'airflow.api.auth.backend.basic_auth,airflow.api.auth.backend.session' AIRFLOW__API__AUTH_BACKENDS: 'airflow.api.auth.backend.basic_auth,airflow.api.auth.backend.session'
AIRFLOW_CONFIG: '/opt/airflow/config/airflow.cfg'
AIRFLOW__WEBSERVER__SECRET_KEY: 'qmALu5JCAW0518WGAqkVZQ==' AIRFLOW__WEBSERVER__SECRET_KEY: 'qmALu5JCAW0518WGAqkVZQ=='
AIRFLOW__WEBSERVER__WORKER_TIMEOUT: '120' AIRFLOW__CORE__INTERNAL_API_SECRET_KEY: 'qmALu5JCAW0518WGAqkVZQ=='
AIRFLOW__CORE__INTERNAL_API_SECRET_KEY: 'qmALu5JCAW0518WGAqkVZZQ=='
# yamllint disable rule:line-length # yamllint disable rule:line-length
# Use simple http server on scheduler for health checks # Use simple http server on scheduler for health checks
# See https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/check-health.html#scheduler-health-check-server # See https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/check-health.html#scheduler-health-check-server
# yamllint enable rule:line-length # yamllint enable rule:line-length
AIRFLOW__SCHEDULER__ENABLE_HEALTH_CHECK: 'true' AIRFLOW__SCHEDULER__ENABLE_HEALTH_CHECK: 'true'
AIRFLOW__DATABASE__LOAD_DEFAULT_CONNECTIONS: 'false' # WARNING: Use _PIP_ADDITIONAL_REQUIREMENTS option ONLY for a quick checks
AIRFLOW__LOGGING__REMOTE_LOGGING: 'true' # for other purpose (development, test and especially production usage) build/extend Airflow image.
#_PIP_ADDITIONAL_REQUIREMENTS: ${{ '{' }}_PIP_ADDITIONAL_REQUIREMENTS:- apache-airflow-providers-docker apache-airflow-providers-http thrift>=0.16.0,<=0.20.0 backoff>=2.2.1 python-dotenv==1.0.1 psutil>=5.9.0{{ '}' }} # The following line can be used to set a custom config file, stored in the local config folder
# If you want to use it, outcomment it and replace airflow.cfg with the name of your config file
AIRFLOW__LOGGING__REMOTE_LOGGING: "True"
AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER: "s3://airflow-logs" AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER: "s3://airflow-logs"
AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID: minio_default AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID: minio_default
AIRFLOW__LOGGING__ENCRYPT_S3_LOGS: 'false' AIRFLOW__LOGGING__ENCRYPT_S3_LOGS: "False"
{% raw %}
AIRFLOW__LOGGING__REMOTE_LOG_FORMAT: "[%%(asctime)s] {%%(filename)s:%%(lineno)d} %%(levelname)s - %%(message)s"
AIRFLOW__LOGGING__LOG_LEVEL: "INFO"
AIRFLOW__LOGGING__LOG_FILENAME_TEMPLATE: "{{ ti.dag_id }}/{{ ti.run_id }}/{{ ti.task_id }}/attempt={{ try_number }}.log"
{% endraw %}
AIRFLOW__CORE__LOCAL_SETTINGS_PATH: "/opt/airflow/config/custom_task_hooks.py" AIRFLOW__CORE__LOCAL_SETTINGS_PATH: "/opt/airflow/config/custom_task_hooks.py"
volumes: volumes:
- ${{ '{' }}AIRFLOW_PROJ_DIR:-.{{ '}' }}/dags:/opt/airflow/dags - ${{ '{' }}AIRFLOW_PROJ_DIR:-.{{ '}' }}/dags:/opt/airflow/dags
- ${{ '{' }}AIRFLOW_PROJ_DIR:-.{{ '}' }}/logs:/opt/airflow/logs - ${{ '{' }}AIRFLOW_PROJ_DIR:-.{{ '}' }}/logs:/opt/airflow/logs
- ${{ '{' }}AIRFLOW_PROJ_DIR:-.{{ '}' }}/config:/opt/airflow/config - ${{ '{' }}AIRFLOW_PROJ_DIR:-.{{ '}' }}/config:/opt/airflow/config
- ${{ '{' }}AIRFLOW_PROJ_DIR:-.{{ '}' }}/config/airflow.cfg:/opt/airflow/airflow.cfg
- ${{ '{' }}AIRFLOW_PROJ_DIR:-.{{ '}' }}/plugins:/opt/airflow/plugins - ${{ '{' }}AIRFLOW_PROJ_DIR:-.{{ '}' }}/plugins:/opt/airflow/plugins
- ${{ '{' }}AIRFLOW_PROJ_DIR:-.{{ '}' }}/downloadfiles:/opt/airflow/downloadfiles - ${{ '{' }}AIRFLOW_PROJ_DIR:-.{{ '}' }}/downloadfiles:/opt/airflow/downloadfiles
- ${{ '{' }}AIRFLOW_PROJ_DIR:-.{{ '}' }}/addfiles:/opt/airflow/addfiles - ${{ '{' }}AIRFLOW_PROJ_DIR:-.{{ '}' }}/addfiles:/opt/airflow/addfiles
- ${{ '{' }}AIRFLOW_PROJ_DIR:-.{{ '}' }}/inputfiles:/opt/airflow/inputfiles - ${{ '{' }}AIRFLOW_PROJ_DIR:-.{{ '}' }}/inputfiles:/opt/airflow/inputfiles
user: "${{ '{' }}AIRFLOW_UID:-50000{{ '}' }}:0" user: "${{ '{' }}AIRFLOW_UID:-50000{{ '}' }}:${{ '{' }}AIRFLOW_GID:-0{{ '}' }}"
depends_on: depends_on:
&airflow-common-depends-on &airflow-common-depends-on
redis: redis:
condition: service_healthy condition: service_healthy
postgres: postgres:
condition: service_healthy condition: service_healthy
minio-init: nginx-minio-lb:
condition: service_completed_successfully condition: service_healthy
services: services:
postgres: postgres:
@ -133,14 +148,8 @@ services:
POSTGRES_USER: airflow POSTGRES_USER: airflow
POSTGRES_PASSWORD: ${{ '{' }}POSTGRES_PASSWORD:-pgdb_pwd_A7bC2xY9zE1wV5uP{{ '}' }} POSTGRES_PASSWORD: ${{ '{' }}POSTGRES_PASSWORD:-pgdb_pwd_A7bC2xY9zE1wV5uP{{ '}' }}
POSTGRES_DB: airflow POSTGRES_DB: airflow
command:
- "postgres"
- "-c"
- "shared_buffers=512MB"
- "-c"
- "effective_cache_size=1536MB"
volumes: volumes:
- ./postgres-data:/var/lib/postgresql/data - postgres-db-volume:/var/lib/postgresql/data
ports: ports:
- "{{ postgres_port }}:5432" - "{{ postgres_port }}:5432"
healthcheck: healthcheck:
@ -161,7 +170,7 @@ services:
command: command:
- "redis-server" - "redis-server"
- "--requirepass" - "--requirepass"
- "${{ '{' }}REDIS_PASSWORD:-rOhTAIlTFFylXsjhqwxnYxDChFc{{ '}' }}" - "${{ '{' }}REDIS_PASSWORD:-redis_pwd_K3fG8hJ1mN5pQ2sT{{ '}' }}"
- "--bind" - "--bind"
- "*" - "*"
- "--protected-mode" - "--protected-mode"
@ -174,22 +183,18 @@ services:
- "--appendonly" - "--appendonly"
- "yes" - "yes"
volumes: volumes:
- redis-data:/data - ./redis-data:/data
expose: expose:
- 6379 - 6379
ports: ports:
- "{{ redis_port }}:6379" - "{{ redis_port }}:6379"
healthcheck: healthcheck:
test: ["CMD", "redis-cli", "-a", "${{ '{' }}REDIS_PASSWORD:-rOhTAIlTFFylXsjhqwxnYxDChFc{{ '}' }}", "ping"] test: ["CMD", "redis-cli", "-a", "${{ '{' }}REDIS_PASSWORD:-redis_pwd_K3fG8hJ1mN5pQ2sT{{ '}' }}", "ping"]
interval: 10s interval: 10s
timeout: 30s timeout: 30s
retries: 50 retries: 50
start_period: 30s start_period: 30s
restart: always restart: always
sysctls:
- net.core.somaxconn=1024
ulimits:
memlock: -1
redis-proxy-account-clear: redis-proxy-account-clear:
image: redis:7.2-bookworm image: redis:7.2-bookworm
@ -201,52 +206,65 @@ services:
command: > command: >
sh -c " sh -c "
echo 'Clearing proxy and account statuses from Redis...'; echo 'Clearing proxy and account statuses from Redis...';
redis-cli -h redis -a $${{ '{' }}REDIS_PASSWORD:-rOhTAIlTFFylXsjhqwxnYxDChFc{{ '}' }} --scan --pattern 'proxy_status:*' | xargs -r redis-cli -h redis -a $${{ '{' }}REDIS_PASSWORD:-rOhTAIlTFFylXsjhqwxnYxDChFc{{ '}' }} DEL; redis-cli -h redis -a $${{ '{' }}REDIS_PASSWORD:-redis_pwd_K3fG8hJ1mN5pQ2sT{{ '}' }} --scan --pattern 'proxy_status:*' | xargs -r redis-cli -h redis -a $${{ '{' }}REDIS_PASSWORD:-redis_pwd_K3fG8hJ1mN5pQ2sT{{ '}' }} DEL;
redis-cli -h redis -a $${{ '{' }}REDIS_PASSWORD:-rOhTAIlTFFylXsjhqwxnYxDChFc{{ '}' }} --scan --pattern 'account_status:*' | xargs -r redis-cli -h redis -a $${{ '{' }}REDIS_PASSWORD:-rOhTAIlTFFylXsjhqwxnYxDChFc{{ '}' }} DEL; redis-cli -h redis -a $${{ '{' }}REDIS_PASSWORD:-redis_pwd_K3fG8hJ1mN5pQ2sT{{ '}' }} --scan --pattern 'account_status:*' | xargs -r redis-cli -h redis -a $${{ '{' }}REDIS_PASSWORD:-redis_pwd_K3fG8hJ1mN5pQ2sT{{ '}' }} DEL;
echo 'Redis cleanup complete.' echo 'Redis cleanup complete.'
" "
depends_on: depends_on:
redis: redis:
condition: service_healthy condition: service_healthy
minio: minio1:
image: minio/minio:latest <<: *minio-common
container_name: minio hostname: minio1
networks:
- proxynet
volumes: volumes:
- ./minio-data:/data - ./minio-data/1/1:/data1
ports: - ./minio-data/1/2:/data2
- "9001:9000"
- "9002:9001" minio2:
environment: <<: *minio-common
MINIO_ROOT_USER: ${{ '{' }}MINIO_ROOT_USER:-admin{{ '}' }} hostname: minio2
MINIO_ROOT_PASSWORD: ${{ '{' }}MINIO_ROOT_PASSWORD:-0153093693-0009{{ '}' }} volumes:
command: server /data --console-address ":9001" - ./minio-data/2/1:/data1
healthcheck: - ./minio-data/2/2:/data2
test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"] depends_on:
interval: 30s minio1:
timeout: 20s condition: service_started
retries: 3
restart: always minio3:
<<: *minio-common
hostname: minio3
volumes:
- ./minio-data/3/1:/data1
- ./minio-data/3/2:/data2
depends_on:
minio2:
condition: service_started
nginx-minio-lb: nginx-minio-lb:
image: nginx:alpine image: nginx:1.19.2-alpine
container_name: nginx-minio-lb hostname: nginx-minio-lb
networks: networks:
- proxynet - proxynet
command: sh -c "apk add --no-cache curl >/dev/null 2>&1 && exec nginx -g 'daemon off;'"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
ports: ports:
- "9000:9000" - "9000:9000"
volumes: - "9001:9001"
- ./configs/nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
minio:
condition: service_healthy
healthcheck: healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"] test: ["CMD", "curl", "-f", "http://localhost:9001/minio/health/live"]
interval: 30s interval: 10s
timeout: 10s timeout: 5s
retries: 5 retries: 5
start_period: 10s
depends_on:
minio1:
condition: service_healthy
minio2:
condition: service_healthy
minio3:
condition: service_healthy
restart: always restart: always
minio-init: minio-init:
@ -299,32 +317,24 @@ services:
MINIO_ROOT_PASSWORD: ${{ '{' }}MINIO_ROOT_PASSWORD:-0153093693-0009{{ '}' }} MINIO_ROOT_PASSWORD: ${{ '{' }}MINIO_ROOT_PASSWORD:-0153093693-0009{{ '}' }}
restart: on-failure restart: on-failure
caddy: nginx-healthcheck:
build: image: nginx:alpine
context: . container_name: nginx-healthcheck
dockerfile: Dockerfile.caddy
image: pangramia/ytdlp-ops-caddy:latest
container_name: caddy
networks: networks:
- proxynet - proxynet
ports: ports:
- "8080:8080" - "8888:80"
depends_on:
- airflow-webserver
restart: always restart: always
airflow-webserver: airflow-webserver:
<<: *airflow-common <<: *airflow-common
command: webserver command: webserver
environment: ports:
<<: *airflow-common-env - "8080:8080"
# Trigger gevent monkeypatching for webserver.
# See: https://github.com/apache/airflow/pull/28283
_AIRFLOW_PATCH_GEVENT: "1"
healthcheck: healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:8080/health"] test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
interval: 30s interval: 30s
timeout: 30s timeout: 10s
retries: 5 retries: 5
start_period: 30s start_period: 30s
restart: always restart: always
@ -348,6 +358,8 @@ services:
airflow-init: airflow-init:
condition: service_completed_successfully condition: service_completed_successfully
airflow-master-worker: airflow-master-worker:
<<: *airflow-common <<: *airflow-common
command: airflow celery worker -q main,default command: airflow celery worker -q main,default
@ -369,7 +381,7 @@ services:
AIRFLOW__CELERY__WORKER_TAGS: "master" AIRFLOW__CELERY__WORKER_TAGS: "master"
AIRFLOW__CELERY__WORKER_CONCURRENCY: "16" AIRFLOW__CELERY__WORKER_CONCURRENCY: "16"
AIRFLOW__CELERY__WORKER_PREFETCH_MULTIPLIER: "1" AIRFLOW__CELERY__WORKER_PREFETCH_MULTIPLIER: "1"
AIRFLOW__CELERY__TASK_ACKS_LATE: "True" AIRFLOW__CELERY__TASK_ACKS_LATE: "False"
AIRFLOW__CELERY__OPERATION_TIMEOUT: "2.0" AIRFLOW__CELERY__OPERATION_TIMEOUT: "2.0"
AIRFLOW__CELERY__WORKER_NAME: "worker-master@%h" AIRFLOW__CELERY__WORKER_NAME: "worker-master@%h"
AIRFLOW__CELERY__WORKER_MAX_TASKS_PER_CHILD: "100" AIRFLOW__CELERY__WORKER_MAX_TASKS_PER_CHILD: "100"
@ -387,10 +399,6 @@ services:
airflow-triggerer: airflow-triggerer:
<<: *airflow-common <<: *airflow-common
command: triggerer command: triggerer
hostname: ${{ '{' }}HOSTNAME{{ '}' }}
environment:
<<: *airflow-common-env
PYTHONASYNCIODEBUG: "1"
healthcheck: healthcheck:
test: ["CMD-SHELL", 'airflow jobs check --job-type TriggererJob --hostname "$${{ '{' }}HOSTNAME{{ '}' }}"'] test: ["CMD-SHELL", 'airflow jobs check --job-type TriggererJob --hostname "$${{ '{' }}HOSTNAME{{ '}' }}"']
interval: 30s interval: 30s
@ -407,6 +415,8 @@ services:
<<: *airflow-common <<: *airflow-common
depends_on: depends_on:
<<: *airflow-common-depends-on <<: *airflow-common-depends-on
minio-init:
condition: service_completed_successfully
redis-proxy-account-clear: redis-proxy-account-clear:
condition: service_completed_successfully condition: service_completed_successfully
entrypoint: /bin/bash entrypoint: /bin/bash
@ -417,20 +427,9 @@ services:
# This container runs as root and is responsible for initializing the environment. # This container runs as root and is responsible for initializing the environment.
# It sets permissions on mounted directories to ensure the 'airflow' user (running with AIRFLOW_UID) # It sets permissions on mounted directories to ensure the 'airflow' user (running with AIRFLOW_UID)
# can write to them. This is crucial for logs, dags, and plugins. # can write to them. This is crucial for logs, dags, and plugins.
echo "Creating scheduler & dag processor log directories..."
mkdir -p /opt/airflow/logs/scheduler /opt/airflow/logs/dag_processor_manager
echo "Initializing permissions for Airflow directories..." echo "Initializing permissions for Airflow directories..."
chown -R "${{ '{' }}AIRFLOW_UID{{ '}' }}:0" /opt/airflow/dags /opt/airflow/logs /opt/airflow/plugins /opt/airflow/config /opt/airflow/downloadfiles /opt/airflow/addfiles /opt/airflow/inputfiles chown -R "${{ '{' }}AIRFLOW_UID{{ '}' }}:${{ '{' }}AIRFLOW_GID{{ '}' }}" /opt/airflow/dags /opt/airflow/logs /opt/airflow/plugins /opt/airflow/config /opt/airflow/downloadfiles /opt/airflow/addfiles /opt/airflow/inputfiles
echo "Setting group-writable and setgid permissions on logs directory..."
find /opt/airflow/logs -type d -exec chmod g+rws {} +
find /opt/airflow/logs -type f -exec chmod g+rw {} +
echo "Permissions set." echo "Permissions set."
# Install curl and setup MinIO connection
echo "Installing curl and setting up MinIO connection..."
apt-get update -yqq && apt-get install -yqq curl
echo "MinIO connection setup complete."
if [[ -z "${{ '{' }}AIRFLOW_UID{{ '}' }}" ]]; then if [[ -z "${{ '{' }}AIRFLOW_UID{{ '}' }}" ]]; then
echo echo
echo -e "\033[1;33mWARNING!!!: AIRFLOW_UID not set!\e[0m" echo -e "\033[1;33mWARNING!!!: AIRFLOW_UID not set!\e[0m"
@ -444,11 +443,6 @@ services:
# Wait for db to be ready. # Wait for db to be ready.
airflow db check --retry 30 --retry-delay 5 airflow db check --retry 30 --retry-delay 5
# Initialize the database if needed
echo "Initializing Airflow database..."
airflow db init
echo "Database initialization complete."
# Run database migrations. # Run database migrations.
echo "Running database migrations..." echo "Running database migrations..."
airflow db upgrade airflow db upgrade
@ -466,13 +460,6 @@ services:
--email admin@example.com || true --email admin@example.com || true
echo "Admin user check/creation complete." echo "Admin user check/creation complete."
# Create/update the redis_default connection to ensure password is correct
echo "Creating/updating redis_default connection..."
airflow connections add 'redis_default' \
--conn-uri "redis://:${{ '{' }}REDIS_PASSWORD{{ '}' }}@redis:6379/0" \
|| echo "Failed to add redis_default connection, but continuing."
echo "Redis connection setup complete."
# Import connections from any .json file in the config directory. # Import connections from any .json file in the config directory.
echo "Searching for connection files in /opt/airflow/config..." echo "Searching for connection files in /opt/airflow/config..."
if [ -d "/opt/airflow/config" ] && [ -n "$(ls -A /opt/airflow/config/*.json 2>/dev/null)" ]; then if [ -d "/opt/airflow/config" ] && [ -n "$(ls -A /opt/airflow/config/*.json 2>/dev/null)" ]; then
@ -496,6 +483,7 @@ services:
<<: *airflow-common-env <<: *airflow-common-env
_AIRFLOW_DB_MIGRATE: 'true' _AIRFLOW_DB_MIGRATE: 'true'
_AIRFLOW_WWW_USER_CREATE: 'false' # Set to false as we handle it manually _AIRFLOW_WWW_USER_CREATE: 'false' # Set to false as we handle it manually
_PIP_ADDITIONAL_REQUIREMENTS: ''
user: "0:0" user: "0:0"
airflow-cli: airflow-cli:
@ -530,8 +518,8 @@ services:
<<: *airflow-common-depends-on <<: *airflow-common-depends-on
airflow-init: airflow-init:
condition: service_completed_successfully condition: service_completed_successfully
profiles:
- flower
docker-socket-proxy: docker-socket-proxy:
profiles: profiles:
@ -550,7 +538,7 @@ services:
restart: always restart: always
volumes: volumes:
redis-data: postgres-db-volume:
networks: networks:
proxynet: proxynet:

View File

@ -1,5 +1,5 @@
name: ytdlp-ops name: ytdlp-ops
{% if service_role is defined and service_role != 'management' %} {% if service_role != 'management' %}
include: include:
# This automatically includes the generated camoufox service definitions and dependencies. # This automatically includes the generated camoufox service definitions and dependencies.
# It simplifies the docker-compose command, as you no longer need to specify both files with -f. # It simplifies the docker-compose command, as you no longer need to specify both files with -f.
@ -31,19 +31,17 @@ services:
# container_name is omitted; Docker will use the service name for DNS. # container_name is omitted; Docker will use the service name for DNS.
# This service depends on the camoufox-group service, which ensures all camoufox # This service depends on the camoufox-group service, which ensures all camoufox
# instances are started before this service. # instances are started before this service.
{% if service_role is defined and service_role != 'management' %} {% if service_role != 'management' %}
depends_on: depends_on:
- camoufox-group - camoufox-group
{% endif %} {% endif %}
# Ports are no longer exposed directly. Envoy will connect to them on the internal network. # Ports are no longer exposed directly. Envoy will connect to them on the internal network.
env_file: env_file:
- ./.env # Path is relative to the compose file location (configs directory) - ./.env # Path is relative to the compose file
volumes: volumes:
- context-data:/app/context-data - context-data:/app/context-data
{% if service_role != 'management' %}
# Mount the generated endpoints file to make it available to the server # Mount the generated endpoints file to make it available to the server
- ../camoufox/camoufox_endpoints.json:/app/config/camoufox_endpoints.json:ro - ./camoufox/camoufox_endpoints.json:/app/config/camoufox_endpoints.json:ro
{% endif %}
# Mount the plugin source code for live updates without rebuilding the image. # Mount the plugin source code for live updates without rebuilding the image.
# Assumes the plugin source is in a 'bgutil-ytdlp-pot-provider' directory # Assumes the plugin source is in a 'bgutil-ytdlp-pot-provider' directory
# next to your docker-compose.yaml file. # next to your docker-compose.yaml file.
@ -62,9 +60,9 @@ services:
- "--server-identity" - "--server-identity"
- "${SERVER_IDENTITY:-ytdlp-ops-airflow-service}" - "${SERVER_IDENTITY:-ytdlp-ops-airflow-service}"
- "--redis-host" - "--redis-host"
- "${MASTER_HOST_IP:-redis}" - "${REDIS_HOST:-redis}"
- "--redis-port" - "--redis-port"
- "${REDIS_PORT:-52909}" - "${REDIS_PORT:-6379}"
- "--redis-password" - "--redis-password"
- "${REDIS_PASSWORD}" - "${REDIS_PASSWORD}"
- "--account-active-duration-min" - "--account-active-duration-min"
@ -84,7 +82,7 @@ services:
- "--clients" - "--clients"
- "${YT_CLIENTS:-web,mweb,ios,android}" - "${YT_CLIENTS:-web,mweb,ios,android}"
- "--proxies" - "--proxies"
- "${CAMOUFOX_PROXIES}" - "{{ combined_proxies_str }}"
- "--camoufox-endpoints-file" - "--camoufox-endpoints-file"
- "/app/config/camoufox_endpoints.json" - "/app/config/camoufox_endpoints.json"
- "--print-tokens" - "--print-tokens"
@ -96,11 +94,8 @@ services:
volumes: volumes:
context-data: context-data:
name: context-data name: context-data
external: true
{% if service_role == 'management' or not camoufox_proxies %}
networks: networks:
proxynet: proxynet:
name: airflow_proxynet name: airflow_proxynet
external: true external: true
{% endif %}

View File

@ -0,0 +1,57 @@
# THIS FILE IS AUTO-GENERATED BY generate_envoy_config.py
# DO NOT EDIT MANUALLY.
#
# It contains the service definitions for the camoufox instances
# and adds the necessary dependencies to the main services.
services:
{% for proxy in proxies %}
camoufox-{{ loop.index }}:
build:
context: ./camoufox
dockerfile: Dockerfile
args:
VNC_PASSWORD: "{{ vnc_password }}"
shm_size: 2gb # Increase shared memory for browser stability
volumes:
- camoufox-data-{{ loop.index }}:/app/persistent-data
ports:
- "{{ base_vnc_port + loop.index - 1 }}:5900"
networks:
- proxynet
command: [
"--ws-host", "0.0.0.0",
"--port", "12345",
"--ws-path", "mypath",
"--proxy-url", "{{ proxy.url }}",
"--locale", "en-US",
"--geoip",
"--extensions", "/app/extensions/google_sign_in_popup_blocker-1.0.2.xpi,/app/extensions/spoof_timezone-0.3.4.xpi,/app/extensions/youtube_ad_auto_skipper-0.6.0.xpi",
"--persistent-context",
"--user-data-dir", "/app/persistent-data",
"--preferences", "security.sandbox.content.level=0,layers.acceleration.disabled=true,cookiebanners.service.mode=2,cookiebanners.service.mode.privateBrowsing=2,network.cookie.lifetimePolicy=0,network.cookie.thirdparty.sessionOnly=false,network.cookie.cookieBehavior=0,network.cookie.alwaysAcceptSessionCookies=true",
"--num-instances", "{{ num_instances | default(4) }}",
"--monitor-resources"
]
restart: unless-stopped
{% endfor %}
{% if proxies %}
# This service is a dependency anchor. The main services depend on it,
# and it in turn depends on all camoufox instances.
camoufox-group:
image: alpine:3.19
command: ["echo", "Camoufox dependency group ready."]
restart: "no"
networks:
- proxynet
depends_on:
{% for proxy in proxies %}
camoufox-{{ loop.index }}:
condition: service_started
{% endfor %}
{% endif %}
volumes:
{% for proxy in proxies %}
camoufox-data-{{ loop.index }}:
{% endfor %}

View File

@ -8,7 +8,7 @@ services:
env_file: env_file:
- ./.env - ./.env
volumes: volumes:
# Mount the entire project directory to access scripts and write output files # Mount the entire airflow directory to access scripts and write output files
- ./:/app - ./:/app
command: > command: >
sh -c "pip install jinja2 && python3 /app/generate_envoy_config.py" sh -c "pip install jinja2 && python3 generate_envoy_config.py"

View File

@ -89,13 +89,9 @@ def generate_configs():
from Jinja2 templates and environment variables. from Jinja2 templates and environment variables.
""" """
try: try:
# --- Setup Paths --- # --- Load .env file ---
# The script runs from /app. Configs and templates are in /app/configs. script_dir = os.path.dirname(os.path.abspath(__file__))
project_root = os.path.dirname(os.path.abspath(__file__)) # This will be /app dotenv_path = os.path.join(script_dir, '.env')
configs_dir = os.path.join(project_root, 'configs')
# Load .env from the 'configs' directory.
dotenv_path = os.path.join(configs_dir, '.env')
load_dotenv(dotenv_path) load_dotenv(dotenv_path)
# --- Common Configuration --- # --- Common Configuration ---
@ -111,8 +107,10 @@ def generate_configs():
worker_count = os.cpu_count() or 1 worker_count = os.cpu_count() or 1
logging.info(f"YTDLP_WORKERS is 0, auto-detected {worker_count} CPU cores for worker and camoufox config.") logging.info(f"YTDLP_WORKERS is 0, auto-detected {worker_count} CPU cores for worker and camoufox config.")
# The templates are in the 'configs' directory. config_dir = os.path.join(script_dir, 'config')
env = Environment(loader=FileSystemLoader(configs_dir), trim_blocks=True, lstrip_blocks=True) os.makedirs(config_dir, exist_ok=True)
env = Environment(loader=FileSystemLoader(script_dir), trim_blocks=True, lstrip_blocks=True)
# Get service role from environment to determine what to generate # Get service role from environment to determine what to generate
service_role = os.getenv('SERVICE_ROLE', 'all-in-one') service_role = os.getenv('SERVICE_ROLE', 'all-in-one')
@ -142,19 +140,94 @@ def generate_configs():
camoufox_backend_prefix = os.getenv('CAMOUFOX_BACKEND_PREFIX', 'camoufox-') camoufox_backend_prefix = os.getenv('CAMOUFOX_BACKEND_PREFIX', 'camoufox-')
# --- Generate docker-compose.camoufox.yaml --- # --- Generate docker-compose.camoufox.yaml ---
compose_template = env.get_template('docker-compose.camoufox.yaml.j2') compose_output_file = os.path.join(script_dir, 'docker-compose.camoufox.yaml')
compose_output_file = os.path.join(configs_dir, 'docker-compose.camoufox.yaml')
camoufox_config_data = { # Generate the compose file directly without template
'camoufox_proxies': camoufox_proxies,
'vnc_password': vnc_password,
'camoufox_port': camoufox_port,
'worker_count': worker_count,
}
rendered_compose_config = compose_template.render(camoufox_config_data)
with open(compose_output_file, 'w') as f: with open(compose_output_file, 'w') as f:
f.write(rendered_compose_config) f.write("# THIS FILE IS AUTO-GENERATED BY generate_envoy_config.py\n")
f.write("# DO NOT EDIT MANUALLY.\n")
f.write("#\n")
f.write("# It contains the service definitions for the camoufox instances\n")
f.write("# and adds the necessary dependencies to the main services.\n")
f.write("services:\n\n")
# Generate services for each proxy
for i, proxy in enumerate(camoufox_proxies):
service_name = f"camoufox-{i+1}"
# Each container gets its own unique range of ports to avoid conflicts
container_base_port = camoufox_port + i * worker_count
host_base_port = container_base_port
f.write(f" {service_name}:\n")
f.write(f" build:\n")
f.write(f" context: ./camoufox\n")
f.write(f" dockerfile: Dockerfile\n")
f.write(f" args:\n")
f.write(f" VNC_PASSWORD: {vnc_password}\n")
f.write(f" image: camoufox:latest\n")
f.write(f" container_name: ytdlp-ops-{service_name}-1\n")
f.write(f" restart: unless-stopped\n")
f.write(f" shm_size: '2gb' # Mitigates browser crashes due to shared memory limitations\n")
f.write(f" ports:\n")
f.write(f" - \"{host_base_port}-{host_base_port + worker_count - 1}:{container_base_port}-{container_base_port + worker_count - 1}\"\n")
f.write(f" environment:\n")
f.write(f" - DISPLAY=:99\n")
f.write(f" - MOZ_HEADLESS_STACKSIZE=2097152\n")
f.write(f" - CAMOUFOX_MAX_MEMORY_MB=2048\n")
f.write(f" - CAMOUFOX_MAX_CONCURRENT_CONTEXTS=8\n")
f.write(f" - CAMOUFOX_RESTART_THRESHOLD_MB=1500\n")
f.write(f" volumes:\n")
f.write(f" - /tmp/.X11-unix:/tmp/.X11-unix:rw\n")
f.write(f" - camoufox-data-{i+1}:/app/context-data\n")
f.write(f" - camoufox-browser-cache:/root/.cache/ms-playwright # Persist browser binaries\n")
f.write(f" command: [\n")
f.write(f" \"--ws-host\", \"0.0.0.0\",\n")
f.write(f" \"--port\", \"{container_base_port}\",\n")
f.write(f" \"--num-instances\", \"{worker_count}\",\n")
f.write(f" \"--ws-path\", \"mypath\",\n")
f.write(f" \"--proxy-url\", \"{proxy['url']}\",\n")
f.write(f" \"--headless\",\n")
f.write(f" \"--monitor-resources\",\n")
f.write(f" \"--memory-restart-threshold\", \"1800\",\n")
f.write(f" \"--preferences\", \"layers.acceleration.disabled=true,dom.ipc.processCount=2,media.memory_cache_max_size=102400,browser.cache.memory.capacity=102400\"\n")
f.write(f" ]\n")
f.write(f" deploy:\n")
f.write(f" resources:\n")
f.write(f" limits:\n")
f.write(f" memory: 2.5G\n")
f.write(f" logging:\n")
f.write(f" driver: \"json-file\"\n")
f.write(f" options:\n")
f.write(f" max-size: \"100m\"\n")
f.write(f" max-file: \"3\"\n")
f.write(f" networks:\n")
f.write(f" - proxynet\n\n")
# Add camoufox-group service that depends on all camoufox instances
if camoufox_proxies:
f.write(" camoufox-group:\n")
f.write(" image: alpine:latest\n")
f.write(" command: [\"echo\", \"Camoufox group ready.\"]\n")
f.write(" restart: \"no\"\n")
f.write(" depends_on:\n")
for i in range(len(camoufox_proxies)):
f.write(f" - camoufox-{i+1}\n")
f.write(" networks:\n")
f.write(" - proxynet\n\n")
# Write volumes section
f.write("volumes:\n")
for i in range(len(camoufox_proxies)):
f.write(f" camoufox-data-{i+1}:\n")
if camoufox_proxies:
f.write(" camoufox-browser-cache:\n")
f.write("\n")
# Write networks section
f.write("networks:\n")
f.write(" proxynet:\n")
f.write(" name: airflow_proxynet\n")
f.write(" external: true\n")
logging.info(f"Successfully generated {compose_output_file} with {len(camoufox_proxies)} camoufox service(s).") logging.info(f"Successfully generated {compose_output_file} with {len(camoufox_proxies)} camoufox service(s).")
logging.info("This docker-compose file defines the remote browser services, one for each proxy.") logging.info("This docker-compose file defines the remote browser services, one for each proxy.")
@ -178,10 +251,8 @@ def generate_configs():
logging.warning(f"Could not extract port from proxy URL: {proxy['url']}. Skipping for endpoint map.") logging.warning(f"Could not extract port from proxy URL: {proxy['url']}. Skipping for endpoint map.")
endpoints_data = {"endpoints": endpoints_map} endpoints_data = {"endpoints": endpoints_map}
# The camoufox directory is at the root of the project context, not under 'airflow'. camoufox_dir = os.path.join(script_dir, 'camoufox')
# camoufox_dir = os.path.join(project_root, 'camoufox') endpoints_output_file = os.path.join(camoufox_dir, 'camoufox_endpoints.json')
# os.makedirs(camoufox_dir, exist_ok=True)
endpoints_output_file = os.path.join(configs_dir, 'camoufox_endpoints.json')
with open(endpoints_output_file, 'w') as f: with open(endpoints_output_file, 'w') as f:
json.dump(endpoints_data, f, indent=2) json.dump(endpoints_data, f, indent=2)
logging.info(f"Successfully generated {endpoints_output_file} with {len(endpoints_map)} port-keyed endpoint(s).") logging.info(f"Successfully generated {endpoints_output_file} with {len(endpoints_map)} port-keyed endpoint(s).")
@ -192,22 +263,29 @@ def generate_configs():
# --- Generate docker-compose-ytdlp-ops.yaml --- # --- Generate docker-compose-ytdlp-ops.yaml ---
ytdlp_ops_template = env.get_template('docker-compose-ytdlp-ops.yaml.j2') ytdlp_ops_template = env.get_template('docker-compose-ytdlp-ops.yaml.j2')
ytdlp_ops_output_file = os.path.join(configs_dir, 'docker-compose-ytdlp-ops.yaml') ytdlp_ops_output_file = os.path.join(script_dir, 'docker-compose-ytdlp-ops.yaml')
# Combine all proxies (camoufox and general) into a single string for the server. # Combine all proxies (camoufox and general) into a single string for the server.
all_proxies = [] all_proxies = []
# Track if we have any explicit proxy configuration
has_explicit_proxies = False
# Add camoufox proxies if they exist
if expanded_camoufox_proxies_str: if expanded_camoufox_proxies_str:
camoufox_proxy_list = [p.strip() for p in expanded_camoufox_proxies_str.split(',') if p.strip()] all_proxies.extend([p.strip() for p in expanded_camoufox_proxies_str.split(',') if p.strip()])
all_proxies.extend(camoufox_proxy_list)
if camoufox_proxy_list:
has_explicit_proxies = True
logging.info(f"Added {len(camoufox_proxy_list)} camoufox proxies: {camoufox_proxy_list}")
general_proxies_str = os.getenv('GENERAL_PROXIES')
if general_proxies_str:
expanded_general_proxies_str = expand_env_vars(general_proxies_str)
logging.info(f"Expanded GENERAL_PROXIES from '{general_proxies_str}' to '{expanded_general_proxies_str}'")
general_proxies = [p.strip() for p in expanded_general_proxies_str.split(',') if p.strip()]
all_proxies.extend(general_proxies)
logging.info(f"Adding {len(general_proxies)} general purpose proxy/proxies.")
# Also check for the SOCKS5_SOCK_SERVER_IP for backward compatibility with docs
socks_server_ip = os.getenv('SOCKS5_SOCK_SERVER_IP', '172.17.0.1')
if socks_server_ip:
socks_server_port = os.getenv('SOCKS5_SOCK_SERVER_PORT', '1087')
general_proxy_url = f"socks5://{socks_server_ip}:{socks_server_port}"
if general_proxy_url not in all_proxies:
all_proxies.append(general_proxy_url)
logging.info(f"Adding general purpose proxy from SOCKS5_SOCK_SERVER_IP: {general_proxy_url}")
combined_proxies_str = ",".join(all_proxies) combined_proxies_str = ",".join(all_proxies)
logging.info(f"Combined proxy string for ytdlp-ops-service: '{combined_proxies_str}'") logging.info(f"Combined proxy string for ytdlp-ops-service: '{combined_proxies_str}'")
@ -215,7 +293,6 @@ def generate_configs():
ytdlp_ops_config_data = { ytdlp_ops_config_data = {
'combined_proxies_str': combined_proxies_str, 'combined_proxies_str': combined_proxies_str,
'service_role': service_role, 'service_role': service_role,
'camoufox_proxies': camoufox_proxies,
} }
rendered_ytdlp_ops_config = ytdlp_ops_template.render(ytdlp_ops_config_data) rendered_ytdlp_ops_config = ytdlp_ops_template.render(ytdlp_ops_config_data)
with open(ytdlp_ops_output_file, 'w') as f: with open(ytdlp_ops_output_file, 'w') as f:
@ -233,8 +310,7 @@ def generate_configs():
# --- Generate envoy.yaml --- # --- Generate envoy.yaml ---
envoy_template = env.get_template('envoy.yaml.j2') envoy_template = env.get_template('envoy.yaml.j2')
# Output envoy.yaml to the configs directory, where other generated files are. envoy_output_file = os.path.join(script_dir, 'envoy.yaml')
envoy_output_file = os.path.join(configs_dir, 'envoy.yaml')
logging.info("--- Generating Envoy Configuration ---") logging.info("--- Generating Envoy Configuration ---")
logging.info(f"Envoy will listen on public port: {envoy_port}") logging.info(f"Envoy will listen on public port: {envoy_port}")

View File

@ -0,0 +1,130 @@
[
"https://www.youtube.com/watch?v=EH81MQiDyFs",
"https://www.youtube.com/watch?v=YwC2VtRFBPs",
"https://www.youtube.com/watch?v=keSo7x42Xis",
"https://www.youtube.com/watch?v=K6OlxDi1cws",
"https://www.youtube.com/watch?v=eIYjjvR_k6w",
"https://www.youtube.com/watch?v=CprKmvtw-TE",
"https://www.youtube.com/watch?v=4vB1bDJ8dvA",
"https://www.youtube.com/watch?v=kJcvr693bjI",
"https://www.youtube.com/watch?v=NPQz5Hn6XKM",
"https://www.youtube.com/watch?v=DCo-7dCw2OY",
"https://www.youtube.com/watch?v=Q0996ndUMxU",
"https://www.youtube.com/watch?v=IxbFckR3yIc",
"https://www.youtube.com/watch?v=xt5QQgEqVzs",
"https://www.youtube.com/watch?v=L9pzC26i3BU",
"https://www.youtube.com/watch?v=YlkzSAqV0jE",
"https://www.youtube.com/watch?v=v9ZxQw3NQA8",
"https://www.youtube.com/watch?v=EB_eBvRsGqM",
"https://www.youtube.com/watch?v=xJ4PHYU3oY4",
"https://www.youtube.com/watch?v=kHf-eCb7q2I",
"https://www.youtube.com/watch?v=q3hNcqo5qdY",
"https://www.youtube.com/watch?v=097ujVv38LU",
"https://www.youtube.com/watch?v=VYnzo8xa_dw",
"https://www.youtube.com/watch?v=2y690c69yb4",
"https://www.youtube.com/watch?v=R_JiPanFbEs",
"https://www.youtube.com/watch?v=_VF9sk-IjOE",
"https://www.youtube.com/watch?v=01yS1dPQsZc",
"https://www.youtube.com/watch?v=0xW7slvHwiU",
"https://www.youtube.com/watch?v=qeeC7i5HTpU",
"https://www.youtube.com/watch?v=McvQBwZ_MfY",
"https://www.youtube.com/watch?v=ssQ456jGiKs",
"https://www.youtube.com/watch?v=Xz84juOdgVY",
"https://www.youtube.com/watch?v=6jw_rFi75YA",
"https://www.youtube.com/watch?v=XVtwjyQESLI",
"https://www.youtube.com/watch?v=GCuRuMZG2CU",
"https://www.youtube.com/watch?v=SLGT3nSHjKY",
"https://www.youtube.com/watch?v=KfXZckcDnwc",
"https://www.youtube.com/watch?v=krlijOR_314",
"https://www.youtube.com/watch?v=c5TIIXZTWYU",
"https://www.youtube.com/watch?v=xbFlak2wDPU",
"https://www.youtube.com/watch?v=ESiCVT43y4M",
"https://www.youtube.com/watch?v=9K-8HK9NGPo",
"https://www.youtube.com/watch?v=AXfq7U9EHHY",
"https://www.youtube.com/watch?v=oWGeLLFTwhk",
"https://www.youtube.com/watch?v=dGTid_QDq3M",
"https://www.youtube.com/watch?v=s2GdkHY7e74",
"https://www.youtube.com/watch?v=EYRnywNSHfM",
"https://www.youtube.com/watch?v=8QcanJptlFs",
"https://www.youtube.com/watch?v=8_B0MrjTDqw",
"https://www.youtube.com/watch?v=2LealZ7TTlY",
"https://www.youtube.com/watch?v=dtBosQzUqDs",
"https://www.youtube.com/watch?v=PuQwOWigWVA",
"https://www.youtube.com/watch?v=LOlVXM27ap8",
"https://www.youtube.com/watch?v=JtgKbx6nm7I",
"https://www.youtube.com/watch?v=owFxod3Pe70",
"https://www.youtube.com/watch?v=dmBpn2ZjNW4",
"https://www.youtube.com/watch?v=7Do8GAKRFsw",
"https://www.youtube.com/watch?v=7oysSz1unf0",
"https://www.youtube.com/watch?v=Z4Wn7qrR0nU",
"https://www.youtube.com/watch?v=wvgwnY0x6wo",
"https://www.youtube.com/watch?v=qUGZg985hqA",
"https://www.youtube.com/watch?v=pWvyocl7dhI",
"https://www.youtube.com/watch?v=BMzSz3aiBFU",
"https://www.youtube.com/watch?v=mgOGXUctR8U",
"https://www.youtube.com/watch?v=1rIhg0Z-Ylo",
"https://www.youtube.com/watch?v=K4hj2aQ8vCM",
"https://www.youtube.com/watch?v=jzMt0J7eohg",
"https://www.youtube.com/watch?v=LeYfSHB1zZw",
"https://www.youtube.com/watch?v=hBS3QbVFHQk",
"https://www.youtube.com/watch?v=2mBdZZm8Syo",
"https://www.youtube.com/watch?v=zaZE_AHeRIc",
"https://www.youtube.com/watch?v=DBod4x5OZsM",
"https://www.youtube.com/watch?v=lNYnMLhMMNc",
"https://www.youtube.com/watch?v=Feo_5sWRjY0",
"https://www.youtube.com/watch?v=tYWLm75nibA",
"https://www.youtube.com/watch?v=xx1HYybZDH0",
"https://www.youtube.com/watch?v=EyIY0BKYIrA",
"https://www.youtube.com/watch?v=BfAoe4GbKt4",
"https://www.youtube.com/watch?v=qmizxZdHB7A",
"https://www.youtube.com/watch?v=7K73KytWJR4",
"https://www.youtube.com/watch?v=hPyi-EnO_Dw",
"https://www.youtube.com/watch?v=M4Gp7eMj2IQ",
"https://www.youtube.com/watch?v=rPOOnshXEOk",
"https://www.youtube.com/watch?v=fmOB4FNj4MM",
"https://www.youtube.com/watch?v=UgwjPBJ-iyA",
"https://www.youtube.com/watch?v=tInqj66fkxc",
"https://www.youtube.com/watch?v=tok-jMC1V0E",
"https://www.youtube.com/watch?v=2IuaROF1pMs",
"https://www.youtube.com/watch?v=Ak5JpqBA5No",
"https://www.youtube.com/watch?v=A_yH2vzq7CY",
"https://www.youtube.com/watch?v=4nzsI5fxdlA",
"https://www.youtube.com/watch?v=1FfwsJInFOM",
"https://www.youtube.com/watch?v=uRjJbkgf_3I",
"https://www.youtube.com/watch?v=HMjduefTG4E",
"https://www.youtube.com/watch?v=Cw9hUSFppnw",
"https://www.youtube.com/watch?v=vrobF1L3BJ8",
"https://www.youtube.com/watch?v=tIiVUsKPCEY",
"https://www.youtube.com/watch?v=7qprIRCTX6A",
"https://www.youtube.com/watch?v=HREKaNF7TT8",
"https://www.youtube.com/watch?v=xlIgqZ1sW5A",
"https://www.youtube.com/watch?v=6_uA0osze4w",
"https://www.youtube.com/watch?v=jarbK6tvflw",
"https://www.youtube.com/watch?v=RWmeSE312FA",
"https://www.youtube.com/watch?v=hhI7lAonIrU",
"https://www.youtube.com/watch?v=4k23-uYPObU",
"https://www.youtube.com/watch?v=rIxiOD0dA3w",
"https://www.youtube.com/watch?v=Ry-_mpn3Pe8",
"https://www.youtube.com/watch?v=m-H4fOb1o2Q",
"https://www.youtube.com/watch?v=NhGxI_tgSwI",
"https://www.youtube.com/watch?v=VTslivtVfAI",
"https://www.youtube.com/watch?v=huSCDYe04Fk",
"https://www.youtube.com/watch?v=LF82qA5a05E",
"https://www.youtube.com/watch?v=kHaHsbFg28M",
"https://www.youtube.com/watch?v=NKDFri_kL94",
"https://www.youtube.com/watch?v=BPIlpDQwWqA",
"https://www.youtube.com/watch?v=UTCAshkc8qk",
"https://www.youtube.com/watch?v=EkUtGGKaX_I",
"https://www.youtube.com/watch?v=tuLyfqdpYxU",
"https://www.youtube.com/watch?v=snxBL-8IGCA",
"https://www.youtube.com/watch?v=Mo9m8EdR8_Y",
"https://www.youtube.com/watch?v=5nBipdnGAbU",
"https://www.youtube.com/watch?v=sLs6vp5TH_w",
"https://www.youtube.com/watch?v=OYM5PrQtT34",
"https://www.youtube.com/watch?v=FX3wjgGWn1s",
"https://www.youtube.com/watch?v=1FfwsJInFOM",
"https://www.youtube.com/watch?v=osWMBc6h5Rs",
"https://www.youtube.com/watch?v=aojc0sLBm5Y",
"https://www.youtube.com/watch?v=akf_6pAx024",
"https://www.youtube.com/watch?v=SgSkvKpAxMQ"
]

View File

@ -0,0 +1,101 @@
[
"https://www.youtube.com/watch?v=Y0WQdA4srb0",
"https://www.youtube.com/watch?v=uFyraEVj848",
"https://www.youtube.com/watch?v=VxPx0Qjgbos",
"https://www.youtube.com/watch?v=FuKOn-_rfeE",
"https://www.youtube.com/watch?v=mn9t5eOs30c",
"https://www.youtube.com/watch?v=7YOE0GEUrVo",
"https://www.youtube.com/watch?v=4L8kv6qVTfY",
"https://www.youtube.com/watch?v=7WSEWOft4Y4",
"https://www.youtube.com/watch?v=bmDsn0_1-f0",
"https://www.youtube.com/watch?v=IILtHOqYndA",
"https://www.youtube.com/watch?v=tyGqbWBjSWE",
"https://www.youtube.com/watch?v=3tgZTpkZQkQ",
"https://www.youtube.com/watch?v=JJH-CkjiQWI",
"https://www.youtube.com/watch?v=4hLWn4hHKNM",
"https://www.youtube.com/watch?v=IFwr6QGxoJo",
"https://www.youtube.com/watch?v=Fj-NKUoMbmI",
"https://www.youtube.com/watch?v=zvoxV3wLjFE",
"https://www.youtube.com/watch?v=EcC4CIyUI2Q",
"https://www.youtube.com/watch?v=jtjiTuTKCT4",
"https://www.youtube.com/watch?v=am28qDtXLLU",
"https://www.youtube.com/watch?v=WNVW86YBkMg",
"https://www.youtube.com/watch?v=kG51upknRCw",
"https://www.youtube.com/watch?v=E-HpdWghf2U",
"https://www.youtube.com/watch?v=GuaAOc9ZssE",
"https://www.youtube.com/watch?v=r1JkW0zfPOA",
"https://www.youtube.com/watch?v=OBYmpN8uAag",
"https://www.youtube.com/watch?v=0HuGAMKHXD4",
"https://www.youtube.com/watch?v=eDmdalDaPdU",
"https://www.youtube.com/watch?v=ZjDR1XMd904",
"https://www.youtube.com/watch?v=HGrsrP4idE8",
"https://www.youtube.com/watch?v=l-J_J7YFDYY",
"https://www.youtube.com/watch?v=Kr5rl0935K4",
"https://www.youtube.com/watch?v=KgK4bu9O384",
"https://www.youtube.com/watch?v=BDq3_y4mXYo",
"https://www.youtube.com/watch?v=slRiaDz12m8",
"https://www.youtube.com/watch?v=iX1oWEsHh0A",
"https://www.youtube.com/watch?v=0zJcsxB6-UU",
"https://www.youtube.com/watch?v=NTOokrCHzJA",
"https://www.youtube.com/watch?v=CXYXqQ-VuYo",
"https://www.youtube.com/watch?v=xaxZtPTEraU",
"https://www.youtube.com/watch?v=wX1wNCPZdE8",
"https://www.youtube.com/watch?v=DOt7ckIGN4Y",
"https://www.youtube.com/watch?v=bncasw-Z4Ow",
"https://www.youtube.com/watch?v=nbVWfXlo7kQ",
"https://www.youtube.com/watch?v=Uu6DmhonkEE",
"https://www.youtube.com/watch?v=HGWigeoSMvA",
"https://www.youtube.com/watch?v=rjbLCaC9yFE",
"https://www.youtube.com/watch?v=Uew7f09gW4o",
"https://www.youtube.com/watch?v=uzc-jLt65mY",
"https://www.youtube.com/watch?v=ZX7qnLuAsMU",
"https://www.youtube.com/watch?v=ZlSgDvCP5UI",
"https://www.youtube.com/watch?v=RmGIid7Yctw",
"https://www.youtube.com/watch?v=u9g0_eR5gEk",
"https://www.youtube.com/watch?v=wu9Cw905NUU",
"https://www.youtube.com/watch?v=cNhQVoY5V5Q",
"https://www.youtube.com/watch?v=I63iJNKOb8I",
"https://www.youtube.com/watch?v=3G5ceoSK6jg",
"https://www.youtube.com/watch?v=JF4TbV940PM",
"https://www.youtube.com/watch?v=0yGaVHfmGa0",
"https://www.youtube.com/watch?v=r8cgtI_ZQIY",
"https://www.youtube.com/watch?v=OcG3-r98XEM",
"https://www.youtube.com/watch?v=w7hooOUEMQI",
"https://www.youtube.com/watch?v=yipW8SF5Gxk",
"https://www.youtube.com/watch?v=LH4PqRiuxts",
"https://www.youtube.com/watch?v=IfAsA3ezUqQ",
"https://www.youtube.com/watch?v=5cUg8I0yps4",
"https://www.youtube.com/watch?v=lCea6bQj3eg",
"https://www.youtube.com/watch?v=5Ie0MAv4XCY",
"https://www.youtube.com/watch?v=57eomGPy1PU",
"https://www.youtube.com/watch?v=TEnk3OfU8Gc",
"https://www.youtube.com/watch?v=1uA4xXlDhvE",
"https://www.youtube.com/watch?v=aXF8ijpn4bM",
"https://www.youtube.com/watch?v=3vKmCDomyJ8",
"https://www.youtube.com/watch?v=z7jLEWJ59uY",
"https://www.youtube.com/watch?v=0TTsKnyH6EY",
"https://www.youtube.com/watch?v=PcqA6Y1RfVQ",
"https://www.youtube.com/watch?v=f1Ar3ydryqc",
"https://www.youtube.com/watch?v=N2nLayOIjxM",
"https://www.youtube.com/watch?v=Cziyx9qaYVM",
"https://www.youtube.com/watch?v=RTJCbIJ294w",
"https://www.youtube.com/watch?v=GC1FB-bZTvA",
"https://www.youtube.com/watch?v=kKYv5uLBSFk",
"https://www.youtube.com/watch?v=jfQHlnNeKzw",
"https://www.youtube.com/watch?v=J7e8PRu9kSU",
"https://www.youtube.com/watch?v=UoHf6pdy0oE",
"https://www.youtube.com/watch?v=JOwNcwSupXs",
"https://www.youtube.com/watch?v=gxwk-bb78-U",
"https://www.youtube.com/watch?v=_lrDwiK544A",
"https://www.youtube.com/watch?v=6i8BVQ9GE1g",
"https://www.youtube.com/watch?v=8c_l9D1qyKY",
"https://www.youtube.com/watch?v=KFCr5BdjFB8",
"https://www.youtube.com/watch?v=orEvHn7lL4A",
"https://www.youtube.com/watch?v=6BhGJxrp8P4",
"https://www.youtube.com/watch?v=n2t8beFnhyA",
"https://www.youtube.com/watch?v=GJzZ2-f_k30",
"https://www.youtube.com/watch?v=oId850O591s",
"https://www.youtube.com/watch?v=f2XmdQdwppw",
"https://www.youtube.com/watch?v=iWM_oe-JY_k",
"https://www.youtube.com/watch?v=GHEDWE9LjRY"
]

View File

@ -0,0 +1,30 @@
[
"https://www.youtube.com/watch?v=lKrVuufVMXA",
"https://www.youtube.com/watch?v=ISqDcqGdow0",
"https://www.youtube.com/watch?v=srG-WnQdZq8",
"https://www.youtube.com/watch?v=HP-KB6XFqgs",
"https://www.youtube.com/watch?v=1e13SIh51wk",
"https://www.youtube.com/watch?v=VTKG48FjSxs",
"https://www.youtube.com/watch?v=onEWAyPRm6E",
"https://www.youtube.com/watch?v=7RdrGwpZzMo",
"https://www.youtube.com/watch?v=M5uu93_AhXg",
"https://www.youtube.com/watch?v=xnkvCBfTfok",
"https://www.youtube.com/watch?v=oE9hGZyFN8E",
"https://www.youtube.com/watch?v=7LofBMRP6U4",
"https://www.youtube.com/watch?v=EDE8tyroJEE",
"https://www.youtube.com/watch?v=oLwsWGi0sUc",
"https://www.youtube.com/watch?v=a6dvhHPyFIw",
"https://www.youtube.com/watch?v=4jds773UlWE",
"https://www.youtube.com/watch?v=B6dXxqiSBSM",
"https://www.youtube.com/watch?v=9EbS6w3RSG0",
"https://www.youtube.com/watch?v=LyKONGzUANU",
"https://www.youtube.com/watch?v=sGW5kfpR6Wo",
"https://www.youtube.com/watch?v=pa4-JninkUQ",
"https://www.youtube.com/watch?v=DxXMFBWarjY",
"https://www.youtube.com/watch?v=PYQjfpCEWvc",
"https://www.youtube.com/watch?v=_jlNCjI9jiQ",
"https://www.youtube.com/watch?v=BxEC11QS3sQ",
"https://www.youtube.com/watch?v=6-qbWRzVbGA",
"https://www.youtube.com/watch?v=p3lCQvZBv_k",
"https://www.youtube.com/watch?v=67YA1CHpGrM"
]

View File

@ -0,0 +1,5 @@
[
"https://www.youtube.com/watch?v=uxiLE2Kv7wc",
"https://www.youtube.com/watch?v=Q7R0epGFnRI",
"https://www.youtube.com/watch?v=4mEmsJXKroE"
]

View File

@ -0,0 +1,48 @@
[
"https://www.youtube.com/watch?v=l700b4BpFAA",
"https://www.youtube.com/watch?v=G_JAVwwWyUM",
"https://www.youtube.com/watch?v=2LGz9nUw-XI",
"https://www.youtube.com/watch?v=7dK6a8LWAWw",
"https://www.youtube.com/watch?v=lKSZnZggcto",
"https://www.youtube.com/watch?v=Zy0ZFAMqm7U",
"https://www.youtube.com/watch?v=7UunWMHBrEE",
"https://www.youtube.com/watch?v=LPdbLCX3N-4",
"https://www.youtube.com/watch?v=-lJ5DVbkVw4",
"https://www.youtube.com/watch?v=QrRRS0RzELs",
"https://www.youtube.com/watch?v=XSty74mE1iE",
"https://www.youtube.com/watch?v=orijdeDOk5g",
"https://www.youtube.com/watch?v=27YVRo9VUE8",
"https://www.youtube.com/watch?v=p-JNgLI_8nA",
"https://www.youtube.com/watch?v=gkekjIJB_Nw",
"https://www.youtube.com/watch?v=V8QFCgOfkgw",
"https://www.youtube.com/watch?v=_GVVEsxZ_Mo",
"https://www.youtube.com/watch?v=7_zMqxK4gZE",
"https://www.youtube.com/watch?v=cwuJCb316yQ",
"https://www.youtube.com/watch?v=TIGxtvVVHak",
"https://www.youtube.com/watch?v=KhcicW2keWY",
"https://www.youtube.com/watch?v=miUJ85pFCPE",
"https://www.youtube.com/watch?v=97L4qVfSwv4",
"https://www.youtube.com/watch?v=Wk38hWQfz24",
"https://www.youtube.com/watch?v=iIU-NVWkTDE",
"https://www.youtube.com/watch?v=l89VaRof8ug",
"https://www.youtube.com/watch?v=IIkjS5MpQVM",
"https://www.youtube.com/watch?v=9XxPGKkOs0o",
"https://www.youtube.com/watch?v=_dlpve9GPZM",
"https://www.youtube.com/watch?v=He_3MjAuZNQ",
"https://www.youtube.com/watch?v=FnPEHn2NHT4",
"https://www.youtube.com/watch?v=HuSjI7HFkzo",
"https://www.youtube.com/watch?v=pBZSgVJHacs",
"https://www.youtube.com/watch?v=OgsG082zDGo",
"https://www.youtube.com/watch?v=_4sxhmPsryY",
"https://www.youtube.com/watch?v=kqU6B5rIEnI",
"https://www.youtube.com/watch?v=BEYn_ILHmBE",
"https://www.youtube.com/watch?v=qy9Zr3HV9V4",
"https://www.youtube.com/watch?v=7I1VvJZbG-M",
"https://www.youtube.com/watch?v=WOa-HA3MoVQ",
"https://www.youtube.com/watch?v=uaHI-WHwivc",
"https://www.youtube.com/watch?v=9ku8r8uZ9EQ",
"https://www.youtube.com/watch?v=XAyaDcLxwHQ",
"https://www.youtube.com/watch?v=zpc-hJGSNBc",
"https://www.youtube.com/watch?v=AGbG62y1DyE",
"https://www.youtube.com/watch?v=7rmyabL60oA"
]

View File

@ -4,11 +4,15 @@ events {
stream { stream {
upstream minio_servers { upstream minio_servers {
server minio:9000; server minio1:9000;
server minio2:9000;
server minio3:9000;
} }
upstream minio_console_servers { upstream minio_console_servers {
server minio:9001; server minio1:9001;
server minio2:9001;
server minio3:9001;
} }
server { server {

View File

@ -1,18 +0,0 @@
"""
Airflow plugins initialization.
"""
import os
import logging
# Set the custom secrets masker
os.environ['AIRFLOW__LOGGING__SECRETS_MASKER_CLASS'] = 'custom_secrets_masker.CustomSecretsMasker'
# Apply Thrift patches
try:
from patch_thrift_exceptions import patch_thrift_exceptions
patch_thrift_exceptions()
except Exception as e:
logging.error(f"Error applying Thrift exception patches: {e}")
logger = logging.getLogger(__name__)
logger.info("Airflow custom configuration applied")

View File

@ -1,56 +0,0 @@
from airflow.plugins_manager import AirflowPlugin
from airflow.hooks.base import BaseHook
from airflow.configuration import conf
import uuid
import backoff
class YTDLPHook(BaseHook):
def __init__(self, conn_id='ytdlp_default'):
super().__init__()
self.conn_id = conn_id
self.connection = self.get_connection(conn_id)
self.timeout = conf.getint('ytdlp', 'timeout', fallback=120)
self.max_retries = conf.getint('ytdlp', 'max_retries', fallback=3)
@backoff.on_exception(backoff.expo,
Exception,
max_tries=3,
max_time=300)
def start_service(self, host, port, service_id, work_dir):
"""Start token service as a long-running process"""
import subprocess
import os
from pathlib import Path
# Get script path relative to Airflow home
airflow_home = os.getenv('AIRFLOW_HOME', '')
script_path = Path(airflow_home).parent / 'ytdlp_ops_server.py'
# Ensure work directory exists
os.makedirs(work_dir, exist_ok=True)
# Start service process
cmd = [
'python', str(script_path),
'--port', str(port),
'--host', host,
'--service-id', service_id,
'--context-dir', work_dir,
'--script-dir', str(Path(airflow_home) / 'dags' / 'scripts')
]
self.log.info(f"Starting token service: {' '.join(cmd)}")
# Start process detached
docker_cmd = [
'docker-compose', '-f', 'docker-compose.yaml',
'up', '-d', '--build', 'ytdlp-service'
]
subprocess.run(docker_cmd, check=True)
self.log.info(f"Token service started on {host}:{port}")
return True
class YTDLPPlugin(AirflowPlugin):
name = 'ytdlp_plugin'
hooks = [YTDLPHook]

View File

@ -1 +0,0 @@
../../thrift_model/gen_py/pangramia

View File

@ -29,13 +29,9 @@ if len(sys.argv) <= 1 or sys.argv[1] == '--help':
print(' bool unbanProxy(string proxyUrl, string serverIdentity)') print(' bool unbanProxy(string proxyUrl, string serverIdentity)')
print(' bool resetAllProxyStatuses(string serverIdentity)') print(' bool resetAllProxyStatuses(string serverIdentity)')
print(' bool banAllProxies(string serverIdentity)') print(' bool banAllProxies(string serverIdentity)')
print(' bool deleteProxyFromRedis(string proxyUrl, string serverIdentity)')
print(' i32 deleteAllProxiesFromRedis(string serverIdentity)')
print(' getAccountStatus(string accountId, string accountPrefix)') print(' getAccountStatus(string accountId, string accountPrefix)')
print(' bool banAccount(string accountId, string reason)') print(' bool banAccount(string accountId, string reason)')
print(' bool unbanAccount(string accountId, string reason)') print(' bool unbanAccount(string accountId, string reason)')
print(' bool deleteAccountFromRedis(string accountId)')
print(' i32 deleteAllAccountsFromRedis(string accountPrefix)')
print(' bool ping()') print(' bool ping()')
print(' bool reportError(string message, details)') print(' bool reportError(string message, details)')
print(' void shutdown()') print(' void shutdown()')
@ -148,18 +144,6 @@ elif cmd == 'banAllProxies':
sys.exit(1) sys.exit(1)
pp.pprint(client.banAllProxies(args[0],)) pp.pprint(client.banAllProxies(args[0],))
elif cmd == 'deleteProxyFromRedis':
if len(args) != 2:
print('deleteProxyFromRedis requires 2 args')
sys.exit(1)
pp.pprint(client.deleteProxyFromRedis(args[0], args[1],))
elif cmd == 'deleteAllProxiesFromRedis':
if len(args) != 1:
print('deleteAllProxiesFromRedis requires 1 args')
sys.exit(1)
pp.pprint(client.deleteAllProxiesFromRedis(args[0],))
elif cmd == 'getAccountStatus': elif cmd == 'getAccountStatus':
if len(args) != 2: if len(args) != 2:
print('getAccountStatus requires 2 args') print('getAccountStatus requires 2 args')
@ -178,18 +162,6 @@ elif cmd == 'unbanAccount':
sys.exit(1) sys.exit(1)
pp.pprint(client.unbanAccount(args[0], args[1],)) pp.pprint(client.unbanAccount(args[0], args[1],))
elif cmd == 'deleteAccountFromRedis':
if len(args) != 1:
print('deleteAccountFromRedis requires 1 args')
sys.exit(1)
pp.pprint(client.deleteAccountFromRedis(args[0],))
elif cmd == 'deleteAllAccountsFromRedis':
if len(args) != 1:
print('deleteAllAccountsFromRedis requires 1 args')
sys.exit(1)
pp.pprint(client.deleteAllAccountsFromRedis(args[0],))
elif cmd == 'ping': elif cmd == 'ping':
if len(args) != 0: if len(args) != 0:
print('ping requires 0 args') print('ping requires 0 args')

View File

@ -62,23 +62,6 @@ class Iface(pangramia.base_service.BaseService.Iface):
""" """
pass pass
def deleteProxyFromRedis(self, proxyUrl, serverIdentity):
"""
Parameters:
- proxyUrl
- serverIdentity
"""
pass
def deleteAllProxiesFromRedis(self, serverIdentity):
"""
Parameters:
- serverIdentity
"""
pass
def getAccountStatus(self, accountId, accountPrefix): def getAccountStatus(self, accountId, accountPrefix):
""" """
Parameters: Parameters:
@ -106,22 +89,6 @@ class Iface(pangramia.base_service.BaseService.Iface):
""" """
pass pass
def deleteAccountFromRedis(self, accountId):
"""
Parameters:
- accountId
"""
pass
def deleteAllAccountsFromRedis(self, accountPrefix):
"""
Parameters:
- accountPrefix
"""
pass
class Client(pangramia.base_service.BaseService.Client, Iface): class Client(pangramia.base_service.BaseService.Client, Iface):
def __init__(self, iprot, oprot=None): def __init__(self, iprot, oprot=None):
@ -311,80 +278,6 @@ class Client(pangramia.base_service.BaseService.Client, Iface):
raise result.userExp raise result.userExp
raise TApplicationException(TApplicationException.MISSING_RESULT, "banAllProxies failed: unknown result") raise TApplicationException(TApplicationException.MISSING_RESULT, "banAllProxies failed: unknown result")
def deleteProxyFromRedis(self, proxyUrl, serverIdentity):
"""
Parameters:
- proxyUrl
- serverIdentity
"""
self.send_deleteProxyFromRedis(proxyUrl, serverIdentity)
return self.recv_deleteProxyFromRedis()
def send_deleteProxyFromRedis(self, proxyUrl, serverIdentity):
self._oprot.writeMessageBegin('deleteProxyFromRedis', TMessageType.CALL, self._seqid)
args = deleteProxyFromRedis_args()
args.proxyUrl = proxyUrl
args.serverIdentity = serverIdentity
args.write(self._oprot)
self._oprot.writeMessageEnd()
self._oprot.trans.flush()
def recv_deleteProxyFromRedis(self):
iprot = self._iprot
(fname, mtype, rseqid) = iprot.readMessageBegin()
if mtype == TMessageType.EXCEPTION:
x = TApplicationException()
x.read(iprot)
iprot.readMessageEnd()
raise x
result = deleteProxyFromRedis_result()
result.read(iprot)
iprot.readMessageEnd()
if result.success is not None:
return result.success
if result.serviceExp is not None:
raise result.serviceExp
if result.userExp is not None:
raise result.userExp
raise TApplicationException(TApplicationException.MISSING_RESULT, "deleteProxyFromRedis failed: unknown result")
def deleteAllProxiesFromRedis(self, serverIdentity):
"""
Parameters:
- serverIdentity
"""
self.send_deleteAllProxiesFromRedis(serverIdentity)
return self.recv_deleteAllProxiesFromRedis()
def send_deleteAllProxiesFromRedis(self, serverIdentity):
self._oprot.writeMessageBegin('deleteAllProxiesFromRedis', TMessageType.CALL, self._seqid)
args = deleteAllProxiesFromRedis_args()
args.serverIdentity = serverIdentity
args.write(self._oprot)
self._oprot.writeMessageEnd()
self._oprot.trans.flush()
def recv_deleteAllProxiesFromRedis(self):
iprot = self._iprot
(fname, mtype, rseqid) = iprot.readMessageBegin()
if mtype == TMessageType.EXCEPTION:
x = TApplicationException()
x.read(iprot)
iprot.readMessageEnd()
raise x
result = deleteAllProxiesFromRedis_result()
result.read(iprot)
iprot.readMessageEnd()
if result.success is not None:
return result.success
if result.serviceExp is not None:
raise result.serviceExp
if result.userExp is not None:
raise result.userExp
raise TApplicationException(TApplicationException.MISSING_RESULT, "deleteAllProxiesFromRedis failed: unknown result")
def getAccountStatus(self, accountId, accountPrefix): def getAccountStatus(self, accountId, accountPrefix):
""" """
Parameters: Parameters:
@ -499,78 +392,6 @@ class Client(pangramia.base_service.BaseService.Client, Iface):
raise result.userExp raise result.userExp
raise TApplicationException(TApplicationException.MISSING_RESULT, "unbanAccount failed: unknown result") raise TApplicationException(TApplicationException.MISSING_RESULT, "unbanAccount failed: unknown result")
def deleteAccountFromRedis(self, accountId):
"""
Parameters:
- accountId
"""
self.send_deleteAccountFromRedis(accountId)
return self.recv_deleteAccountFromRedis()
def send_deleteAccountFromRedis(self, accountId):
self._oprot.writeMessageBegin('deleteAccountFromRedis', TMessageType.CALL, self._seqid)
args = deleteAccountFromRedis_args()
args.accountId = accountId
args.write(self._oprot)
self._oprot.writeMessageEnd()
self._oprot.trans.flush()
def recv_deleteAccountFromRedis(self):
iprot = self._iprot
(fname, mtype, rseqid) = iprot.readMessageBegin()
if mtype == TMessageType.EXCEPTION:
x = TApplicationException()
x.read(iprot)
iprot.readMessageEnd()
raise x
result = deleteAccountFromRedis_result()
result.read(iprot)
iprot.readMessageEnd()
if result.success is not None:
return result.success
if result.serviceExp is not None:
raise result.serviceExp
if result.userExp is not None:
raise result.userExp
raise TApplicationException(TApplicationException.MISSING_RESULT, "deleteAccountFromRedis failed: unknown result")
def deleteAllAccountsFromRedis(self, accountPrefix):
"""
Parameters:
- accountPrefix
"""
self.send_deleteAllAccountsFromRedis(accountPrefix)
return self.recv_deleteAllAccountsFromRedis()
def send_deleteAllAccountsFromRedis(self, accountPrefix):
self._oprot.writeMessageBegin('deleteAllAccountsFromRedis', TMessageType.CALL, self._seqid)
args = deleteAllAccountsFromRedis_args()
args.accountPrefix = accountPrefix
args.write(self._oprot)
self._oprot.writeMessageEnd()
self._oprot.trans.flush()
def recv_deleteAllAccountsFromRedis(self):
iprot = self._iprot
(fname, mtype, rseqid) = iprot.readMessageBegin()
if mtype == TMessageType.EXCEPTION:
x = TApplicationException()
x.read(iprot)
iprot.readMessageEnd()
raise x
result = deleteAllAccountsFromRedis_result()
result.read(iprot)
iprot.readMessageEnd()
if result.success is not None:
return result.success
if result.serviceExp is not None:
raise result.serviceExp
if result.userExp is not None:
raise result.userExp
raise TApplicationException(TApplicationException.MISSING_RESULT, "deleteAllAccountsFromRedis failed: unknown result")
class Processor(pangramia.base_service.BaseService.Processor, Iface, TProcessor): class Processor(pangramia.base_service.BaseService.Processor, Iface, TProcessor):
def __init__(self, handler): def __init__(self, handler):
@ -580,13 +401,9 @@ class Processor(pangramia.base_service.BaseService.Processor, Iface, TProcessor)
self._processMap["unbanProxy"] = Processor.process_unbanProxy self._processMap["unbanProxy"] = Processor.process_unbanProxy
self._processMap["resetAllProxyStatuses"] = Processor.process_resetAllProxyStatuses self._processMap["resetAllProxyStatuses"] = Processor.process_resetAllProxyStatuses
self._processMap["banAllProxies"] = Processor.process_banAllProxies self._processMap["banAllProxies"] = Processor.process_banAllProxies
self._processMap["deleteProxyFromRedis"] = Processor.process_deleteProxyFromRedis
self._processMap["deleteAllProxiesFromRedis"] = Processor.process_deleteAllProxiesFromRedis
self._processMap["getAccountStatus"] = Processor.process_getAccountStatus self._processMap["getAccountStatus"] = Processor.process_getAccountStatus
self._processMap["banAccount"] = Processor.process_banAccount self._processMap["banAccount"] = Processor.process_banAccount
self._processMap["unbanAccount"] = Processor.process_unbanAccount self._processMap["unbanAccount"] = Processor.process_unbanAccount
self._processMap["deleteAccountFromRedis"] = Processor.process_deleteAccountFromRedis
self._processMap["deleteAllAccountsFromRedis"] = Processor.process_deleteAllAccountsFromRedis
self._on_message_begin = None self._on_message_begin = None
def on_message_begin(self, func): def on_message_begin(self, func):
@ -754,64 +571,6 @@ class Processor(pangramia.base_service.BaseService.Processor, Iface, TProcessor)
oprot.writeMessageEnd() oprot.writeMessageEnd()
oprot.trans.flush() oprot.trans.flush()
def process_deleteProxyFromRedis(self, seqid, iprot, oprot):
args = deleteProxyFromRedis_args()
args.read(iprot)
iprot.readMessageEnd()
result = deleteProxyFromRedis_result()
try:
result.success = self._handler.deleteProxyFromRedis(args.proxyUrl, args.serverIdentity)
msg_type = TMessageType.REPLY
except TTransport.TTransportException:
raise
except pangramia.yt.exceptions.ttypes.PBServiceException as serviceExp:
msg_type = TMessageType.REPLY
result.serviceExp = serviceExp
except pangramia.yt.exceptions.ttypes.PBUserException as userExp:
msg_type = TMessageType.REPLY
result.userExp = userExp
except TApplicationException as ex:
logging.exception('TApplication exception in handler')
msg_type = TMessageType.EXCEPTION
result = ex
except Exception:
logging.exception('Unexpected exception in handler')
msg_type = TMessageType.EXCEPTION
result = TApplicationException(TApplicationException.INTERNAL_ERROR, 'Internal error')
oprot.writeMessageBegin("deleteProxyFromRedis", msg_type, seqid)
result.write(oprot)
oprot.writeMessageEnd()
oprot.trans.flush()
def process_deleteAllProxiesFromRedis(self, seqid, iprot, oprot):
args = deleteAllProxiesFromRedis_args()
args.read(iprot)
iprot.readMessageEnd()
result = deleteAllProxiesFromRedis_result()
try:
result.success = self._handler.deleteAllProxiesFromRedis(args.serverIdentity)
msg_type = TMessageType.REPLY
except TTransport.TTransportException:
raise
except pangramia.yt.exceptions.ttypes.PBServiceException as serviceExp:
msg_type = TMessageType.REPLY
result.serviceExp = serviceExp
except pangramia.yt.exceptions.ttypes.PBUserException as userExp:
msg_type = TMessageType.REPLY
result.userExp = userExp
except TApplicationException as ex:
logging.exception('TApplication exception in handler')
msg_type = TMessageType.EXCEPTION
result = ex
except Exception:
logging.exception('Unexpected exception in handler')
msg_type = TMessageType.EXCEPTION
result = TApplicationException(TApplicationException.INTERNAL_ERROR, 'Internal error')
oprot.writeMessageBegin("deleteAllProxiesFromRedis", msg_type, seqid)
result.write(oprot)
oprot.writeMessageEnd()
oprot.trans.flush()
def process_getAccountStatus(self, seqid, iprot, oprot): def process_getAccountStatus(self, seqid, iprot, oprot):
args = getAccountStatus_args() args = getAccountStatus_args()
args.read(iprot) args.read(iprot)
@ -899,64 +658,6 @@ class Processor(pangramia.base_service.BaseService.Processor, Iface, TProcessor)
oprot.writeMessageEnd() oprot.writeMessageEnd()
oprot.trans.flush() oprot.trans.flush()
def process_deleteAccountFromRedis(self, seqid, iprot, oprot):
args = deleteAccountFromRedis_args()
args.read(iprot)
iprot.readMessageEnd()
result = deleteAccountFromRedis_result()
try:
result.success = self._handler.deleteAccountFromRedis(args.accountId)
msg_type = TMessageType.REPLY
except TTransport.TTransportException:
raise
except pangramia.yt.exceptions.ttypes.PBServiceException as serviceExp:
msg_type = TMessageType.REPLY
result.serviceExp = serviceExp
except pangramia.yt.exceptions.ttypes.PBUserException as userExp:
msg_type = TMessageType.REPLY
result.userExp = userExp
except TApplicationException as ex:
logging.exception('TApplication exception in handler')
msg_type = TMessageType.EXCEPTION
result = ex
except Exception:
logging.exception('Unexpected exception in handler')
msg_type = TMessageType.EXCEPTION
result = TApplicationException(TApplicationException.INTERNAL_ERROR, 'Internal error')
oprot.writeMessageBegin("deleteAccountFromRedis", msg_type, seqid)
result.write(oprot)
oprot.writeMessageEnd()
oprot.trans.flush()
def process_deleteAllAccountsFromRedis(self, seqid, iprot, oprot):
args = deleteAllAccountsFromRedis_args()
args.read(iprot)
iprot.readMessageEnd()
result = deleteAllAccountsFromRedis_result()
try:
result.success = self._handler.deleteAllAccountsFromRedis(args.accountPrefix)
msg_type = TMessageType.REPLY
except TTransport.TTransportException:
raise
except pangramia.yt.exceptions.ttypes.PBServiceException as serviceExp:
msg_type = TMessageType.REPLY
result.serviceExp = serviceExp
except pangramia.yt.exceptions.ttypes.PBUserException as userExp:
msg_type = TMessageType.REPLY
result.userExp = userExp
except TApplicationException as ex:
logging.exception('TApplication exception in handler')
msg_type = TMessageType.EXCEPTION
result = ex
except Exception:
logging.exception('Unexpected exception in handler')
msg_type = TMessageType.EXCEPTION
result = TApplicationException(TApplicationException.INTERNAL_ERROR, 'Internal error')
oprot.writeMessageBegin("deleteAllAccountsFromRedis", msg_type, seqid)
result.write(oprot)
oprot.writeMessageEnd()
oprot.trans.flush()
# HELPER FUNCTIONS AND STRUCTURES # HELPER FUNCTIONS AND STRUCTURES
@ -1728,312 +1429,6 @@ banAllProxies_result.thrift_spec = (
) )
class deleteProxyFromRedis_args(object):
"""
Attributes:
- proxyUrl
- serverIdentity
"""
def __init__(self, proxyUrl=None, serverIdentity=None,):
self.proxyUrl = proxyUrl
self.serverIdentity = serverIdentity
def read(self, iprot):
if iprot._fast_decode is not None and isinstance(iprot.trans, TTransport.CReadableTransport) and self.thrift_spec is not None:
iprot._fast_decode(self, iprot, [self.__class__, self.thrift_spec])
return
iprot.readStructBegin()
while True:
(fname, ftype, fid) = iprot.readFieldBegin()
if ftype == TType.STOP:
break
if fid == 1:
if ftype == TType.STRING:
self.proxyUrl = iprot.readString().decode('utf-8', errors='replace') if sys.version_info[0] == 2 else iprot.readString()
else:
iprot.skip(ftype)
elif fid == 2:
if ftype == TType.STRING:
self.serverIdentity = iprot.readString().decode('utf-8', errors='replace') if sys.version_info[0] == 2 else iprot.readString()
else:
iprot.skip(ftype)
else:
iprot.skip(ftype)
iprot.readFieldEnd()
iprot.readStructEnd()
def write(self, oprot):
if oprot._fast_encode is not None and self.thrift_spec is not None:
oprot.trans.write(oprot._fast_encode(self, [self.__class__, self.thrift_spec]))
return
oprot.writeStructBegin('deleteProxyFromRedis_args')
if self.proxyUrl is not None:
oprot.writeFieldBegin('proxyUrl', TType.STRING, 1)
oprot.writeString(self.proxyUrl.encode('utf-8') if sys.version_info[0] == 2 else self.proxyUrl)
oprot.writeFieldEnd()
if self.serverIdentity is not None:
oprot.writeFieldBegin('serverIdentity', TType.STRING, 2)
oprot.writeString(self.serverIdentity.encode('utf-8') if sys.version_info[0] == 2 else self.serverIdentity)
oprot.writeFieldEnd()
oprot.writeFieldStop()
oprot.writeStructEnd()
def validate(self):
return
def __repr__(self):
L = ['%s=%r' % (key, value)
for key, value in self.__dict__.items()]
return '%s(%s)' % (self.__class__.__name__, ', '.join(L))
def __eq__(self, other):
return isinstance(other, self.__class__) and self.__dict__ == other.__dict__
def __ne__(self, other):
return not (self == other)
all_structs.append(deleteProxyFromRedis_args)
deleteProxyFromRedis_args.thrift_spec = (
None, # 0
(1, TType.STRING, 'proxyUrl', 'UTF8', None, ), # 1
(2, TType.STRING, 'serverIdentity', 'UTF8', None, ), # 2
)
class deleteProxyFromRedis_result(object):
"""
Attributes:
- success
- serviceExp
- userExp
"""
def __init__(self, success=None, serviceExp=None, userExp=None,):
self.success = success
self.serviceExp = serviceExp
self.userExp = userExp
def read(self, iprot):
if iprot._fast_decode is not None and isinstance(iprot.trans, TTransport.CReadableTransport) and self.thrift_spec is not None:
iprot._fast_decode(self, iprot, [self.__class__, self.thrift_spec])
return
iprot.readStructBegin()
while True:
(fname, ftype, fid) = iprot.readFieldBegin()
if ftype == TType.STOP:
break
if fid == 0:
if ftype == TType.BOOL:
self.success = iprot.readBool()
else:
iprot.skip(ftype)
elif fid == 1:
if ftype == TType.STRUCT:
self.serviceExp = pangramia.yt.exceptions.ttypes.PBServiceException.read(iprot)
else:
iprot.skip(ftype)
elif fid == 2:
if ftype == TType.STRUCT:
self.userExp = pangramia.yt.exceptions.ttypes.PBUserException.read(iprot)
else:
iprot.skip(ftype)
else:
iprot.skip(ftype)
iprot.readFieldEnd()
iprot.readStructEnd()
def write(self, oprot):
if oprot._fast_encode is not None and self.thrift_spec is not None:
oprot.trans.write(oprot._fast_encode(self, [self.__class__, self.thrift_spec]))
return
oprot.writeStructBegin('deleteProxyFromRedis_result')
if self.success is not None:
oprot.writeFieldBegin('success', TType.BOOL, 0)
oprot.writeBool(self.success)
oprot.writeFieldEnd()
if self.serviceExp is not None:
oprot.writeFieldBegin('serviceExp', TType.STRUCT, 1)
self.serviceExp.write(oprot)
oprot.writeFieldEnd()
if self.userExp is not None:
oprot.writeFieldBegin('userExp', TType.STRUCT, 2)
self.userExp.write(oprot)
oprot.writeFieldEnd()
oprot.writeFieldStop()
oprot.writeStructEnd()
def validate(self):
return
def __repr__(self):
L = ['%s=%r' % (key, value)
for key, value in self.__dict__.items()]
return '%s(%s)' % (self.__class__.__name__, ', '.join(L))
def __eq__(self, other):
return isinstance(other, self.__class__) and self.__dict__ == other.__dict__
def __ne__(self, other):
return not (self == other)
all_structs.append(deleteProxyFromRedis_result)
deleteProxyFromRedis_result.thrift_spec = (
(0, TType.BOOL, 'success', None, None, ), # 0
(1, TType.STRUCT, 'serviceExp', [pangramia.yt.exceptions.ttypes.PBServiceException, None], None, ), # 1
(2, TType.STRUCT, 'userExp', [pangramia.yt.exceptions.ttypes.PBUserException, None], None, ), # 2
)
class deleteAllProxiesFromRedis_args(object):
"""
Attributes:
- serverIdentity
"""
def __init__(self, serverIdentity=None,):
self.serverIdentity = serverIdentity
def read(self, iprot):
if iprot._fast_decode is not None and isinstance(iprot.trans, TTransport.CReadableTransport) and self.thrift_spec is not None:
iprot._fast_decode(self, iprot, [self.__class__, self.thrift_spec])
return
iprot.readStructBegin()
while True:
(fname, ftype, fid) = iprot.readFieldBegin()
if ftype == TType.STOP:
break
if fid == 1:
if ftype == TType.STRING:
self.serverIdentity = iprot.readString().decode('utf-8', errors='replace') if sys.version_info[0] == 2 else iprot.readString()
else:
iprot.skip(ftype)
else:
iprot.skip(ftype)
iprot.readFieldEnd()
iprot.readStructEnd()
def write(self, oprot):
if oprot._fast_encode is not None and self.thrift_spec is not None:
oprot.trans.write(oprot._fast_encode(self, [self.__class__, self.thrift_spec]))
return
oprot.writeStructBegin('deleteAllProxiesFromRedis_args')
if self.serverIdentity is not None:
oprot.writeFieldBegin('serverIdentity', TType.STRING, 1)
oprot.writeString(self.serverIdentity.encode('utf-8') if sys.version_info[0] == 2 else self.serverIdentity)
oprot.writeFieldEnd()
oprot.writeFieldStop()
oprot.writeStructEnd()
def validate(self):
return
def __repr__(self):
L = ['%s=%r' % (key, value)
for key, value in self.__dict__.items()]
return '%s(%s)' % (self.__class__.__name__, ', '.join(L))
def __eq__(self, other):
return isinstance(other, self.__class__) and self.__dict__ == other.__dict__
def __ne__(self, other):
return not (self == other)
all_structs.append(deleteAllProxiesFromRedis_args)
deleteAllProxiesFromRedis_args.thrift_spec = (
None, # 0
(1, TType.STRING, 'serverIdentity', 'UTF8', None, ), # 1
)
class deleteAllProxiesFromRedis_result(object):
"""
Attributes:
- success
- serviceExp
- userExp
"""
def __init__(self, success=None, serviceExp=None, userExp=None,):
self.success = success
self.serviceExp = serviceExp
self.userExp = userExp
def read(self, iprot):
if iprot._fast_decode is not None and isinstance(iprot.trans, TTransport.CReadableTransport) and self.thrift_spec is not None:
iprot._fast_decode(self, iprot, [self.__class__, self.thrift_spec])
return
iprot.readStructBegin()
while True:
(fname, ftype, fid) = iprot.readFieldBegin()
if ftype == TType.STOP:
break
if fid == 0:
if ftype == TType.I32:
self.success = iprot.readI32()
else:
iprot.skip(ftype)
elif fid == 1:
if ftype == TType.STRUCT:
self.serviceExp = pangramia.yt.exceptions.ttypes.PBServiceException.read(iprot)
else:
iprot.skip(ftype)
elif fid == 2:
if ftype == TType.STRUCT:
self.userExp = pangramia.yt.exceptions.ttypes.PBUserException.read(iprot)
else:
iprot.skip(ftype)
else:
iprot.skip(ftype)
iprot.readFieldEnd()
iprot.readStructEnd()
def write(self, oprot):
if oprot._fast_encode is not None and self.thrift_spec is not None:
oprot.trans.write(oprot._fast_encode(self, [self.__class__, self.thrift_spec]))
return
oprot.writeStructBegin('deleteAllProxiesFromRedis_result')
if self.success is not None:
oprot.writeFieldBegin('success', TType.I32, 0)
oprot.writeI32(self.success)
oprot.writeFieldEnd()
if self.serviceExp is not None:
oprot.writeFieldBegin('serviceExp', TType.STRUCT, 1)
self.serviceExp.write(oprot)
oprot.writeFieldEnd()
if self.userExp is not None:
oprot.writeFieldBegin('userExp', TType.STRUCT, 2)
self.userExp.write(oprot)
oprot.writeFieldEnd()
oprot.writeFieldStop()
oprot.writeStructEnd()
def validate(self):
return
def __repr__(self):
L = ['%s=%r' % (key, value)
for key, value in self.__dict__.items()]
return '%s(%s)' % (self.__class__.__name__, ', '.join(L))
def __eq__(self, other):
return isinstance(other, self.__class__) and self.__dict__ == other.__dict__
def __ne__(self, other):
return not (self == other)
all_structs.append(deleteAllProxiesFromRedis_result)
deleteAllProxiesFromRedis_result.thrift_spec = (
(0, TType.I32, 'success', None, None, ), # 0
(1, TType.STRUCT, 'serviceExp', [pangramia.yt.exceptions.ttypes.PBServiceException, None], None, ), # 1
(2, TType.STRUCT, 'userExp', [pangramia.yt.exceptions.ttypes.PBUserException, None], None, ), # 2
)
class getAccountStatus_args(object): class getAccountStatus_args(object):
""" """
Attributes: Attributes:
@ -2518,299 +1913,5 @@ unbanAccount_result.thrift_spec = (
(1, TType.STRUCT, 'serviceExp', [pangramia.yt.exceptions.ttypes.PBServiceException, None], None, ), # 1 (1, TType.STRUCT, 'serviceExp', [pangramia.yt.exceptions.ttypes.PBServiceException, None], None, ), # 1
(2, TType.STRUCT, 'userExp', [pangramia.yt.exceptions.ttypes.PBUserException, None], None, ), # 2 (2, TType.STRUCT, 'userExp', [pangramia.yt.exceptions.ttypes.PBUserException, None], None, ), # 2
) )
class deleteAccountFromRedis_args(object):
"""
Attributes:
- accountId
"""
def __init__(self, accountId=None,):
self.accountId = accountId
def read(self, iprot):
if iprot._fast_decode is not None and isinstance(iprot.trans, TTransport.CReadableTransport) and self.thrift_spec is not None:
iprot._fast_decode(self, iprot, [self.__class__, self.thrift_spec])
return
iprot.readStructBegin()
while True:
(fname, ftype, fid) = iprot.readFieldBegin()
if ftype == TType.STOP:
break
if fid == 1:
if ftype == TType.STRING:
self.accountId = iprot.readString().decode('utf-8', errors='replace') if sys.version_info[0] == 2 else iprot.readString()
else:
iprot.skip(ftype)
else:
iprot.skip(ftype)
iprot.readFieldEnd()
iprot.readStructEnd()
def write(self, oprot):
if oprot._fast_encode is not None and self.thrift_spec is not None:
oprot.trans.write(oprot._fast_encode(self, [self.__class__, self.thrift_spec]))
return
oprot.writeStructBegin('deleteAccountFromRedis_args')
if self.accountId is not None:
oprot.writeFieldBegin('accountId', TType.STRING, 1)
oprot.writeString(self.accountId.encode('utf-8') if sys.version_info[0] == 2 else self.accountId)
oprot.writeFieldEnd()
oprot.writeFieldStop()
oprot.writeStructEnd()
def validate(self):
return
def __repr__(self):
L = ['%s=%r' % (key, value)
for key, value in self.__dict__.items()]
return '%s(%s)' % (self.__class__.__name__, ', '.join(L))
def __eq__(self, other):
return isinstance(other, self.__class__) and self.__dict__ == other.__dict__
def __ne__(self, other):
return not (self == other)
all_structs.append(deleteAccountFromRedis_args)
deleteAccountFromRedis_args.thrift_spec = (
None, # 0
(1, TType.STRING, 'accountId', 'UTF8', None, ), # 1
)
class deleteAccountFromRedis_result(object):
"""
Attributes:
- success
- serviceExp
- userExp
"""
def __init__(self, success=None, serviceExp=None, userExp=None,):
self.success = success
self.serviceExp = serviceExp
self.userExp = userExp
def read(self, iprot):
if iprot._fast_decode is not None and isinstance(iprot.trans, TTransport.CReadableTransport) and self.thrift_spec is not None:
iprot._fast_decode(self, iprot, [self.__class__, self.thrift_spec])
return
iprot.readStructBegin()
while True:
(fname, ftype, fid) = iprot.readFieldBegin()
if ftype == TType.STOP:
break
if fid == 0:
if ftype == TType.BOOL:
self.success = iprot.readBool()
else:
iprot.skip(ftype)
elif fid == 1:
if ftype == TType.STRUCT:
self.serviceExp = pangramia.yt.exceptions.ttypes.PBServiceException.read(iprot)
else:
iprot.skip(ftype)
elif fid == 2:
if ftype == TType.STRUCT:
self.userExp = pangramia.yt.exceptions.ttypes.PBUserException.read(iprot)
else:
iprot.skip(ftype)
else:
iprot.skip(ftype)
iprot.readFieldEnd()
iprot.readStructEnd()
def write(self, oprot):
if oprot._fast_encode is not None and self.thrift_spec is not None:
oprot.trans.write(oprot._fast_encode(self, [self.__class__, self.thrift_spec]))
return
oprot.writeStructBegin('deleteAccountFromRedis_result')
if self.success is not None:
oprot.writeFieldBegin('success', TType.BOOL, 0)
oprot.writeBool(self.success)
oprot.writeFieldEnd()
if self.serviceExp is not None:
oprot.writeFieldBegin('serviceExp', TType.STRUCT, 1)
self.serviceExp.write(oprot)
oprot.writeFieldEnd()
if self.userExp is not None:
oprot.writeFieldBegin('userExp', TType.STRUCT, 2)
self.userExp.write(oprot)
oprot.writeFieldEnd()
oprot.writeFieldStop()
oprot.writeStructEnd()
def validate(self):
return
def __repr__(self):
L = ['%s=%r' % (key, value)
for key, value in self.__dict__.items()]
return '%s(%s)' % (self.__class__.__name__, ', '.join(L))
def __eq__(self, other):
return isinstance(other, self.__class__) and self.__dict__ == other.__dict__
def __ne__(self, other):
return not (self == other)
all_structs.append(deleteAccountFromRedis_result)
deleteAccountFromRedis_result.thrift_spec = (
(0, TType.BOOL, 'success', None, None, ), # 0
(1, TType.STRUCT, 'serviceExp', [pangramia.yt.exceptions.ttypes.PBServiceException, None], None, ), # 1
(2, TType.STRUCT, 'userExp', [pangramia.yt.exceptions.ttypes.PBUserException, None], None, ), # 2
)
class deleteAllAccountsFromRedis_args(object):
"""
Attributes:
- accountPrefix
"""
def __init__(self, accountPrefix=None,):
self.accountPrefix = accountPrefix
def read(self, iprot):
if iprot._fast_decode is not None and isinstance(iprot.trans, TTransport.CReadableTransport) and self.thrift_spec is not None:
iprot._fast_decode(self, iprot, [self.__class__, self.thrift_spec])
return
iprot.readStructBegin()
while True:
(fname, ftype, fid) = iprot.readFieldBegin()
if ftype == TType.STOP:
break
if fid == 1:
if ftype == TType.STRING:
self.accountPrefix = iprot.readString().decode('utf-8', errors='replace') if sys.version_info[0] == 2 else iprot.readString()
else:
iprot.skip(ftype)
else:
iprot.skip(ftype)
iprot.readFieldEnd()
iprot.readStructEnd()
def write(self, oprot):
if oprot._fast_encode is not None and self.thrift_spec is not None:
oprot.trans.write(oprot._fast_encode(self, [self.__class__, self.thrift_spec]))
return
oprot.writeStructBegin('deleteAllAccountsFromRedis_args')
if self.accountPrefix is not None:
oprot.writeFieldBegin('accountPrefix', TType.STRING, 1)
oprot.writeString(self.accountPrefix.encode('utf-8') if sys.version_info[0] == 2 else self.accountPrefix)
oprot.writeFieldEnd()
oprot.writeFieldStop()
oprot.writeStructEnd()
def validate(self):
return
def __repr__(self):
L = ['%s=%r' % (key, value)
for key, value in self.__dict__.items()]
return '%s(%s)' % (self.__class__.__name__, ', '.join(L))
def __eq__(self, other):
return isinstance(other, self.__class__) and self.__dict__ == other.__dict__
def __ne__(self, other):
return not (self == other)
all_structs.append(deleteAllAccountsFromRedis_args)
deleteAllAccountsFromRedis_args.thrift_spec = (
None, # 0
(1, TType.STRING, 'accountPrefix', 'UTF8', None, ), # 1
)
class deleteAllAccountsFromRedis_result(object):
"""
Attributes:
- success
- serviceExp
- userExp
"""
def __init__(self, success=None, serviceExp=None, userExp=None,):
self.success = success
self.serviceExp = serviceExp
self.userExp = userExp
def read(self, iprot):
if iprot._fast_decode is not None and isinstance(iprot.trans, TTransport.CReadableTransport) and self.thrift_spec is not None:
iprot._fast_decode(self, iprot, [self.__class__, self.thrift_spec])
return
iprot.readStructBegin()
while True:
(fname, ftype, fid) = iprot.readFieldBegin()
if ftype == TType.STOP:
break
if fid == 0:
if ftype == TType.I32:
self.success = iprot.readI32()
else:
iprot.skip(ftype)
elif fid == 1:
if ftype == TType.STRUCT:
self.serviceExp = pangramia.yt.exceptions.ttypes.PBServiceException.read(iprot)
else:
iprot.skip(ftype)
elif fid == 2:
if ftype == TType.STRUCT:
self.userExp = pangramia.yt.exceptions.ttypes.PBUserException.read(iprot)
else:
iprot.skip(ftype)
else:
iprot.skip(ftype)
iprot.readFieldEnd()
iprot.readStructEnd()
def write(self, oprot):
if oprot._fast_encode is not None and self.thrift_spec is not None:
oprot.trans.write(oprot._fast_encode(self, [self.__class__, self.thrift_spec]))
return
oprot.writeStructBegin('deleteAllAccountsFromRedis_result')
if self.success is not None:
oprot.writeFieldBegin('success', TType.I32, 0)
oprot.writeI32(self.success)
oprot.writeFieldEnd()
if self.serviceExp is not None:
oprot.writeFieldBegin('serviceExp', TType.STRUCT, 1)
self.serviceExp.write(oprot)
oprot.writeFieldEnd()
if self.userExp is not None:
oprot.writeFieldBegin('userExp', TType.STRUCT, 2)
self.userExp.write(oprot)
oprot.writeFieldEnd()
oprot.writeFieldStop()
oprot.writeStructEnd()
def validate(self):
return
def __repr__(self):
L = ['%s=%r' % (key, value)
for key, value in self.__dict__.items()]
return '%s(%s)' % (self.__class__.__name__, ', '.join(L))
def __eq__(self, other):
return isinstance(other, self.__class__) and self.__dict__ == other.__dict__
def __ne__(self, other):
return not (self == other)
all_structs.append(deleteAllAccountsFromRedis_result)
deleteAllAccountsFromRedis_result.thrift_spec = (
(0, TType.I32, 'success', None, None, ), # 0
(1, TType.STRUCT, 'serviceExp', [pangramia.yt.exceptions.ttypes.PBServiceException, None], None, ), # 1
(2, TType.STRUCT, 'userExp', [pangramia.yt.exceptions.ttypes.PBUserException, None], None, ), # 2
)
fix_spec(all_structs) fix_spec(all_structs)
del all_structs del all_structs

View File

@ -34,13 +34,9 @@ if len(sys.argv) <= 1 or sys.argv[1] == '--help':
print(' bool unbanProxy(string proxyUrl, string serverIdentity)') print(' bool unbanProxy(string proxyUrl, string serverIdentity)')
print(' bool resetAllProxyStatuses(string serverIdentity)') print(' bool resetAllProxyStatuses(string serverIdentity)')
print(' bool banAllProxies(string serverIdentity)') print(' bool banAllProxies(string serverIdentity)')
print(' bool deleteProxyFromRedis(string proxyUrl, string serverIdentity)')
print(' i32 deleteAllProxiesFromRedis(string serverIdentity)')
print(' getAccountStatus(string accountId, string accountPrefix)') print(' getAccountStatus(string accountId, string accountPrefix)')
print(' bool banAccount(string accountId, string reason)') print(' bool banAccount(string accountId, string reason)')
print(' bool unbanAccount(string accountId, string reason)') print(' bool unbanAccount(string accountId, string reason)')
print(' bool deleteAccountFromRedis(string accountId)')
print(' i32 deleteAllAccountsFromRedis(string accountPrefix)')
print(' bool ping()') print(' bool ping()')
print(' bool reportError(string message, details)') print(' bool reportError(string message, details)')
print(' void shutdown()') print(' void shutdown()')
@ -183,18 +179,6 @@ elif cmd == 'banAllProxies':
sys.exit(1) sys.exit(1)
pp.pprint(client.banAllProxies(args[0],)) pp.pprint(client.banAllProxies(args[0],))
elif cmd == 'deleteProxyFromRedis':
if len(args) != 2:
print('deleteProxyFromRedis requires 2 args')
sys.exit(1)
pp.pprint(client.deleteProxyFromRedis(args[0], args[1],))
elif cmd == 'deleteAllProxiesFromRedis':
if len(args) != 1:
print('deleteAllProxiesFromRedis requires 1 args')
sys.exit(1)
pp.pprint(client.deleteAllProxiesFromRedis(args[0],))
elif cmd == 'getAccountStatus': elif cmd == 'getAccountStatus':
if len(args) != 2: if len(args) != 2:
print('getAccountStatus requires 2 args') print('getAccountStatus requires 2 args')
@ -213,18 +197,6 @@ elif cmd == 'unbanAccount':
sys.exit(1) sys.exit(1)
pp.pprint(client.unbanAccount(args[0], args[1],)) pp.pprint(client.unbanAccount(args[0], args[1],))
elif cmd == 'deleteAccountFromRedis':
if len(args) != 1:
print('deleteAccountFromRedis requires 1 args')
sys.exit(1)
pp.pprint(client.deleteAccountFromRedis(args[0],))
elif cmd == 'deleteAllAccountsFromRedis':
if len(args) != 1:
print('deleteAllAccountsFromRedis requires 1 args')
sys.exit(1)
pp.pprint(client.deleteAllAccountsFromRedis(args[0],))
elif cmd == 'ping': elif cmd == 'ping':
if len(args) != 0: if len(args) != 0:
print('ping requires 0 args') print('ping requires 0 args')

View File

@ -0,0 +1,58 @@
"""
Patch for Thrift-generated exception classes to make them compatible with Airflow's secret masking.
"""
import logging
import sys
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple, Union
# --- Python Path Setup ---
project_root = Path(__file__).parent.absolute()
# Add project root to sys.path (needed for the 'pangramia' symlink)
if str(project_root) not in sys.path: sys.path.insert(0, str(project_root))
# --- End Python Path Setup ---
logger = logging.getLogger(__name__)
def patch_thrift_exceptions():
"""
Patch Thrift-generated exception classes to make them compatible with Airflow's secret masking.
"""
try:
from pangramia.yt.exceptions.ttypes import PBServiceException, PBUserException
# Save original __setattr__ methods
original_service_setattr = PBServiceException.__setattr__
original_user_setattr = PBUserException.__setattr__
# Define a new __setattr__ method that allows modifying any attribute
def new_service_setattr(self, name, value):
logger.debug(f"Setting attribute {name} on PBServiceException")
object.__setattr__(self, name, value)
def new_user_setattr(self, name, value):
logger.debug(f"Setting attribute {name} on PBUserException")
object.__setattr__(self, name, value)
# Apply the patch to both exception classes
PBServiceException.__setattr__ = new_service_setattr
PBUserException.__setattr__ = new_user_setattr
logger.info("Successfully patched Thrift exception classes for Airflow compatibility")
# Verify the patch
try:
test_exception = PBServiceException(message="Test")
test_exception.args = ("Test",) # Try to modify an attribute
logger.info("Verified Thrift exception patch is working correctly")
except Exception as e:
logger.error(f"Thrift exception patch verification failed: {e}")
except ImportError as e:
logger.warning(f"Could not import Thrift exception classes: {e}")
logger.warning("Airflow error handling may not work properly with Thrift exceptions")
except Exception as e:
logger.error(f"Error patching Thrift exception classes: {e}")
# Apply the patch when this module is imported
patch_thrift_exceptions()

View File

@ -1,120 +0,0 @@
# Ansible-driven YT-DLP / Airflow Cluster Quick-Start & Cheat-Sheet
> One playbook = one command to **deploy**, **update**, **restart**, or **re-configure** the entire cluster.
---
## 0. Prerequisites (run once on the **tower** server)
```
---
## 1. Ansible Vault Setup (run once on your **local machine**)
This project uses Ansible Vault to encrypt sensitive data like passwords and API keys. To run the playbooks, you need to provide the vault password. The recommended way is to create a file named `.vault_pass` in the root of the project directory.
1. **Create the Vault Password File:**
From the project's root directory (e.g., `/opt/yt-ops-services`), create the file. The file should contain only your vault password on a single line.
```bash
# Replace 'your_secret_password_here' with your actual vault password
echo "your_secret_password_here" > .vault_pass
```
2. **Secure the File:**
It's good practice to restrict permissions on this file so only you can read it.
```bash
chmod 600 .vault_pass
```
The `ansible.cfg` file is configured to automatically look for this `.vault_pass` file in the project root.
---
## 1.5. Cluster & Inventory Management
The Ansible inventory (`ansible/inventory.ini`), host-specific variables (`ansible/host_vars/`), and the master `docker-compose.yaml` are dynamically generated from a central cluster definition file (e.g., `cluster.yml`).
**Whenever you add, remove, or change the IP of a node in your `cluster.yml`, you must re-run the generator script.**
1. **Install Script Dependencies (run once):**
The generator script requires `PyYAML` and `Jinja2`. Install them using pip:
```bash
pip3 install PyYAML Jinja2
```
2. **Edit Your Cluster Definition:**
Modify your `cluster.yml` file (located in the project root) to define your master and worker nodes.
3. **Run the Generator Script:**
From the project's root directory, run the following command to update all generated files:
```bash
# Make sure the script is executable first: chmod +x tools/generate-inventory.py
./tools/generate-inventory.py cluster.yml
```
This ensures that Ansible has the correct host information and that the master node's Docker Compose configuration includes the correct `extra_hosts` for log fetching from workers.
---
## 2. Setup and Basic Usage
### Running Ansible Commands
**IMPORTANT:** All `ansible-playbook` commands should be run from within the `ansible/` directory. This allows Ansible to automatically find the `ansible.cfg` and `inventory.ini` files.
```bash
cd ansible
ansible-playbook <playbook_name>.yml
```
The `ansible.cfg` file is configured to automatically use the `.vault_pass` file located in the project root (one level above `ansible/`). This means you **do not** need to manually specify `--vault-password-file ../.vault_pass` in your commands. Ensure your `.vault_pass` file is located in the project root.
If you run `ansible-playbook` from the project root instead of the `ansible/` directory, you will see warnings about the inventory not being parsed, because Ansible does not automatically find `ansible/ansible.cfg`.
---
## 3. Deployment Scenarios
### Full Cluster Deployment
To deploy or update the entire cluster (master and all workers), run the main playbook. This will build/pull images and restart all services.
```bash
# Run from inside the ansible/ directory
ansible-playbook playbook-full.yml
```
### Targeted & Fast Deployments
For faster development cycles, you can deploy changes to specific parts of the cluster without rebuilding or re-pulling Docker images.
#### Updating Only the Master Node (Fast Deploy)
To sync configuration, code, and restart services on the master node *without* rebuilding the Airflow image or pulling the `ytdlp-ops-service` image, use the `fast_deploy` flag with the master playbook. This is ideal for pushing changes to DAGs, Python code, or config files.
```bash
# Run from inside the ansible/ directory
ansible-playbook playbook-master.yml --extra-vars "fast_deploy=true"
```
#### Updating Only a Specific Worker Node (Fast Deploy)
Similarly, you can update a single worker node. Replace `dl001` with the hostname of the worker you want to target from your `inventory.ini`.
```bash
# Run from inside the ansible/ directory
ansible-playbook playbook-worker.yml --limit dl001 --extra-vars "fast_deploy=true"
```
#### Updating Only DAGs and Configs
If you have only changed DAGs or configuration files and don't need to restart any services, you can run a much faster playbook that only syncs the `dags/` and `config/` directories.
```bash
# Run from inside the ansible/ directory
ansible-playbook playbook-dags.yml
```

View File

@ -1,2 +0,0 @@
# Enable memory overcommit for Redis to prevent background save failures
vm.overcommit_memory = 1

View File

@ -28,9 +28,6 @@ docker_network_name: "airflow_proxynet"
ssh_user: "alex_p" ssh_user: "alex_p"
ansible_user: "alex_p" ansible_user: "alex_p"
# Default group
deploy_group: "ytdl"
# Default file permissions # Default file permissions
dir_permissions: "0755" dir_permissions: "0755"
file_permissions: "0644" file_permissions: "0644"

View File

@ -1,7 +0,0 @@
---
# This file is auto-generated by tools/generate-inventory.py
# Do not edit your changes will be overwritten.
master_host_ip: 89.253.221.173
redis_port: 52909
external_access_ips:
[]

View File

@ -1,4 +1,7 @@
---
vault_redis_password: "rOhTAIlTFFylXsjhqwxnYxDChFc" vault_redis_password: "rOhTAIlTFFylXsjhqwxnYxDChFc"
vault_postgres_password: "pgdb_pwd_A7bC2xY9zE1wV5uP" vault_postgres_password: "pgdb_pwd_A7bC2xY9zE1wV5uP"
vault_airflow_admin_password: "admin_pwd_X9yZ3aB1cE5dF7gH" vault_airflow_admin_password: "2r234sdfrt3q454arq45q355"
vault_minio_root_password: "0153093693-0009"
vault_vnc_password: "vnc_pwd_Z5xW8cV2bN4mP7lK" vault_vnc_password: "vnc_pwd_Z5xW8cV2bN4mP7lK"
vault_dockerhub_token: "dckr_pat_Fbg-Q-ysA7aUKHroTZQIrd-VbIE"

View File

@ -1,4 +0,0 @@
---
# Variables for af-green
master_host_ip: 89.253.221.173
redis_port: 52909

View File

@ -0,0 +1 @@
master_host_ip: 89.253.223.97

View File

@ -0,0 +1,4 @@
---
# Variables for dl001
worker_proxies:
- "socks5://sslocal-rust-1087:1087"

View File

@ -1,6 +0,0 @@
---
# Variables for dl003
master_host_ip: 89.253.221.173
redis_port: 52909
worker_proxies:
- "socks5://sslocal-rust-1087:1087"

View File

@ -3,7 +3,7 @@
# Edit cluster.yml and re-run the generator instead. # Edit cluster.yml and re-run the generator instead.
[airflow_master] [airflow_master]
af-green ansible_host=89.253.221.173 af-test ansible_host=89.253.223.97
[airflow_workers] [airflow_workers]
dl003 ansible_host=62.60.245.103 dl001 ansible_host=109.107.189.106

View File

@ -19,16 +19,14 @@
- name: Sync Config to MASTER server - name: Sync Config to MASTER server
ansible.posix.synchronize: ansible.posix.synchronize:
src: "../airflow/config/{{ item }}" src: "../airflow/config/"
dest: /srv/airflow_master/config/ dest: /srv/airflow_master/config/
archive: yes archive: yes
delete: yes
rsync_path: "sudo rsync" rsync_path: "sudo rsync"
rsync_opts: rsync_opts:
- "--exclude=__pycache__/" - "--exclude=__pycache__/"
- "--exclude=*.pyc" - "--exclude=*.pyc"
loop:
- "airflow.cfg"
- "custom_task_hooks.py"
- name: Deploy Airflow DAGs to DL Workers - name: Deploy Airflow DAGs to DL Workers
hosts: airflow_workers hosts: airflow_workers

View File

@ -4,12 +4,6 @@
vars_files: vars_files:
- group_vars/all.yml - group_vars/all.yml
- group_vars/all/vault.yml - group_vars/all/vault.yml
pre_tasks:
- name: Announce fast deploy mode if enabled
debug:
msg: "🚀 FAST DEPLOY MODE ENABLED: Skipping Docker image builds and pulls. 🚀"
when: fast_deploy | default(false)
run_once: true
tasks: tasks:
- name: Ensure worker directory exists - name: Ensure worker directory exists
@ -25,49 +19,37 @@
dest: "{{ airflow_worker_dir }}/.env" dest: "{{ airflow_worker_dir }}/.env"
mode: '0600' mode: '0600'
- name: Template docker-compose file for Airflow worker - name: Copy docker-compose-dl.yaml
template: copy:
src: ../airflow/configs/docker-compose-dl.yaml.j2 src: airflow/docker-compose-dl.yaml
dest: "{{ airflow_worker_dir }}/configs/docker-compose-dl.yaml" dest: "{{ airflow_worker_dir }}/docker-compose.yaml"
mode: '0644' remote_src: yes
- name: Build Airflow worker image from local Dockerfile - name: Symlink compose file
community.docker.docker_image: file:
name: "{{ airflow_image_name }}" src: "{{ airflow_worker_dir }}/docker-compose.yaml"
build: dest: "{{ airflow_worker_dir }}/docker-compose-dl.yaml"
path: "{{ airflow_worker_dir }}" state: link
dockerfile: "Dockerfile"
source: build
force_source: true
when: not fast_deploy | default(false)
- name: Build Camoufox image from local Dockerfile
community.docker.docker_image:
name: "camoufox:latest"
build:
path: "{{ airflow_worker_dir }}/camoufox"
source: build
force_source: true
when: not fast_deploy | default(false)
- name: Pull ytdlp-ops-service image only
community.docker.docker_image:
name: "{{ ytdlp_ops_image }}"
source: pull
when: not fast_deploy | default(false)
- name: Generate dynamic configs (camoufox + envoy) - name: Generate dynamic configs (camoufox + envoy)
shell: community.docker.docker_compose:
cmd: "docker compose -f configs/docker-compose.config-generate.yaml run --rm config-generator"
chdir: "{{ airflow_worker_dir }}"
- name: Start worker services
community.docker.docker_compose_v2:
project_src: "{{ airflow_worker_dir }}" project_src: "{{ airflow_worker_dir }}"
files: files:
- configs/docker-compose-dl.yaml - docker-compose.config-generate.yaml
- configs/docker-compose-ytdlp-ops.yaml services:
- configs/docker-compose.camoufox.yaml - config-generator
state: present
- name: Pull latest images
community.docker.docker_compose:
project_src: "{{ airflow_worker_dir }}"
files:
- docker-compose.yaml
pull: yes
- name: Start worker services
community.docker.docker_compose:
project_src: "{{ airflow_worker_dir }}"
files:
- docker-compose.yaml
state: present state: present
remove_orphans: true
pull: "{{ 'never' if fast_deploy | default(false) else 'missing' }}"

View File

@ -5,20 +5,7 @@
vars_files: vars_files:
- group_vars/all.yml - group_vars/all.yml
- group_vars/all/vault.yml - group_vars/all/vault.yml
pre_tasks:
- name: Announce fast deploy mode if enabled
debug:
msg: "🚀 FAST DEPLOY MODE ENABLED: Skipping Docker image builds and pulls. 🚀"
when: fast_deploy | default(false)
run_once: true
tasks: tasks:
- name: Ensure python3-docker is installed
ansible.builtin.apt:
name: python3-docker
state: present
update_cache: yes
become: yes
- name: Ensure shared Docker network exists - name: Ensure shared Docker network exists
community.docker.docker_network: community.docker.docker_network:
name: airflow_proxynet name: airflow_proxynet

View File

@ -9,92 +9,6 @@
- name: Announce master deployment - name: Announce master deployment
debug: debug:
msg: "Starting deployment for Airflow Master: {{ inventory_hostname }} ({{ ansible_host }})" msg: "Starting deployment for Airflow Master: {{ inventory_hostname }} ({{ ansible_host }})"
- name: Set deploy_group to a valid single group name
set_fact:
deploy_group: "ytdl"
- name: Ensure deploy group exists
group:
name: "{{ deploy_group }}"
state: present
become: yes
- name: Ensure deploy user exists
user:
name: "{{ ssh_user }}"
group: "{{ deploy_group }}"
state: present
become: yes
- name: Validate deploy_group variable
ansible.builtin.assert:
that:
- deploy_group is defined
- deploy_group is string
- "',' not in deploy_group"
- "' ' not in deploy_group"
fail_msg: "The 'deploy_group' variable ('{{ deploy_group }}') must be a single, valid group name. It should not contain commas or spaces."
- name: Check for swapfile
stat:
path: /swapfile
register: swap_file
become: yes
- name: Create 8GB swapfile
command: fallocate -l 8G /swapfile
when: not swap_file.stat.exists
become: yes
- name: Set swapfile permissions
file:
path: /swapfile
mode: '0600'
when: not swap_file.stat.exists
become: yes
- name: Make swap
command: mkswap /swapfile
when: not swap_file.stat.exists
become: yes
- name: Check current swap status
command: swapon --show
register: swap_status
changed_when: false
become: yes
- name: Enable swap
command: swapon /swapfile
when: "'/swapfile' not in swap_status.stdout"
become: yes
- name: Add swapfile to fstab
lineinfile:
path: /etc/fstab
regexp: '^/swapfile'
line: '/swapfile none swap sw 0 0'
state: present
become: yes
- name: Get GID of the deploy group
getent:
database: group
key: "{{ deploy_group }}"
register: deploy_group_info
become: yes
- name: Set deploy_group_gid fact
set_fact:
deploy_group_gid: "{{ deploy_group_info.ansible_facts.getent_group[deploy_group][1] }}"
when: deploy_group_info.ansible_facts.getent_group is defined and deploy_group in deploy_group_info.ansible_facts.getent_group
- name: Ensure deploy_group_gid is set to a valid value
set_fact:
deploy_group_gid: "0"
when: deploy_group_gid is not defined or deploy_group_gid == ""
roles: roles:
- ytdlp-master
- airflow-master - airflow-master
- ytdlp-master

View File

@ -9,92 +9,6 @@
- name: Announce worker deployment - name: Announce worker deployment
debug: debug:
msg: "Starting deployment for Airflow Worker: {{ inventory_hostname }} ({{ ansible_host }})" msg: "Starting deployment for Airflow Worker: {{ inventory_hostname }} ({{ ansible_host }})"
- name: Set deploy_group to a valid single group name
set_fact:
deploy_group: "ytdl"
- name: Ensure deploy group exists
group:
name: "{{ deploy_group }}"
state: present
become: yes
- name: Ensure deploy user exists
user:
name: "{{ ssh_user }}"
group: "{{ deploy_group }}"
state: present
become: yes
- name: Validate deploy_group variable
ansible.builtin.assert:
that:
- deploy_group is defined
- deploy_group is string
- "',' not in deploy_group"
- "' ' not in deploy_group"
fail_msg: "The 'deploy_group' variable ('{{ deploy_group }}') must be a single, valid group name. It should not contain commas or spaces."
- name: Check for swapfile
stat:
path: /swapfile
register: swap_file
become: yes
- name: Create 8GB swapfile
command: fallocate -l 8G /swapfile
when: not swap_file.stat.exists
become: yes
- name: Set swapfile permissions
file:
path: /swapfile
mode: '0600'
when: not swap_file.stat.exists
become: yes
- name: Make swap
command: mkswap /swapfile
when: not swap_file.stat.exists
become: yes
- name: Check current swap status
command: swapon --show
register: swap_status
changed_when: false
become: yes
- name: Enable swap
command: swapon /swapfile
when: "'/swapfile' not in swap_status.stdout"
become: yes
- name: Add swapfile to fstab
lineinfile:
path: /etc/fstab
regexp: '^/swapfile'
line: '/swapfile none swap sw 0 0'
state: present
become: yes
- name: Get GID of the deploy group
getent:
database: group
key: "{{ deploy_group }}"
register: deploy_group_info
become: yes
- name: Set deploy_group_gid fact
set_fact:
deploy_group_gid: "{{ deploy_group_info.ansible_facts.getent_group[deploy_group][1] }}"
when: deploy_group_info.ansible_facts.getent_group is defined and deploy_group in deploy_group_info.ansible_facts.getent_group
- name: Ensure deploy_group_gid is set to a valid value
set_fact:
deploy_group_gid: "0"
when: deploy_group_gid is not defined or deploy_group_gid == ""
roles: roles:
- airflow-worker - airflow-worker
- ytdlp-worker - ytdlp-worker

View File

@ -9,34 +9,17 @@
path: "{{ airflow_master_dir }}" path: "{{ airflow_master_dir }}"
state: directory state: directory
owner: "{{ ssh_user }}" owner: "{{ ssh_user }}"
group: "{{ deploy_group }}" group: ytdl
mode: '0755' mode: '0755'
become: yes become: yes
when: not master_dir_stat.stat.exists when: not master_dir_stat.stat.exists
- name: Ensure Airflow master configs directory exists
file:
path: "{{ airflow_master_dir }}/configs"
state: directory
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
mode: '0755'
become: yes
- name: Ensure Airflow master config directory exists
file:
path: "{{ airflow_master_dir }}/config"
state: directory
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
mode: '0755'
become: yes
- name: Check if source directories exist - name: Check if source directories exist
stat: stat:
path: "../{{ item }}" path: "../{{ item }}"
register: source_dirs register: source_dirs
loop: loop:
- "airflow/inputfiles"
- "airflow/plugins" - "airflow/plugins"
- "airflow/addfiles" - "airflow/addfiles"
- "airflow/bgutil-ytdlp-pot-provider" - "airflow/bgutil-ytdlp-pot-provider"
@ -55,53 +38,23 @@
rsync_opts: "{{ rsync_default_opts }}" rsync_opts: "{{ rsync_default_opts }}"
loop: loop:
- "airflow/Dockerfile" - "airflow/Dockerfile"
- "airflow/Dockerfile.caddy"
- "airflow/.dockerignore" - "airflow/.dockerignore"
- "airflow/docker-compose-master.yaml"
- "airflow/dags" - "airflow/dags"
- "airflow/inputfiles" - "airflow/config"
- "setup.py" - "yt_ops_package/setup.py"
- "yt_ops_services" - "yt_ops_package/yt_ops_services"
- "thrift_model" - "yt_ops_package/thrift_model"
- "VERSION" - "yt_ops_package/VERSION"
- "yt_ops_package/pangramia"
- "airflow/init-airflow.sh"
- "airflow/update-yt-dlp.sh" - "airflow/update-yt-dlp.sh"
- "get_info_json_client.py" - "airflow/nginx.conf"
- "proxy_manager_client.py" - "yt_ops_package/get_info_json_client.py"
- "yt_ops_package/proxy_manager_client.py"
- "token_generator" - "token_generator"
- "utils" - "utils"
- name: Copy custom Python config files to master
copy:
src: "../airflow/config/{{ item }}"
dest: "{{ airflow_master_dir }}/config/{{ item }}"
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
mode: '0644'
become: yes
loop:
- "custom_task_hooks.py"
- "airflow_local_settings.py"
- name: Copy airflow.cfg to master
copy:
src: "../airflow/airflow.cfg"
dest: "{{ airflow_master_dir }}/config/airflow.cfg"
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
mode: '0644'
become: yes
- name: Sync Airflow master config files
synchronize:
src: "../airflow/configs/{{ item }}"
dest: "{{ airflow_master_dir }}/configs/"
archive: yes
recursive: yes
rsync_path: "sudo rsync"
rsync_opts: "{{ rsync_default_opts }}"
loop:
- "nginx.conf"
- "Caddyfile"
- name: Sync optional directories if they exist - name: Sync optional directories if they exist
synchronize: synchronize:
src: "../{{ item.item }}/" src: "../{{ item.item }}/"
@ -116,7 +69,7 @@
- name: Sync pangramia thrift files - name: Sync pangramia thrift files
synchronize: synchronize:
src: "../thrift_model/gen_py/pangramia/" src: "../yt_ops_package/thrift_model/gen_py/pangramia/"
dest: "{{ airflow_master_dir }}/pangramia/" dest: "{{ airflow_master_dir }}/pangramia/"
archive: yes archive: yes
recursive: yes recursive: yes
@ -124,58 +77,42 @@
rsync_path: "sudo rsync" rsync_path: "sudo rsync"
rsync_opts: "{{ rsync_default_opts }}" rsync_opts: "{{ rsync_default_opts }}"
- name: Template docker-compose file for master - name: Create .env file for Airflow master service
template: template:
src: "{{ playbook_dir }}/../airflow/configs/docker-compose-master.yaml.j2" src: "../../templates/.env.master.j2"
dest: "{{ airflow_master_dir }}/configs/docker-compose-master.yaml" dest: "{{ airflow_master_dir }}/.env"
mode: "{{ file_permissions }}" mode: "{{ file_permissions }}"
owner: "{{ ssh_user }}" owner: "{{ ssh_user }}"
group: "{{ deploy_group }}" group: ytdl
become: yes become: yes
vars:
service_role: "master"
- name: Template Redis connection file - name: Template Minio connection file
template: template:
src: "../airflow/config/redis_default_conn.json.j2" src: "../templates/minio_default_conn.json.j2"
dest: "{{ airflow_master_dir }}/config/redis_default_conn.json"
mode: "{{ file_permissions }}"
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
become: yes
- name: Template Minio connection file for master
template:
src: "../airflow/config/minio_default_conn.json.j2"
dest: "{{ airflow_master_dir }}/config/minio_default_conn.json" dest: "{{ airflow_master_dir }}/config/minio_default_conn.json"
mode: "{{ file_permissions }}" mode: "{{ file_permissions }}"
owner: "{{ ssh_user }}" owner: "{{ ssh_user }}"
group: "{{ deploy_group }}" group: ytdl
become: yes become: yes
- name: Ensure config directory is group-writable for Airflow initialization - name: Template YT-DLP Redis connection file
file: template:
path: "{{ airflow_master_dir }}/config" src: "../templates/ytdlp_redis_conn.json.j2"
state: directory dest: "{{ airflow_master_dir }}/config/ytdlp_redis_conn.json"
mode: '0775' mode: "{{ file_permissions }}"
owner: "{{ ssh_user }}" owner: "{{ ssh_user }}"
group: "{{ deploy_group }}" group: ytdl
become: yes
- name: Ensure airflow.cfg is group-writable for Airflow initialization
file:
path: "{{ airflow_master_dir }}/config/airflow.cfg"
state: file
mode: '0664'
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
become: yes become: yes
- name: Create symlink for docker-compose.yaml - name: Create symlink for docker-compose.yaml
file: file:
src: "{{ airflow_master_dir }}/configs/docker-compose-master.yaml" src: "{{ airflow_master_dir }}/docker-compose-master.yaml"
dest: "{{ airflow_master_dir }}/docker-compose.yaml" dest: "{{ airflow_master_dir }}/docker-compose.yaml"
state: link state: link
owner: "{{ ssh_user }}" owner: "{{ ssh_user }}"
group: "{{ deploy_group }}" group: ytdl
force: yes force: yes
follow: no follow: no
@ -184,34 +121,10 @@
path: "{{ airflow_master_dir }}" path: "{{ airflow_master_dir }}"
state: directory state: directory
owner: "{{ ssh_user }}" owner: "{{ ssh_user }}"
group: "{{ deploy_group }}" group: ytdl
recurse: yes recurse: yes
become: yes become: yes
- name: Ensure logs directory exists on master
file:
path: "{{ airflow_master_dir }}/logs"
state: directory
owner: "{{ airflow_uid }}"
group: "{{ deploy_group }}"
mode: '0775'
become: yes
- name: Ensure postgres-data directory exists on master and has correct permissions
file:
path: "{{ airflow_master_dir }}/postgres-data"
state: directory
owner: "{{ airflow_uid }}"
group: "{{ deploy_group }}"
mode: '0775'
become: yes
- name: Set group-writable and setgid permissions on master logs directory contents
shell: |
find {{ airflow_master_dir }}/logs -type d -exec chmod g+rws {} +
find {{ airflow_master_dir }}/logs -type f -exec chmod g+rw {} +
become: yes
- name: Verify Dockerfile exists in build directory - name: Verify Dockerfile exists in build directory
stat: stat:
path: "{{ airflow_master_dir }}/Dockerfile" path: "{{ airflow_master_dir }}/Dockerfile"
@ -234,21 +147,19 @@
dockerfile: "Dockerfile" # Explicitly specify the Dockerfile name dockerfile: "Dockerfile" # Explicitly specify the Dockerfile name
source: build source: build
force_source: true force_source: true
when: not fast_deploy | default(false)
- name: "Log: Building Caddy reverse proxy image" - name: Make Airflow init script executable
debug: file:
msg: "Building the Caddy image (pangramia/ytdlp-ops-caddy:latest) to serve static assets." path: "{{ airflow_master_dir }}/init-airflow.sh"
mode: "0755"
become: yes
- name: Build Caddy image - name: Run Airflow init script
community.docker.docker_image: shell:
name: "pangramia/ytdlp-ops-caddy:latest" cmd: "./init-airflow.sh"
build: chdir: "{{ airflow_master_dir }}"
path: "{{ airflow_master_dir }}" become: yes
dockerfile: "Dockerfile.caddy" become_user: "{{ ssh_user }}"
source: build
force_source: true
when: not fast_deploy | default(false)
- name: "Log: Starting Airflow services" - name: "Log: Starting Airflow services"
debug: debug:
@ -258,7 +169,6 @@
community.docker.docker_compose_v2: community.docker.docker_compose_v2:
project_src: "{{ airflow_master_dir }}" project_src: "{{ airflow_master_dir }}"
files: files:
- "configs/docker-compose-master.yaml" - "docker-compose-master.yaml"
state: present state: present
remove_orphans: true remove_orphans: true
pull: "{{ 'never' if fast_deploy | default(false) else 'missing' }}"

View File

@ -9,29 +9,11 @@
path: "{{ airflow_worker_dir }}" path: "{{ airflow_worker_dir }}"
state: directory state: directory
owner: "{{ ssh_user }}" owner: "{{ ssh_user }}"
group: "{{ deploy_group }}" group: ytdl
mode: '0755' mode: '0755'
become: yes become: yes
when: not worker_dir_stat.stat.exists when: not worker_dir_stat.stat.exists
- name: Ensure Airflow worker configs directory exists
file:
path: "{{ airflow_worker_dir }}/configs"
state: directory
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
mode: '0755'
become: yes
- name: Ensure Airflow worker config directory exists
file:
path: "{{ airflow_worker_dir }}/config"
state: directory
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
mode: '0755'
become: yes
- name: "Log: Syncing Airflow core files" - name: "Log: Syncing Airflow core files"
debug: debug:
msg: "Syncing DAGs, configs, and Python source code to the worker node." msg: "Syncing DAGs, configs, and Python source code to the worker node."
@ -48,50 +30,25 @@
- "airflow/Dockerfile" - "airflow/Dockerfile"
- "airflow/.dockerignore" - "airflow/.dockerignore"
- "airflow/dags" - "airflow/dags"
- "airflow/inputfiles" - "airflow/config"
- "setup.py" - "yt_ops_package/setup.py"
- "yt_ops_services" - "yt_ops_package/yt_ops_services"
- "thrift_model" - "yt_ops_package/thrift_model"
- "VERSION" - "yt_ops_package/VERSION"
- "yt_ops_package/pangramia"
- "airflow/init-airflow.sh"
- "airflow/update-yt-dlp.sh" - "airflow/update-yt-dlp.sh"
- "get_info_json_client.py" - "yt_ops_package/get_info_json_client.py"
- "proxy_manager_client.py" - "yt_ops_package/proxy_manager_client.py"
- "token_generator" - "token_generator"
- "utils" - "utils"
- name: Copy custom Python config files to worker
copy:
src: "../airflow/config/{{ item }}"
dest: "{{ airflow_worker_dir }}/config/{{ item }}"
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
mode: '0644'
become: yes
loop:
- "custom_task_hooks.py"
- "airflow_local_settings.py"
- name: Ensure any existing airflow.cfg directory is removed
file:
path: "{{ airflow_worker_dir }}/config/airflow.cfg"
state: absent
become: yes
ignore_errors: yes
- name: Copy airflow.cfg to worker
copy:
src: "../airflow/airflow.cfg"
dest: "{{ airflow_worker_dir }}/config/airflow.cfg"
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
mode: '0644'
become: yes
- name: Check if source directories exist - name: Check if source directories exist
stat: stat:
path: "../{{ item }}" path: "../{{ item }}"
register: source_dirs register: source_dirs
loop: loop:
- "airflow/inputfiles"
- "airflow/plugins" - "airflow/plugins"
- "airflow/addfiles" - "airflow/addfiles"
- "airflow/bgutil-ytdlp-pot-provider" - "airflow/bgutil-ytdlp-pot-provider"
@ -110,7 +67,7 @@
- name: Sync pangramia thrift files - name: Sync pangramia thrift files
synchronize: synchronize:
src: "../thrift_model/gen_py/pangramia/" src: "../yt_ops_package/thrift_model/gen_py/pangramia/"
dest: "{{ airflow_worker_dir }}/pangramia/" dest: "{{ airflow_worker_dir }}/pangramia/"
archive: yes archive: yes
recursive: yes recursive: yes
@ -118,61 +75,33 @@
rsync_path: "sudo rsync" rsync_path: "sudo rsync"
rsync_opts: "{{ rsync_default_opts }}" rsync_opts: "{{ rsync_default_opts }}"
- name: Ensure config directory is group-writable for Airflow initialization
file:
path: "{{ airflow_worker_dir }}/config"
state: directory
mode: '0775'
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
become: yes
- name: Ensure airflow.cfg is group-writable for Airflow initialization
file:
path: "{{ airflow_worker_dir }}/config/airflow.cfg"
state: file
mode: '0664'
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
become: yes
- name: Template docker-compose file for worker - name: Template docker-compose file for worker
template: template:
src: "{{ playbook_dir }}/../airflow/configs/docker-compose-dl.yaml.j2" src: "{{ playbook_dir }}/../airflow/docker-compose-dl.yaml.j2"
dest: "{{ airflow_worker_dir }}/configs/docker-compose-dl.yaml" dest: "{{ airflow_worker_dir }}/docker-compose-dl.yaml"
mode: "{{ file_permissions }}" mode: "{{ file_permissions }}"
owner: "{{ ssh_user }}" owner: "{{ ssh_user }}"
group: "{{ deploy_group }}" group: ytdl
become: yes become: yes
- name: Create .env file for Airflow worker service - name: Create .env file for Airflow worker service
template: template:
src: "../../templates/.env.j2" src: "../../templates/.env.worker.j2"
dest: "{{ airflow_worker_dir }}/.env" dest: "{{ airflow_worker_dir }}/.env"
mode: "{{ file_permissions }}" mode: "{{ file_permissions }}"
owner: "{{ ssh_user }}" owner: "{{ ssh_user }}"
group: "{{ deploy_group }}" group: ytdl
become: yes become: yes
vars: vars:
service_role: "worker" service_role: "worker"
server_identity: "ytdlp-ops-service-worker-{{ inventory_hostname }}"
- name: Template Minio connection file for worker
template:
src: "../airflow/config/minio_default_conn.json.j2"
dest: "{{ airflow_worker_dir }}/config/minio_default_conn.json"
mode: "{{ file_permissions }}"
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
become: yes
- name: Create symlink for docker-compose.yaml - name: Create symlink for docker-compose.yaml
file: file:
src: "{{ airflow_worker_dir }}/configs/docker-compose-dl.yaml" src: "{{ airflow_worker_dir }}/docker-compose-dl.yaml"
dest: "{{ airflow_worker_dir }}/docker-compose.yaml" dest: "{{ airflow_worker_dir }}/docker-compose.yaml"
state: link state: link
owner: "{{ ssh_user }}" owner: "{{ ssh_user }}"
group: "{{ deploy_group }}" group: ytdl
follow: no follow: no
- name: Ensure correct permissions for build context - name: Ensure correct permissions for build context
@ -180,25 +109,10 @@
path: "{{ airflow_worker_dir }}" path: "{{ airflow_worker_dir }}"
state: directory state: directory
owner: "{{ ssh_user }}" owner: "{{ ssh_user }}"
group: "{{ deploy_group }}" group: ytdl
recurse: yes recurse: yes
become: yes become: yes
- name: Ensure logs directory exists on worker
file:
path: "{{ airflow_worker_dir }}/logs"
state: directory
owner: "{{ airflow_uid }}"
group: "{{ deploy_group }}"
mode: '0775'
become: yes
- name: Set group-writable and setgid permissions on worker logs directory contents
shell: |
find {{ airflow_worker_dir }}/logs -type d -exec chmod g+rws {} +
find {{ airflow_worker_dir }}/logs -type f -exec chmod g+rw {} +
become: yes
- name: Verify Dockerfile exists in build directory - name: Verify Dockerfile exists in build directory
stat: stat:
path: "{{ airflow_worker_dir }}/Dockerfile" path: "{{ airflow_worker_dir }}/Dockerfile"
@ -221,7 +135,19 @@
dockerfile: "Dockerfile" dockerfile: "Dockerfile"
source: build source: build
force_source: true force_source: true
when: not fast_deploy | default(false)
- name: Make Airflow init script executable
file:
path: "{{ airflow_worker_dir }}/init-airflow.sh"
mode: "0755"
become: yes
- name: Run Airflow init script
shell:
cmd: "./init-airflow.sh"
chdir: "{{ airflow_worker_dir }}"
become: yes
become_user: "{{ ssh_user }}"
- name: "Log: Starting Airflow services" - name: "Log: Starting Airflow services"
debug: debug:
@ -231,7 +157,6 @@
community.docker.docker_compose_v2: community.docker.docker_compose_v2:
project_src: "{{ airflow_worker_dir }}" project_src: "{{ airflow_worker_dir }}"
files: files:
- "configs/docker-compose-dl.yaml" - "docker-compose-dl.yaml"
state: present state: present
remove_orphans: true remove_orphans: true
pull: "{{ 'never' if fast_deploy | default(false) else 'missing' }}"

View File

@ -9,81 +9,62 @@
path: "{{ airflow_master_dir }}" path: "{{ airflow_master_dir }}"
state: directory state: directory
owner: "{{ ssh_user }}" owner: "{{ ssh_user }}"
group: "{{ deploy_group }}" group: ytdl
mode: '0755' mode: '0755'
become: yes become: yes
when: not master_dir_stat.stat.exists when: not master_dir_stat.stat.exists
- name: Ensure YT-DLP master configs directory exists
file:
path: "{{ airflow_master_dir }}/configs"
state: directory
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
mode: '0755'
become: yes
- name: "Log: Syncing YT-DLP service files" - name: "Log: Syncing YT-DLP service files"
debug: debug:
msg: "Syncing YT-DLP service components (config generator, envoy/camoufox templates) to the master node." msg: "Syncing YT-DLP service components (config generator, envoy/camoufox templates) to the master node."
- name: Sync YT-DLP config generator to master - name: Sync YT-DLP service files to master
synchronize: synchronize:
src: "../airflow/generate_envoy_config.py" src: "../{{ item }}"
dest: "{{ airflow_master_dir }}/" dest: "{{ airflow_master_dir }}/"
archive: yes archive: yes
rsync_path: "sudo rsync"
rsync_opts: "{{ rsync_default_opts }}"
- name: Sync YT-DLP config files to master
synchronize:
src: "../airflow/configs/{{ item }}"
dest: "{{ airflow_master_dir }}/configs/"
archive: yes
recursive: yes recursive: yes
rsync_path: "sudo rsync" rsync_path: "sudo rsync"
rsync_opts: "{{ rsync_default_opts }}" rsync_opts: "{{ rsync_default_opts }}"
loop: loop:
- "docker-compose-ytdlp-ops.yaml.j2" - "airflow/docker-compose-ytdlp-ops.yaml.j2"
- "docker-compose.config-generate.yaml" - "airflow/docker-compose.config-generate.yaml"
- "envoy.yaml.j2" - "airflow/generate_envoy_config.py"
- "airflow/init-yt-service.sh"
- "airflow/envoy.yaml.j2"
- name: Create .env file for YT-DLP master service - name: Create .env file for YT-DLP master service
template: template:
src: "../../templates/.env.j2" src: "../../templates/.env.master.j2"
dest: "{{ airflow_master_dir }}/.env" dest: "{{ airflow_master_dir }}/.env"
mode: "{{ file_permissions }}" mode: "{{ file_permissions }}"
owner: "{{ ssh_user }}" owner: "{{ ssh_user }}"
group: "{{ deploy_group }}" group: ytdl
become: yes become: yes
vars: vars:
service_role: "management" service_role: "master"
server_identity: "ytdlp-ops-service-mgmt" server_identity: "ytdlp-ops-service-mgmt"
- name: Template docker-compose file for YT-DLP master service - name: Make YT-DLP service init script executable
template: file:
src: "../airflow/configs/docker-compose-ytdlp-ops.yaml.j2" path: "{{ airflow_master_dir }}/init-yt-service.sh"
dest: "{{ airflow_master_dir }}/configs/docker-compose-ytdlp-ops.yaml" mode: "0755"
mode: "{{ file_permissions }}"
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
become: yes become: yes
vars:
service_role: "management" - name: Run YT-DLP service init script
shell:
cmd: "./init-yt-service.sh"
chdir: "{{ airflow_master_dir }}"
become: yes
become_user: "{{ ssh_user }}"
- name: "Log: Generating YT-DLP service configurations" - name: "Log: Generating YT-DLP service configurations"
debug: debug:
msg: "Running the configuration generator script inside a temporary Docker container. This creates docker-compose and envoy files based on .env variables." msg: "Running the configuration generator script inside a temporary Docker container. This creates docker-compose and envoy files based on .env variables."
- name: Ensure envoy.yaml is removed before generation
file:
path: "{{ airflow_master_dir }}/envoy.yaml"
state: absent
become: yes
- name: Generate YT-DLP service configurations - name: Generate YT-DLP service configurations
shell: shell:
cmd: "docker compose --project-directory . --env-file .env -f configs/docker-compose.config-generate.yaml run --rm config-generator" cmd: "docker compose -f docker-compose.config-generate.yaml run --rm config-generator"
chdir: "{{ airflow_master_dir }}" chdir: "{{ airflow_master_dir }}"
become: yes become: yes
become_user: "{{ ssh_user }}" become_user: "{{ ssh_user }}"
@ -92,30 +73,6 @@
community.docker.docker_image: community.docker.docker_image:
name: "{{ ytdlp_ops_image }}" name: "{{ ytdlp_ops_image }}"
source: pull source: pull
when: not fast_deploy | default(false)
- name: Ensure correct permissions for build context after generation
file:
path: "{{ airflow_master_dir }}"
state: directory
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
recurse: yes
become: yes
- name: Create dummy camoufox compose file for master to prevent errors
copy:
content: |
# This is a placeholder file.
# The master node does not run Camoufox, but the shared docker-compose-ytdlp-ops.yaml
# may unconditionally include this file, causing an error if it's missing.
# This file provides an empty services block to satisfy the include.
services: {}
dest: "{{ airflow_master_dir }}/configs/docker-compose.camoufox.yaml"
mode: "{{ file_permissions }}"
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
become: yes
- name: "Log: Starting YT-DLP management service" - name: "Log: Starting YT-DLP management service"
debug: debug:
@ -125,7 +82,6 @@
community.docker.docker_compose_v2: community.docker.docker_compose_v2:
project_src: "{{ airflow_master_dir }}" project_src: "{{ airflow_master_dir }}"
files: files:
- "configs/docker-compose-ytdlp-ops.yaml" - "docker-compose-ytdlp-ops.yaml"
state: present state: present
remove_orphans: true remove_orphans: true
pull: "{{ 'never' if fast_deploy | default(false) else 'missing' }}"

View File

@ -9,20 +9,11 @@
path: "{{ airflow_worker_dir }}" path: "{{ airflow_worker_dir }}"
state: directory state: directory
owner: "{{ ssh_user }}" owner: "{{ ssh_user }}"
group: "{{ deploy_group }}" group: ytdl
mode: '0755' mode: '0755'
become: yes become: yes
when: not worker_dir_stat.stat.exists when: not worker_dir_stat.stat.exists
- name: Ensure YT-DLP worker configs directory exists
file:
path: "{{ airflow_worker_dir }}/configs"
state: directory
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
mode: '0755'
become: yes
- name: "Log: Syncing YT-DLP service files" - name: "Log: Syncing YT-DLP service files"
debug: debug:
msg: "Syncing YT-DLP service components (config generator, envoy/camoufox templates) to the worker node." msg: "Syncing YT-DLP service components (config generator, envoy/camoufox templates) to the worker node."
@ -36,42 +27,37 @@
rsync_path: "sudo rsync" rsync_path: "sudo rsync"
rsync_opts: "{{ rsync_default_opts }}" rsync_opts: "{{ rsync_default_opts }}"
loop: loop:
- "airflow/docker-compose-ytdlp-ops.yaml.j2"
- "airflow/docker-compose.config-generate.yaml"
- "airflow/generate_envoy_config.py"
- "airflow/init-yt-service.sh"
- "airflow/envoy.yaml.j2"
- "airflow/camoufox" - "airflow/camoufox"
- name: Sync YT-DLP config generator to worker
synchronize:
src: "../airflow/generate_envoy_config.py"
dest: "{{ airflow_worker_dir }}/"
archive: yes
rsync_path: "sudo rsync"
rsync_opts: "{{ rsync_default_opts }}"
- name: Sync YT-DLP config files to worker
synchronize:
src: "../airflow/configs/{{ item }}"
dest: "{{ airflow_worker_dir }}/configs/"
archive: yes
recursive: yes
rsync_path: "sudo rsync"
rsync_opts: "{{ rsync_default_opts }}"
loop:
- "docker-compose-ytdlp-ops.yaml.j2"
- "docker-compose.config-generate.yaml"
- "envoy.yaml.j2"
- "docker-compose.camoufox.yaml.j2"
- name: Create .env file for YT-DLP worker service - name: Create .env file for YT-DLP worker service
template: template:
src: "../../templates/.env.j2" src: "../../templates/.env.worker.j2"
dest: "{{ airflow_worker_dir }}/.env" dest: "{{ airflow_worker_dir }}/.env"
mode: "{{ file_permissions }}" mode: "{{ file_permissions }}"
owner: "{{ ssh_user }}" owner: "{{ ssh_user }}"
group: "{{ deploy_group }}" group: ytdl
become: yes become: yes
vars: vars:
service_role: "worker" service_role: "worker"
server_identity: "ytdlp-ops-service-worker-{{ inventory_hostname }}" server_identity: "ytdlp-ops-service-worker-{{ inventory_hostname }}"
- name: Make YT-DLP service init script executable
file:
path: "{{ airflow_worker_dir }}/init-yt-service.sh"
mode: "0755"
become: yes
- name: Run YT-DLP service init script
shell:
cmd: "./init-yt-service.sh"
chdir: "{{ airflow_worker_dir }}"
become: yes
become_user: "{{ ssh_user }}"
- name: "Log: Generating YT-DLP service configurations" - name: "Log: Generating YT-DLP service configurations"
debug: debug:
@ -79,7 +65,7 @@
- name: Generate YT-DLP service configurations - name: Generate YT-DLP service configurations
shell: shell:
cmd: "docker compose --project-directory . --env-file .env -f configs/docker-compose.config-generate.yaml run --rm config-generator" cmd: "docker compose -f docker-compose.config-generate.yaml run --rm config-generator"
chdir: "{{ airflow_worker_dir }}" chdir: "{{ airflow_worker_dir }}"
become: yes become: yes
become_user: "{{ ssh_user }}" become_user: "{{ ssh_user }}"
@ -88,7 +74,6 @@
community.docker.docker_image: community.docker.docker_image:
name: "{{ ytdlp_ops_image }}" name: "{{ ytdlp_ops_image }}"
source: pull source: pull
when: not fast_deploy | default(false)
- name: "Log: Building Camoufox (remote browser) image" - name: "Log: Building Camoufox (remote browser) image"
debug: debug:
@ -101,16 +86,6 @@
path: "{{ airflow_worker_dir }}/camoufox" path: "{{ airflow_worker_dir }}/camoufox"
source: build source: build
force_source: true force_source: true
when: not fast_deploy | default(false)
- name: Ensure correct permissions for build context after generation
file:
path: "{{ airflow_worker_dir }}"
state: directory
owner: "{{ ssh_user }}"
group: "{{ deploy_group }}"
recurse: yes
become: yes
- name: "Log: Starting YT-DLP worker services" - name: "Log: Starting YT-DLP worker services"
debug: debug:
@ -120,8 +95,6 @@
community.docker.docker_compose_v2: community.docker.docker_compose_v2:
project_src: "{{ airflow_worker_dir }}" project_src: "{{ airflow_worker_dir }}"
files: files:
- "configs/docker-compose-ytdlp-ops.yaml" - "docker-compose-ytdlp-ops.yaml"
- "configs/docker-compose.camoufox.yaml"
state: present state: present
remove_orphans: true remove_orphans: true
pull: "{{ 'never' if fast_deploy | default(false) else 'missing' }}"

View File

@ -0,0 +1,14 @@
# This file is managed by Ansible.
AIRFLOW_UID={{ airflow_uid | default(1003) }}
AIRFLOW_GID=0
HOSTNAME={{ inventory_hostname }}
# Passwords
POSTGRES_PASSWORD={{ postgres_password }}
REDIS_PASSWORD={{ redis_password }}
AIRFLOW_ADMIN_PASSWORD={{ airflow_admin_password }}
# For DL workers, specify the master host IP
{% if 'worker' in service_role %}
MASTER_HOST_IP={{ master_host_ip }}
{% endif %}

View File

@ -0,0 +1,19 @@
HOSTNAME="{{ inventory_hostname }}"
REDIS_PASSWORD="{{ vault_redis_password }}"
POSTGRES_PASSWORD="{{ vault_postgres_password }}"
AIRFLOW_UID={{ airflow_uid }}
AIRFLOW_ADMIN_PASSWORD="{{ vault_airflow_admin_password }}"
YTDLP_BASE_PORT=9090
SERVER_IDENTITY=ytdlp-ops-service-mgmt
SERVICE_ROLE=management
AIRFLOW_GID=0
MINIO_ROOT_USER=admin
MINIO_ROOT_PASSWORD={{ vault_minio_root_password }}
AIRFLOW_VAR_MASTER_HOST_IP={{ hostvars[groups['airflow_master'][0]].ansible_host }}
# S3 Logging Configuration
AIRFLOW_VAR_S3_LOG_BUCKET=your-s3-bucket-name
AIRFLOW_VAR_S3_LOG_FOLDER=airflow-logs/master
AWS_ACCESS_KEY_ID={{ vault_aws_access_key_id | default('') }}
AWS_SECRET_ACCESS_KEY={{ vault_aws_secret_access_key | default('') }}
AWS_DEFAULT_REGION={{ aws_region | default('us-east-1') }}

View File

@ -0,0 +1,29 @@
HOSTNAME="{{ inventory_hostname }}"
MASTER_HOST_IP={{ hostvars[groups['airflow_master'][0]].ansible_host }}
REDIS_PASSWORD="{{ vault_redis_password }}"
POSTGRES_PASSWORD="{{ vault_postgres_password }}"
AIRFLOW_UID={{ airflow_uid }}
REDIS_HOST={{ hostvars[groups['airflow_master'][0]].ansible_host }}
REDIS_PORT=52909
SERVER_IDENTITY=ytdlp-ops-service-worker-{{ inventory_hostname }}
SERVICE_ROLE=worker
ENVOY_PORT=9080
ENVOY_ADMIN_PORT=9901
YTDLP_WORKERS=4
YTDLP_BASE_PORT=9090
CAMOUFOX_PROXIES={{ worker_proxies | join(',') }}
VNC_PASSWORD={{ vault_vnc_password }}
CAMOUFOX_BASE_VNC_PORT=5901
CAMOUFOX_PORT=12345
ACCOUNT_ACTIVE_DURATION_MIN=7
ACCOUNT_COOLDOWN_DURATION_MIN=30
MINIO_ROOT_USER=admin
MINIO_ROOT_PASSWORD={{ vault_minio_root_password }}
AIRFLOW_GID=0
# S3 Logging Configuration
AIRFLOW_VAR_S3_LOG_BUCKET=your-s3-bucket-name
AIRFLOW_VAR_S3_LOG_FOLDER=airflow-logs/workers/{{ inventory_hostname }}
AWS_ACCESS_KEY_ID={{ vault_aws_access_key_id | default('') }}
AWS_SECRET_ACCESS_KEY={{ vault_aws_secret_access_key | default('') }}
AWS_DEFAULT_REGION={{ aws_region | default('us-east-1') }}

View File

@ -1,48 +1,46 @@
# This file is managed by Ansible. # This file is managed by Ansible.
HOSTNAME="{{ inventory_hostname }}" HOSTNAME="{{ inventory_hostname }}"
SERVICE_ROLE={{ service_role }} SERVICE_ROLE={{ service_role }}
{% if server_identity is defined %}
SERVER_IDENTITY={{ server_identity }} SERVER_IDENTITY={{ server_identity }}
{% endif %}
# Passwords # Passwords
REDIS_PASSWORD="{{ vault_redis_password }}" REDIS_PASSWORD="{{ redis_password }}"
POSTGRES_PASSWORD="{{ vault_postgres_password }}" POSTGRES_PASSWORD="{{ postgres_password }}"
# Common settings # Common settings
AIRFLOW_UID={{ airflow_uid | default(1003) }} AIRFLOW_UID={{ airflow_uid | default(1003) }}
AIRFLOW_GID={{ deploy_group_gid | default(1001) }} AIRFLOW_GID=0
YTDLP_BASE_PORT={{ ytdlp_base_port }} YTDLP_BASE_PORT={{ ytdlp_base_port }}
REDIS_PORT={{ redis_port }}
# Master-specific settings # Master-specific settings
{% if 'master' in service_role or 'management' in service_role %} {% if 'master' in service_role %}
AIRFLOW_ADMIN_PASSWORD="{{ vault_airflow_admin_password }}" AIRFLOW_ADMIN_PASSWORD="{{ airflow_admin_password }}"
AIRFLOW_VAR_MASTER_HOST_IP={{ hostvars[groups['airflow_master'][0]].ansible_host }} MINIO_ROOT_USER=admin
MASTER_HOST_IP={{ hostvars[groups['airflow_master'][0]].ansible_host }} MINIO_ROOT_PASSWORD=0153093693-0009
# Camoufox is not used on master, but the config generator expects the variable.
CAMOUFOX_PROXIES=
{% endif %} {% endif %}
# Worker-specific settings # Worker-specific settings
{% if 'worker' in service_role %} {% if 'worker' in service_role %}
MASTER_HOST_IP={{ hostvars[groups['airflow_master'][0]].ansible_host }} MASTER_HOST_IP={{ master_host_ip }}
REDIS_HOST={{ master_host_ip }}
REDIS_PORT={{ redis_port }}
# --- Envoy & Worker Configuration --- # --- Envoy & Worker Configuration ---
ENVOY_PORT={{ envoy_port }} ENVOY_PORT={{ envoy_port }}
ENVOY_ADMIN_PORT={{ envoy_admin_port }} ENVOY_ADMIN_PORT={{ envoy_admin_port }}
MANAGEMENT_SERVICE_PORT={{ management_service_port }} MANAGEMENT_SERVICE_PORT={{ management_service_port }}
YTDLP_WORKERS=4 YTDLP_WORKERS=1
# --- Camoufox (Browser) Configuration --- # --- Camoufox (Browser) Configuration ---
CAMOUFOX_PROXIES="{{ (worker_proxies | default([])) | join(',') }}" CAMOUFOX_PROXIES="{{ camoufox_proxies }}"
VNC_PASSWORD="{{ vault_vnc_password }}" VNC_PASSWORD="{{ vnc_password }}"
CAMOUFOX_BASE_VNC_PORT={{ camoufox_base_vnc_port }} CAMOUFOX_BASE_VNC_PORT={{ camoufox_base_vnc_port }}
CAMOUFOX_PORT=12345 CAMOUFOX_PORT=12345
# --- General Proxy Configuration ---
SOCKS5_SOCK_SERVER_IP=172.17.0.1
# --- Account Manager Configuration --- # --- Account Manager Configuration ---
ACCOUNT_ACTIVE_DURATION_MIN=7 ACCOUNT_ACTIVE_DURATION_MIN=7
ACCOUNT_COOLDOWN_DURATION_MIN=30 ACCOUNT_COOLDOWN_DURATION_MIN=30
{% endif %} {% endif %}

View File

@ -1,8 +1,8 @@
master: master:
af-green: 89.253.221.173 af-test: 89.253.223.97
workers: workers:
dl003: dl001:
ip: 62.60.245.103 ip: 109.107.189.106
proxies: proxies:
- "socks5://sslocal-rust-1087:1087" - "socks5://sslocal-rust-1087:1087"

View File

@ -1,4 +0,0 @@
# This file is now auto-generated by tools/generate-inventory.py
# Do not edit put overrides in cluster.yml instead
dl-worker-001: 109.107.189.106

Some files were not shown because too many files have changed in this diff Show More