77 lines
7.6 KiB
Markdown

V2 System: Separated Auth & Download Flow
The v2 system splits the process into two distinct stages, each with its own set of queues. The base names for these queues are queue2_auth and queue2_dl.
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1. Authentication Stage (ytdlp_ops_v02_worker_per_url_auth)
This stage is responsible for taking a raw YouTube URL, authenticating with the yt-ops-server to get an info.json, and creating granular download tasks.
• Getting Data (Input):
• Queue: queue2_auth_inbox
• Redis Type: LIST
• Purpose: This is the main entry point for the entire v2 system. Raw YouTube URLs or video IDs are pushed here. The ytdlp_ops_v02_dispatcher_auth DAG pulls URLs from this list to start the process.
• Reporting Results:
• Success:
• Queue: queue2_auth_result (Redis HASH) - A success record for the authentication step is stored here.
• Queue: queue_dl_format_tasks (Redis LIST) - This is the critical handoff queue. Upon successful authentication, the auth worker resolves the desired formats (e.g., bestvideo+bestaudio) into specific format IDs (e.g., 299, 140) and pushes one JSON job payload for each format into this list. This queue
feeds the download stage.
• Failure:
• Queue: queue2_auth_fail (Redis HASH) - If the authentication fails due to a system error (like bot detection or a proxy failure), the error details are stored here.
• Skipped:
• Queue: queue2_auth_skipped (Redis HASH) - If the video is unavailable for a non-system reason (e.g., it's private, deleted, or geo-restricted), the URL is logged here. This is not considered a system failure.
• Tracking Tasks:
• Queue: queue2_auth_progress
• Redis Type: HASH
• Purpose: When an auth worker picks up a URL, it adds an entry to this hash to show that the URL is actively being processed. The entry is removed upon completion (success, failure, or skip).
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
2. Download Stage (ytdlp_ops_v02_worker_per_url_dl)
This stage is responsible for executing the download and probing of a single media format, based on the job created by the auth worker.
• Getting Data (Input):
• Queue: queue_dl_format_tasks
• Redis Type: LIST
• Purpose: The ytdlp_ops_v02_worker_per_url_dl DAG pulls granular job payloads from this list. Each payload contains everything needed to download a single format (the path to the info.json, the format ID, etc.).
• Reporting Results:
• Success:
• Queue: queue2_dl_result (Redis HASH) - A success record for the download of a specific format is stored here.
• Failure:
• Queue: queue2_dl_fail (Redis HASH) - If the download or probe fails, the error is logged here. As seen in ytdlp_mgmt_queues.py, these failed items can be requeued, which sends them back to queue2_auth_inbox to start the process over.
• Skipped:
• Queue: queue2_dl_skipped (Redis HASH) - Used for unrecoverable download errors (e.g., HTTP 403 Forbidden), similar to the auth stage.
• Tracking Tasks:
• Queue: queue2_dl_progress
• Redis Type: HASH
• Purpose: Tracks download tasks that are actively in progress.
Summary Table (V2)
Queue Name Pattern Redis Type Purpose
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
queue2_auth_inbox LIST Input for Auth: Holds raw YouTube URLs to be authenticated.
queue2_auth_progress HASH Tracks URLs currently being authenticated.
queue2_auth_result HASH Stores successful authentication results.
queue2_auth_fail HASH Stores failed authentication attempts.
queue2_auth_skipped HASH Stores URLs skipped due to content issues (private, deleted, etc.).
queue_dl_format_tasks LIST Input for Download: Holds granular download jobs (one per format) created by the auth worker.
queue2_dl_progress HASH Tracks download jobs currently in progress.
queue2_dl_result HASH Stores successful download results.
queue2_dl_fail HASH Stores failed download attempts.
queue2_dl_skipped HASH Stores downloads skipped due to unrecoverable errors (e.g., 403 Forbidden).
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
V1 System (Monolithic) for Contrast
For completeness, the older v1 system (ytdlp_ops_v01_worker_per_url) uses a simpler, monolithic set of queues, typically with the base name video_queue.
• Input: video_queue_inbox (Redis LIST)
• Results: video_queue_result, video_queue_fail, video_queue_skipped (all Redis HASHes)
• In-Progress: video_queue_progress (Redis HASH)
In this model, there is no handoff between stages; a single worker handles both authentication and download for all requested formats of a URL.