yt-dlp-dags/ytops_client-source/ytops_client/stress_policy/arg_parser.py

import argparse

def add_stress_policy_parser(subparsers):
    """Add the parser for the 'stress-policy' command."""
    parser = subparsers.add_parser(
        'stress-policy',
        description="The primary, policy-driven stress-testing orchestrator.\nIt runs complex, multi-stage stress tests based on a YAML policy file.\nUse '--list-policies' to see available pre-configured scenarios.\n\nModes supported:\n- full_stack: Generate info.json and then download from it.\n- fetch_only: Only generate info.json files.\n- download_only: Only download from existing info.json files.",
        formatter_class=argparse.RawTextHelpFormatter,
        help='Run advanced, policy-driven stress tests (recommended).',
        epilog="""
Examples:

1. Fetch info.jsons for a TV client with a single profile and a rate limit:
   ytops-client stress-policy --policy policies/1_fetch_only_policies.yaml \\
     --policy-name tv_downgraded_single_profile \\
     --set settings.urls_file=my_urls.txt \\
     --set execution_control.run_until.minutes=30
   # This runs a 'fetch_only' test using the 'tv_downgraded' client. It uses a single,
   # static profile for all requests and enforces a safety limit of 450 requests per hour.

2. Fetch info.jsons for an Android client using cookies for authentication:
   ytops-client stress-policy --policy policies/1_fetch_only_policies.yaml \\
     --policy-name android_sdkless_with_cookies \\
     --set settings.urls_file=my_urls.txt \\
     --set info_json_generation_policy.request_params.cookies_file_path=/path/to/my_cookies.txt
   # This demonstrates an authenticated 'fetch_only' test. It passes the path to a
   # Netscape cookie file, which the server will use for the requests.

3. Download from a folder of info.jsons, grouped by profile, with auto-workers:
   ytops-client stress-policy --policy policies/2_download_only_policies.yaml \\
     --policy-name basic_profile_aware_download \\
     --set settings.info_json_dir=/path/to/my/infojsons
   # This runs a 'download_only' test. It scans a directory, extracts profile names from
   # the filenames (e.g., 'tv_user_1' from '...-VIDEOID-tv_user_1.json'), and groups
   # them. 'workers=auto' sets the number of workers to the number of unique profiles found.

4. Full-stack test with multiple workers and profile rotation:
   ytops-client stress-policy --policy policies/3_full_stack_policies.yaml \\
     --policy-name tv_simply_profile_rotation \\
     --set settings.urls_file=my_urls.txt \\
     --set execution_control.workers=4 \\
     --set settings.profile_management.max_requests_per_profile=500
   # This runs a 'full_stack' test with 4 parallel workers. Each worker gets a unique
   # profile (e.g., tv_simply_user_0_0, tv_simply_user_1_0, etc.). After a profile is
   # used 500 times, it is retired, and a new "generation" is created (e.g., tv_simply_user_0_1).

5. Full-stack authenticated test with a pool of profiles and corresponding cookie files:
   ytops-client stress-policy --policy policies/3_full_stack_policies.yaml \\
     --policy-name mweb_multi_profile_with_cookies \\
     --set settings.urls_file=my_urls.txt \\
     --set settings.profile_management.cookie_files='["/path/c1.txt","/path/c2.txt"]'
   # This runs a 'full_stack' test using a pool of profiles (e.g., mweb_user_0, mweb_user_1).
   # It uses the 'cookie_files' list to assign a specific cookie file to each profile in the
   # pool, enabling multi-account authenticated testing. Note the JSON/YAML list format for the override.

6. Full-stack test submitting downloads to an aria2c RPC server:
   ytops-client stress-policy --policy policies/3_full_stack_policies.yaml \\
     --policy-name tv_simply_profile_rotation_aria2c_rpc \\
     --set settings.urls_file=my_urls.txt \\
     --set download_policy.aria_host=192.168.1.100 \\
     --set download_policy.aria_port=6801
   # This runs a test where downloads are not performed by the worker itself, but are
   # sent to a remote aria2c daemon. The policy specifies 'downloader: aria2c_rpc'
   # and provides connection details. This is useful for offloading download traffic.

--------------------------------------------------------------------------------
Overridable Policy Parameters via --set:

  Key                                     Description
  --------------------------------------  ------------------------------------------------
  [settings]
  settings.mode                           Test mode: 'full_stack', 'fetch_only', or 'download_only'.
  settings.urls_file                      Path to file with URLs/video IDs.
  settings.info_json_dir                  Path to directory with existing info.json files.
  settings.profile_extraction_regex       For 'download_only' stats, a regex to extract profile names from info.json filenames. The first capture group is used as the profile name. E.g., '.*-(.*?).json'.
  settings.info_json_dir_sample_percent   Randomly sample this %% of files from the directory (for 'once' scan mode).
  settings.directory_scan_mode            For 'download_only': 'once' (default) or 'continuous' to watch for new files.
  settings.mark_processed_files           For 'continuous' scan mode: if true, rename processed files to '*.<timestamp>.processed' to avoid reprocessing.
  settings.max_files_per_cycle            For 'continuous' scan mode: max new files to process per cycle.
  settings.sleep_if_no_new_files_seconds  For 'continuous' scan mode: seconds to sleep if no new files are found (default: 10).
  settings.profile_prefix                 (Legacy) Prefix for profile names (e.g., 'test_user').
  settings.profile_pool                   (Legacy) Size of the profile pool.
  settings.profile_mode                   Profile strategy. 'per_request' (legacy), 'per_worker' (legacy), or 'per_worker_with_rotation' (requires profile_management).
  settings.info_json_script               Command to run the info.json generation script (e.g., 'bin/ytops-client get-info').
  settings.save_info_json_dir             If set, save all successfully generated info.json files to this directory.

  [settings.profile_management] (New, preferred method for profile control)
  profile_management.prefix               Prefix for profile names (e.g., 'dyn_user').
  profile_management.suffix               Suffix for profile names. Set to 'auto' for a timestamp, or provide a string.
  profile_management.initial_pool_size    The number of profiles to start with.
  profile_management.auto_expand_pool     If true, create new profiles when the initial pool is exhausted (all sleeping).
  profile_management.max_requests_per_profile  Max requests a profile can make before it must 'sleep'.
  profile_management.sleep_minutes_on_exhaustion  How many minutes a profile 'sleeps' after hitting its request limit.
  profile_management.cookie_files         A list of paths to cookie files. Used to assign a unique cookie file to each profile in a pool.

  [execution_control]
  execution_control.workers               Number of parallel worker threads. Set to "auto" to calculate from target_rate.
  execution_control.target_rate.requests  Target requests for 'auto' workers calculation.
  execution_control.target_rate.per_minutes Period in minutes for target_rate.
  execution_control.run_until.minutes     Stop test after N minutes. Will continuously cycle through sources.
  execution_control.run_until.cycles      Stop test after N cycles. A cycle is one full pass through all sources.
  execution_control.run_until.requests    Stop test after N total info.json requests (cumulative across runs).
  execution_control.sleep_between_tasks.{min,max}_seconds  Min/max sleep time between tasks, per worker.

  [info_json_generation_policy]
  info_json_generation_policy.client      Client to use (e.g., 'mweb', 'tv_camoufox').
  info_json_generation_policy.auth_host   Host for the auth/Thrift service.
  info_json_generation_policy.auth_port   Port for the auth/Thrift service.
  info_json_generation_policy.assigned_proxy_url A specific proxy to use for a request, overriding the server's proxy pool.
  info_json_generation_policy.proxy_rename  Regex substitution for the assigned proxy URL (e.g., 's/old/new/').
  info_json_generation_policy.command_template  A full command template for the info.json script. Overrides other keys.
  info_json_generation_policy.rate_limits.per_ip.max_requests   Max requests for the given time period from one IP.
  info_json_generation_policy.rate_limits.per_ip.per_minutes    Time period in minutes for the per_ip rate limit.
  info_json_generation_policy.rate_limits.per_profile.max_requests Max requests for a single profile in a time period.
  info_json_generation_policy.rate_limits.per_profile.per_minutes Time period in minutes for the per_profile rate limit.
  info_json_generation_policy.client_rotation_policy.major_client   The primary client to use for most requests.
  info_json_generation_policy.client_rotation_policy.refresh_client The client to use periodically to refresh context.
  info_json_generation_policy.client_rotation_policy.refresh_every.requests  Trigger refresh client after N requests for a profile.

  [download_policy]
  download_policy.formats                 Formats to download (e.g., '18,140', 'random:50%%').
  download_policy.downloader              Orchestrator script to use: 'native-py' (default, Python lib), 'native-cli' (legacy CLI wrapper), or 'aria2c_rpc'.
  download_policy.external_downloader     For 'native-py' or default, the backend yt-dlp should use (e.g., 'aria2c', 'native').
  download_policy.downloader_args         Arguments for the external_downloader. For yt-dlp, e.g., 'aria2c:-x 8'.
  download_policy.merge_output_format     Container to merge to (e.g., 'mkv'). Defaults to 'mp4' via cli.config.
  download_policy.temp_path               For 'native-py', path to a directory for temporary files (e.g., a RAM disk like /dev/shm).
  download_policy.output_to_buffer        For 'native-py', download to an in-memory buffer and pipe to stdout instead of saving to a file (true/false). Best for single-file formats.
  download_policy.proxy                   Proxy for direct downloads (e.g., "socks5://127.0.0.1:1080").
  download_policy.proxy_rename            Regex substitution for the proxy URL (e.g., 's/old/new/').
  download_policy.pause_before_download_seconds Pause for N seconds before starting each download attempt.
  download_policy.continue_downloads      Enable download continuation (true/false).
  download_policy.cleanup                 After success, replace downloaded media file with a zero-byte '.empty' file.
  download_policy.run_ffprobe             After success, run ffprobe on the media file and save stream info to a .ffprobe.json file.
  download_policy.extra_args              A string of extra arguments for the download script (e.g., "--limit-rate 5M").
  download_policy.sleep_per_proxy_seconds Cooldown in seconds between downloads on the same proxy.
  download_policy.rate_limits.per_proxy.max_requests Max downloads for a single proxy in a time period.
  download_policy.rate_limits.per_proxy.per_minutes Time period in minutes for the per_proxy download rate limit.
  # For downloader: 'aria2c_rpc'
  download_policy.aria_host               Hostname of the aria2c RPC server.
  download_policy.aria_port               Port of the aria2c RPC server.
  download_policy.aria_secret             Secret token for the aria2c RPC server.
  download_policy.aria_wait               Wait for aria2c downloads to complete (true/false).
  download_policy.purge_on_complete       On success, purge ALL completed/failed downloads from aria2c history. Use as a workaround for older aria2c versions where targeted removal fails.
  download_policy.output_dir              Output directory for downloads.
  download_policy.aria_remote_dir         The absolute download path on the remote aria2c host.
  download_policy.aria_fragments_dir      The local path to find fragments for merging (if different from output_dir).
  download_policy.auto_merge_fragments    For fragmented downloads, automatically merge parts after download (true/false). Requires aria_wait=true.
  download_policy.remove_fragments_after_merge For fragmented downloads, delete fragment files after a successful merge (true/false). Requires auto_merge_fragments=true.

  [stop_conditions]
  stop_conditions.on_failure              Stop on any download failure (true/false).
  stop_conditions.on_http_403             Stop on any HTTP 403 error (true/false).
  stop_conditions.on_error_rate.max_errors  Stop test if more than N errors (of any type) occur within the time period.
  stop_conditions.on_error_rate.per_minutes Time period in minutes for the error rate calculation.
  stop_conditions.fatal_error_patterns A list of regex patterns. Errors matching these are always considered fatal and count towards 'on_error_rate', even if they also match a tolerated pattern.
  stop_conditions.tolerated_error_patterns A list of regex patterns. Fetch errors matching these will be ignored by 'on_error_rate'.
  stop_conditions.on_cumulative_403.max_errors Stop test if more than N HTTP 403 errors occur within the time period.
  stop_conditions.on_cumulative_403.per_minutes Time period in minutes for the cumulative 403 calculation.
  stop_conditions.on_quality_degradation.trigger_if_missing_formats A format ID or comma-separated list of IDs. Triggers if any are missing.
  stop_conditions.on_quality_degradation.max_triggers Stop test if quality degradation is detected N times.
  stop_conditions.on_quality_degradation.per_minutes Time period in minutes for the quality degradation calculation.
--------------------------------------------------------------------------------
"""
    )
    parser.add_argument('--policy', help='Path to the YAML policy file. Required unless --list-policies is used.')
    parser.add_argument('--policy-name', help='Name of the policy to run from a multi-policy file (if it contains "---" separators).')
    parser.add_argument('--list-policies', action='store_true', help='List all available policies from the default policies directory and exit.')
    parser.add_argument('--show-overrides', action='store_true', help='Load the specified policy and print all its defined values as a single-line of --set arguments, then exit.')
    parser.add_argument('--set', action='append', default=[], help="Override a policy setting using 'key.subkey=value' format.\n(e.g., --set execution_control.workers=5)")
    parser.add_argument('--profile-prefix', help="Shortcut to override the profile prefix for profile locking mode. Affects both auth and download stages.")
    parser.add_argument('--start-from-url-index', type=int, help='Start processing from this line number (1-based) in the urls_file. Overrides saved state.')
    parser.add_argument('--expire-time-shift-minutes', type=int, help="Consider URLs expiring in N minutes as expired. Overrides policy.")

    # Add a group for aria2c-specific overrides for clarity in --help
    aria_group = parser.add_argument_group('Aria2c RPC Downloader Overrides', 'Shortcuts for common --set options for the aria2c_rpc downloader.')
    aria_group.add_argument('--auto-merge-fragments', action=argparse.BooleanOptionalAction, default=None, help='Shortcut to enable/disable download_policy.auto_merge_fragments.')
    aria_group.add_argument('--remove-fragments-after-merge', action=argparse.BooleanOptionalAction, default=None, help='Shortcut to enable/disable download_policy.remove_fragments_after_merge.')
    aria_group.add_argument('--fragments-dir', help='Shortcut for --set download_policy.aria_fragments_dir=PATH.')
    aria_group.add_argument('--remote-dir', help='Shortcut for --set download_policy.aria_remote_dir=PATH.')
    aria_group.add_argument('--cleanup', action=argparse.BooleanOptionalAction, default=None, help='Shortcut to enable/disable download_policy.cleanup.')

    parser.add_argument('--verbose', action='store_true', help='Enable verbose output for the orchestrator and underlying scripts.')
    parser.add_argument('--print-downloader-log', action='store_true', help='Stream the live stdout/stderr from the download subprocess to the console.')
    parser.add_argument('--dry-run', action='store_true', help='Print the effective policy and exit without running the test.')
    parser.add_argument('--dummy', action='store_true', help='Simulate auth and download without running external commands. Used to test profile management logic.\nDummy behavior (e.g., failure rates, durations) can be configured in the policy file under settings.dummy_simulation_settings.')
    parser.add_argument('--dummy-batch', action='store_true', help="[Dummy Mode] Simulate batch modes ('direct_batch_cli', 'direct_docker_cli') by creating dummy info.json files without running yt-dlp. Updates profile counters for each simulated URL.")
    parser.add_argument('--dummy-auth-failure-rate', type=float, default=0.0, help='[Dummy Mode] The probability (0.0 to 1.0) of a simulated auth request failing fatally.')
    parser.add_argument('--dummy-auth-skipped-failure-rate', type=float, default=0.0, help='[Dummy Mode] The probability (0.0 to 1.0) of a simulated auth request having a tolerated failure (e.g., 429).')
    parser.add_argument('--disable-log-writing', action='store_true', help='Disable writing state, stats, and log files. By default, files are created for each run.')
    parser.add_argument('--requeue-failed', action='store_true', help='[Queue Modes] Requeue all tasks from the failure queues back into the inbox before starting.')

    # Add a group for download-specific utilities
    download_util_group = parser.add_argument_group('Download Mode Utilities')
    download_util_group.add_argument('--pre-cleanup-media', nargs='?', const='.', default=None,
                                     help='Before running, delete media files (.mp4, .m4a, .webm, etc.) from a directory. '
                                          'If a path is provided, cleans that directory. '
                                          'If used without a path, cleans the directory specified in download_policy.output_dir or direct_docker_cli_policy.docker_host_download_path. '
                                          'If no output_dir is set, it fails.')
    download_util_group.add_argument('--run-ffprobe', action=argparse.BooleanOptionalAction, default=None,
                                     help='After a successful download, run ffprobe to generate a stream info JSON file. '
                                          'Overrides download_policy.run_ffprobe.')
    download_util_group.add_argument('--reset-local-cache-folder', nargs='?', const='.', default=None,
                                     help="Before running, delete the contents of the local cache folder used by direct_docker_cli mode. "
                                          "The cache folder is defined by 'direct_docker_cli_policy.docker_host_cache_path' in the policy. "
                                          "This is useful for forcing a fresh start for cookies, user-agents, etc. "
                                          "If a path is provided, cleans that directory instead of the one from the policy.")
    download_util_group.add_argument('--reset-infojson', action='store_true',
                                     help="Before running, reset all '.processed' and '.LOCKED' info.json files in the source directory "
                                          "back to '.json', allowing them to be re-processed.")

    # Add a group for Redis connection settings
    redis_group = parser.add_argument_group('Redis Connection Overrides (for profile locking mode)')
    redis_group.add_argument('--env-file', help='Path to a .env file to load environment variables from.')
    redis_group.add_argument('--redis-host', default=None, help='Redis host. Defaults to REDIS_HOST or MASTER_HOST_IP env var, or localhost.')
    redis_group.add_argument('--redis-port', type=int, default=None, help='Redis port. Defaults to REDIS_PORT env var, or 6379.')
    redis_group.add_argument('--redis-password', default=None, help='Redis password. Defaults to REDIS_PASSWORD env var.')
    redis_group.add_argument('--redis-db', type=int, default=None, help='Redis DB number. Defaults to REDIS_DB env var, or 0.')
    redis_group.add_argument('--env', default=None, help="Default environment name for Redis key prefix (e.g., 'stg', 'prod'). Used if --auth-env or --download-env are not specified. Overrides policy file setting.")
    redis_group.add_argument('--auth-env', help="Override the environment for the Auth simulation. Overrides --env.")
    redis_group.add_argument('--download-env', help="Override the environment for the Download simulation. Overrides --env.")
    redis_group.add_argument('--key-prefix', default=None, help='Explicit key prefix for Redis. Overrides --env and any defaults.')

    return parser