Skip to content

Implement multi-threading for concurrant downloads #49

@nazwr

Description

@nazwr

Goal

Enable concurrent downloads using multi-threading to significantly speed up downloading multiple shows or tracks.

Current Behavior

Downloads happen sequentially:

Proposed Behavior

Use Ruby threading to download multiple items concurrently:

  • Download multiple shows in parallel (e.g., 3 shows at once)
  • Download multiple tracks in parallel within a show (e.g., 5 tracks at once)
  • Configurable thread pool size (e.g., --threads 5)

Technical Approach

Phase 1: Parallel tracks within a show

# In Show#download_tracks or similar
require 'thread/pool' # or similar gem

pool = Thread.pool(thread_count)
@tracks.each do |track|
  pool.process do
    downloader.get(base_url, track)
  end
end
pool.shutdown

Phase 2: Parallel shows

# In DeadList#run when handling multiple shows
show_ids.each do |show_id|
  pool.process do
    # Create show, download tracks
  end
end

Considerations

Thread Safety:

  • Logger is thread-safe by default ✓
  • File I/O needs to ensure unique filenames (already handled) ✓
  • Network connections (HTTParty) are thread-safe ✓

Error Handling:

  • Errors in one thread shouldn't crash others
  • Collect and report errors after all threads complete
  • Maintain existing error messages

Resource Limits:

  • Don't overwhelm archive.org servers (be a good citizen)
  • Reasonable default thread count (3-5)
  • Allow users to configure via --threads N flag

Progress Tracking:

Dependencies

Implementation Phases

  1. Basic threading for tracks - Parallel track downloads within a show
  2. Add --threads flag - Let users control concurrency
  3. Parallel shows - Download multiple shows concurrently
  4. Polish - Error aggregation, proper shutdown handling

Benefits

  • 3-5x speedup for shows with many tracks
  • Much faster batch downloads of multiple shows
  • Better utilization of available bandwidth
  • Still respectful to archive.org (configurable limits)

Related Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions