简体中文 | English
Public resource routing engine purpose-built for AI Agents.
Multi-source discovery -> intelligent ranking -> verified delivery.
Quarry is a resource discovery engine designed to be called by AI Agents (Hermes, OpenClaw, etc.).
It doesn't download files. It finds the best public routes (cloud drive links, magnet URIs, ebook pages) across 28 sources, ranks them by quality, verifies liveness, and returns structured JSON.
User: "Find me Oppenheimer 4K resources"
Agent translates -> hunt.py search "Oppenheimer 2023" --4k --json
Engine returns:
OK Top 1: Oppenheimer.2023.2160p.BluRay.REMUX -> aliyun link (verified alive)
OK Top 2: Oppenheimer.2023.1080p.WEB-DL -> magnet (42 seeders)
Suppressed: Oppenheimer.CAM.720p -> risky quality
28 source adapters across 3 channels:
| Channel | Sources | What they cover |
|---|---|---|
| Cloud Drive | upyunso, pansou, ps.252035, panhunt | Aliyun, Quark, Baidu, 115, PikPak, Lanzou, etc. |
| Torrent | torznab, nyaa, dmhy, bangumi_moe, eztv, torrentgalaxy, bitsearch, tpb, yts, 1337x, limetorrents, torlock, fitgirl, torrentmac, ext_to | Movies, TV, anime, games, music, macOS apps |
| Book | annas (Anna's Archive), libgen (Library Genesis) | PDF, EPUB, MOBI, academic papers, fiction & non-fiction |
- Title-family matching: canonical, phrase, token overlap scoring
- Quality parsing: resolution, codec, HDR, source type, lossless audio
- Category-aware: different scoring weights for movie/TV/anime/music/software/book
- Confidence tiers:
top->related->risky(suppressed by default)
Cloud drive links die constantly. The engine auto-probes before delivery:
| Provider | Method | Result |
|---|---|---|
| Aliyun (AliDrive) | Anonymous share API | alive / cancelled |
| Quark (Quark Drive) | Share token API | alive / expired |
| Baidu (Baidu Netdisk) | Page dead-signal detection | alive / removed |
Dead links are auto-demoted to risky tier and never shown in text output.
Priority chain: httpx -> curl_cffi -> urllib
Install curl-cffi to bypass DDoS-Guard and similar TLS fingerprint checks. Zero config, auto-detected.
Public video URL -> metadata extraction -> optional download:
hunt.py video probe "https://www.bilibili.com/video/BV..."
hunt.py video download "https://youtu.be/..." bestOn-demand subtitle discovery (user-initiated, not automatic):
hunt.py subtitle "Breaking Bad" --season 1 --episode 1 --lang zh,en --jsonSources: SubDL (multilingual), SubHD (Chinese), Jimaku (Japanese anime).
git clone https://github.com/taffy-owo/quarry.git
cd quarry
# Zero dependencies for basic search
python3 scripts/hunt.py search "Oppenheimer 2023" --4k
# Optional performance extras
pip install httpx # HTTP/2 + connection pooling
pip install pycryptodome # Upyunso encrypted API
pip install curl-cffi # TLS fingerprint impersonation# Movies
python3 scripts/hunt.py search "Oppenheimer 2023" --4k --json
python3 scripts/hunt.py search "Oppenheimer 2023" --4k --json --explain
# TV Shows
python3 scripts/hunt.py search "Breaking Bad S05E16" --tv
# Anime
python3 scripts/hunt.py search "Kamiina Botan" --anime
# Music (lossless)
python3 scripts/hunt.py search "Jay Chou Fantasy FLAC" --music
# Software
python3 scripts/hunt.py search "Adobe Photoshop 2024" --software --channel pan
# Books
python3 scripts/hunt.py search "Clean Code epub" --book
# Skip pan link probing (faster, but may include dead links)
python3 scripts/hunt.py search "Interstellar 2014" --no-probepython3 scripts/hunt.py sources --probe --json # Source health check
python3 scripts/hunt.py doctor --json # System diagnostics + source_health metrics
python3 scripts/hunt.py benchmark # Offline precision benchmark
python3 scripts/hunt.py cache stats --json # Cache statistics
python3 scripts/hunt.py source validate local/sources/my_source.py --jsonUpdating is safe regardless of how you installed:
# Git users: just pull
cd quarry && git pull
# ZIP users: download new ZIP, extract over the old folder
# (or delete and re-extract, both work)Auto-cleanup: On first run after an update, the engine automatically detects and removes deprecated files from previous versions. No manual cleanup needed, even if you extract a ZIP on top of an old installation.
All user customizations go in the local/ directory, a safe zone that is never overwritten by updates:
local/
├── sources/ # Drop custom SourceAdapter .py files here (auto-discovered)
├── config.json # Override ranking weights
└── .env # Override environment variables (takes priority over root .env)
Validate custom adapters before relying on them:
python3 scripts/hunt.py source validate local/sources/my_tracker.py --jsondoctor --json also exposes adaptive source-health rows such as success_rate_24h, median_latency_ms, result_yield, top_hit_rate, and recommended_query_budget. Quarry uses these cached metrics conservatively to prefer strong sources and reduce query budget for weak or recently failing ones.
25 of 28 sources work out of the box. 3 optional sources need credentials for extra coverage:
| Source | Env Variable | How to Get |
|---|---|---|
| ps.252035 / panhunt | PANSOU_TOKEN |
Register at linux.do, login at so.252035.xyz, copy JWT from browser cookies |
| pansou (self-hosted) | PANSOU_API_URL |
Deploy fish2018/pansou, set your instance URL |
| torznab (Jackett) | TORZNAB_URL + TORZNAB_APIKEY |
Install Jackett, copy API key from dashboard |
Add credentials to .env or local/.env:
PANSOU_TOKEN=eyJhbGciOiJIUzI1NiIs...
TORZNAB_URL=http://localhost:9117/api/v2.0/indexers/all/results/torznab
TORZNAB_APIKEY=your-api-keySee
references/sources.mdfor detailed step-by-step instructions. Custom source adapters, ranking tweaks, and env variables inlocal/are update-proof.git pulland ZIP updates both leave this directory untouched.
Quarry is designed as an AI Agent skill. It's meant to be called by Agents, not used directly by humans.
Agent config files are in agents/:
# agents/hermes.yaml - Agent instructions include:
# - Query translation workflow (CJK -> English)
# - Category-specific routing guidance
# - Result interpretation (link_alive, tiers, penalties)
# - Available command referenceSKILL.md is the Agent-readable skill contract:
- When to use: public resource discovery, release comparison, video probing
- Query normalization: Agent should translate to English/romanized titles first; the engine has a best-effort CJK alias fallback for movie/TV/anime/general queries
- Result interpretation: how to read
link_alive,tier,penalties - Category routing: which sources fire first for each content type
- 13 agent rules: ordering, fallback behavior, format hints
python3 scripts/hunt.py search "Oppenheimer 2023" --json
python3 scripts/hunt.py search "Oppenheimer 2023" --json --explain{
"schema_version": "3",
"query": "Oppenheimer 2023",
"results": [
{
"tier": "top",
"title": "Oppenheimer.2023.2160p.BluRay.REMUX.HEVC.DTS-HD",
"link_or_magnet": "https://alipan.com/s/...",
"provider": "aliyun",
"source": "upyunso",
"source_health": {
"link_alive": true,
"link_probe_reason": "share active"
},
"quality": "2160p BluRay REMUX HDR",
"confidence": 0.95,
"match_bucket": "exact_title_family",
"canonical_identity": "movie:oppenheimer:2023"
}
]
}With --explain, the response includes an agent-readable explanation:
{
"explain": {
"why_top": ["exact title-family match", "year matched 2023"],
"why_not_others": ["candidate X demoted: dead pan link"]
}
}Key fields for Agents:
| Field | Meaning |
|---|---|
tier |
top = high confidence, related = decent, risky = unreliable |
source_health.link_alive |
true = verified, false = dead (skip it), null = unknown |
confidence |
0.0 to 1.0 match confidence score |
match_bucket |
exact_title_family, title_family_match, weak_context_match, etc. |
canonical_identity |
Deduplication key (e.g. movie:oppenheimer:2023) |
flowchart LR
Q["Query"] --> I["Intent\nParsing"]
I --> A["Alias\nResolver"]
A --> S["Multi-Source\nFan-out"]
S --> N["Normalize"]
N --> D["Dedup"]
D --> P{"Pan Probe"}
P --> R{"Ranking"}
R -->|"Top / Related"| Out["JSON / Text"]
R -->|"Risky"| Sup("Suppressed")
style P fill:#d4af37,stroke:#aa7c11,color:#000,stroke-width:2px
style R fill:#d4af37,stroke:#aa7c11,color:#000,stroke-width:2px
style Out fill:#10b981,stroke:#059669,color:#fff,stroke-width:2px
style Sup fill:#ef4444,stroke:#b91c1c,color:#fff
| Category | Primary -> Fallback | Key Signal |
|---|---|---|
| Movie | Pan -> YTS/TorrentGalaxy/TPB -> 1337x | Year match |
| TV | EZTV/TorrentGalaxy/TPB -> Pan | S{XX}E{XX} |
| Anime | Nyaa/DMHY/Bangumi Moe -> Pan | Romanized title |
| Book | Anna's Archive -> Pan -> 1337x/TorLock | Format (pdf/epub) |
| Music | Pan -> DMHY/Nyaa (noise-filtered) | Lossless tags |
| Software | Pan -> FitGirl/TorrentMac/TorrentGalaxy | Platform hint |
quarry/
├── scripts/
│ ├── hunt.py # CLI entrypoint
│ └── quarry/
│ ├── engine.py # Search orchestration
│ ├── intent.py # Query -> Intent -> SearchPlan
│ ├── ranking.py # Scoring, tiers, deduplication
│ ├── pan_probe.py # Cloud drive link viability probe
│ ├── parsers.py # Release tag parsing (resolution, codec, HDR)
│ ├── config.py # RankingConfig weights
│ ├── cache.py # SQLite WAL cache
│ ├── source_validation.py # Custom SourceAdapter contract validator
│ ├── video_core.py # Public video pipeline (yt-dlp)
│ ├── subdl.py / subhd.py / jimaku.py # Subtitle sources
│ └── sources/ # 28 source adapters
│ ├── base.py # HTTPClient (httpx -> curl_cffi -> urllib)
│ ├── upyunso.py # Cloud drive aggregator (AES encrypted API)
│ ├── pansou.py # PanSou self-hosted pan aggregation API
│ ├── nyaa.py # Anime torrents (RSS)
│ ├── dmhy.py # 動漫花園 Chinese anime community (RSS)
│ ├── bangumi_moe.py # Bangumi Moe anime torrents (JSON API)
│ ├── torrentgalaxy.py # TorrentGalaxy general tracker (RARBG alt)
│ ├── torlock.py # TorLock verified torrents
│ ├── ext_to.py # EXT.to modern magnet search
│ ├── annas.py # Anna's Archive books (HTML scraper)
│ ├── torznab.py # Jackett/Prowlarr meta-indexer
│ └── ... # eztv, bitsearch, tpb, yts, 1337x, etc.
├── agents/
│ ├── hermes.yaml # Hermes Agent skill config
│ └── openclaw.yaml # OpenClaw Agent skill config
├── local/ # User safe zone (gitignored contents)
│ ├── sources/ # Custom source adapters (auto-discovered)
│ ├── config.json # Ranking weight overrides
│ └── .env # Environment variable overrides
├── tests/ # 39 unit, precision, CLI, video, and benchmark tests
├── references/ # Architecture, usage, source docs
├── SKILL.md # Agent-readable skill contract
├── CHANGELOG.md
└── pyproject.toml
| What this does | What this doesn't do |
|---|---|
| Find public download routes | Download files |
| Rank results by quality | Bypass DRM or logins |
| Verify cloud drive link liveness | Access private trackers |
| Provide structured JSON for Agents | Guarantee legality or permanence |
| Component | Dependency | Required? |
|---|---|---|
| Core search | Python 3.10+ | Yes |
| HTTP acceleration | httpx |
Optional |
| TLS impersonation | curl-cffi |
Optional |
| Upyunso API | pycryptodome |
Optional |
| Video pipeline | yt-dlp + ffmpeg |
Optional |
AI coding agents: Read
CONTRIBUTING.mdbefore making any changes.
User customizations go inlocal/, not inscripts/.
# Run benchmark before PR
python3 scripts/hunt.py benchmark
# Run tests
python -m pytest tests/ -vMIT-0, no attribution required.
If you encounter any bugs, have feature requests, or need help with custom source adapters, please open an issue on GitHub. Pull requests are also highly welcome!