PeerTube Browser is a video discovery project for the federated PeerTube network. It crawls instances, builds a local dataset, creates ANN indexes for fast similarity search, and serves a web UI with recommendation feeds.
PeerTube has great content but weak discovery across the federation. This project tries to make exploration easier by aggregating public data and providing similarity-based recommendations.
- Crawler discovers instances via subscriptions and walks channels/videos.
- Filtering keeps instances that appear in the JoinPeerTube whitelist.
- Embeddings are built from video metadata (title, description, tags, channel, etc.).
- ANN index (FAISS) is created for fast similarity lookups.
- Server serves recommendations and metadata from the local DB/index.
- Client renders the feed and video pages.
See DATA_BUILD.md for the end-to-end steps to build the SQLite dataset and ANN index.
engine/: read/analytics workspace.engine/crawler/: crawler subsystem (part of Engine).engine/server/: read-only recommendation API + bridge ingest.client/frontend/: frontend app and static assets.client/backend/: client write/profile API that publishes normalized events to Engine.
The recommendation system is a mix of filtering + scoring:
- similarity to liked videos,
- freshness,
- popularity,
- layer mixing (explore/exploit/popular/random/fresh).
Likes are used as a signal to find similar content. Engine reads likes from the
current request context only (provided by Client/Frontend) and does not depend
on local engine/server/db/users.db for recommendation ranking. Bridge-ingested
events update aggregated interaction_signals, which are also used by ranking.
This is not a heavy ML system; it is a transparent, controllable pipeline.
| Concern | Owner | Contract | Forbidden coupling |
|---|---|---|---|
Public read API (/recommendations, /videos/{id}/similar, /videos/similar, /api/video, /api/health) |
Engine | Exposed by Engine HTTP API only. | Client backend importing Engine modules or reading Engine DB files directly. |
Browser-facing write/profile API (/api/user-action, /api/user-profile/*) |
Client backend | Exposed by Client backend only. | Moving write/profile ownership into Engine handlers. |
Browser-facing read gateway (/recommendations, /videos/similar, /api/video, /api/channels) |
Client backend | Frontend reads use Client API base and gateway routes only. | Direct frontend Engine API base usage. |
Internal Client->Engine read contract (/internal/videos/resolve, /internal/videos/metadata) |
Engine (provider), Client backend (consumer) | Client backend consumes these internal endpoints over HTTP. | Direct DB coupling instead of HTTP contract. |
Temporary bridge ingest (/internal/events/ingest) |
Engine (ingest), Client backend (publisher) | Client backend publishes normalized events to Engine ingest endpoint. | Frontend direct ingest calls or bypassing Client normalization path. |
Boundary guard policy:
- Client backend must not import
engine.server.*/engine.*internals and must not readengine/server/db/*directly. - Frontend runtime reads must stay Client-gateway only (no direct Engine API base or Engine internal route usage).
Installer topology:
- Service-specific installers:
engine/install-engine-service.sh(--mode prod|dev)client/install-client-service.sh(--mode prod|dev)
- Centralized mode installer (source of truth):
install-service.sh --mode prod|dev|all
Examples:
# Prod contour defaults to: --force + --with-updater-timer
sudo bash install-service.sh --mode prod
# Dev contour defaults to: --force + --uninstall
sudo bash install-service.sh --mode dev
# Centralized direct mode usage
sudo bash install-service.sh --mode prod --force --with-updater-timer
sudo bash install-service.sh --mode dev --force --uninstall
Uninstall topology:
- Service-specific uninstallers:
engine/uninstall-engine-service.sh(--mode prod|dev)client/uninstall-client-service.sh(--mode prod|dev)
- Centralized mode uninstaller (source of truth):
uninstall-service.sh --mode prod|dev|all
Examples:
# Keep updater state artifacts while uninstalling prod contour
sudo bash uninstall-service.sh --mode prod --keep-updater-state
# Purge updater state artifacts while uninstalling dev contour
sudo bash uninstall-service.sh --mode dev --purge-updater-state
# Centralized direct mode usage
sudo bash uninstall-service.sh --mode all --purge-updater-stateTwo smoke scripts are available:
- Installer/uninstaller matrix + runtime checks:
# Contract-only checks (safe, no system changes)
bash tests/run-installers-smoke.sh --dry-run-only
# Full dev contour install/uninstall verification (systemd + HTTP + e2e)
sudo bash tests/run-installers-smoke.sh --mode dev- Boundary/interaction checks with temporary local processes:
bash tests/run-arch-split-smoke.shrun-arch-split-smoke.sh starts Engine (7072) and Client (7272) locally,
runs boundary + bridge checks, aggregates errors, prints diagnostics, and always
stops started processes.
- Contract preflight checks:
tests/check-client-engine-boundary.shtests/check-frontend-client-gateway.shAny failure here is a hard fail and startup is aborted.
- Runtime readiness:
- Engine
/api/healthmust become200within timeout. - Client
/api/healthmust become200within timeout. Timeout/non-200 is a hard fail.
- Engine
- Boundary enforcement checks:
- Engine must reject Client-owned endpoints (
/api/user-profile,/api/user-action) with non-success. - Client gateway routes must answer successfully (
/api/channels,/recommendations,/videos/similar). Any unexpected status is a hard fail.
- Engine must reject Client-owned endpoints (
- Bridge flow checks:
- Seed extraction from client recommendations response must succeed (
uuid+host). - Client proxy
/api/videomust return200. - Client
/api/user-actionresponse must validateok=true,bridge_ok=true, and emptybridge_error. - Client
/api/user-profile/likesmust return a non-empty likes array after like action. Parse/validation mismatch is a hard fail.
- Seed extraction from client recommendations response must succeed (
- Engine users DB ownership guard:
- Engine process must not keep
engine/server/db/users.dbfile descriptor open. Any open FD match is a hard fail.
- Engine process must not keep
- Failure diagnostics:
- Script prints aggregated check/error summary and stores run/check/error logs and service log tails for troubleshooting.
- ActivityPub integration (receive new video events, send likes/comments).
- User accounts and server-side profiles (opt-in).
- Better discovery modes and ranking logic.
- Viewing modes (Hot / Popular / Random / Fresh) as separate feeds or tabs.
- User‑tunable recommendation settings (mix ratios, weights, or presets).
- Peer-to-peer communication between aggregators to share or refresh metadata.
- Far‑beyond‑the‑horizon experiments (collaborative indexing, distributed caches).
- etc
This project was started by a developer from Ukraine during the war. It is both a personal coping project and an attempt to improve discovery in federated media.
If you want to help, contributions are welcome. You can open issues or submit PRs.
If you want to support this project, here are quick options: