A Streamlit-based web application for importing, categorizing, and managing YouTube videos from your personalized home page feed. Uses AI (OpenAI) to automatically summarize video transcripts and categorize videos.
Note: This project is experimental. Some functionality is currently commented out, including:
- Video blurb generation
- Theme extraction
- Ollama/local LLM support (code is present but disabled in favor of OpenAI)
- Import videos from your YouTube home page using Selenium automation
- Automatic subtitle/transcript extraction via pytubefix
- AI-powered video summarization using OpenAI
- Automatic video categorization (with optional free-form category generation)
- Category management interface
- Progress tracking for partially watched videos
- Hide videos you're not interested in
- Daily PostgreSQL database backups
- Python 3.10+
- PostgreSQL database
- Google Chrome or Chromium browser (for Selenium)
- OpenAI API key
- YouTube account credentials
cd /path/to/youtuberpython3 -m venv venv
source venv/bin/activatepip install -r requirements.txtYou may also need to install lxml for HTML parsing:
pip install lxmlCreate a PostgreSQL database for the application:
# Connect to PostgreSQL
sudo -u postgres psql
# Create the database
CREATE DATABASE youtuber;
# Create a user (optional, if not using default postgres user)
CREATE USER youtuber_user WITH PASSWORD 'your_password';
GRANT ALL PRIVILEGES ON DATABASE youtuber TO youtuber_user;
# Exit psql
\qThe application will automatically create the required videos table on first run.
Copy the sample environment file and edit it with your credentials:
cp .env.sample .envEdit .env with your settings:
| Variable | Description |
|---|---|
MODEL |
OpenAI model to use (e.g., gpt-4o-mini) |
MAX_TOKENS |
Maximum tokens for AI responses (e.g., 8192) |
OPENAI_API_KEY |
Your OpenAI API key |
YOUTUBE_USERNAME |
Your YouTube/Google account email |
YOUTUBE_PASSWORD |
Your YouTube/Google account password |
ALLOW_ANY_CATEGORY |
True to let AI create new categories, False to stop generating new categories and only use those plus the CATEGORIES list |
CATEGORIES |
Comma-separated list of predefined categories |
POSTGRES_DB |
PostgreSQL database name |
POSTGRES_USER |
PostgreSQL username |
POSTGRES_PASSWORD |
PostgreSQL password |
POSTGRES_HOST |
PostgreSQL host (e.g., localhost) |
SERVER_PORT |
Port for Streamlit server (default: 7086) |
./start.shOr manually:
source venv/bin/activate
streamlit run youtuber.py --server.port 7086The application will be available at http://localhost:7086
The main view displays your imported videos organized by category:
- Select a category from the dropdown to filter videos
- View video thumbnails, titles, channels, and lengths
- Click "Subs" to view the video transcript
- Click "Sum" to view the AI-generated summary
- Click "Retry" to regenerate a summary
- Check "hidden" to hide videos from the list
- Click the link to open the video on YouTube
Imports videos from your YouTube home page:
- Opens a Chrome browser window
- Logs into your YouTube account (first run only)
- Scrolls through your home page to load videos
- Extracts video metadata (title, channel, thumbnail, etc.)
- Fetches subtitles/transcripts where available
- Generates AI summaries
- Auto-categorizes each video
Note: The first import requires manual login. You may need to complete 2FA or CAPTCHA challenges in the browser window.
Rename or merge categories:
- View all existing categories
- Enter a new name to rename a category (all videos in that category will be updated)
/?action=summarize- Generate summaries for videos that have subtitles but no summary/?action=subs- Import subtitles for videos missing them/?action=themes- Extract themes from videos (functionality partially commented out)
In youtuber.py, you can modify:
SKIP_RELOAD = True- Set to skip reloading the YouTube home page during import (useful if the import crashes and you want to resume without re-scraping)
The videos table contains:
| Column | Type | Description |
|---|---|---|
id |
SERIAL | Primary key |
title |
VARCHAR | Video title |
link |
VARCHAR | YouTube video URL |
channel |
VARCHAR | Channel name |
thumbnail |
VARCHAR | Thumbnail URL |
subtitles |
TEXT | Video transcript |
summary |
TEXT | AI-generated summary |
blurb |
TEXT | Short blurb (currently unused) |
themes |
TEXT | Extracted themes (currently unused) |
progress |
INT | Watch progress percentage |
category |
VARCHAR | Video category |
video_created |
TIMESTAMP | When video was published |
video_length |
INTERVAL | Video duration |
record_created |
TIMESTAMP | When record was imported |
hidden |
BOOLEAN | Whether video is hidden from view |
The application automatically creates daily PostgreSQL backups on startup:
- Backup files are named
pg_dump_YYYYMMDD.sql.gz - Only the 5 most recent backups are kept
- Chrome/Selenium issues: Ensure Chrome/Chromium is installed and accessible. The app auto-detects the browser version.
- Login failures: YouTube may require CAPTCHA or 2FA. Complete these manually in the browser window.
- Import crashes: Set
SKIP_RELOAD = Trueinyoutuber.pyto resume without re-fetching the home page.
This project is for personal use.