Multi-source entry browser and analyzer. Import, organize, tag, and export entries (notes, docs, and conversations) from ChatGPT, Google Drive, and Bear notes.
What is an entry? Jungle treats all content as "entries" - whether it's a ChatGPT conversation, a Google Doc, or a Bear note. This unified approach lets you organize, search, and analyze all your textual content in one place.
- Multi-Source Import: Import from ChatGPT, Google Drive/Docs, and Bear notes
- CLI Interface: Powerful command-line tool for all operations
- YAML Configuration: Flexible configuration with environment variable substitution
- Markdown Export: Export to generic markdown, Obsidian, or Bear formats
- SQLite Database: Fast, local storage with full-text search
- Auto-Tagging: Automatic topic extraction and categorization (coming soon)
- Similarity Search: Find related entries using embeddings (coming soon)
- Web UI: Optional Streamlit interface for browsing (coming soon)
# Clone the repository
git clone https://github.com/zmainen/jungle.git
cd jungle
# Create conda environment
conda create -n jungle python=3.11
conda activate jungle
# Install dependencies
pip install -e .
# Set up environment variables
export OPENAI_API_KEY="sk-..." # Optional, for embeddings
# Initialize database
jungle db init# Import from ChatGPT export
jungle import chatgpt /path/to/conversations.jsonThis will:
- Parse all entries from the JSON export
- Export each to markdown with YAML frontmatter
- Store metadata in SQLite database
- Save files to
data/conversations/(note: directory name retained for backward compatibility)
# General statistics
jungle stats
# By model
jungle stats --by-model
# By tag
jungle stats --by-tag# View current config
jungle config show
# Validate config
jungle config validate
# Edit config
jungle config editEdit config.yaml to configure:
- Data sources: Enable/disable ChatGPT, Google Drive, Bear
- Storage paths: Database location, output directories
- OpenAI settings: API key, embedding model
- Topic categories: Customize auto-tagging topics
- UI preferences: Streamlit port, theme, etc.
sources:
google_drive:
enabled: true
credentials_path: "~/.secrets"
folder_ids:
- "your-folder-id-here"jungle import chatgpt <json-file> # Import ChatGPT entries (conversations)
jungle import google-drive --folder-id <ID> # Import from Google Drive (docs)
jungle import bear --all # Import all Bear notesjungle search "keyword" # Search entries (coming soon)
jungle similar <entry-id> # Find similar entries (coming soon)jungle tag list # List all tags (coming soon)
jungle tag add <id> <tag> # Add tag to entry (coming soon)
jungle tag auto # Auto-generate tags (coming soon)jungle export --tag research --format obsidian # Export by tag (coming soon)
jungle export --all --format bear # Export all (coming soon)jungle db init # Initialize database schema
jungle db backup # Backup database (coming soon)
jungle db restore <file> # Restore from backup (coming soon)jungle ui # Launch Streamlit web interface (coming soon)jungle-viewer/
├── cli.py # Click CLI entry point
├── config.yaml # YAML configuration
├── config_loader.py # Configuration parser
├── setup.py # Package setup
├── parsers/ # Source-specific parsers
│ ├── base_parser.py # Abstract base
│ ├── chatgpt_parser.py # ChatGPT JSON parser
│ ├── google_drive_parser.py # Google Drive API
│ └── bear_parser.py # Bear SQLite parser
├── exporters/ # Markdown exporters
│ └── markdown_exporter.py # Generic, Obsidian, Bear
├── storage/ # Data persistence
│ └── database.py # SQLite interface
├── analysis/ # Analysis modules (coming soon)
│ ├── topic_analyzer.py
│ ├── similarity.py
│ └── stats.py
└── data/ # Generated data (gitignored)
├── conversations/ # Markdown files (name kept for backward compatibility)
├── exports/ # Exported collections
└── metadata.db # SQLite database
Each entry is exported as markdown with YAML frontmatter:
---
title: "Entry Title"
source: chatgpt # or 'google_drive', 'bear'
source_id: "uuid-here"
create_time: "2025-01-15T10:30:00Z"
model: "gpt-4o" # for ChatGPT entries
message_count: 42 # for conversation-type entries
tags:
- technical
- python
---
## User (2025-01-15 10:30)
Message content here...
## Assistant (2025-01-15 10:31)
Response content here...Note: For non-conversation entries (Google Docs, Bear notes), the format adapts - message_count and role headers are omitted, and content appears as standard markdown.
- YAML configuration system
- Click-based CLI framework
- Multi-source parser architecture
- ChatGPT parser (tested with 1,200+ entries)
- Google Drive parser
- Bear notes parser
- SQLite database with multi-source support
- Markdown exporters (generic, Obsidian, Bear)
- ChatGPT import command
- Statistics command
- Package installation (
junglecommand)
- Topic analyzer with TF-IDF
- OpenAI embeddings for similarity
- Search commands
- Tag management commands
- Google Drive import implementation
- Bear import implementation
- Streamlit web UI
- Export commands
- Sync commands
- Full-text search
- Semantic search with embeddings
- Visualization pages
- Python 3.8+
- SQLite (included with Python)
- OpenAI API key (optional, for embeddings)
- Google Drive credentials (optional, for Google Drive import)
- Bear app (optional, for Bear import)
This is a personal project, but suggestions and improvements are welcome!
MIT License
Zach Mainen