Get up and running with LLM Playground in under 10 minutes!
# 1. Install Ollama (if not already installed)
brew install ollama # macOS
# 2. Start Ollama
ollama serve # In one terminal
# 3. Setup project (in another terminal)
cd llm_101
./setup.sh
# 4. Launch!
streamlit run app.pyYour browser will open to the playground!
- This file (you're here!) ✓
- Launch
streamlit run app.py - Click around and experiment!
- README.md - Project overview
- CONCEPTS.md - LLM theory
- TUTORIAL.md - Guided experiments
- ARCHITECTURE.md - System design
- Look at
models/andexperiments/code - Extend with your own features!
- Open the Streamlit app
- In sidebar, click "🔌 Connect to Model"
- Wait for "✅ Connected" message
- Go to "💬 Quick Chat" tab
- Type: "Explain machine learning in one sentence"
- Click "🚀 Generate"
- Watch the magic happen!
- Go to "🌡️ Temperature" tab
- Keep the default prompt
- Test temperatures: 0.1, 0.7, 1.5
- Click "🧪 Run Temperature Test"
- Compare the outputs!
Try different tasks without examples:
- Sentiment analysis
- Question answering
- Code generation
Observe: How accurate is the model?
Add examples and see improvement:
- Compare zero-shot vs few-shot
- Notice accuracy increase
- Understand the cost (more tokens)
Find the sweet spot:
- Test: 0.1 (deterministic)
- Test: 0.7 (balanced)
- Test: 1.5 (creative)
Question: Which is best for facts? For stories?
Small changes, big differences:
- Try "tone_changes" variations
- See how outputs differ
- Learn prompt engineering!
Analyze your experiments:
- Go to "📋 Logs" tab
- Review your interactions
- Look for patterns
- ✅ Setup and first experiments
- ✅ Understand temperature
- ✅ Try different prompts
- ✅ Read CONCEPTS.md
- ✅ Understand transformers
- ✅ Learn about tokens
- ✅ Follow TUTORIAL.md
- ✅ Complete all experiments
- ✅ Analyze your logs
- ✅ Read LEARNING_OUTCOMES.md
- ✅ Try follow-up experiments
- ✅ Build a mini-project
# Read theory
cat CONCEPTS.md
# Launch interactive playground
streamlit run app.py
# Experiment systematically# Quick testing
python cli.py generate "Your prompt here"
# Compare variations
# Use Streamlit "Sensitivity" tab# Test different models
ollama pull mistral
ollama pull phi
# Compare in Streamlit
# Switch models in sidebar# Check logs for token usage
cat logs/*.jsonl | jq '.metrics.total_tokens'
# Calculate costs
# (automatic in logs)# Start Ollama
ollama serve
# Verify it's running
curl http://localhost:11434/api/tags# Pull the model
ollama pull llama2
# List available models
ollama list# Activate virtual environment
source venv/bin/activate
# Reinstall dependencies
pip install -r requirements.txt# Check Python version (need 3.8+)
python --version
# Reinstall Streamlit
pip install --upgrade streamlitCreate a prompt library in a text file:
# good_prompts.txt
Explain [topic] in simple terms for a beginner
Write a [type] story about [subject] with a [tone] tone
Classify this as [categories]: [text]
Use the CLI for automation:
# Test multiple prompts
for temp in 0.1 0.5 0.9; do
python cli.py generate "Your prompt" --temperature $temp
doneUse jq to query logs:
# Average latency
cat logs/*.jsonl | jq '.metrics.latency_ms' | awk '{sum+=$1} END {print sum/NR}'
# Total tokens used
cat logs/*.jsonl | jq '.metrics.total_tokens' | awk '{sum+=$1} END {print sum}'Open two terminal windows:
# Terminal 1: llama2
python cli.py generate "Test prompt" --model llama2
# Terminal 2: mistral
python cli.py generate "Test prompt" --model mistralTokens:
- Prompt tokens: Your input
- Completion tokens: Model output
- Total tokens: Sum (used for billing)
Latency:
- Time from request to response
- Includes: network + processing + generation
- Lower is better (but quality matters more!)
Cost:
- $0.00 for Ollama (local)
- ~$0.0005-0.03 per 1K tokens for OpenAI
- Check logs for your actual usage
- ✅ Complete first hour experiments
- ✅ Read key sections of CONCEPTS.md
- ✅ Try 10+ different prompts
- ✅ Understand temperature effects
- ✅ Complete full TUTORIAL.md
- ✅ Read all of CONCEPTS.md
- ✅ Try all experiment types
- ✅ Analyze your logs
- ✅ Read LEARNING_OUTCOMES.md
- ✅ Implement follow-up experiments
- ✅ Build a mini-project
- ✅ Read ARCHITECTURE.md
- ✅ Extend the system
- README.md - Overview
- CONCEPTS.md - Theory
- TUTORIAL.md - Hands-on guide
- ARCHITECTURE.md - Technical details
- example.py - 4 working examples
- experiments/ - Reusable patterns
- app.py - Full UI implementation
Q: Which model should I use?
A: Start with llama2. It's small, fast, and works well for learning.
Q: What's a good temperature? A: 0.7 is a good default. Lower (0.1-0.3) for facts, higher (1.0-1.5) for creativity.
Q: How many tokens can I use? A: Most models support 2K-4K tokens. Check context limits for your model.
Q: Is it free? A: Yes! Ollama is completely free for local use.
Q: Can I use GPT-4?
A: Yes, if you have an OpenAI API key. Add it to .env file.
You now know enough to start experimenting. The best way to learn is by doing!
Start here:
streamlit run app.pyHave fun learning! 🚀
Setup: ./setup.sh
Web UI: streamlit run app.py
CLI: python cli.py --help
Example: python example.py
List models: ollama list
Pull model: ollama pull <model>
View logs: cat logs/*.jsonl
Theory: cat CONCEPTS.md
Tutorial: cat TUTORIAL.md
Bookmark this page for quick reference!