HappyHackingSpace · myz21 · Jan 17, 2026 · Jan 29, 2026 · Mar 2, 2026 · Mar 8, 2026
diff --git a/.env.example b/.env.example
@@ -1,3 +1,4 @@
 OPENAI_API_KEY=
 OPENAI_MODEL=
+GEMINI_API_KEY=
 LOG_LEVEL=INFO
diff --git a/README.md b/README.md
@@ -16,8 +16,13 @@ and **aggregate** category scores into an overall score with strengths, risks, r
 > **Part of [Happy Hacking Space](https://github.com/HappyHackingSpace) - A community-driven
 > organization focused on security, AI, and software development.**
 
+## Architecture & How It Works
+
+![Privacy Policy Analyzer System Workflow](docs/workflow.svg)
+
 ## Features
 
+- **Multi-LLM Provider Engine**: Seamlessly switch between **Google Gemini** (using official `google-genai` SDK) and **OpenAI GPT** models based on model name prefix.
 - **Auto-discovery**: Common paths → robots.txt/sitemaps → footer links.
 - **HTTP-first extraction**: `trafilatura` (clean text) or `BeautifulSoup` fallback; **Selenium** for dynamic pages.
 - **Structured scoring (JSON)**: Per-category (0–10) scores + rationales; aggregated to 0–100 overall in `scoring.py`.
@@ -48,6 +53,7 @@ privacy-policy-analyzer/
 ├── pyproject.toml                 # Project configuration
 ├── requirements.txt               # Legacy requirements
 ├── .env.example                   # Environment template
+├── RESULTS.md                     # Benchmarks & real-world scores (GitHub, TikTok, Meta)
 ├── .gitignore
 ├── LICENSE
 └── README.md
@@ -56,7 +62,7 @@ privacy-policy-analyzer/
 ## Requirements
 
 - Python **3.10.11 or higher**
-- An **OpenAI API key**
+- An **OpenAI API key** (for GPT models) and/or **Google Gemini API key** (for Gemini models)
 - (Optional) **Chrome/Chromium** on the machine (Selenium fallback; driver auto-installs)
 
 ## Installation
@@ -100,8 +106,10 @@ Copy `.env.example` → `.env` and set your credentials:
 
 ```
 OPENAI_API_KEY=sk-************************
-# Optional (overrides default):
 OPENAI_MODEL=gpt-4o
+
+# Required if using Gemini models (e.g. gemini-2.5-flash)
+GEMINI_API_KEY=AIzaSy*********************
 ```
 
 ## Usage

diff --git a/RESULTS.MD b/RESULTS.MD
@@ -0,0 +1,71 @@
+# Privacy Policy Analyzer - Test Results
+
+This document contains the analysis results for **Facebook** and **TikTok** privacy policies using the newly refactored production-grade CLI tool.
+
+---
+
+## 1. Facebook Analysis Results
+
+*   **Target Website:** `https://facebook.com`
+*   **Resolved Privacy URL:** `https://www.facebook.com/privacy/policy/?entry_point=facebook_page_footer`
+*   **LLM Model Used:** `openai/gpt-oss-120b:free` (OpenRouter)
+*   **Total Chunks Analyzed:** 6 (5 valid chunks aggregated)
+
+### Score & Overview
+*   **Overall Privacy Score:** `42.16 / 100` (Moderate risk profile)
+*   **Confidence Level:** `1.0`
+*   **Total Red Flags Identified:** `20`
+
+### Top Strengths
+1.  **Lawful Basis & Purpose:** `5.8 / 10`
+2.  **Transparency & Notice:** `5.8 / 10`
+3.  **Collection & Minimization:** `5.2 / 10`
+
+### Top Risks
+1.  **Retention & Deletion:** `2.8 / 10` (High Risk)
+2.  **User Rights & Redress:** `3.2 / 10` (High Risk)
+3.  **Sensitive Children, Ads & Profiling:** `3.2 / 10` (High Risk)
+
+---
+
+## 2. TikTok Analysis Results
+
+*   **Target Website:** `https://tiktok.com`
+*   **Resolved Privacy URL:** `https://www.tiktok.com/legal/privacy-policy-row`
+*   **LLM Model Used:** `openai/gpt-oss-120b:free` (OpenRouter)
+*   **Total Chunks Analyzed:** 4 (2 valid chunks aggregated)
+
+### Score & Overview
+*   **Overall Privacy Score:** `49.9 / 100` (Moderate risk profile, slightly better than Facebook)
+*   **Confidence Level:** `1.0`
+*   **Total Red Flags Identified:** `9`
+
+### Top Strengths
+1.  **Lawful Basis & Purpose:** `6.5 / 10`
+2.  **Third Parties & Processors:** `6.5 / 10`
+3.  **User Rights & Redress:** `6.0 / 10`
+
+### Top Risks
+1.  **Retention & Deletion:** `3.0 / 10` (High Risk)
+2.  **Security & Breach:** `3.0 / 10` (High Risk)
+3.  **Cross Border Transfers:** `3.5 / 10` (Medium-High Risk)
+
+---
+
+## 3. Performance & System Benchmarks
+
+| Metric | Facebook (`https://facebook.com`) | TikTok (`https://tiktok.com`) |
+| :--- | :--- | :--- |
+| **Discovery Time** | 1.60s | 5.11s (Discovered via Sitemap) |
+| **Fetching Time** | 3.50s | 1.97s |
+| **LLM Analysis Time** | 114.72s | 59.91s |
+| **Total Duration** | 119.83s | 66.99s |
+| **Average Per Chunk** | 19.12s | 14.98s |
+| **Text Characters** | 48,763 chars | 31,001 chars |
+
+---
+
+## 4. Key Takeaways
+1.  **Robust Sitemap Discovery:** The new scoring and sitemap heuristics successfully located TikTok's exact privacy policy at `https://www.tiktok.com/legal/privacy-policy-row` under 5.11s.
+2.  **Parallel Execution & Rate Limits:** The multi-threaded pipeline analyzed multiple policy chunks simultaneously. The integrated `tenacity` exponential retry logic flawlessly handled rate limits on the free tier backend.
+3.  **Clean Output Architecture:** Standardized error outputs (`click.secho(..., err=True)`) kept stdout completely clean, allowing accurate piping of the output JSON string.
diff --git a/RESULTS.md b/RESULTS.md
@@ -0,0 +1,80 @@
+# Privacy Policy Analysis Results
+
+This document records the empirical results of evaluating real-world privacy policies using the Privacy Policy Analyzer. Analyses were conducted using directly supported LLM providers on **Google Gemini** (`gemini-2.5-flash`).
+
+---
+
+## 📊 Summary of Analyzed Policies
+
+| Platform | Domain | Resolved Policy URL | Overall Score | Red Flags | Confidence | Analysis Time |
+| :--- | :--- | :--- | :---: | :---: | :---: | :---: |
+| **GitHub** | `github.com` | [GitHub Privacy Statement](https://docs.github.com/site-policy/privacy-policies/github-privacy-statement) | **44.40 / 100** | 8 | 100% | 23.66s |
+| **TikTok** | `tiktok.com` | [TikTok Privacy Policy](https://www.tiktok.com/legal/privacy-policy) | **41.41 / 100** | 15 | 100% | 36.90s |
+| **Facebook** | `facebook.com` | [Meta Privacy Policy](https://www.facebook.com/privacy/policy/) | **40.15 / 100** | 9 | 100% | 28.62s |
+
+---
+
+## 🔍 Detailed Breakdown per Platform
+
+### 1. GitHub (Score: 44.40/100)
+
+> [!NOTE]
+> GitHub provides relatively transparent data sharing disclosures and clear lawful bases, but scores extremely poorly on retention timelines and cross-border safeguards in user-facing summaries.
+
+* **Top Strengths:**
+  1. **Third Parties & Processors** (`8.5/10`) — Detailed processor disclosures and joint-controller roles are clearly demarcated.
+  2. **Transparency & Notice** (`8.0/10`) — Standard simplified language with comprehensive version notices and clear contacts.
+  3. **Lawful Basis & Purpose** (`7.5/10`) — Well-articulated data processing purposes and legitimate business interests.
+* **Top Risks:**
+  1. **Retention & Deletion** (`0.0/10`) — Criticized for indefinite retention clauses (e.g., "retained as long as your account is active").
+  2. **Cross-Border Transfers** (`0.5/10`) — Extremely vague transfer mechanism safety disclosures in main text chunks.
+  3. **Security & Breach** (`1.0/10`) — Lack of concrete technical and organizational safety measures detailed in the text excerpt analyzed.
+
+---
+
+### 2. TikTok (Score: 41.41/100)
+
+> [!WARNING]
+> TikTok suffers from a massive volume of Red Flags (15 distinct flags detected). While transparent about *why* they collect data, their security standards and user-redress frameworks remain highly problematic under global scoring rubrics.
+
+* **Top Strengths:**
+  1. **Lawful Basis & Purpose** (`7.0/10`) — Purposes of processing are detailed but heavily geared toward personalized content.
+  2. **Secondary Use & Limits** (`6.67/10`) — Stated boundaries on auxiliary features.
+  3. **Transparency & Notice** (`6.67/10`) — Highly structured layout despite massive length.
+* **Top Risks:**
+  1. **Security & Breach** (`0.0/10`) — Analyzed chunks contained zero actionable breach details or concrete technical guardrails.
+  2. **Retention & Deletion** (`2.0/10`) — "Keep as long as necessary" phrasing used frequently without specific schedules.
+  3. **User Rights & Redress** (`2.67/10`) — Complex mechanisms to object or request erasure for global users outside EEA/California jurisdictions.
+
+---
+
+### 3. Facebook / Meta (Score: 40.15/100)
+
+> [!CAUTION]
+> Meta's privacy policy is highly structured and written in plain language (high transparency scoring), yet the underlying substance scores heavily against user protection—particularly on indefinite data retention and aggressive cross-border transfers.
+
+* **Top Strengths:**
+  1. **Transparency & Notice** (`7.0/10`) — Excellent usage of plain-language headings, bullet points, and interactive cookie control guidance.
+  2. **Lawful Basis & Purpose** (`5.67/10`) — Categorized lists of purposes.
+  3. **Secondary Use & Limits** (`5.67/10`) — Stated boundaries on ad rendering vs. diagnostic telemetry.
+* **Top Risks:**
+  1. **Retention & Deletion** (`0.67/10`) — Extremely low scoring due to massive loops of indefinite storage policies.
+  2. **Security & Breach** (`1.67/10`) — Highly generic security assertions with no robust technical benchmarks.
+  3. **Cross-Border Transfers** (`2.0/10`) — Aggressive globally dispersed storage notices with standard contractual clauses (SCC) obscured in deeply nested external links.
+
+---
+
+## 🚀 Execution Commands
+
+To reproduce these exact results, configure your `.env` with a `GEMINI_API_KEY` and run:
+
+```bash
+# Analyze GitHub
+uv run python src/main.py --url https://github.com --model gemini-2.5-flash --report summary --max-chunks 2
+
+# Analyze TikTok
+uv run python src/main.py --url https://tiktok.com --model gemini-2.5-flash --report summary --max-chunks 3
+
+# Analyze Facebook
+uv run python src/main.py --url https://facebook.com --model gemini-2.5-flash --report summary --max-chunks 3
+```
diff --git a/docs/workflow.svg b/docs/workflow.svg
diff --git a/pyproject.toml b/pyproject.toml
@@ -19,6 +19,8 @@ dependencies = [
     "selenium>=4.35.0",
     "chromedriver-autoinstaller>=0.6.4",
     "requests>=2.32.5",
+    "click>=8.1.0",
+    "google-genai>=1.0.0",
 ]
 
 [dependency-groups]

diff --git a/src/analyzer/scoring.py b/src/analyzer/scoring.py
@@ -19,17 +19,39 @@
 
 
 def _avg(nums: list[float]) -> float:
+    """Calculate the average of a list of floats.
+
+    Args:
+        nums: A list of float numbers.
+
+    Returns:
+        The average value as a float, or 0.0 if the list is empty.
+    """
     return sum(nums) / len(nums) if nums else 0.0
 
 
+def _get_score_descending(kv: tuple[str, float]) -> float:
+    """Helper key function to sort categories by score in descending order."""
+    return -kv[1]
+
+
+def _get_score_ascending(kv: tuple[str, float]) -> float:
+    """Helper key function to sort categories by score in ascending order."""
+    return kv[1]
+
+
 def aggregate_chunk_results(chunk_json_list: list[dict[str, Any]]) -> dict[str, Any]:
-    """
-    Aggregates per-chunk JSON results into a weighted overall report.
-    Expects each item to contain:
-      - "scores": {category: int 0..10}
-      - "rationales": {category: str}
-      - optional "red_flags": [str]
-      - optional "notes": [str]
+    """Aggregates per-chunk JSON results into a weighted overall report.
+
+    Args:
+        chunk_json_list: A list of chunk evaluation results from LLM. Each item must contain
+            "scores" (dict mapping category keys to int 0..10), "rationales" (dict mapping
+            category keys to string explanations), optional "red_flags" (list of strings),
+            and optional "notes" (list of strings).
+
+    Returns:
+        A aggregated report dictionary with overall score, confidence, category scores,
+        top strengths, top risks, red flags, and recommendations.
     """
     per_cat: dict[str, list[float]] = {k: [] for k in _REQUIRED_KEYS}
     rationales: dict[str, list[str]] = {k: [] for k in _REQUIRED_KEYS}
@@ -73,11 +95,11 @@ def aggregate_chunk_results(chunk_json_list: list[dict[str, Any]]) -> dict[str,
 
     strengths: list[tuple[str, float]] = sorted(
         ((k, v["score"]) for k, v in category_scores.items()),
-        key=lambda kv: -kv[1],
+        key=_get_score_descending,
     )[:3]
     risks: list[tuple[str, float]] = sorted(
         ((k, v["score"]) for k, v in category_scores.items()),
-        key=lambda kv: kv[1],
+        key=_get_score_ascending,
     )[:3]
 
     return {