diff --git a/README.md b/README.md index cfa4c73..9613025 100644 --- a/README.md +++ b/README.md @@ -105,6 +105,9 @@ cd AIGoat The script handles everything: checks Ollama, downloads the Mistral model if missing, initializes the database, and starts both backend and frontend. +### Google Colab +Follow the instructions provided at https://github.com/AISecurityConsortium/AIGoat/blob/main/colab_notebooks/AIGoat_Colab.md + Once you see "AI Goat is running!", open your browser: | What | URL | diff --git a/colab_notebooks/AIGoat_Colab.ipynb b/colab_notebooks/AIGoat_Colab.ipynb new file mode 100644 index 0000000..b5a6b1f --- /dev/null +++ b/colab_notebooks/AIGoat_Colab.ipynb @@ -0,0 +1,533 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "ctFY_KDW9tQc" + }, + "source": [ + "# AIGoat on Google Colab\n", + "\n", + "Runs [AIGoat](https://github.com/AISecurityConsortium/AIGoat) end-to-end inside a Colab runtime. Installs Ollama, clones the repo, installs Python + npm deps, seeds the SQLite DB, starts the FastAPI backend and React frontend in the background, and exposes the app through Colab's built-in port proxy.\n", + "\n", + "## Before you run\n", + "\n", + "- **Runtime**: the default CPU runtime works. A GPU runtime (T4) makes Mistral dramatically faster — switch via `Runtime → Change runtime type` before running.\n", + "- **RAM**: you need at least ~12 GB free. Mistral alone is ~4.5 GB; embeddings + backend + Node dev server take another few GB.\n", + "- **First run is slow.** Expect ~3–5 min on the `pip install` cell and another several minutes on `ollama pull mistral` (it's a 4.5 GB download).\n", + "- **CPU inference is slow.** If you're on CPU-only, chat responses may take 30–60 seconds each.\n", + "- **Single-port proxy.** Colab can only realistically expose one port, so the frontend is patched at runtime to use relative URLs and the React dev server proxies all API calls to the backend. You will only open the `3000` URL.\n", + "\n", + "Run the cells top-to-bottom. The last cell prints the clickable link." + ], + "id": "ctFY_KDW9tQc" + }, + { + "cell_type": "markdown", + "metadata": { + "id": "D4rwbyRB9tQe" + }, + "source": [ + "## 1. Install Ollama and Serve Ollama" + ], + "id": "D4rwbyRB9tQe" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "GmrN83Tr9tQe" + }, + "outputs": [], + "source": [ + "!sudo apt-get update && sudo apt-get install -y zstd pciutils\n", + "!curl -fsSL https://ollama.com/install.sh | sh\n", + "!ollama --version" + ], + "id": "GmrN83Tr9tQe" + }, + { + "cell_type": "code", + "source": [ + "import subprocess\n", + "import time\n", + "\n", + "# Start ollama serve in the background\n", + "subprocess.Popen([\"ollama\", \"serve\"])\n", + "time.sleep(5) # Wait for the server to start" + ], + "metadata": { + "id": "HhWxJRYFJdSh" + }, + "id": "HhWxJRYFJdSh", + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "RDswJO3y9tQf" + }, + "source": [ + "## 2. Clone AIGoat and set the working directory\n", + "\n", + "`os.chdir` is used instead of `!cd` so subsequent Python cells inherit the directory." + ], + "id": "RDswJO3y9tQf" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "FBfnFzjq9tQg" + }, + "outputs": [], + "source": [ + "import os, subprocess\n", + "\n", + "REPO_DIR = '/content/AIGoat'\n", + "REPO_URL = 'https://github.com/AISecurityConsortium/AIGoat.git'\n", + "\n", + "if not os.path.exists(REPO_DIR):\n", + " subprocess.run(['git', 'clone', '--depth', '1', REPO_URL, REPO_DIR], check=True)\n", + "\n", + "os.chdir(REPO_DIR)\n", + "os.makedirs('logs', exist_ok=True)\n", + "\n", + "# Make the repo importable as a package (so 'from app.core.database import ...' works later).\n", + "import sys\n", + "if REPO_DIR not in sys.path:\n", + " sys.path.insert(0, REPO_DIR)\n", + "\n", + "print('cwd:', os.getcwd())" + ], + "id": "FBfnFzjq9tQg" + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9rZtW2nt9tQg" + }, + "source": [ + "## 3. Install Python dependencies\n", + "\n", + "No virtualenv — Colab already has Python 3.11. This step is slow on first run because `sentence-transformers`, `chromadb`, and `nemoguardrails` pull in a lot of transitive packages." + ], + "id": "9rZtW2nt9tQg" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "t4P-MEN59tQg" + }, + "outputs": [], + "source": [ + "!ollama list\n", + "!pip install -r requirements.txt" + ], + "id": "t4P-MEN59tQg" + }, + { + "cell_type": "markdown", + "metadata": { + "id": "RwqMeM7e9tQh" + }, + "source": [ + "## 4. Patch the frontend to use relative API URLs\n", + "\n", + "`frontend/src/config/api.js` hardcodes `http://localhost:8000` as the fallback when `REACT_APP_API_URL` isn't set. Your browser is talking to a Colab proxy URL, not to localhost, so we rewrite the fallback to an empty string. That makes axios use relative URLs, which are sent to the React dev server on port 3000, which then forwards `/api/…` to `http://localhost:8000` via the `\"proxy\"` field already set in `frontend/package.json`. Net effect: one exposed port, everything works." + ], + "id": "RwqMeM7e9tQh" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "43F6zyxK9tQh" + }, + "outputs": [], + "source": [ + "from pathlib import Path\n", + "\n", + "p = Path('/content/AIGoat/frontend/src/config/api.js')\n", + "src = p.read_text()\n", + "\n", + "# Target string to be replaced\n", + "target_str = \"process.env.REACT_APP_API_URL || 'http://localhost:8000'\"\n", + "# Desired replacement string\n", + "replacement_str = \"''\"\n", + "\n", + "# Check if the file is already in the desired state\n", + "# The original code's replacement results in `BASE_URL: ''`\n", + "if f\"BASE_URL: {replacement_str},\" in src:\n", + " print('api.js is already patched.')\n", + "elif target_str in src:\n", + " patched = src.replace(target_str, replacement_str)\n", + " p.write_text(patched)\n", + " print('Patched', p)\n", + "else:\n", + " # If the target string is not found and the file isn't already patched,\n", + " # it means the repo layout changed in an unexpected way.\n", + " raise RuntimeError('api.js default URL not found — repo layout may have changed unexpectedly.')" + ], + "id": "43F6zyxK9tQh" + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Ar4hHe2y9tQh" + }, + "source": [ + "## 5. Install frontend npm packages\n", + "\n", + "Using `%%bash` so the `cd` is scoped to this cell." + ], + "id": "Ar4hHe2y9tQh" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "8TKDrMPr9tQh" + }, + "outputs": [], + "source": [ + "%%bash\n", + "cd /content/AIGoat/frontend\n", + "npm install --silent --no-audit --no-fund 2>&1 | tail -20" + ], + "id": "8TKDrMPr9tQh" + }, + { + "cell_type": "markdown", + "metadata": { + "id": "cMtwW5979tQh" + }, + "source": [ + "## 6. Start the Ollama daemon (background)\n", + "\n", + "`subprocess.Popen` keeps the daemon alive for the whole kernel session. A plain `!ollama serve &` would be killed when the cell finishes." + ], + "id": "cMtwW5979tQh" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "5lb9OE7S9tQi" + }, + "outputs": [], + "source": [ + "import subprocess, time\n", + "import urllib.request, urllib.error\n", + "\n", + "ollama_log = open('/content/AIGoat/logs/ollama.log', 'wb')\n", + "ollama_proc = subprocess.Popen(\n", + " ['ollama', 'serve'],\n", + " stdout=ollama_log,\n", + " stderr=subprocess.STDOUT,\n", + ")\n", + "\n", + "for _ in range(30):\n", + " try:\n", + " with urllib.request.urlopen('http://localhost:11434/api/tags', timeout=2) as r:\n", + " if r.status == 200:\n", + " print('Ollama is up on :11434')\n", + " break\n", + " except (urllib.error.URLError, ConnectionError, TimeoutError):\n", + " pass\n", + " time.sleep(1)\n", + "else:\n", + " raise RuntimeError('Ollama did not start — check logs/ollama.log')" + ], + "id": "5lb9OE7S9tQi" + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-W4KS-JT9tQi" + }, + "source": [ + "## 7. Pull the Mistral model\n", + "\n", + "~4.5 GB download on first run. Output streams progress bars." + ], + "id": "-W4KS-JT9tQi" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "8G1u2rWI9tQi" + }, + "outputs": [], + "source": [ + "!ollama pull mistral\n", + "!ollama list" + ], + "id": "8G1u2rWI9tQi" + }, + { + "cell_type": "markdown", + "metadata": { + "id": "CmLoS4Fv9tQi" + }, + "source": [ + "## 8. Initialize the database" + ], + "id": "CmLoS4Fv9tQi" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "fWLHF6q99tQi" + }, + "outputs": [], + "source": [ + "import nest_asyncio\n", + "nest_asyncio.apply()\n", + "\n", + "import asyncio\n", + "from app.core.database import init_db\n", + "\n", + "asyncio.run(init_db())\n", + "print('DB initialized')\n" + ], + "id": "fWLHF6q99tQi" + }, + { + "cell_type": "markdown", + "metadata": { + "id": "5OL6tWaQ9tQi" + }, + "source": [ + "## 9. Seed demo data (idempotent)\n", + "\n", + "Matches the counts the bash script checks for: 5 users, 20 products, 9 challenges." + ], + "id": "5OL6tWaQ9tQi" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "188fdOin9tQi" + }, + "outputs": [], + "source": [ + "import asyncio, subprocess\n", + "from sqlalchemy import select, func\n", + "from app.core.database import async_session\n", + "from app.models import User, Product, Challenge\n", + "\n", + "async def _needs_seed():\n", + " async with async_session() as db:\n", + " u = (await db.execute(select(func.count(User.id)))).scalar() or 0\n", + " p = (await db.execute(select(func.count(Product.id)))).scalar() or 0\n", + " c = (await db.execute(select(func.count(Challenge.id)))).scalar() or 0\n", + " return u < 5 or p < 20 or c < 9\n", + "\n", + "if asyncio.run(_needs_seed()):\n", + " subprocess.run(['python', '-m', 'scripts.seed'], check=True, cwd='/content/AIGoat')\n", + " print('Demo data seeded')\n", + "else:\n", + " print('Database already has required data')" + ], + "id": "188fdOin9tQi" + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WHh4ERGd9tQj" + }, + "source": [ + "## 10. Start the backend (FastAPI / uvicorn) in the background" + ], + "id": "WHh4ERGd9tQj" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "4tUAeK-F9tQj" + }, + "outputs": [], + "source": [ + "import subprocess, time\n", + "import urllib.request, urllib.error\n", + "\n", + "backend_log = open('/content/AIGoat/logs/backend.log', 'wb')\n", + "backend_proc = subprocess.Popen(\n", + " [\n", + " 'python', '-m', 'uvicorn', 'app.main:app',\n", + " '--host', '0.0.0.0',\n", + " '--port', '8000',\n", + " '--log-level', 'info',\n", + " ],\n", + " stdout=backend_log,\n", + " stderr=subprocess.STDOUT,\n", + " cwd='/content/AIGoat',\n", + ")\n", + "\n", + "for _ in range(60):\n", + " try:\n", + " with urllib.request.urlopen('http://localhost:8000/docs', timeout=2) as r:\n", + " if r.status < 500:\n", + " print('Backend is up on :8000')\n", + " break\n", + " except (urllib.error.URLError, ConnectionError, TimeoutError):\n", + " pass\n", + " if backend_proc.poll() is not None:\n", + " raise RuntimeError('Backend exited early — run the logs cell at the bottom to see why')\n", + " time.sleep(1)\n", + "else:\n", + " print('Backend slow to start — check logs/backend.log if the next cells fail')" + ], + "id": "4tUAeK-F9tQj" + }, + { + "cell_type": "markdown", + "metadata": { + "id": "X5byjE-a9tQj" + }, + "source": [ + "## 11. Start the frontend (React dev server) in the background\n", + "\n", + "`CI=true` suppresses CRA's interactive \"port in use\" prompt and disables the auto-browser-open." + ], + "id": "X5byjE-a9tQj" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "nOkpUJ8t9tQj" + }, + "outputs": [], + "source": [ + "import os, subprocess, time\n", + "import urllib.request, urllib.error\n", + "\n", + "env = os.environ.copy()\n", + "env['PORT'] = '3000'\n", + "env['HOST'] = '0.0.0.0'\n", + "env['BROWSER'] = 'none'\n", + "env['CI'] = 'true'\n", + "env['DANGEROUSLY_DISABLE_HOST_CHECK'] = 'true'\n", + "env['WDS_SOCKET_PORT'] = '0'\n", + "\n", + "frontend_log = open('/content/AIGoat/logs/frontend.log', 'wb')\n", + "frontend_proc = subprocess.Popen(\n", + " ['npm', 'start'],\n", + " stdout=frontend_log,\n", + " stderr=subprocess.STDOUT,\n", + " cwd='/content/AIGoat/frontend',\n", + " env=env,\n", + ")\n", + "\n", + "# CRA cold compile is slow — wait up to ~3 minutes.\n", + "for _ in range(180):\n", + " try:\n", + " with urllib.request.urlopen('http://localhost:3000/', timeout=2) as r:\n", + " if r.status == 200:\n", + " print('Frontend is up on :3000')\n", + " break\n", + " except (urllib.error.URLError, ConnectionError, TimeoutError):\n", + " pass\n", + " if frontend_proc.poll() is not None:\n", + " raise RuntimeError('Frontend exited early — run the logs cell at the bottom to see why')\n", + " time.sleep(1)\n", + "else:\n", + " print('Frontend slow to compile — check logs/frontend.log if the preview URL 404s')" + ], + "id": "nOkpUJ8t9tQj" + }, + { + "cell_type": "markdown", + "metadata": { + "id": "gwlAKoQ89tQj" + }, + "source": [ + "## 12. Open AIGoat in a browser tab\n", + "\n", + "`serve_kernel_port_as_window` prints a clickable proxy URL that forwards to `localhost:3000`. All `/api/…` requests sent to that URL are routed to the backend by the React dev server's proxy config, so the same URL also serves `/docs` (FastAPI Swagger)." + ], + "id": "gwlAKoQ89tQj" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "2XCVLXPl9tQj" + }, + "outputs": [], + "source": [ + "from google.colab import output\n", + "\n", + "output.serve_kernel_port_as_window(3000)\n", + "\n", + "print()\n", + "print('==========================================')\n", + "print(' AI Goat is running! ')\n", + "print('==========================================')\n", + "print()\n", + "print(' Click the link above to open the app.')\n", + "print(' API docs: append /docs to that URL')\n", + "print()\n", + "print(' Demo login: alice / password123')\n", + "print(' Admin login: admin / admin123')\n", + "print()" + ], + "id": "2XCVLXPl9tQj" + }, + { + "cell_type": "markdown", + "metadata": { + "id": "lDq697JU9tQj" + }, + "source": [ + "## 13. (Optional) Tail logs or shut everything down\n", + "\n", + "Uncomment whatever you need. Process handles (`ollama_proc`, `backend_proc`, `frontend_proc`) are held in kernel memory from the earlier cells." + ], + "id": "lDq697JU9tQj" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "gHXAv8Uo9tQk" + }, + "outputs": [], + "source": [ + "# !tail -n 80 /content/AIGoat/logs/backend.log\n", + "# !tail -n 80 /content/AIGoat/logs/frontend.log\n", + "# !tail -n 80 /content/AIGoat/logs/ollama.log\n", + "\n", + "#frontend_proc.terminate()\n", + "# backend_proc.terminate()\n", + "# ollama_proc.terminate()" + ], + "id": "gHXAv8Uo9tQk" + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + }, + "language_info": { + "name": "python", + "version": "3.11" + }, + "colab": { + "provenance": [], + "gpuType": "T4" + }, + "accelerator": "GPU" + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/colab_notebooks/AIGoat_Colab.md b/colab_notebooks/AIGoat_Colab.md new file mode 100644 index 0000000..9fb5d58 --- /dev/null +++ b/colab_notebooks/AIGoat_Colab.md @@ -0,0 +1,18 @@ +## How to Run AIGoat Colab Notebook + +Follow these steps to get the AIGoat Colab notebook up and running: + +1. **Download the Notebook:** + * Download the `AIGoat_Colab.ipynb` file directly from its GitHub location: `https://github.com/AISecurityConsortium/AIGoat/blob/main/colab_notebooks/AIGoat_Colab.ipynb` + +2. **Access Google Colab:** + * Log in to your Google account and navigate to the Google Colab platform: `https://colab.research.google.com/`. + +3. **Upload the Notebook to Colab:** + * Once in Colab, go to `File` in the top menu. + * Select `Upload Notebook`. + * Choose the `AIGoat_Colab.ipynb` file that you downloaded in the first step. + +4. **Execute All Cells:** + * After the notebook has finished loading in Colab, Select `Runtime` to GPU. + * Select `Run all` to execute every cell in the notebook sequentially.