The Study Agent is an automated system that analyzes exam question images (from the any certification) and generates detailed explanations using Google's Gemini AI model. The system processes images in batch, extracts questions, and produces markdown-formatted study materials.
- π― Automated batch processing of exam question images
- π€ Intelligent analysis using Gemini 2.5 Flash model
- π Markdown-formatted output with explanations
- βοΈ Cloud-native architecture on Google Cloud Platform
- π Duplicate prevention (skips already processed images)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Google Cloud Platform β
β β
β ββββββββββββββββ ββββββββββββββββ β
β β GCS INPUT β β GCS OUTPUT β β
β β BUCKET β β BUCKET β β
β β (Images) β β (Markdown) β β
β ββββββββ¬ββββββββ ββββββββ²ββββββββ β
β β β β
β β ββββββββββββββββββββ β β
β βββ>β Cloud Run Job βββ β
β β (Docker Image) β β
β β β β
β β Process Images β β
β β + Gemini API β β
β ββββββββββββββββββββ β
β β β
β βΌ β
β Vertex AI / Gemini β
β (LLM Analysis) β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Input Phase: Exam question images are uploaded to the
INPUT_BUCKET - Processing Phase: Cloud Run Job starts the containerized application
- Analysis Phase: For each image:
- Load Gemini model with system prompt (tutor instructions)
- Send image + prompt to Gemini 2.5 Flash model
- Model returns markdown-formatted analysis
- Output Phase: Results saved as
result_*.mdfiles toOUTPUT_BUCKET
-
Google Cloud Project with:
- Cloud Run API enabled
- Artifact Registry API enabled
- Vertex AI API enabled
- Two GCS buckets (input and output)
- Service Account with appropriate permissions
-
Local tools:
gcloudCLI installed and configured- Docker installed (for building images locally)
- Python 3.11+ (for local testing)
- Create
.envfile in the project root:
# GCP Configuration
PROJECT_ID="your-gcp-project-id"
REGION="us-central1"
REPOSITORY_NAME="study-agent"
IMAGE_NAME="process-images"
JOB_NAME="study-agent-job"
SERVICE_ACCOUNT_EMAIL="your-sa@your-project.iam.gserviceaccount.com"
# GCS Buckets
INPUT_BUCKET="input-exam-images"
OUTPUT_BUCKET="output-study-materials"
# Model Variables
MODEL_NAME="gemini-1.5-pro"- Load environment variables:
export REPO_FOLDER=${PWD}
set -o allexport && source .env && set +o allexportThe Dockerfile creates a lightweight containerized application using Python 3.11-slim:
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
COPY main.py .
RUN pip install --no-cache-dir -r requirements.txt
ENTRYPOINT ["python", "main.py"]Build and push to Artifact Registry:
gcloud builds submit --tag ${REGION}-docker.pkg.dev/${PROJECT_ID}/${REPOSITORY_NAME}/${IMAGE_NAME}First-time deployment (create new job):
gcloud run jobs create ${JOB_NAME} \
--image ${REGION}-docker.pkg.dev/${PROJECT_ID}/${REPOSITORY_NAME}/${IMAGE_NAME} \
--region ${REGION} \
--service-account ${SERVICE_ACCOUNT_EMAIL} \
--set-env-vars \
GCP_PROJECT=${PROJECT_ID},\
GCP_REGION=${REGION},\
INPUT_BUCKET_NAME=${INPUT_BUCKET},\
OUTPUT_BUCKET_NAME=${OUTPUT_BUCKET} \
--max-retries 0Update environment variables (if job already exists):
gcloud run jobs update ${JOB_NAME} \
--region ${REGION} \
--service-account ${SERVICE_ACCOUNT_EMAIL} \
--update-env-vars \
INPUT_BUCKET_NAME=${INPUT_BUCKET},\
OUTPUT_BUCKET_NAME=${OUTPUT_BUCKET}Execute the job:
gcloud run jobs execute ${JOB_NAME} --region ${REGION} --task-timeout 1200sPurpose: Main processing script that orchestrates image analysis
Key Functions:
- Loads the system instruction from
system_prompt.txt - Falls back to
system_prompt.txt.exampleif original doesn't exist - Returns the tutor agent instructions as a string
- Client Initialization: Creates GCS and Gemini API clients
- Configuration: Sets up model parameters:
- Model:
gemini-2.5-flash - Temperature: 0.2 (low randomness for consistent answers)
- Max tokens: 2048
- Safety settings: Set to BLOCK_NONE for educational content
- Model:
- Image Processing Loop:
- Lists all objects in INPUT_BUCKET
- Filters for image files (
.png,.jpg,.jpeg,.webp) - Checks if output already exists (prevents reprocessing)
- Sends image + system prompt to Gemini API
- Saves markdown results to OUTPUT_BUCKET
Contains the tutor agent instructions, defining:
- Response format (Markdown)
- Analysis structure (Transcription, Correct Answer, Detailed Explanation)
- References to documentation
- Output labeling (REFERENCIA_PDF tag)
Dependencies:
google-cloud-storage: For GCS bucket operationsgoogle-genai: Gemini API client (includes Vertex AI support)
pip install -r requirements.txt| Variable | Description | Example |
|---|---|---|
GCP_PROJECT |
Google Cloud Project ID | my-project-123 |
GCP_REGION |
GCP region for deployment | us-central1 |
INPUT_BUCKET_NAME |
GCS bucket for input images | gs://exam-questions |
OUTPUT_BUCKET_NAME |
GCS bucket for markdown output | gs://study-materials |
1. List blobs in INPUT_BUCKET
β
2. For each blob:
ββ Is it an image file? (png, jpg, jpeg, webp)
β ββ If No: Skip and continue
β
ββ Does output file exist?
β ββ If Yes: Skip and continue
β
ββ Load image from GCS URI
β ββ gs://input-bucket/image-name.jpg
β
ββ Create Gemini request with:
β ββ System prompt (tutor instructions)
β ββ Image content
β ββ Analysis request text
β
ββ Call Gemini model
β ββ Returns markdown analysis
β
ββ Upload result to OUTPUT_BUCKET
ββ result_sanitized-name.md
Input: question_001.jpg
Output: result_question_001.md
The system sanitizes filenames by:
- Converting dots to underscores
- Converting slashes to underscores
- Removing extensions
- Prefixing with
result_
Each generated markdown file follows this structure:
# [Question Number/Title]
## 1. Transcription
[Original question text and options]
## 2. Correct Answer
**[Correct option letter and text]**
## 3. Detailed Explanation (Tutor)
[Comprehensive explanation with reference to different docs]
REFERENCIA_PDF: [Topic like x1,x2, x3]gcloud run jobs describe ${JOB_NAME} --region ${REGION}gcloud run jobs logs read ${JOB_NAME} --region ${REGION} --limit 50| Issue | Solution |
|---|---|
| INPUT_BUCKET not found | Verify bucket name and service account permissions |
| No images processed | Check image format and bucket contents |
| Output files not created | Verify OUTPUT_BUCKET exists and is writable |
| Gemini API errors | Ensure Vertex AI API is enabled and quota available |
The
google-genailibrary automatically includes Vertex AI dependencies. It's designed specifically for interacting with the Gemini API within Google Cloud environment.
Cloud Run Jobs are ideal for batch processing tasks. Unlike Cloud Run services, jobs automatically terminate after completion, reducing unnecessary costs.
The 1200-second timeout (20 minutes) should be sufficient for processing 50-100 images depending on model response times. Adjust as needed based on volume.
This project is licensed under the MIT License.
π‘ For commercial inquiries or specific licensing questions, feel free to contact me.
Jorge Aguirre
Last Updated: February 5, 2026