SketchMind: A Multi-Agent Cognitive Framework for Assessing Student-Drawn Scientific Sketches [NeurIPS 2025 π₯]
- Sep-18-25: SketchMind accepted at NeurIPS 2025! π₯π₯³
- Jul-18-25: Released complete multi-agent framework with both GPT-4 and LLaMA-4 implementations π₯
- May-15-25: Published comprehensive evaluation on 3,500+ scientific sketches across 6 NGSS-aligned assessment items π₯
SketchMind introduces a cognitively grounded, multi-agent framework for assessing student drawn scientific sketches using semantic structures known as Sketch Reasoning Graphs (SRGs). Each SRG is annotated with Bloom's Taxonomy levels and constructed via mulitmodal agents that collaboratively parse rubrics, analyze student sketches, evaluate conceptual understanding, and generate pedagogical feedback.
- Novel Multi-Agent Framework: First cognitively-grounded multi-agent system for scientific sketch assessment
- Sketch Reasoning Graphs (SRGs): New semantic representation combining visual elements with Bloom's taxonomy levels
- Comprehensive Evaluation: Extensive validation on 3,500+ student sketches across 6 NGSS-aligned assessment items
- Dual Model Implementation: Complete pipelines for both proprietary (GPT-4) and open-source (LLaMA-4) models
- Interactive Visualization: Web-based tools for exploring and understanding SRG structures
- Inputs: Rubric, question text, gold-standard sketches
- Outputs: Gold-Standard Reference SRG with Bloom's taxonomy levels and reverse mapping
- Function: Establishes cognitive benchmarks from expert-designed rubrics and 3 gold-standard reference scientific sketches for each assessment tasks
- Inputs: Student sketch image, reference SRG
- Outputs: Student SRG constructed from visible sketch content
- Function: Extracts semantic elements and relationships from hand-drawn sketches
- Inputs: Reference SRG, student SRG
- Outputs: Cognitive alignment score, proficiency classification, concept gaps
- Function: Compares graphs using edit distance and Bloom-level analysis
- Inputs: Evaluation results, original sketch, reference standards
- Outputs: Next-step sketch revision plan with Bloom-guided visual hints
- Function: Generates pedagogical feedback with visual overlays
SketchMind/
βββ config/ # Configuration files
β βββ task_config.yaml # Task Specific configuration
βββ data/ # Task-specific data
β βββ README.md
β βββ {task_id}/ # Per-task directories
β βββ student_images/ # Student submissions
β βββ golden_standard_images/ # 3 reference sketches
β βββ question.txt # Task question (optional)
β βββ rubric.txt # Evaluation rubric (optional)
βββ outputs/
β βββ {task_id}/
β βββ logs/
β βββ cache/ # SRG cache files
β βββ results/ # Visual hints and textual feedback
βββ scripts/ # Core implementation
β βββ config_loader.py # Configuration management
β βββ GPT_SRG_Agents.py # GPT-based agents
β βββ GPT_SRG_MAS.py # GPT pipeline
β βββ Llama4_SRG_Agents.py # Llama4-based agents
β βββ Llama4_SRG_MAS.py # Llama4 pipeline
β βββ requirements.txt
βββ .env.example # API key template
βββ .gitignore
βββ run.py # Unified entry point
βββ README.md # This file
- Python 3.8+
- pip package manager
- OpenAI API key (for GPT models)
- OpenRouter API key (for Llama4 models - free tier available)
We recommend setting up a conda environment for the project:
# Create and activate environment
conda create --name sketchmind python=3.9+
conda activate sketchmind
# Clone repository
git clone https://github.com/ehsanlatif/SketchMind.git
cd SketchMind
# Install dependencies
pip install -r requirements.txt
## Config API Keys in .env
# For GPT models
OPENAI_API_KEY=your_openai_api_key_here
# For Llama4 models via OpenRouter
OPENROUTER_API_KEY=your_openrouter_api_key_hereFor a complete list of dependencies, see requirements.txt.
Set specific Model names and data paths in .yaml file for task evaluation
python run.py \
--config config/example_task.yaml \
--model-type gpt \
--student-image data/Task{id}/student_images/student1_sketch.jpgpython run.py \
--config config/example_task.yaml \
--model-type llama4 \
--student-image data/Task{id}/student_images/student1_sketch.jpgSketchMind is evaluated on a comprehensive dataset of 3,500+ student-drawn scientific sketches across 6 NGSS-aligned assessment items:
Each assessment item includes:
- β Detailed textual rubric
- β 3 gold standard scientific sketches (Beginning, Developing, Proficient)
- β Student scientific sketch images
- β Expert assigned proficiency labels
Note: The full dataset release is pending approval.
- OpenAI for GPT API access and multimodal capabilities
- Meta AI for open-sourcing multimodal models like LLaMA-4
- Open Router for making LLaMa models available via GPT like API calls for easy reproducibility
Thanks to Dr. Xiaoming Zhai for his unwavering support throughout the project. Special thanks to our educators at AI4STEM Education Center at University of Georgia who provided domain expertise for rubric development.
@misc{latif2025sketchmindmultiagentcognitiveframework,
title={SketchMind: A Multi-Agent Cognitive Framework for Assessing Student-Drawn Scientific Sketches},
author={Ehsan Latif and Zirak Khan and Xiaoming Zhai},
year={2025},
eprint={2507.22904},
archivePrefix={arXiv},
primaryClass={cs.HC},
url={https://arxiv.org/abs/2507.22904},
}For questions, collaborations, or support:
- π§ Email: Zirak.khan@uga.edu || Ehsan.Latif@uga.edu
- π Issues: GitHub Issues
This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Looking forward to your feedback, contributions, and stars! π
