Skip to content

ehsanlatif/SketchMind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

SketchMind πŸ§ βœ’οΈ

Paper Code License SketchMind

SketchMind: A Multi-Agent Cognitive Framework for Assessing Student-Drawn Scientific Sketches [NeurIPS 2025 πŸ”₯]

University of Georgia


Performance Benchmarks

Average Accuracy Improvement

Highest Accuracy With SRG

Highest Feedback Quality

Best Results


πŸ“’ Latest Updates

  • Sep-18-25: SketchMind accepted at NeurIPS 2025! πŸ”₯πŸ₯³
  • Jul-18-25: Released complete multi-agent framework with both GPT-4 and LLaMA-4 implementations πŸ”₯
  • May-15-25: Published comprehensive evaluation on 3,500+ scientific sketches across 6 NGSS-aligned assessment items πŸ”₯

SketchMind Overview πŸ’‘

SketchMind introduces a cognitively grounded, multi-agent framework for assessing student drawn scientific sketches using semantic structures known as Sketch Reasoning Graphs (SRGs). Each SRG is annotated with Bloom's Taxonomy levels and constructed via mulitmodal agents that collaboratively parse rubrics, analyze student sketches, evaluate conceptual understanding, and generate pedagogical feedback.


Contributions πŸ†

  • Novel Multi-Agent Framework: First cognitively-grounded multi-agent system for scientific sketch assessment
  • Sketch Reasoning Graphs (SRGs): New semantic representation combining visual elements with Bloom's taxonomy levels
  • Comprehensive Evaluation: Extensive validation on 3,500+ student sketches across 6 NGSS-aligned assessment items
  • Dual Model Implementation: Complete pipelines for both proprietary (GPT-4) and open-source (LLaMA-4) models
  • Interactive Visualization: Web-based tools for exploring and understanding SRG structures

Multi-Agent Architecture βš™οΈ

Agent 1: Rubric Parser πŸ“„

  • Inputs: Rubric, question text, gold-standard sketches
  • Outputs: Gold-Standard Reference SRG with Bloom's taxonomy levels and reverse mapping
  • Function: Establishes cognitive benchmarks from expert-designed rubrics and 3 gold-standard reference scientific sketches for each assessment tasks

Agent 2: Sketch Parser πŸ‘

  • Inputs: Student sketch image, reference SRG
  • Outputs: Student SRG constructed from visible sketch content
  • Function: Extracts semantic elements and relationships from hand-drawn sketches

Agent 3: SRG Evaluator βš–οΈ

  • Inputs: Reference SRG, student SRG
  • Outputs: Cognitive alignment score, proficiency classification, concept gaps
  • Function: Compares graphs using edit distance and Bloom-level analysis

Agent 4: Feedback Generator πŸ’¬

  • Inputs: Evaluation results, original sketch, reference standards
  • Outputs: Next-step sketch revision plan with Bloom-guided visual hints
  • Function: Generates pedagogical feedback with visual overlays

Directory Structure πŸ“

SketchMind/
β”œβ”€β”€ config/                          # Configuration files
β”‚   β”œβ”€β”€ task_config.yaml             # Task Specific configuration 
β”œβ”€β”€ data/                            # Task-specific data
β”‚   β”œβ”€β”€ README.md                    
β”‚   └── {task_id}/                   # Per-task directories
β”‚       β”œβ”€β”€ student_images/          # Student submissions
β”‚       β”œβ”€β”€ golden_standard_images/  # 3 reference sketches
β”‚       β”œβ”€β”€ question.txt             # Task question (optional)
β”‚       └── rubric.txt               # Evaluation rubric (optional)
β”œβ”€β”€ outputs/                         
β”‚   └── {task_id}/
β”‚       β”œβ”€β”€ logs/                    
β”‚       β”œβ”€β”€ cache/                   # SRG cache files
β”‚       └── results/                 # Visual hints and textual feedback
β”œβ”€β”€ scripts/                         # Core implementation
β”‚   β”œβ”€β”€ config_loader.py             # Configuration management
β”‚   β”œβ”€β”€ GPT_SRG_Agents.py            # GPT-based agents
β”‚   β”œβ”€β”€ GPT_SRG_MAS.py               # GPT pipeline
β”‚   β”œβ”€β”€ Llama4_SRG_Agents.py         # Llama4-based agents
β”‚   β”œβ”€β”€ Llama4_SRG_MAS.py            # Llama4 pipeline
β”‚   └── requirements.txt             
β”œβ”€β”€ .env.example                     # API key template
β”œβ”€β”€ .gitignore                       
β”œβ”€β”€ run.py                           # Unified entry point
└── README.md                        # This file

Installation

Prerequisites

  • Python 3.8+
  • pip package manager
  • OpenAI API key (for GPT models)
  • OpenRouter API key (for Llama4 models - free tier available)

Setup πŸ”§

We recommend setting up a conda environment for the project:

# Create and activate environment
conda create --name sketchmind python=3.9+
conda activate sketchmind

# Clone repository
git clone https://github.com/ehsanlatif/SketchMind.git
cd SketchMind

# Install dependencies
pip install -r requirements.txt

## Config API Keys in .env
# For GPT models
OPENAI_API_KEY=your_openai_api_key_here

# For Llama4 models via OpenRouter
OPENROUTER_API_KEY=your_openrouter_api_key_here

For a complete list of dependencies, see requirements.txt.


Quick Start Examples πŸš€

Set specific Model names and data paths in .yaml file for task evaluation

GPT Pipeline

python run.py \
    --config config/example_task.yaml \
    --model-type gpt \
    --student-image data/Task{id}/student_images/student1_sketch.jpg

LLaMA-4 Pipeline

python run.py \
    --config config/example_task.yaml \
    --model-type llama4 \
    --student-image data/Task{id}/student_images/student1_sketch.jpg

Dataset πŸ“‚

SketchMind is evaluated on a comprehensive dataset of 3,500+ student-drawn scientific sketches across 6 NGSS-aligned assessment items:

Annotation Schema

Each assessment item includes:

  • βœ… Detailed textual rubric
  • βœ… 3 gold standard scientific sketches (Beginning, Developing, Proficient)
  • βœ… Student scientific sketch images
  • βœ… Expert assigned proficiency labels

Note: The full dataset release is pending approval.


Acknowledgements πŸ™

  • OpenAI for GPT API access and multimodal capabilities
  • Meta AI for open-sourcing multimodal models like LLaMA-4
  • Open Router for making LLaMa models available via GPT like API calls for easy reproducibility

Thanks to Dr. Xiaoming Zhai for his unwavering support throughout the project. Special thanks to our educators at AI4STEM Education Center at University of Georgia who provided domain expertise for rubric development.

Citation πŸ“œ

@misc{latif2025sketchmindmultiagentcognitiveframework,
      title={SketchMind: A Multi-Agent Cognitive Framework for Assessing Student-Drawn Scientific Sketches}, 
      author={Ehsan Latif and Zirak Khan and Xiaoming Zhai},
      year={2025},
      eprint={2507.22904},
      archivePrefix={arXiv},
      primaryClass={cs.HC},
      url={https://arxiv.org/abs/2507.22904}, 
}

Contact βœ‰οΈ

For questions, collaborations, or support:


License

This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


Looking forward to your feedback, contributions, and stars! 🌟

GitHub stars GitHub forks GitHub watchers

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •