Skip to content

GaganaMD/QuestionMe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

QuestionMe

Second place in AMD's AAIPL Competition System

A defensive multi-agent Q&A system designed for adversarial AI competitions, incorporating strategies for bias exploitation mitigation, poison detection, and balanced evaluation.


🏆 Competition Features

  • Dual Agent Architecture: Combines a Question Generator (Q-Agent) with an Answer Predictor (A-Agent).
  • Robust Defensive Strategies: Implements poison detection, A-bias exploitation safeguards, and answer distribution control.
  • Local Model Support: Integrates Qwen3-4B without relying on external APIs.
  • Comprehensive Evaluation: Balanced testing with detailed bias analysis.

🚀 Quick Start

Setup Environment

pip install -r requirements.txt
cp .env.example .env

Run Test Mode

python scripts/run_competition.py --mode test --questions 5

Generate Questions

python agents/question_agent.py --num_questions 10 --output_file outputs/questions.json

Answer Questions

python agents/answer_agent.py --input_file outputs/questions.json --output_file outputs/answers.json

🛡️ Defensive Strategies

A-Agent Defenses

  • Bias Exploitation Counter: Defaults to "A" only when confidence < 50%
  • Poison Detection: Pattern-based recognition of adversarial manipulations
  • Format Compliance: Tolerant JSON parsing with structural fallbacks

Q-Agent Techniques

  • Poison Generation: Uses complex grammar, misleading context, and double negatives
  • Balanced Answer Distribution: Evenly splits answers across A/B/C/D
  • Quality Control: Enforces token standards and question clarity

🔧 System Architecture

├── Q-Agent: Defensive question generation module
├── A-Agent: Bias-aware answer prediction module
├── Evaluation: Bias detection and robustness scoring

🎯 Competition Results

  • Reduced A-bias from 84% to balanced 25% per option
  • Achieved 80%+ poison question detection accuracy
  • Maintained perfect 25/25/25/25 answer distribution in evaluation

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages