Skip to content

Vishxlll20/AI-Gauntlet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

AI-Gauntlet

What it is

AI-Gauntlet is a proof-of-concept app where two AI-generated answers are presented anonymously to the user. The backend creates both candidate solutions with different AI models, judges them with a third model, and the frontend reveals the model only after the user picks an answer.

The app also supports a leaderboard concept: AI models are ranked by judge score so users can compare model performance over time.

How it works

  1. The backend starts an Express server in backend/server.ts.
  2. The /run-graph endpoint in backend/src/app.ts calls runGraph() from backend/src/services/graph.service.ts.
  3. graph.service.ts builds a LangChain StateGraph with:
    • a solution node that creates two answers using MistralModel and cohereModel
    • a judge_node that evaluates both answers using GeminiModel
  4. The judge node parses the AI judge output and returns JSON scores plus reasoning.
  5. The final response includes both solutions and the judge results.

Tech stack

  • Backend: Node.js, Express, TypeScript
  • AI / LangChain: @langchain/langgraph, @langchain/google-genai, @langchain/mistralai, @langchain/cohere
  • Environment config: dotenv
  • Frontend: React + Tailwind CSS (planned for UI)

Backend structure

  • backend/server.ts — starts the Express server
  • backend/src/app.ts — defines routes and health checks
  • backend/src/services/graph.service.ts — builds and executes the LangChain graph
  • backend/src/services/model.service.ts — configures AI model clients
  • backend/src/config/config.ts — loads API keys from .env

Running this on your device

1. Clone the repository

git clone <repo-url>
cd AI-Gauntlet

2. Install backend dependencies

cd backend
npm install

3. Create a .env file in backend

Add keys for the supported AI providers:

GEMINI_API_KEY=your_gemini_api_key
MISTRAL_API_KEY=your_mistral_api_key
COHERE_API_KEY=your_cohere_api_key

4. Start the backend server

npm run dev

The backend listens on port 3000.

5. Verify the server

Open a browser or use curl:

curl http://localhost:3000/health

You should receive:

{ "status": "ok" }

6. Call the graph endpoint

curl http://localhost:3000/run-graph

It returns the AI graph result as JSON.

Frontend setup

The frontend is intended to use React and Tailwind CSS. If you want to build the UI locally:

  1. Create a React app under frontend
  2. Add Tailwind CSS
  3. Point UI requests to http://localhost:3000/run-graph

Planned UI behavior

  • The two candidate answers will be shown anonymously, without model names attached.
  • When a user selects one of the answers, the UI will reveal which AI model produced it.
  • The app will display a leaderboard ranking the AI models by the scores they receive from the judge.
  • This gives users a fair comparison experience and a model performance summary over time.

Note: The current repository has backend implementation only. The frontend folder is prepared for a React + Tailwind CSS app.

About

AI-Gauntlet is a GenAI-based evaluation platform that generates multiple AI-driven solutions to a problem and uses an automated judging model to compare, score, and rank them. It leverages LangGraph workflows to orchestrate multi-model interactions, enabling fair comparison and performance tracking through a dynamic leaderboard system.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors