AI-Gauntlet is a proof-of-concept app where two AI-generated answers are presented anonymously to the user. The backend creates both candidate solutions with different AI models, judges them with a third model, and the frontend reveals the model only after the user picks an answer.
The app also supports a leaderboard concept: AI models are ranked by judge score so users can compare model performance over time.
- The backend starts an Express server in
backend/server.ts. - The
/run-graphendpoint inbackend/src/app.tscallsrunGraph()frombackend/src/services/graph.service.ts. graph.service.tsbuilds a LangChainStateGraphwith:- a
solutionnode that creates two answers usingMistralModelandcohereModel - a
judge_nodethat evaluates both answers usingGeminiModel
- a
- The judge node parses the AI judge output and returns JSON scores plus reasoning.
- The final response includes both solutions and the judge results.
- Backend: Node.js, Express, TypeScript
- AI / LangChain:
@langchain/langgraph,@langchain/google-genai,@langchain/mistralai,@langchain/cohere - Environment config:
dotenv - Frontend: React + Tailwind CSS (planned for UI)
backend/server.ts— starts the Express serverbackend/src/app.ts— defines routes and health checksbackend/src/services/graph.service.ts— builds and executes the LangChain graphbackend/src/services/model.service.ts— configures AI model clientsbackend/src/config/config.ts— loads API keys from.env
git clone <repo-url>
cd AI-Gauntletcd backend
npm installAdd keys for the supported AI providers:
GEMINI_API_KEY=your_gemini_api_key
MISTRAL_API_KEY=your_mistral_api_key
COHERE_API_KEY=your_cohere_api_keynpm run devThe backend listens on port 3000.
Open a browser or use curl:
curl http://localhost:3000/healthYou should receive:
{ "status": "ok" }curl http://localhost:3000/run-graphIt returns the AI graph result as JSON.
The frontend is intended to use React and Tailwind CSS. If you want to build the UI locally:
- Create a React app under
frontend - Add Tailwind CSS
- Point UI requests to
http://localhost:3000/run-graph
- The two candidate answers will be shown anonymously, without model names attached.
- When a user selects one of the answers, the UI will reveal which AI model produced it.
- The app will display a leaderboard ranking the AI models by the scores they receive from the judge.
- This gives users a fair comparison experience and a model performance summary over time.
Note: The current repository has backend implementation only. The frontend folder is prepared for a React + Tailwind CSS app.