AI-powered business card scanner that instantly delivers company intelligence, strategic insights, and contact profiles — all from a single photo.
![]()
![]()
![]()
![]()
CardSense transforms a business card photo into a full intelligence briefing in seconds. Designed for sales professionals, recruiters, and business developers who need instant context before meetings.
- 📸 Scan — Snap a photo of a business card (or enter details manually)
- 🤖 AI Analysis — Three specialized agents research the person and company in real-time
- 📊 Dashboard — View structured intelligence across multiple dimensions
┌─────────────────────────────────────────────────────┐
│ Frontend (React 18) │
│ Babel Standalone + Tailwind CSS │
│ Mobile-First SPA │
└──────────────────┬──────────────────────────────────┘
│ REST API
┌──────────────────▼──────────────────────────────────┐
│ Backend (FastAPI + Uvicorn) │
│ │
│ ┌────────────┐ ┌────────────────────────────────┐ │
│ │ VLM Scan │ │ Agent Orchestrator │ │
│ │ (OCR) │ │ │ │
│ │ │ │ ┌──────┐ ┌────────┐ ┌──────┐ │ │
│ │ Image → │ │ │Macro │ │Strategy│ │Person│ │ │
│ │ Structured │ │ │Agent │ │Agent │ │Agent │ │ │
│ │ Data │ │ └──┬───┘ └───┬────┘ └──┬───┘ │ │
│ └────────────┘ │ │ │ │ │ │
│ └─────┼─────────┼─────────┼──────┘ │
│ │ │ │ │
└────────────────────────┼─────────┼─────────┼─────────┘
▼ ▼ ▼
Google Gemini API + Search Grounding
The Vision Language Model (VLM) module extracts structured data from business card images:
- Company name, person name (Kanji + Romaji), title, department
- Contact details (email, phone, URL, address)
- "Vibe" analysis — Classifies card design aesthetics (e.g., "Innovative & Ambitious", "Traditional & Solid") to infer corporate culture
- Confidence scoring (0.0–1.0) based on image clarity
Each agent leverages Google Search Grounding for real-time web research:
| Agent | Role | Output |
|---|---|---|
| 🏢 Macro Agent | Company IR Analyst | Company overview, financials, latest news (last 3 months) |
| 🔧 Strategy Agent | Tech Strategy Consultant | Tech stack analysis, hiring pattern insights, growth direction |
| 👤 Person Agent | Executive Headhunter | Professional profile, career timeline, social media links |
The Person Agent implements strict identity verification to prevent AI hallucination:
Identity Lock Protocol:
1. Search queries MUST include "{name}" AND "{company}"
2. Cross-reference temporal consistency (career timeline)
3. Verify domain expertise alignment
4. If uncertain → explicitly state "limited public information found"
5. Never fabricate or guess — absent data = "not found"
6. Past affiliations are noted as "previously at {company}" with caveats
This ensures the AI never confuses the target person with someone who has the same name, and never fills in gaps with invented information.
| Layer | Technology | Purpose |
|---|---|---|
| Frontend | React 18 + TypeScript | Single-page application (transpiled via Babel Standalone) |
| Styling | Tailwind CSS 3.x | Utility-first CSS with dark mode, glassmorphism effects |
| Backend | Python 3.11+ / FastAPI | REST API server with async support |
| AI Model | Google Gemini 2.0 Flash | VLM scanning + agent reasoning (configurable via .env) |
| Search | Google Search Grounding | Real-time web research for each agent |
| Server | Uvicorn | ASGI server with hot-reload |
- Python 3.11+
- Node.js (for optional tunnel access)
- Google Gemini API Key (Get one here)
# Clone the repository
git clone https://github.com/YOUR_USERNAME/cardsense.git
cd cardsense
# Install Python dependencies
cd backend
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Edit .env and add your GEMINI_API_KEY# Start the server
cd backend
python -m uvicorn main:app --host 0.0.0.0 --port 8000 --reloadOpen http://localhost:8000 in your browser.
To test on a mobile device with camera access (requires HTTPS):
npx localtunnel --port 8000
# Opens a public HTTPS URL you can access from your phoneAll model settings are managed via backend/.env:
GEMINI_API_KEY=your_api_key_here
MODEL_ID=gemini-2.0-flash # Main model for all agents
DEEP_RESEARCH_MODEL_ID=gemini-2.0-flash # Model for deep research featurecardsense/
├── frontend/
│ ├── index.html # Entry point (loads React + Babel via CDN)
│ └── index.tsx # Full SPA (components, state, routing)
├── backend/
│ ├── main.py # FastAPI app, routes, static file serving
│ ├── models.py # Pydantic data models
│ ├── vlm.py # Vision Language Model — business card OCR
│ ├── requirements.txt # Python dependencies
│ ├── .env.example # Environment template
│ └── agents/
│ ├── __init__.py # Agent module exports
│ ├── macro_agent.py # Company overview + news agent
│ ├── strategy_agent.py # Tech & hiring strategy agent
│ └── person_agent.py # Person profile + career agent
- No build step required — Frontend uses Babel Standalone for in-browser TypeScript transpilation, making it instantly deployable without npm/webpack configuration
- Sequential agent execution — Agents run one at a time with configurable delays to respect API rate limits on free-tier plans
- Mobile-first UI — Large camera button, touch-friendly interface optimized for on-the-go use at business events
- Graceful degradation — Each agent handles errors independently; if one fails, others still display results
MIT License — feel free to use, modify, and distribute.