Generate image captions and audio descriptions with AI, in a fast and intuitive way.
Layout • Features • Getting Started • Useful Scripts • Supported Images • How to Use • Common Issues • Preview
Project prototype:
- Upload or image URL
- Caption generation with Gemini (Google GenAI)
- Easy copy to clipboard
- Audio description (voice)
- Install dependencies:
npm install- Create a
.envfile in the project root:
VITE_GOOGLE_GENAI_API_KEY=your_key_here(Get your key from Google AI Studio)
- Start in dev mode:
npm run devnpm run dev— start local servernpm run build— production buildnpm run preview— preview production buildnpm run lint— code linting
- JPG, PNG, GIF, WEBP, BMP
- Up to 5 MB
- Enter the image URL or upload a file
- Click "Generate descriptions"
- Copy the text or listen to the audio description
- Error generating caption: check
VITE_GOOGLE_GENAI_API_KEYin.env - Image not loading: make sure the URL is public
- Audio not playing: use a modern browser (Web Speech API)
