cartoonimator

AI illustrates, code animates.

▶ Watch in HD (MP4, 4.3 MB) — a real KaiCalls commercial rendered with cartoonimator

Deterministic mascot animation: feed it a pose PNG, an audio WAV, and a few anchor coordinates. Get back a lip-synced cartoon video. No diffusion in the render loop, no frame-to-frame wobble, no GPU required.

from cartoonimator import load_mascot, render_scene

render_scene(
    mascot=load_mascot("mascots/kai"),
    audio_wav="hello.wav",
    background_png="assets/backgrounds/solid_deep_navy_1080x1920.png",
    output="hello.mp4",
)

The idea

Diffusion models are great at illustrating consistent character art. They are bad at frame-to-frame consistency — every frame redraws the body, eyebrows, hands, costume seams. Faces wobble. Hands grow extra fingers between frames.

So split the job:

AI is the illustrator. Generate one canonical body per pose, once. Lock it. (Use Stable Diffusion, GPT Image, Midjourney — whatever produces the model sheet you want.)
Code is the animator. PIL draws mouth states on top at known anchor coordinates. Rhubarb maps audio → visemes. FFmpeg muxes the result. The body never changes pixels between talking frames.

That's it. The body never wobbles because the body is a static image. The mouth follows the audio because Rhubarb says it should. The eyes blink because we draw a horizontal line over them every few seconds.

Quickstart

Install system dependencies:

# ffmpeg
sudo apt-get install ffmpeg                         # Debian/Ubuntu
brew install ffmpeg                                  # macOS

# Rhubarb Lip Sync (https://github.com/DanielSWolf/rhubarb-lip-sync/releases)
# Download the binary for your OS, put it on $PATH (or set RHUBARB_BINARY)

Install the package:

pip install cartoonimator

Verify the install:

cartoonimator demo out.mp4

Render with your own audio:

cartoonimator render \
    --mascot mascots/kai \
    --audio examples/hello.wav \
    --output hello.mp4

Or in Python:

from cartoonimator import load_mascot, render_scene

render_scene(
    mascot=load_mascot("mascots/kai"),
    audio_wav="hello.wav",
    background_png="assets/backgrounds/solid_deep_navy_1080x1920.png",
    output="hello.mp4",
    pose_cut_interval_s=2.0,
)

Bring your own mascot

A mascot is a directory with three files plus a poses folder:

my_mascot/
├── anchors.json            # mouth + eye coordinates per pose
├── poses-manifest.json     # pose IDs and filenames
├── character-bible.md      # personality notes (optional)
└── poses/
    ├── standing_open_hands.png
    ├── pointing_at_camera.png
    └── ...

The included mascots/kai/ is the reference. To make your own:

Generate pose PNGs at 1024×1024 with transparent backgrounds. (Tip: green-screen the AI output, then run bg_remover.remove_green to get clean alpha.)
Run cartoonimator tag --mascot my_mascot --port 8801 and click mouth/eye anchors in the browser at http://localhost:8801.
Render.

See docs/anchors.md for the anchor schema and tagger workflow, and docs/architecture.md for how the renderer assembles a scene.

TTS — bring your own audio (or plug in a provider)

The core API takes a WAV path. Generate audio however you want — record yourself, use a local TTS like Piper, hit ElevenLabs, etc.

For convenience, optional providers are bundled:

from cartoonimator.tts import ElevenLabsProvider

provider = ElevenLabsProvider(api_key="...", voice_id="...")
audio_path = provider.synthesize("Hello there.", out_path="hello.mp3")
render_scene(audio_wav=audio_path, ...)

A MOSS-TTS provider stub is included for users running a self-hosted MOSS-TTS server.

Other commands

# Trim a clip
cartoonimator cut --input long.mp4 --start 12.0 --end 27.5 --output clip.mp4

# Mix music under an existing video
cartoonimator mix-music --video clip.mp4 --music score.mp3 --volume 0.15 --output final.mp4

# Generate a fresh mascot pose library from prompts (needs OPENROUTER_API_KEY)
# Requires mascots/yourname/character-bible.md (with a fenced base prompt)
# and mascots/yourname/pose-specs.json (list of {id, prompt, description}).
cartoonimator build-library --char-dir mascots/yourname --workers 5

The library generator uses GPT Image 2 via OpenRouter to render each pose against a green-screen background, then keys the green out to produce transparent PNGs ready for cartoonimator tag. See mascots/kai/pose-specs.json for a reference spec format.

How it compares

	cartoonimator	Adobe Character Animator	Live2D	Synthesia / HeyGen
Open source	✅	❌	❌	❌
Self-hosted, no API costs	✅	❌ (Adobe sub)	✅	❌ (per-minute)
Code-driven (CI, scriptable)	✅	❌ (GUI)	❌ (GUI)	⚠️ (API only)
Deterministic output	✅	⚠️ (live capture varies)	✅	❌
GPU required	❌	❌	❌	✅ (cloud)
Photorealistic	❌ (cartoon)	❌	❌	✅
Bring your own art	✅	⚠️ (Adobe puppets)	✅	❌ (avatar library)

In one line: a CLI version of Adobe Character Animator's lip-sync, brand-agnostic, MIT-licensed, with no facial-tracking webcam loop.

What this is not

Not a video editor. Cuts are pose changes every N seconds. No transitions, no effects, no zooms.
Not photorealistic. Cartoon mascots only. The whole point is that the body doesn't move between frames.
Not real-time. It's batch — render once, deliver an MP4.

License

MIT. See LICENSE.

Made by

Built by Connor Gallic alongside KaiCalls (AI voice agent) and MeetKai (AI marketing execution). Kai — the reference mascot — is the shared brand character across both.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
assets		assets
docs		docs
examples		examples
mascots/kai		mascots/kai
src/cartoonimator		src/cartoonimator
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
demo.mp4		demo.mp4
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cartoonimator

The idea

Quickstart

Bring your own mascot

TTS — bring your own audio (or plug in a provider)

Other commands

How it compares

What this is not

License

Made by

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cartoonimator

The idea

Quickstart

Bring your own mascot

TTS — bring your own audio (or plug in a provider)

Other commands

How it compares

What this is not

License

Made by

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages