Turn a short list of evocative text prompts into a seamless, SDXL‑powered video—no video editor required.
Creating smooth AI‑art animations usually means juggling multiple notebooks, CLI tools, and video editors. This repo compresses the whole workflow into one Python script:
- Encode each prompt once with Stable Diffusion XL (SDXL).
- Blend their embeddings in latent space so the story flows naturally.
- Render every in‑between frame with a stylistic LoRA adapter.
- Hand the PNG sequence to FFmpeg to publish a ready‑to‑share .mp4.
That’s it—one command, one folder, one video.
flowchart TD
A[main.py] --> B[interpolation.py\n(generate_interpolated_embeddings)]
B --> C[utils.py\n(generate_images)]
C --> D[video.py\n(create_video)]
| Stage | Key Function / File | Responsibility |
|---|---|---|
| Prompt → Embedding | pipe.encode_prompt in main.py |
Vectorises each prompt & negative prompt once. |
| Interpolation | generate_interpolated_embeddings in interpolation.py |
Creates a smooth path between prompt embeddings so motion feels organic. |
| Frame Synthesis | generate_images in utils.py |
Renders each latent pair to a 960 × 544 PNG with LoRA style applied. |
| Video Assembly | create_video in video.py |
Calls FFmpeg to concat the PNGs into H.264 / yuv420p video. |
.
├── main.py # orchestration entry‑point
├── interpolation.py # latent‑space blending helpers
├── utils.py # image generation loop
├── video.py # FFmpeg wrapper
└── README.md # you’re here
- Python ≥ 3.10
- FFmpeg on your PATH
- A GPU is optional but highly recommended (CUDA or Apple M‑series).
# 1. Clone and enter
$ git clone https://github.com/your‑handle/lora‑interp‑video.git
$ cd lora‑interp‑video
# 2. Create env & install deps
$ python -m venv .venv && source .venv/bin/activate
$ pip install torch diffusers transformers accelerate tqdm
# 3. Fetch or copy your SDXL‑compatible LoRA weights
$ mkdir -p lora && cp /path/to/aidmaMJ6.1SDXL-v0.5.safetensors lora/
# 4. Run
$ python main.pyThe script warms the pipeline, generates one PNG per second of video, and writes output/video.mp4.
All tunables live at the top of main.py:
| Variable | Meaning | Typical tweak |
|---|---|---|
lora_weight_name |
Which LoRA file to load from lora/ |
Swap styles by filename |
prompts |
List[str] | Craft your narrative here |
total_duration |
Seconds = frames | Longer video → more frames |
guidance_scale |
CFG strength in utils.generate_images |
5.0–8.0 for vivid output |
height,width |
Output resolution | Keep within GPU VRAM |
| Symptom | Cause | Fix |
|---|---|---|
| CUDA out‑of‑memory | Resolution or batch too big | Lower height/width or use CPU |
| Pink/gray images | Incorrect LoRA base model | Use SDXL‑trained LoRA only |
| FFmpeg not found | PATH mis‑configured | brew install ffmpeg / apt install ffmpeg |
- 🎞️ Variable FPS & cross‑fade support
- 🎨 Per‑frame positive/negative prompt mixing
- 🧩 CLI flags instead of hard‑coded vars
MIT—see LICENSE for details. LoRA weights retain their respective licenses.
Questions or integration requests? basalelr@gmail.com