Skip to content

Basalel5Mill/SDXL-interpolation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Interpolation‑Driven LoRA Animation

Turn a short list of evocative text prompts into a seamless, SDXL‑powered video—no video editor required.


1 · Why this exists

Creating smooth AI‑art animations usually means juggling multiple notebooks, CLI tools, and video editors. This repo compresses the whole workflow into one Python script:

  1. Encode each prompt once with Stable Diffusion XL (SDXL).
  2. Blend their embeddings in latent space so the story flows naturally.
  3. Render every in‑between frame with a stylistic LoRA adapter.
  4. Hand the PNG sequence to FFmpeg to publish a ready‑to‑share .mp4.

That’s it—one command, one folder, one video.


2 · How it works

flowchart TD
    A[main.py] --> B[interpolation.py\n(generate_interpolated_embeddings)]
    B --> C[utils.py\n(generate_images)]
    C --> D[video.py\n(create_video)]
Loading
Stage Key Function / File Responsibility
Prompt → Embedding pipe.encode_prompt in main.py Vectorises each prompt & negative prompt once.
Interpolation generate_interpolated_embeddings in interpolation.py Creates a smooth path between prompt embeddings so motion feels organic.
Frame Synthesis generate_images in utils.py Renders each latent pair to a 960 × 544 PNG with LoRA style applied.
Video Assembly create_video in video.py Calls FFmpeg to concat the PNGs into H.264 / yuv420p video.

3 · Repository layout

.
├── main.py              # orchestration entry‑point
├── interpolation.py     # latent‑space blending helpers
├── utils.py             # image generation loop
├── video.py             # FFmpeg wrapper
└── README.md            # you’re here

4 · Quick start

Prerequisites

  • Python ≥ 3.10
  • FFmpeg on your PATH
  • A GPU is optional but highly recommended (CUDA or Apple M‑series).
# 1. Clone and enter
$ git clone https://github.com/your‑handle/lora‑interp‑video.git
$ cd lora‑interp‑video

# 2. Create env & install deps
$ python -m venv .venv && source .venv/bin/activate
$ pip install torch diffusers transformers accelerate tqdm

# 3. Fetch or copy your SDXL‑compatible LoRA weights
$ mkdir -p lora && cp /path/to/aidmaMJ6.1SDXL-v0.5.safetensors lora/

# 4. Run
$ python main.py

The script warms the pipeline, generates one PNG per second of video, and writes output/video.mp4.


5 · Configuration

All tunables live at the top of main.py:

Variable Meaning Typical tweak
lora_weight_name Which LoRA file to load from lora/ Swap styles by filename
prompts List[str] Craft your narrative here
total_duration Seconds = frames Longer video → more frames
guidance_scale CFG strength in utils.generate_images 5.0–8.0 for vivid output
height,width Output resolution Keep within GPU VRAM

6 · Troubleshooting

Symptom Cause Fix
CUDA out‑of‑memory Resolution or batch too big Lower height/width or use CPU
Pink/gray images Incorrect LoRA base model Use SDXL‑trained LoRA only
FFmpeg not found PATH mis‑configured brew install ffmpeg / apt install ffmpeg

7 · Roadmap

  • 🎞️ Variable FPS & cross‑fade support
  • 🎨 Per‑frame positive/negative prompt mixing
  • 🧩 CLI flags instead of hard‑coded vars

8 · License

MIT—see LICENSE for details. LoRA weights retain their respective licenses.


9 · Contact

Questions or integration requests? basalelr@gmail.com

About

Turn a short list of evocative text prompts into a seamless, SDXL‑powered video—no video editor required.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages