Image style transfer application using Stable Diffusion XL with ControlNet conditioning. Runs fully locally - no external API dependencies.
- Stable Diffusion XL - image generation backbone (SDXL 1.0 Base, JuggernautXL, YamerMIX)
- ControlNet - structural conditioning via Canny edge detection and depth estimation
- Gradio - web-based GUI
- PyTorch - deep learning framework
- Docker - containerization (experimental)
- SimpleGen
- Apply artistic styles (Watercolor, Film Noir, Cyberpunk, Anime, Pixelart, and more) to any input image
- 3 model checkpoints with different generation characteristics
- Dual ControlNet support (Canny + Depth) for fine-grained structural control
- Configurable generation parameters (sampling method, steps, CFG scale, resolution, seed)
- Self-contained - all inference runs on your local GPU, no cloud APIs needed
- GPU required - CPU-only inference is not supported (float16 precision, and generation would take hours regardless)
- ControlNet style transfer works best when the output aspect ratio is similar to the input, otherwise cropping is applied
- Requires 10-14GB of VRAM for dual ControlNet inference at the default 1024x1024 resolution
- Per-model ControlNet conditioning scales, separate condition images, multi-image conditioning
- ControlNet scheduling (e.g. apply conditioning for the first 80% of steps, then pure diffusion for refinement)
- Additional ControlNet types (OpenPose, etc.) and more Canny preprocessor options
- Generation metadata embedded in output images for reproducibility
- Automatic aspect ratio handling for conditioning images (padding/stretching instead of cropping)
- Avatar generation via IP-Adapter, Instant-ID, or LoRA fine-tuning
- Video conditioning using Live Portrait
- Post-processing pipelines: upscaling, detailing, and refining
- Production architecture: separate GUI, API gateway, and pipeline hosting for independent scaling
Style transfer examples grouped by checkpoint - all outputs can be found in Examples/.
|
Pixelart |
![]() Steampunk |
![]() Lowpoly |
![]() Watercolor |
![]() Fantasy |
![]() Horror |
![]() Lowpoly |
Pixelart |
![]() Fantasy |
![]() Film Noir |
![]() Long Exposure |
![]() Minecraft |
![]() Watercolor |
![]() Cyberpunk |
![]() Anime |
![]() Horror |
![]() Steampunk |
Pixelart |
![]() Minecraft |
![]() Horror |
Additional Images and GenAI projects I took part in can be found on my Google Drive Portfolio if you're interested to see what I can do.
- Nvidia GPU card with at least 12GB of VRAM
- Nvidia CUDA installed -> installing
All inference runs locally - model weights (~15GB total) need to be downloaded before first use.
A Dockerfile is included but not fully supported. The image is ~33.7GB and requires the NVIDIA Container Toolkit with CUDA. Canny-based style transfer works in the container, but depth estimation has unresolved issues in the containerized environment.
-
Download the codebase:
git clone https://github.com/Lorakszak/img_gen_project
-
CD into the root directory of the project:
cd img_gen_project/ -
Create Python 3.10 environment either using conda on venv (!MAKE SURE IT's PYTHON 3.10)
- Using conda:
conda create --name=heavy_env python=3.10 conda activate heavy_env # Make sure you have the correct environment active which python && which pip
- Using venv (on Linux/MacOS):
python3.10 -m venv myenv source myenv/bin/activate # Make sure you have the correct environment active which python && which pip
- Using venv (on Windows):
python3.10 -m venv myenv myenv\Scripts\activate # Make sure you have the correct environment active which python && which pip
-
Install all the dependencies necessary:
pip install -r requirements.txt
-
Ensure you have the correct directory structure by running these:
mkdir -p ./assets/SDXL_base_1_0 mkdir -p ./assets/Juggernaut_XL mkdir -p ./assets/YamerMIX mkdir -p ./assets/Controlnets/canny_XL mkdir -p ./assets/Controlnets/depth_XL
-
Download the models (~15GB total from Hugging Face and Civitai):
- SDXL base 1.0
- JuggernautXL Rundiffusionphoto2
- YamerMIX
- Controlnet Canny
- Controlnet Depth
curl -L -o ./assets/SDXL_base_1_0/model.safetensors https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors?download=true && \
curl -L -o ./assets/Juggernaut_XL/juggernautXL_v9Rundiffusionphoto2.safetensors https://civitai.com/api/download/models/348913?type=Model&format=SafeTensor&size=full&fp=fp16 && \
curl -L -o ./assets/YamerMIX/sdxlUnstableDiffusers_nihilmania.safetensors https://civitai.com/api/download/models/395107?type=Model&format=SafeTensor&size=pruned&fp=fp16 && \
curl -L -o ./assets/Controlnets/canny_XL/diffusion_pytorch_model.fp16.safetensors https://huggingface.co/diffusers/controlnet-canny-sdxl-1.0/resolve/main/diffusion_pytorch_model.fp16.safetensors?download=true && \
curl -L -o ./assets/Controlnets/depth_XL/diffusion_pytorch_model.fp16.safetensors https://huggingface.co/diffusers/controlnet-depth-sdxl-1.0/resolve/main/diffusion_pytorch_model.fp16.safetensors?download=true-
Run the App by:
python main.py
-
From the same device go to the link:
https://0.0.0.0:7860
(+) For verbose logging, set the DEBUG environment variable:
bash DEBUG=true python main.py
The app will then be accessible at http://127.0.0.1:7860.






















