Caelum「澄空」

English | 中文

Caelum「澄空」

We share the same sky — I just want it to be clear.

Inspired by Porter Robinson — "Look At The Sky"

What is Caelum?

Anime illustrations shared across the internet typically go through multiple rounds of resizing and JPEG/WebP compression, ending up as a blurry, lossy, artifact-ridden version of the original.

Caelum's goal is to restore them as faithfully as possible.

This is a ×4 super-resolution reconstruction network specifically targeting real-world internet image degradation (anime illustrations distributed on platforms like Pixiv, X, Facebook, etc.). It is not a diffusion model and it is not "generating" — it is attempting to restore the image as close as possible to its original form.

🔍 How Does Caelum Compare?

Caelum focuses on a specific, underserved problem: restoring anime illustrations degraded by real-world internet multi-platform re-upload chains. Unlike general-purpose upscalers, it is trained exclusively on a simulated pipeline that matches how images actually deteriorate across Pixiv, Twitter/X, Facebook, Discord, and screenshot cycles.

	Caelum	waifu2x	Real-ESRGAN	RealCUGAN	Anime4K
Target content	Anime illustration	Anime / Art	General / Anime	Anime	Anime (video)
Degradation model	Multi-platform internet re-upload chain	Noise + blur	Real-world general	Noise + compression	—
Architecture	PPBUNet (U-Net + HAT Attention + Mamba)	SwinUNet / CNN	RRDB	U-Net	GLSL shader
Output scale	×4	×1/×2/×4	×2/×4	×2/×3/×4	Variable
JPEG/WebP artifact removal	✅ Multi-stage	✅	✅	✅	—
Inference backend	DirectML (ONNX)	NCNN / Vulkan	NCNN / CUDA	NCNN	OpenCL / Vulkan
Windows GUI	✅ Native	Partial	Partial	Partial	✅
Free	✅	✅	✅	✅	✅

⚠️ Caelum is in active training. Quantitative comparisons (PSNR/SSIM/LPIPS) will be published with the first stable Release.

✨ Features

🎯 Scenario-Focused — Specifically simulates the real degradation pipeline of modern social/image platforms (resize + JPEG/WebP compression), not generic degradation
🏗️ PPBUNet — Palette-Painter-Brush U-Net for Anime Super-Resolution
⚡ ×4 Upscaling — Dedicated 4× super-resolution reconstruction
🪟 Windows GUI — Ready-to-use .exe application — no Python or command line needed
🔓 Free & Open Source — Free forever; source code available under AGPL-3.0 / CC BY-NC-SA 4.0

🖼️ Results

⚠️ Early checkpoint — The network is still in training; current results are from an early checkpoint and do not represent the final quality.

Degraded Input	waifu2x SwinUNet noise2 ×4	Caelum PPBUNet CAR2 ×4	Ground Truth

📝 Comparison images were generated by Google Gemini, featuring Reimu Hakurei from Touhou Project (ZUN). Touhou Project permits non-commercial fan works; this project is a non-commercial open-source project with no copyright concerns.

🚀 Quick Start

Download & Use

Releases are split into two separate packages — download both from Releases:

Package	Contents
`Caelum_vX.Y.Z.zip`	GUI application
`Models_YYYYMMDD.zip`	Model weights (date-versioned; only changed models are included per release)

After extracting both archives, place the Models/ folder into the application directory:

Caelum/        ← app directory
└── Models/    ← extracted Models folder goes here

Then run Caelum.exe.

System Requirements

OS: Windows 10 version 2004 (build 19041) or later / Windows 11
Architecture: x64 or ARM64
.NET Runtime: .NET 10 Desktop Runtime
GPU (optional): Any DirectX 12 compatible GPU (NVIDIA / AMD / Intel)
- If no DX12 GPU is available, inference will automatically fall back to CPU (slower)

Adding a New Language

The app uses a JSON sidecar localization system. No recompilation is needed — just add a file.

Steps:

Go to the Langs/ folder next to the executable.
Create a new JSON file named with the BCP-47 culture code (e.g. fr-FR.json for French, ko-KR.json for Korean).

Copy the contents of en-US.json as a template and translate all values in the "Strings" section. Set Metadata.DisplayName to the language's own native name (this appears in the Settings menu):

{
  "Metadata": {
    "Culture": "fr-FR",
    "DisplayName": "Français",
    "Author": "Your Name"
  },
  "Strings": {
    "App.Title": "Caelum",
    "Menu.Settings": "Paramètres",
    "..."
  }
}

Restart the app. The new language will appear automatically under Settings → Language.

Note: Key names must not be changed. Only translate the values on the right side of each "Key": "Value" pair. Missing keys will fall back to displaying [KeyName] as a placeholder.

🏗️ Architecture

Current version: PPBUNet v1.2

A 2-level U-Net pyramid with a parallel geometry bypass, operating in three stages: Palette extracts global color prototypes → Painter reconstructs global structure → Brush performs geometry refinement and upsampling.

flowchart TD
    In(["Input Image"]) --> SE["Shallow Feature Extraction"]

    SE -->|"parallel bypass"| GB["Geometry Bypass<br/>Directional · Full-resolution"]
    SE --> L0

    subgraph Encoder["Encoder — 2-Level Pyramid"]
        L0["Level 0 · 64ch"] --> L1["Level 1 · 128ch"]
    end

    L1 --> FR

    subgraph BN["Bottleneck — Frequency Decoupled"]
        FR["Frequency Router<br/>DC / AC split"]
        FR -->|"DC"| PAL["Palette<br/>Color prototype extraction"]
        FR -->|"AC"| PAI["Painter<br/>Global structure reconstruction"]
    end

    L1 -.->|"skip"| SCR1["Skip Refinement L1<br/>Filtered · Aligned"]
    PAI --> SCR1
    SCR1 --> DL1["Decoder Level 1<br/>128ch · Attention"]
    DL1 --> R1["Residual L1"]

    L0 -.->|"skip"| SCR0["Skip Refinement L0<br/>Filtered · Aligned"]
    R1 --> SCR0
    SCR0 --> DL0["Decoder Level 0<br/>64ch · Attention"]
    DL0 --> R0["Residual L0"]

    subgraph Brush["Brush Stage"]
        GR["Geometry Refinement<br/>Deformable edge & curve correction"]
        GR --> CM["Color Modulation<br/>Palette-guided broadcast"]
    end

    GB -->|"geometry prior"| GR
    R0 --> GR
    PAL -->|"palette"| CM
    CM --> FUSE
    SE -->|"latent residual"| FUSE["Feature Fusion"]
    FUSE --> UP["Adaptive ×4 Upsampler<br/>Coordinate-aware · Phase-zero residual"]
    UP --> Out(["Output Image · 4H × 4W"])

classDef io fill:#4A90D9,stroke:#2C5F8A,color:#fff
classDef se fill:#5D6D7E,stroke:#2E4057,color:#fff
classDef bypass fill:#8E44AD,stroke:#5B2C6F,color:#fff
classDef enc fill:#2980B9,stroke:#1A4F7A,color:#fff
classDef btn fill:#E67E22,stroke:#9A5C12,color:#fff
classDef pal fill:#D4AC0D,stroke:#8B6E0A,color:#fff
classDef skip fill:#16A085,stroke:#0B6B5A,color:#fff
classDef dec fill:#2471A3,stroke:#154360,color:#fff
classDef brush fill:#C0392B,stroke:#7B241C,color:#fff
classDef up fill:#1ABC9C,stroke:#0E7560,color:#fff

class In,Out io
class SE se
class GB bypass
class L0,L1 enc
class FR,PAI btn
class PAL pal
class SCR0,SCR1 skip
class DL1,DL0,R1,R0 dec
class GR,CM brush
class FUSE,UP up

Architecture Stages

Stage	Function
Shallow Feature Extraction	Converts input image to a shared latent space; provides a residual bypass to the upsampler
Geometry Bypass	Parallel full-resolution branch capturing directional high-frequency features before any downsampling
Encoder	2-level pyramid that progressively aggregates multi-scale spatial context
Frequency Router	Explicitly splits the latent into low-freq DC (color / flat regions) and high-freq AC (edges / lines) streams
Palette	Extracts global color prototypes from the DC stream; conditions the Brush stage on a stable color prior
Painter	Reconstructs global high-frequency topology from the AC stream; tracks long-range structural continuity
Skip Refinement	Filters and aligns encoder features before decoder merge, suppressing compressed-artifact propagation
Decoder	2-level attention-based decoder restoring spatial detail from bottleneck representations
Geometry Refinement	Deformable correction guided by the geometry bypass; recovers sharp edges, fine strokes, and curves
Color Modulation	Broadcasts palette prototypes across the feature map to enforce global color fidelity
Adaptive Upsampler	Coordinate-aware ×4 upsampling with phase-zero high-frequency residuals for alias-free reconstruction

For full design rationale and module specifications, see Caelum/model/PPBUnet_v1/ARCHITECTURE.md.

Degradation Pipeline

When anime illustrations circulate online, they don't undergo a single compression — they go through a full multi-platform re-upload chain. The training data pipeline (dataset.py) simulates this chain online using 5 degradation modes, generating (LR, HR) pairs in real time each batch without pre-storing degraded images.

Degradation Modes

Mode	Name	Scenario	Sample Rate
0	Pure Bicubic	Mathematical downsampling only (validation set)	—
1	Pre-blur + Bicubic	Simulates anti-aliased upload	10%
2	Light Compression	Bicubic ↓4× + random 1–3× JPEG/WebP	30%
3	Moderate Degradation	3-stage high-order degradation lv2 (social platform re-upload)	35%
4	Heavy Degradation	3-stage high-order degradation lv3 (deep compression artifacts)	25%

Training uses CaelumMixedDataset, where each image undergoes a different degradation mode each epoch, providing massive equivalent data augmentation.

3-Stage High-Order Degradation Chain (Mode 3/4)

HR Original
  │
  ▼ Stage 1 — Creator Upload
  ├─ 50% chance pre-blur (Gaussian r=2 / r=1+Box)
  ├─ Random rescale (50%→0.5× · 25%→1.0× · 25%→uniform sample)
  └─ JPEG/WebP compression (q = 75–95)  [JPEG 70% / WebP 30%]
  │
  ▼ Stage 2 — Platform Re-upload
  ├─ 30% chance Sinc ringing (Hamming window, lv2: ω∈[2π/3,π] / lv3: ω∈[π/3,2π/3])
  ├─ 50% chance secondary blur
  ├─ Bilinear/Bicubic random rescale → target LR size
  ├─ DCT grid shift 1–7px (breaks quantization grid alignment, produces realistic overlapping block artifacts)
  └─ JPEG/WebP compression (q = 50–80)
  │
  ▼ Stage 3 — End-user Retrieval
  ├─ lv2: 25% / lv3: 50% chance screenshot upscale + recompress
  ├─ DCT grid shift + final compression (lv2: q=40–75 / lv3: q=10–40)
  └─ Restore coordinate alignment
  │
  ▼ LR Output

Key Technical Details

Technique	Implementation	Purpose
JPEG/WebP Piecewise Linear Mapping	`jpeg_quality_to_webp()`	WebP is far more efficient at low quality than JPEG; equal perceptual strength requires differentiated mapping
DCT Grid Shift	`break_dct_grid()` cyclic shift 1–7px	Two compression passes produce non-overlapping block boundaries, creating realistic multi-compression block artifacts
Sinc Ringing	Hamming-windowed sinc + À Trous multi-scale	Simulates ringing overshoot (Gibbs phenomenon) introduced by downsampling/resampling
Mixed Interpolation	50% Bicubic / 50% Bilinear	Covers scaling algorithm differences across platforms
Geometric Augmentation	Horizontal flip × Vertical flip × 90° rotation	8 combinations, 8× effective data expansion
In-Memory Compression	`io.BytesIO` in-memory encode/decode	Full DCT encode/decode ensures degradation authenticity with no filesystem overhead

Loss Function Design

CaelumLossV2 coordinates 13 sub-losses spanning pixel, color, frequency, spatial, perceptual, and adversarial dimensions, using a two-phase progressive activation strategy.

Two-Phase Progressive Strategy

Training Progress
0%──────────────30%──────────────────────────100%
│      Phase 1        │          Phase 2           │
│  Pixel + Color Anchor│  + High-freq + Perceptual + Adversarial │
└─────────────────────┴────────────────────────────┘

Phase 1 lets the network converge to the correct color and pixel distribution; Phase 2 then introduces stronger constraints to refine edges, frequency content, and semantic detail, avoiding early-stage gradient oscillation.

Sub-Loss Overview

Phase 1 (active throughout)

Loss	Weight	Purpose
`L1`	1.0	Pixel-level absolute error baseline
`AdaptiveDCAnchorLoss`	1.0	Scharr gradient energy drives exponential-decay soft weights; amplifies L1 in flat regions, fades in textured regions — eliminates hard-threshold variance instability
`OklchColorLoss`	5.0	OKLCH perceptual color space: chroma L1 + hue cosine joint constraint, atan2-free
`StrictFlatTGVLoss`	1.0	Morphological hard mask isolates flat regions; Charbonnier penalizes 1st+2nd derivatives→0, eliminating flat-area ripple
`SmoothGradientHessianLoss`	1.5	Structure-tensor guided Hessian penalty on smooth gradient regions; suppresses color banding and micro-ripple in graduation areas

Phase 2 (added when training progress ≥ 30%)

Loss	Weight	Purpose
`ChromaGradientLoss`	1.5	Sobel directly constrains Oklab a/b chroma gradients to align with GT, preventing color overflow
`CreviceColorLoss`	6.0	Morphological closing detects line crevices; corrects hue shift caused by JPEG 4:2:0 chroma subsampling
`MaskedAsymmetricHistogramLoss`	1.5	Soft histogram over edge-dilated regions; asymmetric divergence heavily penalizes "hallucinated" colors (×5) while lightly penalizing "unrecovered" detail (×1)
`GibbsRingingPenaltySWT`	4.0	Haar SWT three sub-bands (HL/LH/HH) × À Trous multi-scale (d=1,2,4); one-sided penalty on high-frequency overshoot without interfering with normal sharpening
`AngularFluencyLoss`	5.0	Farid 7×7 rotation-equivariant operator computes gradient direction angular distance, directly eliminating super-resolution aliasing
`MacroscopicTurningPointLoss`	0.5	Dilated Scharr (d=2) + 11×11 Gaussian macroscopic integration; contrast-invariant corner response C = 4·det(S)/trace(S)²
`LaplacianResonanceTopologyLoss`	1.5	Dual-scale dilated Laplacian resonance + R^75 cosine topology + Charbonnier intensity; restores crevice topology
`AnimePerceptualLossV2`	0.5	Danbooru ConvNeXt cosine manifold distance (stage0+stage1); GT magnitude gating focuses on edge regions

GAN Components (optional)

Component	Description
`DecoupledUNetDiscriminatorSN`	Guided filter front-end decomposes structure/texture into dual streams; structure branch full-power U-Net, texture branch lightweight global statistics — prevents D from forcing G to hallucinate unrecoverable textures
`DecoupledGANLoss`	Structure adversarial weight ×1.0, texture adversarial weight ×0.1 (`texture_tolerance`); spectral normalization stabilizes training

MIM Auxiliary Loss

Obtain the InfoNCE skip-connection mutual information loss via model.mi_loss during training; recommended weight λ=0.01:

loss = criterion(pred, hr) + 0.01 * model.mi_loss

BADI Gate Regularization (optional)

GateTolerancePenalty applies a hinge penalty on model.badi.last_gate over GT flat regions, discouraging gate laziness that would allow shallow-layer noise to leak through. Cosine anneals to zero at 70% training progress; recommended weight λ≈0.001:

loss = loss + 0.001 * gate_penalty(model.badi.last_gate, gt)

📊 Experimental Results

This project is not aimed at academic publication and does not include controlled ablation studies.

The architecture is theoretically derived to be at the frontier, particularly for the specific scenario of restoring real-world internet image degradation.

If you process an image with it and the result looks good — that's enough.

📁 Project Structure

Caelum/                            ← Repository root
├── .github/
│   └── FUNDING.yml
├── Caelum/                        ← Python project directory
│   ├── assets/
│   │   ├── logo.png               # Project logo
│   │   ├── demo0.png              # GUI demo image
│   │   ├── demo1.png              # GUI demo image
│   │   └── compare/               # Comparison images
│   │       ├── degradation.png
│   │       ├── GT.png
│   │       ├── PPBUnet_CAR2_x4.png
│   │       └── waifu2x_SwinUNet_noise2_x4.png
│   └── model/
│       └── PPBUnet_v1/
│           ├── ARCHITECTURE.md       # Detailed architecture design doc
│           ├── PPBUNet_v1_x4.py      # Main network definition       [AGPL-3.0]
│           ├── modules.py            # Core module library           [AGPL-3.0]
│           ├── hat.py                # HAT Decoder                   [AGPL-3.0]
│           ├── ps_mamba.py           # PS-Mamba SSM module           [AGPL-3.0]
│           ├── dataset.py            # Online degradation pipeline   [CC BY-NC-SA 4.0]
│           ├── losses.py             # Custom loss function system   [CC BY-NC-SA 4.0]
│           └── train.py              # Training script               [AGPL-3.0]
├── README.md
├── README_zh.md
├── LICENSE
└── LICENSE-AGPL-3.0

Model weights and packaged applications (*.onnx, Caelum.exe) are distributed via Releases under CC BY-NC-SA 4.0.

🗺️ Roadmap

✅ Completed

PPBUNet v1.2 architecture design (ParallelOAM · FrequencyRouter · MIM · RMA · HAT · CornerAwareDCN · AnimeCommitteeRefiner · SATUpsampler)
Multi-platform degradation simulation pipeline (5 modes · 3-stage high-order chain · online real-time generation)
Custom loss function system (CaelumLossV2 · 13 sub-losses · two-phase progressive strategy · committee orthogonality + BADI gate regularizer)
Early checkpoint validation

🔄 In Progress

Complete model training → update final results showcase
Package exe + ONNX export → publish Release

🔭 Future Plans

Hair reconstruction — Recovery of hair tips and line-crevice detail is the current biggest weakness; planning hair-aware perceptual loss and targeted geometry refinement modules
Residual-free architecture exploration — Skip-add residuals have a fundamental limitation in artifact suppression (harmful input information is difficult to cut off); exploring purely attention/Mamba forward architectures free of skip-add residuals
New architecture exploration — Building on PPBUNet experience, continuing to explore more efficient and interesting anime SR architecture directions
Expand training dataset — Current dataset scale and diversity remain a bottleneck; planning to incorporate larger-scale anime illustration data (Danbooru · Pixiv etc.) while researching data cleaning and quality filtering pipelines

🌌 Origin Story

Years ago, I just wanted to upscale the anime illustrations I loved so they could be desktop wallpapers.

waifu2x was the first time I realized that neural networks could do this remarkably well — and at the time, the results were a complete paradigm shift over traditional interpolation upscaling. It sparked an intense curiosity in me — how does it do that?

To find out, I started learning deep learning and built my first super-resolution network, Entropia. Life got in the way and I set it aside for a long time.

Now, with the help of LLMs, I'm back — standing on the shoulders of my past self and everyone who came before.

Caelum is not a paper, not a research contribution. It's simply a continuation of a question — is there anything wrong with wanting the things I love to be a little clearer?

I've benefited too much from too many people's unpaid contributions along this road.

❤️ Support This Project

Caelum is and always will be free. If it helped you, you can buy me a coffee via Ko-fi — it directly translates into more GPU time.

📄 License

This project uses a dual license model:

Scope	License	Key Constraint
Network architecture & training source code (`PPBUNet_v1_x4.py`, `modules.py`, `hat.py`, `ps_mamba.py`, `train.py`)	AGPL-3.0	Derivatives must be open-sourced (including network services); commercial use permitted
Degradation/loss code + model & application release files (`dataset.py`, `losses.py`, `*.onnx`, `Caelum.exe`)	CC BY-NC-SA 4.0	Commercial use prohibited; derivatives must use the same license

For commercial use of training code or model weights, please contact the author for a separate commercial license.

📬 Acknowledgements

Thanks to the author of waifu2x.

Thanks to everyone willing to make the world a little better.

"Look at the sky — I'm still here."

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github		.github
Caelum		Caelum
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE-AGPL-3.0		LICENSE-AGPL-3.0
README.md		README.md
README_zh.md		README_zh.md

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Caelum「澄空」

What is Caelum?

🔍 How Does Caelum Compare?

✨ Features

🖼️ Results

🚀 Quick Start

Download & Use

System Requirements

Adding a New Language

🏗️ Architecture

Architecture Stages

Degradation Pipeline

Degradation Modes

3-Stage High-Order Degradation Chain (Mode 3/4)

Key Technical Details

Loss Function Design

Two-Phase Progressive Strategy

Sub-Loss Overview

📊 Experimental Results

📁 Project Structure

🗺️ Roadmap

✅ Completed

🔄 In Progress

🔭 Future Plans

🌌 Origin Story

❤️ Support This Project

📄 License

📬 Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages