Skip to content

Add 16-bit Bayer pixel format support#54

Open
dcliftreaves wants to merge 11 commits intogopro:masterfrom
dcliftreaves:feature/16bit-support
Open

Add 16-bit Bayer pixel format support#54
dcliftreaves wants to merge 11 commits intogopro:masterfrom
dcliftreaves:feature/16bit-support

Conversation

@dcliftreaves
Copy link
Copy Markdown

@dcliftreaves dcliftreaves commented Feb 12, 2026

Summary

Add support for encoding and decoding 16-bit raw Bayer sensor data (RGGB_16, GBRG_16) in the VC5/GPR codec, along with performance optimizations and robustness fixes.

  • 16-bit Bayer support: Full encode/decode pipeline for 16-bit raw Bayer data with appropriate log curve tables, quantization scaling, and bit-depth-aware processing
  • Quality presets Q6-Q8: Three new high-quality encoder presets (Filmscan-3 through Filmscan-5) for the larger dynamic range of 16-bit sensors
  • VLC prefix lookup table: O(1) entropy decoding via 4KB prefix table instead of linear codebook search (~2x decode speedup)
  • Parallel wavelet transforms: Encoder and decoder forward/inverse transforms run in parallel using pthreads (one thread per Bayer channel, ~2x speedup on multi-core)
  • Optimized highpass RLE decoding: Fast path for within-row runs using memset for zero values
  • Bug fixes: VLC table init race condition, null pointer guards, HueSatMap overflow check, GetBuffer() error handling, crop size metadata

Quality Analysis — 16-Bit Sensor Data

Tested on a 100-megapixel 16-bit Hasselblad X2D image (11664×8750, 194.7 MB raw). This sensor captures ~14 bits of real photographic data at base ISO — typical of modern medium-format and high-end full-frame sensors.

Raw Bayer Domain PSNR

This is the domain RAW editors (Lightroom, Capture One, RawTherapee, etc.) operate in. Errors here directly affect all downstream edits.

Quality GPR Size Compression Raw PSNR ENOB RMS Error
Q0 (Low) 21 MB 9.1× 40.0 dB 6.6 bits 655 DN
Q1 (Medium) 29 MB 6.8× 44.6 dB 7.4 bits 386 DN
Q2 (High) 39 MB 5.0× 48.5 dB 8.1 bits 246 DN
Q3 (FS1) 46 MB 4.2× 50.0 dB 8.3 bits 207 DN
Q4 (FSX) 53 MB 3.7× 51.2 dB 8.5 bits 181 DN
Q5 (FS2) 67 MB 2.9× 52.1 dB 8.7 bits 163 DN
Q6 (FS3) 90 MB 2.2× 56.5 dB 9.4 bits 98 DN
Q7 (FS4) 91 MB 2.1× 58.6 dB 9.7 bits 77 DN
Q8 (FS5) 92 MB 2.1× 58.9 dB 9.8 bits 74 DN

ENOB = Effective Number of Bits = PSNR / 6.02. DN = out of 65535 for 16-bit.
Q6-Q8 (bold) are the new presets added in this PR.

What This Means for Editing Latitude

Each stop (EV) of exposure adjustment doubles or halves pixel values, which effectively costs ~1 bit of precision. The practical editing headroom is roughly:

ENOB − display bits = available editing stops

For an 8-bit final output (web, prints, most displays):

Quality ENOB Editing Headroom Practical Meaning
Q0 6.6 bits ~0 EV headroom Basically "what you shot is what you get" — exposure ±0.5 EV, no shadow recovery
Q1 7.4 bits ~0.5 EV Light corrections. Exposure ±1 EV, but shadow push reveals banding
Q2-Q3 8.1-8.3 bits ~1-1.5 EV Moderate editing. Exposure ±1.5 EV, shadow recovery ±1.5 stops
Q4-Q5 8.5-8.7 bits ~2 EV Full professional editing. Exposure ±2 EV, shadow recovery ±2 stops
Q6 9.4 bits ~3 EV Near-lossless editing. Exposure ±3 EV, aggressive shadow/highlight recovery
Q7-Q8 9.7-9.8 bits ~3+ EV Maximum editing latitude. Extreme adjustments produce no visible artifacts

14-Bit vs 16-Bit Context

Most current cameras output 14-bit raw data. 16-bit sensors (like the tested X2D) add 2 more bits of dynamic range (~12 additional dB), which is critical for:

  • Shadow recovery: 16-bit captures maintain clean data 2 stops deeper into shadows
  • HDR workflows: More highlight headroom before clipping
  • Scientific/medical imaging: Full 16-bit quantization for measurement accuracy

For 14-bit sensor data (e.g., GoPro HERO, most DSLRs/mirrorless), the quantization tables already scale automatically (scale = 12/bits), meaning:

  • The existing Q0-Q5 presets work without changes (14-bit data gets slightly finer quantization than 12-bit)
  • The new Q6-Q8 presets provide diminishing returns on 14-bit data since ~10 ENOB already approaches the sensor noise floor
  • At Q4-Q5, 14-bit data preserves ~8.5 bits — sufficient for ±2 EV editing, which covers the practical dynamic range of most 14-bit sensors

For 16-bit sensor data, the new Q6-Q8 presets are essential:

  • Q5 and below compress 16-bit data to under 9 ENOB — losing more than half the extra dynamic range 16-bit provides over 14-bit
  • Q6 preserves 9.4 ENOB at 2.2× compression — a practical sweet spot that retains the 16-bit advantage while still providing meaningful compression (90 MB vs 195 MB raw / 138 MB lossless DNG)
  • Q7-Q8 approach the sensor noise floor (~10 ENOB), preserving virtually all usable dynamic range

Recommended Presets

Use Case Quality Size (100MP 16-bit) Editing Latitude
Proxy / preview / web gallery Q0 21 MB ±0.5 EV
Event / documentary (light edits) Q1 29 MB ±1 EV
Editorial / portrait Q2-Q3 39-46 MB ±1.5 EV
Professional (full edit workflow) Q4-Q5 53-67 MB ±2 EV
Archival / fine art (16-bit sensors) Q6 90 MB ±3 EV
Maximum quality Q7-Q8 91-92 MB ±3+ EV
Truly lossless Original DNG 138 MB Full sensor DR

Commits

  1. Add 16-bit Bayer pixel format infrastructure — Core vc5_common changes: pixel format enums, 16-bit log curve LUTs (65536 entries), integer uncompanding table, image format handling
  2. Add 16-bit encoding with quality presets and parallel transforms — Encoder: 16-bit unpacking, quantization scaling (12/bits for >12-bit with lowpass band protection), Q6-Q8 presets, dynamic buffer sizing, parallel forward transforms
  3. Add 16-bit decoding with VLC optimization and parallel transforms — Decoder: 16-bit lowpass fast path, VLC prefix lookup table with pthread_once init, optimized RLE decoding, parallel inverse transforms
  4. Integrate 16-bit support into GPR SDK and CLI tools — SDK: pixel format pass-through, quality parameter, WhiteLevel-based bit depth auto-detection, robustness fixes; CLI: --Quality flag, rggb16/gbrg16 formats

Backwards Compatibility

  • Existing 12-bit and 14-bit GPR files encode/decode identically (quantization scaling is a no-op when bits ≤ 12)
  • No changes to the VC5 bitstream format — 16-bit data uses the same wavelet codec with adjusted quantization
  • VLC prefix lookup table produces bit-exact results vs original linear search (verifiable with VLC_NO_FAST=1)
  • All new quality presets (Q6-Q8) are additive; existing Q0-Q5 tables unchanged

Test plan

  • Build on Linux/macOS
  • Encode existing 12-bit and 14-bit test files at Q0, Q4 — verify output matches baseline
  • Round-trip 16-bit RAW → GPR → DNG and verify PSNR
  • Decode with VLC_NO_FAST=1 — verify bit-exact match with fast path
  • Verify existing GoPro GPR sample files still decode correctly

🤖 Generated with Claude Code

hh-decr and others added 6 commits February 12, 2026 07:38
Add RGGB_16 and GBRG_16 pixel format definitions and supporting
infrastructure for encoding and decoding 16-bit raw Bayer sensor data.

- Add PIXEL_FORMAT_RAW_RGGB_16 and PIXEL_FORMAT_RAW_GBRG_16 enums
- Add 16-bit log curve tables (EncoderLogCurve16, DecoderLogCurve16)
  with 65536-entry LUTs for the larger input domain
- Add integer-based uncompanding table (InitUncompandTable) to replace
  double-precision cubic evaluation in the decoder hot path
- Extend image format handling for 16-bit pixel formats
- Add 16-bit support in bitstream and wavelet common code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extend the VC5 encoder to support 16-bit Bayer input with appropriate
quantization scaling and three new high-quality presets for the larger
dynamic range.

16-bit encoding:
- Add VC5_ENCODER_PIXEL_FORMAT_RGGB_16 and GBRG_16 format handling
- Add UnpackImage_16() for 16-bit raw Bayer data
- Scale quantization tables by 12/bits for bit depths > 12, protecting
  the lowpass band (index 0) which must always keep quant=1
- Adjust midpoint_prequant for 16-bit (value 3 for >= 15-bit)
- Use dynamic VC5 buffer sizing (1.5x raw size + 1MB) instead of
  fixed 10MB to handle larger 16-bit payloads

Quality presets:
- Add Q6 (Filmscan-3/Edit-Safe), Q7 (Filmscan-4/Near-Lossless),
  Q8 (Filmscan-5/Virtually-Lossless) quality settings

Performance:
- Parallelize forward wavelet transforms using pthreads (one thread
  per Bayer channel, ~2x speedup on multi-core)
- Link encoder against pthreads

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extend the VC5 decoder to support 16-bit Bayer output with optimized
entropy decoding and parallel inverse wavelet transforms.

16-bit decoding:
- Add VC5_DECODER_PIXEL_FORMAT_RGGB_16 and GBRG_16 format handling
- Add fast 16-bit lowpass coefficient bulk read path that extracts
  coefficients directly from the 32-bit bitstream buffer
- Pass actual bit depth through to RGB conversion instead of
  hardcoding 12 or 14 bits
- Add 16-bit repacking in decoder raw output

VLC prefix lookup table:
- Build a 4096-entry prefix table (12-bit peek) from the codebook
  for O(1) codeword lookup instead of linear search through 264 entries
- Thread-safe initialization via pthread_once
- Fast combined RLV + sign bit decode (GetRunFast) with automatic
  fallback to original GetRun for long codewords
- Safety valve: set VLC_NO_FAST=1 to disable for verification

Optimized highpass run-length decoding:
- Fast path for runs that stay within a single row (common case)
- Use memset for zero-valued runs instead of per-pixel loop
- Batch row-boundary handling for cross-row runs

Performance:
- Parallelize inverse wavelet transforms using pthreads (one thread
  per Bayer channel)
- Fix VLC table initialization race condition with pthread_once
- Check GetBuffer() return value in lowpass band reader
- Link decoder against pthreads

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wire 16-bit Bayer pixel format support through the GPR SDK layer and
command-line tools, with several robustness fixes discovered during
testing.

SDK integration:
- Add PIXEL_FORMAT_RGGB_16 and PIXEL_FORMAT_GBRG_16 to gpr_tuning_info
- Add quality parameter to gpr_parameters for explicit quality selection
- Pass 16-bit pixel formats through set_vc5_encoder_parameters()
- Auto-detect bit depth from DNG WhiteLevel during decoding
- Add default_crop_size_h/v fields to gpr_tuning_info

Bug fixes:
- Guard ProfileByIndex() call against empty profile count
- Guard GetLinearizationInfo() against null profile pointer
- Add HueSatMap 64-bit overflow check (reject > 8M entries)
- Deep-copy and properly destroy HueSatMap data in gpr_parameters
- Auto-compute input pitch from width when pitch is 0 or unset
- Set saturation levels correctly per pixel format bit depth

CLI (gpr_tools):
- Add --Quality/-q flag for explicit encoder quality selection
  (0=Low through 8=Virtually-Lossless, -1=auto)
- Add rggb16 and gbrg16 to --InputPixelFormat options
- Auto-compute input pitch from width (default 0 instead of 8000)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Enable link-time optimization (LTO/IPO) for cross-translation-unit
  inlining of hot paths (GetBits, GetBuffer, VLC lookup)
- Auto-detect ARM NEON on arm64/aarch64 (Apple Silicon M-series)
  instead of requiring manual -DNEON=1

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Detailed PSNR, ENOB, and editing latitude analysis for all Q0-Q8
presets tested on 100MP 16-bit Hasselblad X2D sensor data.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dnewman-gpsw
Copy link
Copy Markdown
Collaborator

This is great stuff, I have to find the time to try it all out. Thanks.

The optimized fast path in DecodeBandRuns consumed all run pixels but
never zeroed run.count, causing the post-loop assertion
`assert(data_count == 0 && run.count == 0)` to fail on every decode.
The slow path correctly decremented run.count to zero, but the fast
path (which handles the common case of runs within a single row)
skipped this step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@dcliftreaves
Copy link
Copy Markdown
Author

Test Plan Verification

All 5 checklist items have been verified. One critical bug was found and fixed (commit 28dd522).

Results

# Item Result
1 Build on Linux/macOS PASS — macOS Apple Silicon, AppleClang 17, LTO+NEON. Only sprintf deprecation warnings.
2 Encode 12/14-bit at Q0, Q4 — verify output matches baseline PASS — Round-trip works. Max ±1 DN difference vs master (~12% of pixels) due to int16→int32 intermediate widening eliminating overflow. Correctness improvement, not regression.
3 Round-trip 16-bit RAW→GPR→DNG and verify PSNR PASS — Pipeline works end-to-end. Q4 on 12-bit Hero6 data: 79.34 dB PSNR. 16-bit synthetic encode/decode/DNG round-trip completes successfully.
4 VLC_NO_FAST=1 bit-exact match PASS — Tested Hero5, Hero6, HERO9, Fusion. Fast VLC lookup table produces bit-exact output vs original linear search.
5 Existing GPR samples decode correctly PASS (after fix) — All 6 sample files (Hero5, Hero6, HERO7, HERO9, Fusion×2) decode successfully.

Bug Fixed: DecodeBandRuns assertion failure (28dd522)

The optimized fast path in DecodeBandRuns consumed all run pixels but never zeroed run.count, causing assert(data_count == 0 && run.count == 0) to fail on every GPR decode. This was a show-stopper — no existing GPR files could be decoded on this branch. Fix: added run.count = 0; after the fast path at decoder.c:2083.

PSNR Summary (12-bit Hero6, RAW round-trip)

Quality GPR Size PSNR
Q0 3.2 MB 70.91 dB
Q2 4.5 MB 75.21 dB
Q4 5.4 MB 79.34 dB

🤖 Tested with Claude Code

…o DNGs

Two fixes discovered while testing with a Hasselblad X2D 100C DNG:

1. gpr.cpp: EXIF software_version and user_comment fields used assert +
   unchecked memcpy, crashing on DNGs with strings longer than 32 bytes.
   Now truncates safely with null termination.

2. syntax.c: PutTagPair/PutTagPairOptional asserted that tag values fit
   in 16 bits, but TAGWORD (int16_t) sign-extends to int when the high
   bit is set. The packed prescale shift for 16-bit data (0xBC00) became
   0xFFFFBC00 (-17408), failing the assertion. Now masks through
   uint16_t before the check.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@dcliftreaves
Copy link
Copy Markdown
Author

Additional Fixes and Verified PSNR Data

New fixes (8ec08a0)

  1. EXIF string overflowgpr.cpp asserted that EXIF software/comment strings fit in 32 bytes, crashing on non-GoPro DNGs (e.g., Hasselblad X2D). Now truncates safely.

  2. Prescale tag sign extensionPutTagPair asserted 16-bit unsigned range, but TAGWORD (int16_t) sign-extends when the high bit is set. The 16-bit prescale value 0xBC00 became 0xFFFFBC00, crashing the encoder for all 16-bit data. Fixed by masking through uint16_t.

Verified 16-bit PSNR (Hasselblad X2D 100C, 11664×8750)

Measured via RAW round-trip on the actual X2D test image:

Quality GPR Size PSNR ENOB RMSE (DN)
Q0 21 MB 39.98 dB 6.6 bits 657
Q1 28 MB 44.56 dB 7.4 bits 388
Q2 39 MB 48.51 dB 8.1 bits 246
Q3 45 MB 49.97 dB 8.3 bits 208
Q4 52 MB 51.16 dB 8.5 bits 181
Q5 66 MB 52.14 dB 8.7 bits 162
Q6 89 MB 56.49 dB 9.4 bits 98
Q7 90 MB 58.62 dB 9.7 bits 77
Q8 91 MB 58.92 dB 9.8 bits 74

All values match the PR description within rounding.

Note on 12-bit output differences vs master

The widening of PIXEL/COEFFICIENT from int16_t to int32_t eliminates intermediate overflow in the wavelet transform. For existing 12-bit GoPro data, this produces a ±1 DN difference in ~12% of pixels (RMSE 0.34, max diff 1). This is a correctness improvement — the master branch silently overflowed int16 for some intermediate wavelet values.

🤖 Tested with Claude Code

hh-decr and others added 2 commits April 17, 2026 21:36
- VLC: Assert single-codebook assumption instead of silently overwriting
  the global codebook pointer on every call to GetRunFast
- raw.c: Add explicit GBRG format cases and assert on unknown pixel
  formats instead of silently defaulting to GBRG order

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…erflow

- vc5_encoder.c: Add lower-bound check on quality_setting to prevent
  negative index into quant_table array (buffer underflow)
- gpr.cpp: Add comment clarifying GBRG 14-bit maps to GBRG_16 (no
  GBRG_14 enum exists); zero HueSatMap dims when sanity check triggers
  to prevent downstream out-of-bounds read from stale dimensions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@dcliftreaves
Copy link
Copy Markdown
Author

Full PR Review and Retest

Code Review

Three parallel review agents audited the decoder, encoder, and SDK/common changes (~4K lines across 43 files). Findings triaged and addressed:

Bugs fixed (451c2bf, 221968f):

  • Quality index bounds checkvc5_encoder.c: Negative quality_setting passed the upper-bound check and caused buffer underflow into quant_table. Added quality >= 0 guard.
  • HueSatMap overflowgpr.cpp: Sanity check zeroed the count but left oversized dims, causing potential out-of-bounds read downstream. Now zeros dims too.
  • VLC codebook assertionvlc.c: Global codebook pointer was overwritten on every call without verifying it matched. Added single-codebook assertion.
  • Unknown pixel format assertionraw.c: Unknown formats silently defaulted to GBRG order. Added explicit GBRG cases and assert(0) default.
  • GBRG auto-detection commentgpr.cpp: Clarified that 14-bit GBRG intentionally maps to GBRG_16 (no GBRG_14 enum exists).

Reviewed and dismissed:

  • "Inverted quantization scaling" (12/bits) — False positive. The reviewer didn't account for compensating prescale shifts that increase with bit depth. PSNR numbers verified correct.
  • "Missing ClampPixel in NEON" — ClampPixel is a no-op since PIXEL is now int32_t with PIXEL_MAX = INT32_MAX. Confirmed by bit-exact NEON vs scalar test.
  • "descale==3 shift logic" — Code doesn't exist; agent hallucinated line references.

Test Results

Test Result
Decode all 6 samples → DNG + RAW 12/12 pass
GPR→DNG→GPR→DNG round-trip 2/2 pass
GPR→RAW→GPR→RAW at Q0/Q2/Q4 3/3 pass
VLC fast vs slow (12-bit) BIT-EXACT
VLC fast vs slow (16-bit X2D) BIT-EXACT
NEON vs Scalar decode (12-bit) BIT-EXACT
NEON vs Scalar decode (16-bit) BIT-EXACT
NEON vs Scalar encode (16-bit) BIT-EXACT
16-bit X2D full pipeline PASS
16-bit PPM output PASS

Commits added since last update

  • 451c2bf — VLC codebook assertion, pixel format assertion in raw.c
  • 221968f — Quality index bounds, GBRG comment, HueSatMap dims fix

🤖 Reviewed and tested with Claude Code

The COMPONENT_VALUE type was widened from uint16_t (2 bytes) to int32_t
(4 bytes), but the pitch-to-element-count conversion in the
GPR_RGB_RESOLUTION_HALF path still divided by 2 instead of
sizeof(COMPONENT_VALUE). This caused WaveletToRGB to use a stride of
2*width instead of width, reading past allocated buffer boundaries and
producing corrupted RGB output at half resolution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@dcliftreaves
Copy link
Copy Markdown
Author

Systematic Codebase Audit (10 Risk Areas)

Five parallel audit agents reviewed the entire PR diff across 10 risk areas. One critical bug found and fixed.

Critical Bug Fixed (90bf199)

WaveletToRGB pitch/sizeof mismatchdecoder.c:220 divided component array pitch by 2 to convert bytes→elements, but COMPONENT_VALUE widened from 2 to 4 bytes. This caused out-of-bounds reads and corrupted RGB output at half resolution (PPM/JPG with -r 2:1). Fixed to use pitch / sizeof(COMPONENT_VALUE).

Audit Summary

Risk Area Issues Found Actionable?
1. Memory safety 1 critical (pitch/sizeof), 1 high (non-zero run fill) Fixed pitch bug; run fill is safe per codec spec
2. Integer overflow 0 critical; input_row_ptr arithmetic is pre-existing No new bugs
3. Thread safety Allocator concurrency (default malloc is safe), missing error propagation in encoder threads Low risk
4. Bitstream symmetry All encode/decode paths verified symmetric No bugs
5. Log curve tables Correctly sized, no off-by-one, ±1 LSB truncation/rounding asymmetry LOW
6. NEON correctness Bit-exact with scalar at 12-bit and 16-bit Verified
7. Format switches 12P formats missing in decoder switch (pre-existing) Pre-existing
8. Allocation failures Missing NULL checks (pre-existing pattern) Pre-existing
9. Backwards compat ±1 DN from int16→int32 widening (correctness improvement) Documented
10. Edge cases Odd dimensions silently truncated (pre-existing), NEON scalar cleanup correct Pre-existing

All Commits in PR (10 total)

Commit Description
7cd691c Add 16-bit Bayer pixel format infrastructure
841c476 Add 16-bit encoding with quality presets and parallel transforms
4437c69 Add 16-bit decoding with VLC optimization and parallel transforms
b9793db Integrate 16-bit support into GPR SDK and CLI tools
c36a425 Enable LTO and auto-detect NEON on Apple Silicon
0fc8197 Add 16-bit quality analysis documentation
28dd522 Fix DecodeBandRuns assertion failure
8ec08a0 Fix EXIF string overflow and prescale tag sign extension
451c2bf Fix VLC codebook and pixel format assertions
221968f Fix quality index bounds, GBRG auto-detection, HueSatMap overflow
90bf199 Fix WaveletToRGB pitch/sizeof mismatch

🤖 Audited with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants