Skip to content

[Updated] Likelihood-informed processors#410

Open
odunbar wants to merge 28 commits intomainfrom
orad/likelihood-informed-proc-v2
Open

[Updated] Likelihood-informed processors#410
odunbar wants to merge 28 commits intomainfrom
orad/likelihood-informed-proc-v2

Conversation

@odunbar
Copy link
Copy Markdown
Member

@odunbar odunbar commented Apr 1, 2026

Co-authored by @ArneBouillon

Purpose

Content

  • Migration of the LIS from Add likelihood-informed data processors #376 under the following simplifications/changes:
    • [change] encoder_kwargs for structure vectors now keep samples_in and samples_out in their respective distributions, and also contain the algorithm time values dt corresponding to the distributions
    • [simplification] we no longer provide the dim_criterion, only a threshold for doing dimension reduction
    • [change] the user provides iters (Vector{Int}) not alphas (Vector{FT}) to indicate which dt values they would like to use for dimension reduction. A message reports the dt of the requested iterations
    • [change] if multiple iters are requested, by default we compute the LIS for each iter, then use a trapezoidal rule to create the final subspace.
    • [simplified] Reduced the complexity of the DimReduction example
    • [change] added a shift by the data mean
    • [change] Use a more robust information measure f(k) = log(1+k)^2 and truncate based on cumsum(f(k))/sum(f(k)) > eps. Leads to cutoff eps that is not "0.99999"
  • Extended encoder_kwargs_from(eki,prior; g_final=...) to obtain all kwargs for encoding a typical problem. Additionally, we provide a convenience where the user can provide the "g" ensemble for a final evaluation of "u" in from the eki
  • Recreated the DimensionReduction example for a linear problem to assess performance

MISC

  • Refactor the boundary between public/private encode/decode functions.
    • API (to Utilities): encode_data, encode_structure_matrix, encode_with_schedule applied to encoder_schedule. NOT applicable to processors.
    • Private: _encode_data _encode_structure_matrix only applicable to exact processors etc.
    • API (to Emulators): encode_data and encode_structure_matrix (and decode counterparts) allow application to emulator in place of encoder_schedule
  • add small check for error message

Example DimensionReduction

We just try the DR methods (using the forward map wrapper in place of an emulator) for a linear example in 100D->100D: the following is output

┌ Info: (reduced_dim, error) of posterior mean to whitened "reference" solution,
└  when using the truncation ({for PCA}, {for LI})
5×4 DataFrame
 Row │ truncation         PCA-in          LI-in 1:1       LI-in 1:5      
     │ Tuple…             Tuple…          Tuple…          Tuple…         
─────┼───────────────────────────────────────────────────────────────────
   1 │ (0.995, 0.999999)  (55, 0.165108)  (74, 0.155773)  (55, 0.157305)
   2 │ (0.99, 0.9999)     (38, 0.128842)  (45, 0.137292)  (29, 0.133462)
   3 │ (0.98, 0.999)      (23, 0.216203)  (30, 0.154637)  (21, 0.136362)
   4 │ (0.94, 0.94)       (9, 0.540077)   (11, 0.316728)  (10, 0.128643)
   5 │ (0.9, 0.88)        (6, 0.514944)   (9, 0.42256)    (8, 0.294448)

We see that as you increase truncation, our new LI 1:5 method reduces errors much more slowly than PCA. Meanwhile LI 1:1 is similar

@odunbar odunbar changed the title Likelikehood-informed processors [Updated] Likelikehood-informed processors Apr 1, 2026
@odunbar odunbar changed the title [Updated] Likelikehood-informed processors [Updated] Likelihood-informed processors Apr 1, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 92.41379% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.03%. Comparing base (6341ca2) to head (7134a07).

Files with missing lines Patch % Lines
src/Utilities.jl 81.81% 18 Missing ⚠️
src/Utilities/likelihood_informed.jl 97.70% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #410      +/-   ##
==========================================
- Coverage   94.18%   94.03%   -0.15%     
==========================================
  Files          10       11       +1     
  Lines        1977     2214     +237     
==========================================
+ Hits         1862     2082     +220     
- Misses        115      132      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@odunbar odunbar force-pushed the orad/likelihood-informed-proc-v2 branch 4 times, most recently from ae22812 to 8c3e37c Compare April 2, 2026 22:28
odunbar added 13 commits April 30, 2026 16:10
typos

add pkgs

encoder_kwargs in nice form

remove dim criterion for now

docstring

API refinement

add example

builds encoder, but mcmc bug

examples run, bugs for dimensionality and product order resolved

noise injection reduced

add some get_encoded_dim functions

add consistency for both in-and-out dimensions for li, when using multiple distributions

runs through with output reduction, logic simplified for now
@odunbar odunbar force-pushed the orad/likelihood-informed-proc-v2 branch from 002a511 to b5525d6 Compare April 30, 2026 23:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add new utilities for the likelihood-informed DataProcessor

1 participant