Questions about the paper `A message passing realization of EFE minimization`` #590

meditans · 2025-12-22T09:35:58Z

meditans
Dec 22, 2025

Hi! I recently read @wouterwln, @Nimrais's and @ThijsvdLaar's very insightful paper, "A message passing realization of EFE minimization". While working through the code I distilled the stochastic maze example into a minimal single-file presentation, that could be eventually added to the RxInfer examples. Trying to understand the ideas behind it, I've encountered six questions I'd like to hear your thoughts on. I apologize for the combined length of the post, but it seemed better than dividing the questions in separate threads.

1 - Closing `y_future[t]` half-edges

In the model definition, y_future is described as:

y_future[t] ~ DiscreteTransition(s[t], A) where {meta=observation_marginalstorage}
y_future[t] ~ Categorical(fill(1 / n_states, n_states))

I believe the categorical distribution constraint is added because otherwise y_future edges would remain half-open.
I tried substituting the constraint with y_future[t] ~ Uninformative(), but I couldn't find a combination that worked.

If I understand correctly, the difference is that, in the Uninformative case no message is sent from the node, while the Categorical node nudges the variable toward the uniform distribution. That's somehow extraneous to the model. Why
doesn't this cause problems in practice?

2 - The lack of prior on observations:

The paper defines three types of priors:

Action prior (encourages actions that resolve ambiguities - the Exploration node)
State prior (favors informative states - the Ambiguity node)
Observation prior (encourages actions informative about parameters - not present in the code)

Am I correct in thinking that the third type of node is absent because it concerns parameter estimation, and this model doesn't include parameters? For instance, if we were to learn the slippery coefficient of each state (which remains constant across runs), would we then need to include this prior? How?

3 - On missing marginal rules

While writing a minimal version of the code, I realized that if the marginal rules that introduce the handling of JointMarginalStorage were missing, the result would be incorrect, (since the version without an explicit meta would be used, and JointMarginalMeta would never be updated), yet crucially no error would be raised.
Would it be worth to notify the user when the meta parameter goes unused to prevent this kind of issue?

4 - On which rules are fired

This is related to the previous question. While debugging that problem, I wanted to know which @rules are fired during the inference process. Is there a recommended way to inspect this? By the way, this might be something that could
be shown to the user in the new observation library, @bvdmitri.

5 - On the meaning of the sumdims parameter

In the definition of JointMarginalMetaComponent, there's a sumdims parameter. Since it's never used as a runtime value, I assume it helps selecting the correct dispatch as a type (similar to out_dim and in_dim). Is that correct?
Could you explain why it's needed and what it represents?

struct JointMarginalMetaComponent{C,OutDim,InDim,SumDims}
    jms::JointMarginalStorage{C}
    out_dim::Val{OutDim}
    in_dim::Val{InDim}
    sumdims::SumDims
end

6 - On the true dependencies of Exploration and Ambiguity nodes

Consider the Exploration node. Nominally, it depends the y_current variable (based on how the model is connected), but the rule doesn't actually use the information coming from that node. In fact, the only thing that matters for exploration is the joint distribution calculated during the marginal rule in the state transition node, which is passed through via JointMarginalMeta.

@rule Exploration(:out, Marginalisation) (q_in::Any, meta::JointMarginalMeta,) = begin
    entropies = mapreduce(conditional_entropy, +, meta.components)
    softmax!(entropies, entropies)
    return Categorical(entropies)
end

Here's a crude drawing: the red connections represent the hidden links through which the marginal information is shared. Again, both Exploration and Ambiguity nodes nominally depend on y_current.

My question is, is y_current simply a place-holder input that we need because an incoming message is required to trigger the rule? Could we have used something else instead?

Moreover, regarding the scheduling of messages, what prompts the Exploration node to send a message? It can't be y_current again (?) and I'm not sure whether a change in the meta can trigger a message from the node. I'm just confused on this.

wouterwln · 2025-12-24T14:46:18Z

wouterwln
Dec 24, 2025
Maintainer

Hey @meditans , thanks for reading the paper and trying to recreate it! I'll talk you through your questions one by one. Mostly you're right, and most of your confusion comes from my own laziness in not cleaning up the code in the process of writing the paper :) Let's go into them

You're right, except that the Uninformative node sends an Uninformative message, which is a bit clunky because I did not program DiscreteTransition to handle Uninformative messages (only Categorical ones). I therefore had 2 choices, either reimplement DiscreteTransition for Uninformative messages, or cut the corner and put a vague Categorical prior (mathematically they are equivalent). I chose to cut the corner.
Well, there is a prior on states which is the goal prior. Indeed there is no prior on observations, but the paper does not specify this. The paper specifies two "epistemic" priors, one on states (Ambiguity node) and one on actions (Exploration node), the second prior on states is the goal prior which is implemented as a regular Categorical node
Well, we could throw an error, indeed, but these experiments were primarily scientific experiments and not an attempt to make a proper maintainable extension to RxInfer (yet). I'd have to change some stuff in the DiscreteTransition node to be a little less lenient with meta objects, but that'd impact the rule calculation as well. Again, here I had two choices: cut a corner and be sure that my marginalrule is triggered every time myself, or completely reimplement every rule from DiscreteTransition to throw when the right meta combination is not known. Again, I chose the option that cost the least effort.
This is actually an open topic within the ReactiveBayes ecosystem so if it's okay with you I'll leave this question open and hopefully get back to you in some time with a solution. I think this is what a lot of people struggle with and a concern that we're aware of, so you're definitely not alone (I also find this painful at times)
This is purely a remnant of a previous implementation (where my epistemic priors were slightly different). I didn't take it out yet because I think it could serve some optimization purpose if we ever expand this method, but for now it is indeed not used. (Again, I was too lazy to take it out)
You're absolutely right. Here's the thing: RxInfer does some low level optimization, meaning that if a message does not need recomputing, we won't recompute it and we'll just send the previously sent message. Since no-one ever thought we'd be modifying the meta object, and that this would mean we'd have to recompute the message, a change in the meta object actually is not enough for RxInfer to recompute the message on the next iteration. The data that is passed here is purely to trick RxInfer into thinking that the incoming messages have changed and that we therefore will have to recompute the message (which is why it is never used). Ideally I would not have any arguments to this node (and not pass this observation around unnecessarily) but RxInfer is a bit too smart in its optimization to allow elegant implementations of these funky nonstandard procedures.

I hope this helps! If you have any more questions, feel free to ask!

0 replies

meditans · 2026-02-13T09:35:49Z

meditans
Feb 13, 2026
Author

Hi @wouterwln, thank you so much for the answers. No worries, it all makes sense! I have just two follow up questions that stem from debugging my own adaptation of the code.

The point of the paper is that to minimize EFE, one doesn't need to check all possible future policies (an exponentially-growing number of them); instead, it's enough to do variational inference with carefully chosen priors. But canceling an exponential amount of work feels like magic to me: I don't intuitively grasp where the reduction really comes from. I know this is a soft question, but how do you picture it?
In my adaptation of the code, I tried moving around the hazards and agent starting positions. I can see that the EFE agent is very scared of the noisy observation cells, but can't plan well enough around the stochastic transition cells. In fact, if I completely disable the noisy observation cells the performance drops.

I traced back this problem to the published code in biaslab/EFEasVFE, and I wonder if you could confirm. In your generate_maze_tensors function, eliminate the noisy_observations, like so:

 function generate_maze_tensors(grid_size_x::Int, grid_size_y::Int, n_actions::Int;
     sink_states::Vector{Tuple{Int,Int}}=[(4, 2), (4, 4)],
     stochastic_states::Vector{Tuple{Int,Int}}=[(2, 3), (3, 3), (4, 3)],
-    noisy_observations::Vector{Tuple{Int,Int,Float64}}=[(1, 5, 0.1), (2, 5, 0.1), (3, 5, 0.4), (4, 5, 0.1), (2, 3, 0.4), (2, 4, 0.4), (3, 2, 0.4), (3, 3, 0.4), (4, 3, 0.2), (2, 2, 0.3), (3, 2, 0.4), (3, 4, 0.4),])
+    noisy_observations::Vector{Tuple{Int,Int,Float64}}=[(1, 5, 0.0)])

Look at how the performance of the EFE agent drops:

Running stochastic_maze with  threads and parameters: 
Precompiling EFEasVFE finished.
  1 dependency successfully precompiled in 16 seconds. 368 already precompiled.
Running experiments with KL-Control agent...
Running KL-Control episodes: 100%|███████████| Time: 0:00:25
KL-Control agent results: mean reward = -0.57, std = 0.7420283421851624, success rate = 0.15
Running experiments with EFE agent...
Running EFE episodes: 100%|██████████████████| Time: 0:00:42
EFE agent results: mean reward = -0.43, std = 0.6236905474591539, success rate = 0.07
Results saved to /home/carlo/code/julia/EFEasVFE/data/results/stochastic_maze/stochastic_maze_20260213_102052/results.json
Episode data saved to /home/carlo/code/julia/EFEasVFE/data/results/stochastic_maze/stochastic_maze_20260213_102052/episodes.json

It seems to me that the agent was avoiding the stochastic transitions because they happened to be along paths of noisy observability, but it happily tries (and often fails) if it can observe clearly. But maybe I'm missing something important.

Thank you again!

1 reply

skoghoern Feb 19, 2026

Hi meditans,
just wanted to share my thoughts until wouter finds time to answer. i also just started working on the package (which by the way @wouterwln is awesome work, including the papers!!) and mainly worked with the easiest of the examples: the t-maze (no stochastic transitions, short time horizon of min. 4).

Message-passing allows to "back-propagate" the prior preferences logged-in at the end of the state-space-chain. using these messages combined with the infused epistemic priors allows at each step to calculate an "optimal" action (rather than doing a forward simulation). In general this concept is commonly known as planning-as-inference.
while this works "optimally" if we set the timehorizon to the minimally-needed length, in my experiments with the t-maze it doesnt find the shortest path for longer time-horizons than needed: the agent remains longer in the states favored by the epistemic priors infused in every step (in the tmaze the reward cue field), rather than going immediately to the reward after having visited the reward cue location once. (it seems that the information of the prior preferences diffuse during the reverse-messages due to the epistemic priors injected at every step.)
could it be that you just changed the function head of generate_maze_tensors()? this gets over written when being called via create_stochastic_maze(). When i changed it there, i couldnt reproduce your results:

 "efe": {"std_reward": 0.0, "mean_reward": 1.0, [...]}
  "klcontrol": {"std_reward": 0.614451804788759, "mean_reward": -0.5, [...]}

Generally stochastic state transitions should be more likely due to the Epistemic Action prior (aka Exploration Node) whereas noisy observability generating states should be less likely due to the Epistemic State Prior (aka Ambiguity Node) -> ~~so theoretically they shouldnt be influencing each other~~ [corrected by wouter, see comment below]

wouterwln · 2026-02-19T08:55:46Z

wouterwln
Feb 19, 2026
Maintainer

Hey @meditans and @skoghoern , thanks for your responses. I'll try and give some insight:

Exactly, the point of the paper is that we do no longer need to enumerate all possible sequences of actions. Essentially what we are doing here is the same "trick" as variational inference: Bayesian inference is an intractable problem, meaning it requires an enumeration of all possible configurations. When introducing an auxiliary variational distribution $q$, and then optimizing the KL-divergence, we turn this enumeration problem into an optimization problem (which is no longer intractable) at the price of accuracy. This is essentially what we're doing in this paper as well, we're trying to come up with an optimization objective that minimizes EFE by proxy. We can implement the optimization in message passing form to get some nice computational benefits as well.

As for your second question, and here's where I have to correct @skoghoern a bit: You're right that the agent prefers stochastic transitions; this is what the epistemic prior on action essentially says. The awkward thing with this optimization objective is that the two epistemic priors have a bit of a counteracting effect; one wants to maximize the entropy of the state belief of future states (to explore), and another wants the entropy of future state beliefs to be minimal (identified). In my further work on this topic I have found that this objective is very hard to optimize because of these counteracting forces. It could very well be the case that putting in different values of transition and observation noise is detrimental for performance. However, the behavior is perfectly rational, think about it like this: The agent does not know if the uncertainty in the transition distribution of these noisy states is epistemic or purely aleatoric (we know it is aleatoric), so for the agent, it makes perfect sense to try these transitions, as it is actively trying to explore these transitions to reduce uncertainty about them.

1 reply

skoghoern Feb 19, 2026

oh yeah, thanks for catching that error.
I understood it like this: the agent wants to explore unpredictable state transitions (due to action prior), whereas the state prior helps to not remain on uninformative aleatoric uncertainty states (like watching a white noise tv).

skoghoern · 2026-02-19T08:57:08Z

skoghoern
Feb 19, 2026

Since this thread is about EFEasVFE:
I was working with the tmaze model. and since both epistemic priors in that scenario actually are independent of the current agents belief q(x_t-1) or q(x_t) we can simplify the model to return constants for the priors. Yet it only works if i wrap the probability vector of the Categorical into a helper function that needs to receive reward_observation (even if we set VMP iterations =1) - can you explain this? is this also due to rxinfers message-passsing optimization?

@model function efe_tmaze_agent(reward_observation_tensor, location_transition_tensor, prior_location, prior_reward_location, reward_to_location_mapping, u_prev, T, reward_observation, location_observation)
    old_location ~ prior_location # =beliefs.location
    reward_location ~ prior_reward_location # =beliefs.reward_location

    current_location ~ DiscreteTransition(old_location, location_transition_tensor, u_prev)
    location_observation ~ DiscreteTransition(current_location, diageye(5))
    reward_observation ~ DiscreteTransition(current_location, reward_observation_tensor, reward_location)

    previous_location = current_location
    for t in 1:T
        # Epistemic Action Prior (Exploration Node) - ideally u[t] ~ Categorical([0.25, 0.25, 0.25, 0.25]) 
        u[t] ~ Categorical(calculate_epistemic_action_prior(reward_observation))
        location[t] ~ DiscreteTransition(previous_location, location_transition_tensor, u[t])
        # Epistemic State Prior (Ambiguity Node) - ideally location[t] ~ Categorical([1/3, 1/6, 1/6, 1/6, 1/6])
        location[t] ~ Categorical(calculate_epistemic_prior_vec(reward_observation_tensor))
        previous_location = location[t]
    end
    location[end] ~ DiscreteTransition(reward_location, reward_to_location_mapping)
end

where i defined:

function calculate_epistemic_action_prior(tensor)
    return [0.25, 0.25, 0.25, 0.25]
end
function calculate_epistemic_prior_vec(tensor)
    return [1/3, 1/6, 1/6, 1/6, 1/6]
end

3 replies

wouterwln Feb 19, 2026
Maintainer

I guess here you could write Categorical([0.25, 0.25, 0.25, 0.25]) and Categorical([1/3, 1/6, 1/6, 1/6, 1/6]) and it is supposed to work out of the box

skoghoern Feb 19, 2026

I expected the same, but strangely it didnt return the same results as the helper-function implementation (efe agent became KL-control like)

skoghoern Mar 4, 2026

talking about debugging: a concrete use case i am currently having is to understand how the order of messages passed changes between the following two models (where we connect u[t] either to a direct prior u[t] ~ Categorical([0.25, 0.25, 0.25, 0.25]), or let the prior vector be returned by a function which depends on the same observation variable, that "triggers" the prior for location[end]: u[t] ~ Categorical(vague_cat_vec(reward_cue_observation))

model 1 that first explores reward cue - desired

@model function efe_tmaze_agent(reward_observation_tensor, location_transition_tensor, prior_location, prior_reward_location, reward_to_location_mapping, u_prev, T, reward_cue_observation, location_observation)
    old_location ~ Categorical(prior_location) 
    reward_location ~ Categorical(prior_reward_location)
    current_location ~ DiscreteTransition(old_location, location_transition_tensor, u_prev)
    location_observation ~ DiscreteTransition(current_location, diageye(5))
    reward_cue_observation ~ DiscreteTransition(current_location, reward_observation_tensor, reward_location)

    previous_location = current_location
    for t in 1:T
        # Epistemic Action Prior (Exploration Node) 
        u[t] ~ Categorical(vague_cat_vec(reward_cue_observation))
        location[t] ~ DiscreteTransition(previous_location, location_transition_tensor, u[t])
        # Epistemic State Prior (Ambiguity Node)
        location[t] ~ Categorical([1/3, 1/6, 1/6, 1/6, 1/6])

        previous_location = location[t]
    end
    location[end] ~ DiscreteTransition(reward_location, reward_to_location_mapping) 
end

model 2 that directly goes toward reward fields - not desired

@model function efe_tmaze_agent(reward_observation_tensor, location_transition_tensor, prior_location, prior_reward_location, reward_to_location_mapping, u_prev, T, reward_cue_observation, location_observation)
    old_location ~ Categorical(prior_location) 
    reward_location ~ Categorical(prior_reward_location)
    current_location ~ DiscreteTransition(old_location, location_transition_tensor, u_prev)
    location_observation ~ DiscreteTransition(current_location, diageye(5))
    reward_cue_observation ~ DiscreteTransition(current_location, reward_observation_tensor, reward_location)

    previous_location = current_location
    for t in 1:T
        # Epistemic Action Prior (Exploration Node) 
        u[t] ~ Categorical([0.25, 0.25, 0.25, 0.25])
        location[t] ~ DiscreteTransition(previous_location, location_transition_tensor, u[t])
        # Epistemic State Prior (Ambiguity Node)
        location[t] ~ Categorical([1/3, 1/6, 1/6, 1/6, 1/6])

        previous_location = location[t]
    end
    location[end] ~ DiscreteTransition(reward_location, reward_to_location_mapping) 
end

Changing the prior factor of u[t] seems to be significant in how the messages get passed in this loopy factor graph -> e.g. the posterior on actions u changes drastically. I also tried it with higher VMP iterations, but the results didnt change.

@bvdmitri could you explain how these implementations lead to different update schemes?

setup and model call

# tensors
reward_observation_mapping = fill(0.5, 2, 5, 2)
reward_observation_mapping[CartesianIndex.([(1,1,1), (2,1,1), (1,1,2), (2,1,2)])] .= [1.0, 0.0, 0.0, 1.0]

reward_mapping = zeros(5, 2)
reward_mapping[[CartesianIndex(3, 1), CartesianIndex(5, 2)]] .= 1.0

# Locations: 1=Stem Bottom (2,3), 2=Stem Center (2,2), 3=Top Left (1,1), 4=Top Center (1,2), 5=Top Right (1,3)
# Actions: 1=N, 2=E, 3=S, 4=W
# dimenisons: p(new_loc | prev_loc, action)
# indexed as [target_state, current_state, action_id]
location_transition = zeros(5, 5, 4)
transitions = [ (2,1,1), (1,1,2), (1,1,3), (1,1,4), (4,2,1), (2,2,2), (1,2,3), (2,2,4),
                (3,3,1), (4,3,2), (3,3,3), (3,3,4), (4,4,1), (5,4,2), (2,4,3), (3,4,4),
                (5,5,1), (5,5,2), (5,5,3), (4,5,4) ]
location_transition[CartesianIndex.(transitions)] .= 1.0
# beliefs and other variables
belief_prev_location = vague(Categorical, 5)
belief_reward_location = vague(Categorical, 2)
future_location_beliefs = vague(Categorical, 5)
@initialization function efe_tmaze_agent_initialization(prior_location, prior_reward_location, prior_future_locations)
    μ(old_location) = prior_location # = vague(Categorical, 5)
    μ(reward_location) = prior_reward_location # = vague(Categorical, 2)
    μ(location) = prior_future_locations # =vague(Categorical, 5)
end
initialization = efe_tmaze_agent_initialization(belief_prev_location, belief_reward_location, future_location_beliefs)

function vague_cat_vec(message_trigger)
    return [0.25, 0.25, 0.25, 0.25]
end

result = infer(
        model=efe_tmaze_agent(
            reward_observation_tensor=reward_observation_mapping,
            location_transition_tensor=location_transition,
            prior_location=probvec(belief_prev_location),
            prior_reward_location=probvec(belief_reward_location),
            reward_to_location_mapping=reward_mapping,
            u_prev=[0.0, 1.0, 0.0, 0.0],
            T=3
        ),
        data=(
            location_observation=[0.0, 1.0, 0.0, 0.0, 0.0],
            reward_cue_observation=[0.5,0.5]
        ),
        #addons = (AddonMemory(),),
        iterations=1,
        initialization=initialization
    )

i traced down the following order for model version 1 (correct):

@ marginal u[3] = Categorical{Float64, Vector{Float64}}(support=Base.OneTo(4), p=[0.25, 0.25, 0.25, 0.25])
@ marginal u[2] = Categorical{Float64, Vector{Float64}}(support=Base.OneTo(4), p=[0.25000000000000006, 0.24999999999999994, 0.25000000000000006, 0.24999999999999997])
@ marginal location[3] = Categorical{p: [7.749999999832563e-12, 7.749999999832563e-12, 0.49999999998837513, 7.749999999832563e-12, 0.49999999998837474])
@ marginal location[2] = Categorical{p: [1.3999999999653993e-11, 6.999999999826999e-12, 0.3749999999912501, 0.24999999999649994, 0.37499999999125]
@ marginal current_location = Categorical{Float64, Vector{Float64}}(
p: [5.600002799992323e-12, 0.9999999999799999, 4.799999999996161e-12, 4.799999999996161e-12, 4.799999999996161e-12]
@ marginal old_location = Categorical{p: [8.400002799805242e-12, 0.9999999999664002, 8.400002799805242e-12, 8.400002799805242e-12, 8.400002799805242e-12]
@ marginal reward_location = Categorical{Float64, Vector{Float64}}(support=Base.OneTo(2), p=[0.5000000000000002, 0.4999999999999998])
@ marginal location[1] = Categorical{p: [0.39999999999728003, 0.39999999999347996, 4.9999999999980004e-12, 0.19999999999924, 4.9999999999980004e-12]
@ marginal u[1] = Categorical{Float64, Vector{Float64}}(support=Base.OneTo(4), p=[0.20000000000052004, 0.20000000000132007, 0.39999999999683994, 0.20000000000132007])

whereas in the direct prior version model 2 (undesired) i get:

@ marginal u[2] = Categorical{Float64, Vector{Float64}}(support=Base.OneTo(4), p=[0.25, 0.25, 0.25, 0.25])
@ marginal location[2] = Categorical{Float64, Vector{Float64}}(p: [0.33333333333333326, 0.16666666666666666, 0.1666666666666667, 0.16666666666666663, 0.16666666666666669]
@ marginal u[3] = Categorical{Float64, Vector{Float64}}(support=Base.OneTo(4), p=[0.25, 0.25, 0.25, 0.25])
@ marginal location[3] = Categorical{Float64, Vector{Float64}}(p: [7.749999999832563e-12, 7.749999999832563e-12, 0.49999999998837513, 7.749999999832563e-12, 0.49999999998837474]
@ marginal location[2] = Categorical{Float64, Vector{Float64}}(p: [1.3999999999653993e-11, 6.999999999826999e-12, 0.3749999999912501, 0.24999999999649994, 0.37499999999125]
@ marginal current_location = Categorical{Float64, Vector{Float64}}(p: [1.0666666665295111e-11, 0.9999999999418332, 1.7083333330245768e-11, 1.3333333331052225e-11, 1.708333333024577e-11]
@ marginal old_location = Categorical{Float64, Vector{Float64}}(p: [1.5500000331415174e-11, 0.9999999999380002, 1.5500000331415174e-11, 1.550000033141517e-11, 1.550000033141517e-11]
@ marginal reward_location = Categorical{Float64, Vector{Float64}}(support=Base.OneTo(2), p=[0.5000000000000002, 0.4999999999999998])
@ marginal location[3] = Categorical{Float64, Vector{Float64}}(p: [7.749999999832563e-12, 7.749999999832563e-12, 0.49999999998837513, 7.749999999832563e-12, 0.49999999998837474]
@ marginal location[2] = Categorical{Float64, Vector{Float64}}(p: [1.3999999999653993e-11, 6.999999999826999e-12, 0.3749999999912501, 0.24999999999649994, 0.37499999999125]
@ marginal location[1] = Categorical{Float64, Vector{Float64}}(p: [7.066666665482242e-11, 0.3333333333198889, 2.1999999996577668e-11, 0.6666666665654446, 2.1999999996577674e-11]
@ marginal u[3] = Categorical{Float64, Vector{Float64}}(support=Base.OneTo(4), p=[0.16666666668402774, 0.33333333331597226, 0.16666666668402774, 0.33333333331597226])
@ marginal u[2] = Categorical{Float64, Vector{Float64}}(support=Base.OneTo(4), p=[0.4999999999425833, 0.2500000000000417, 5.733333332366886e-11, 0.25000000000004163])
@ marginal u[1] = Categorical{Float64, Vector{Float64}}(support=Base.OneTo(4), p=[0.6666666665686111, 0.1666666666740278, 8.333333331866915e-11, 0.1666666666740278])

ReactiveBayes

Questions about the paper A message passing realization of EFE minimization` #590

Uh oh!

Uh oh!

meditans Dec 22, 2025

1 - Closing y_future[t] half-edges

2 - The lack of prior on observations:

3 - On missing marginal rules

4 - On which rules are fired

5 - On the meaning of the sumdims parameter

6 - On the true dependencies of Exploration and Ambiguity nodes

Replies: 4 comments · 5 replies

Uh oh!

wouterwln Dec 24, 2025 Maintainer

Uh oh!

meditans Feb 13, 2026 Author

Uh oh!

Uh oh!

skoghoern Feb 19, 2026

Uh oh!

wouterwln Feb 19, 2026 Maintainer

Uh oh!

skoghoern Feb 19, 2026

Uh oh!

skoghoern Feb 19, 2026

Uh oh!

wouterwln Feb 19, 2026 Maintainer

Uh oh!

skoghoern Feb 19, 2026

Uh oh!

Uh oh!

skoghoern Mar 4, 2026

model 1 that first explores reward cue - desired

model 2 that directly goes toward reward fields - not desired

setup and model call

Questions about the paper `A message passing realization of EFE minimization`` #590

meditans
Dec 22, 2025

1 - Closing `y_future[t]` half-edges

Replies: 4 comments 5 replies

wouterwln
Dec 24, 2025
Maintainer

meditans
Feb 13, 2026
Author

wouterwln
Feb 19, 2026
Maintainer

skoghoern
Feb 19, 2026

wouterwln Feb 19, 2026
Maintainer