Why do we need to freeze the batch normalization layers when training badencoder?

Hi! I have been reading your awesome paper these days. One question I hope you can give me the answer. That is, why do we need to freeze the BN layers when training the badencoder? Do we also need to do the same thing for other layers, like layer normalization?