Hello, I noticed that your training command doesn't specify mv_unet as True, but during inference, mv_unet will be loaded as True if there is a ref_image. Is it intended that cross-view attention is not used during training, but is enabled during inference instead?
Hello, I noticed that your training command doesn't specify mv_unet as True, but during inference, mv_unet will be loaded as True if there is a ref_image. Is it intended that cross-view attention is not used during training, but is enabled during inference instead?