Thank you for your great work!
I have a coding question.
Does the code support stream generation, chunk by chunk? In other words, can it take one chunk (as a window) as input and output the corresponding 3D structure, rather than first processing the whole video chunk and then internally splitting it into multiple windows? I think this would be more helpful for streaming video world models.
Thank you for your great work!
I have a coding question.
Does the code support stream generation, chunk by chunk? In other words, can it take one chunk (as a window) as input and output the corresponding 3D structure, rather than first processing the whole video chunk and then internally splitting it into multiple windows? I think this would be more helpful for streaming video world models.