This is very interesting work, but I encountered a problem when reproducing the training on PI3: the results are similar to PI3 during full sequence training, but once I enable chunking, the model's performance drops significantly, and I don't know why.
This is very interesting work, but I encountered a problem when reproducing the training on PI3: the results are similar to PI3 during full sequence training, but once I enable chunking, the model's performance drops significantly, and I don't know why.