From b87773382c1725dfefb13c31799650589ecc46ed Mon Sep 17 00:00:00 2001 From: Jinwei Gu Date: Mon, 25 May 2026 14:01:07 -0700 Subject: [PATCH 1/9] Update with Cosmos3-Super results Update with Cosmos3-Super results (on I2V and V2V, with and without WMReward(BoN) --- README.md | 40 ++++++++++++++++++++++------------------ 1 file changed, 22 insertions(+), 18 deletions(-) diff --git a/README.md b/README.md index a7921e8..97b56db 100644 --- a/README.md +++ b/README.md @@ -37,24 +37,28 @@ If you test your model on Physics-IQ and would like your score/paper/model to be | **#** | **Model** | **input type** | **Physics-IQ score** | **date added (YYYY-MM-DD)** | | -- | --- | --- | --- | --- | -| 1 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | multiframe (v2v) | **62.6 %** :1st_place_medal: | 2025-10-28 | -| 2 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | multiframe (v2v) | **56.0 %** :2nd_place_medal: | 2025-04-21 | -| 3 | [Sora2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **46.4 %** :3rd_place_medal: | 2026-04-01 | -| 4 | [Wan2.2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 44.4 % | 2026-04-01 | -| 5 | [Sora2](https://openai.com/index/sora-2/) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 42.3 % | 2026-04-01 | -| 6 | [Wan2.2](https://github.com/Wan-Video/Wan2.2) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 38.3 % | 2026-04-01 | -| 7 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 36.9 % | 2025-10-28 | -| 8 | [Video-GPT](https://arxiv.org/abs/2505.12489) reported [here](https://arxiv.org/abs/2505.12489) | multiframe (v2v) | 35.0 % | 2025-05-22 | -| 9 | [CogVideoX-5b](https://github.com/ved015/CogVideoX-5b-Physics_iq_benchmarking) reported [here](https://github.com/ved015/CogVideoX-5b-Physics_iq_benchmarking) | i2v | 32.3 % | 2026-01-06 | -| 10 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | i2v | 30.2 % | 2025-04-21 | -| 11 | [VideoPoet](https://arxiv.org/abs/2312.14125) reported [here](https://arxiv.org/abs/2501.09038) | multiframe (v2v) | 29.5 % | 2025-02-19 | -| 12 | [Lumiere](https://arxiv.org/abs/2401.12945) reported [here](https://arxiv.org/abs/2501.09038) | multiframe (v2v) | 23.0 % | 2025-02-19 | -| 13 | [Runway Gen 3](https://runwayml.com/research/introducing-gen-3-alpha) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 22.8 % | 2025-02-19 | -| 14 | [VideoPoet](https://arxiv.org/abs/2312.14125) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 20.3 % | 2025-02-19 | -| 15 | [Lumiere](https://arxiv.org/abs/2401.12945) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 19.0 % | 2025-02-19 | -| 16 | [Stable Video Diffusion](https://arxiv.org/abs/2311.15127) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 14.8 % | 2025-02-19 | -| 17 | [Pika](https://pika.art/) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 13.0 % | 2025-02-19 | -| 18 | [Sora](https://openai.com/sora/) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 10.0 % | 2025-02-19 | +| 1 | [Cosmos3-Super + WMReward (BoN)](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | **63.4 %** :1st_place_medal: | 2026-05-25 | +| 2 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | multiframe (v2v) | **62.6 %** :2nd_place_medal: | 2025-10-28 | +| 3 | [Cosmos3-Super](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | **57.7 %** :3rd_place_medal: | 2026-05-25 | +| 4 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | multiframe (v2v) | 56.0 % | 2025-04-21 | +| 5 | [Cosmos3-Super + WMReward (BoN)](https://placeholder) reported [here](https://placeholder) | i2v | **48.9 %** :1st_place_medal: | 2026-05-25 | +| 6 | [Sora2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **46.4 %** :2nd_place_medal: | 2026-04-01 | +| 7 | [Wan2.2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **44.4 %** :3rd_place_medal: | 2026-04-01 | +| 8 | [Cosmos3-Super](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | 43.8 % | 2026-05-25 | +| 9 | [Sora2](https://openai.com/index/sora-2/) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 42.3 % | 2026-04-01 | +| 10 | [Wan2.2](https://github.com/Wan-Video/Wan2.2) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 38.3 % | 2026-04-01 | +| 11 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 36.9 % | 2025-10-28 | +| 12 | [Video-GPT](https://arxiv.org/abs/2505.12489) reported [here](https://arxiv.org/abs/2505.12489) | multiframe (v2v) | 35.0 % | 2025-05-22 | +| 13 | [CogVideoX-5b](https://github.com/ved015/CogVideoX-5b-Physics_iq_benchmarking) reported [here](https://github.com/ved015/CogVideoX-5b-Physics_iq_benchmarking) | i2v | 32.3 % | 2026-01-06 | +| 14 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | i2v | 30.2 % | 2025-04-21 | +| 15 | [VideoPoet](https://arxiv.org/abs/2312.14125) reported [here](https://arxiv.org/abs/2501.09038) | multiframe (v2v) | 29.5 % | 2025-02-19 | +| 16 | [Lumiere](https://arxiv.org/abs/2401.12945) reported [here](https://arxiv.org/abs/2501.09038) | multiframe (v2v) | 23.0 % | 2025-02-19 | +| 17 | [Runway Gen 3](https://runwayml.com/research/introducing-gen-3-alpha) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 22.8 % | 2025-02-19 | +| 18 | [VideoPoet](https://arxiv.org/abs/2312.14125) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 20.3 % | 2025-02-19 | +| 19 | [Lumiere](https://arxiv.org/abs/2401.12945) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 19.0 % | 2025-02-19 | +| 20 | [Stable Video Diffusion](https://arxiv.org/abs/2311.15127) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 14.8 % | 2025-02-19 | +| 21 | [Pika](https://pika.art/) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 13.0 % | 2025-02-19 | +| 22 | [Sora](https://openai.com/sora/) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 10.0 % | 2025-02-19 | *Note to early adopters of the benchmark: results from the paper were finalized on February 19, 2025; if you used the toolbox before please re-run since we changed and improved a few aspects. Likewise, if you downloaded the dataset before that date, it is recommended to re-download it, ensuring the ground truth video masks have a duration of five seconds.* From b51dff897ceb7a66697cfd01c9bbd38b31dacd38 Mon Sep 17 00:00:00 2001 From: Jinwei Gu Date: Mon, 25 May 2026 14:02:02 -0700 Subject: [PATCH 2/9] Fix typo --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 97b56db..6db64cb 100644 --- a/README.md +++ b/README.md @@ -44,7 +44,7 @@ If you test your model on Physics-IQ and would like your score/paper/model to be | 5 | [Cosmos3-Super + WMReward (BoN)](https://placeholder) reported [here](https://placeholder) | i2v | **48.9 %** :1st_place_medal: | 2026-05-25 | | 6 | [Sora2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **46.4 %** :2nd_place_medal: | 2026-04-01 | | 7 | [Wan2.2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **44.4 %** :3rd_place_medal: | 2026-04-01 | -| 8 | [Cosmos3-Super](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | 43.8 % | 2026-05-25 | +| 8 | [Cosmos3-Super](https://placeholder) reported [here](https://placeholder) | i2v | 43.8 % | 2026-05-25 | | 9 | [Sora2](https://openai.com/index/sora-2/) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 42.3 % | 2026-04-01 | | 10 | [Wan2.2](https://github.com/Wan-Video/Wan2.2) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 38.3 % | 2026-04-01 | | 11 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 36.9 % | 2025-10-28 | From 2a0e40f63c60d18553914dca9bb8f47f2493e1cf Mon Sep 17 00:00:00 2001 From: Jinwei Gu Date: Mon, 25 May 2026 14:46:32 -0700 Subject: [PATCH 3/9] fix typo fix typo --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 6db64cb..3e8ea42 100644 --- a/README.md +++ b/README.md @@ -39,7 +39,7 @@ If you test your model on Physics-IQ and would like your score/paper/model to be | -- | --- | --- | --- | --- | | 1 | [Cosmos3-Super + WMReward (BoN)](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | **63.4 %** :1st_place_medal: | 2026-05-25 | | 2 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | multiframe (v2v) | **62.6 %** :2nd_place_medal: | 2025-10-28 | -| 3 | [Cosmos3-Super](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | **57.7 %** :3rd_place_medal: | 2026-05-25 | +| 3 | [Cosmos3-Super](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | **59.7 %** :3rd_place_medal: | 2026-05-25 | | 4 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | multiframe (v2v) | 56.0 % | 2025-04-21 | | 5 | [Cosmos3-Super + WMReward (BoN)](https://placeholder) reported [here](https://placeholder) | i2v | **48.9 %** :1st_place_medal: | 2026-05-25 | | 6 | [Sora2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **46.4 %** :2nd_place_medal: | 2026-04-01 | From 4ea2f9706a10133eb33dda3e88af11ee9a4b96fc Mon Sep 17 00:00:00 2001 From: Jinwei Gu Date: Mon, 25 May 2026 15:50:25 -0700 Subject: [PATCH 4/9] add Cosmos3-Nano results in --- README.md | 42 +++++++++++++++++++++++------------------- 1 file changed, 23 insertions(+), 19 deletions(-) diff --git a/README.md b/README.md index 3e8ea42..c2bc698 100644 --- a/README.md +++ b/README.md @@ -40,25 +40,29 @@ If you test your model on Physics-IQ and would like your score/paper/model to be | 1 | [Cosmos3-Super + WMReward (BoN)](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | **63.4 %** :1st_place_medal: | 2026-05-25 | | 2 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | multiframe (v2v) | **62.6 %** :2nd_place_medal: | 2025-10-28 | | 3 | [Cosmos3-Super](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | **59.7 %** :3rd_place_medal: | 2026-05-25 | -| 4 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | multiframe (v2v) | 56.0 % | 2025-04-21 | -| 5 | [Cosmos3-Super + WMReward (BoN)](https://placeholder) reported [here](https://placeholder) | i2v | **48.9 %** :1st_place_medal: | 2026-05-25 | -| 6 | [Sora2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **46.4 %** :2nd_place_medal: | 2026-04-01 | -| 7 | [Wan2.2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **44.4 %** :3rd_place_medal: | 2026-04-01 | -| 8 | [Cosmos3-Super](https://placeholder) reported [here](https://placeholder) | i2v | 43.8 % | 2026-05-25 | -| 9 | [Sora2](https://openai.com/index/sora-2/) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 42.3 % | 2026-04-01 | -| 10 | [Wan2.2](https://github.com/Wan-Video/Wan2.2) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 38.3 % | 2026-04-01 | -| 11 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 36.9 % | 2025-10-28 | -| 12 | [Video-GPT](https://arxiv.org/abs/2505.12489) reported [here](https://arxiv.org/abs/2505.12489) | multiframe (v2v) | 35.0 % | 2025-05-22 | -| 13 | [CogVideoX-5b](https://github.com/ved015/CogVideoX-5b-Physics_iq_benchmarking) reported [here](https://github.com/ved015/CogVideoX-5b-Physics_iq_benchmarking) | i2v | 32.3 % | 2026-01-06 | -| 14 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | i2v | 30.2 % | 2025-04-21 | -| 15 | [VideoPoet](https://arxiv.org/abs/2312.14125) reported [here](https://arxiv.org/abs/2501.09038) | multiframe (v2v) | 29.5 % | 2025-02-19 | -| 16 | [Lumiere](https://arxiv.org/abs/2401.12945) reported [here](https://arxiv.org/abs/2501.09038) | multiframe (v2v) | 23.0 % | 2025-02-19 | -| 17 | [Runway Gen 3](https://runwayml.com/research/introducing-gen-3-alpha) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 22.8 % | 2025-02-19 | -| 18 | [VideoPoet](https://arxiv.org/abs/2312.14125) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 20.3 % | 2025-02-19 | -| 19 | [Lumiere](https://arxiv.org/abs/2401.12945) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 19.0 % | 2025-02-19 | -| 20 | [Stable Video Diffusion](https://arxiv.org/abs/2311.15127) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 14.8 % | 2025-02-19 | -| 21 | [Pika](https://pika.art/) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 13.0 % | 2025-02-19 | -| 22 | [Sora](https://openai.com/sora/) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 10.0 % | 2025-02-19 | +| 4 | [Cosmos3-Nano + WMReward (BoN)](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | 57.7 % | 2026-05-25 | +| 5 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | multiframe (v2v) | 56.0 % | 2025-04-21 | +| 6 | [Cosmos3-Nano](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | 50.2% | 2026-05-25 | +| 7 | [Cosmos3-Super + WMReward (BoN)](https://placeholder) reported [here](https://placeholder) | i2v | **48.9 %** :1st_place_medal: | 2026-05-25 | +| 8 | [Sora2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **46.4 %** :2nd_place_medal: | 2026-04-01 | +| 9 | [Wan2.2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **44.4 %** :3rd_place_medal: | 2026-04-01 | +| 10 | [Cosmos3-Super](https://placeholder) reported [here](https://placeholder) | i2v | 43.8 % | 2026-05-25 | +| 11 | [Cosmos3-Nano + WMReward (BoN)](https://placeholder) reported [here](https://placeholder) | i2v | 43.8 % | 2026-05-25 | +| 12 | [Sora2](https://openai.com/index/sora-2/) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 42.3 % | 2026-04-01 | +| 13 | [Cosmos3-Nano](https://placeholder) reported [here](https://placeholder) | i2v | 40.2% | 2026-05-25 | +| 14 | [Wan2.2](https://github.com/Wan-Video/Wan2.2) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 38.3 % | 2026-04-01 | +| 15 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 36.9 % | 2025-10-28 | +| 16 | [Video-GPT](https://arxiv.org/abs/2505.12489) reported [here](https://arxiv.org/abs/2505.12489) | multiframe (v2v) | 35.0 % | 2025-05-22 | +| 17 | [CogVideoX-5b](https://github.com/ved015/CogVideoX-5b-Physics_iq_benchmarking) reported [here](https://github.com/ved015/CogVideoX-5b-Physics_iq_benchmarking) | i2v | 32.3 % | 2026-01-06 | +| 18 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | i2v | 30.2 % | 2025-04-21 | +| 19 | [VideoPoet](https://arxiv.org/abs/2312.14125) reported [here](https://arxiv.org/abs/2501.09038) | multiframe (v2v) | 29.5 % | 2025-02-19 | +| 20 | [Lumiere](https://arxiv.org/abs/2401.12945) reported [here](https://arxiv.org/abs/2501.09038) | multiframe (v2v) | 23.0 % | 2025-02-19 | +| 21 | [Runway Gen 3](https://runwayml.com/research/introducing-gen-3-alpha) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 22.8 % | 2025-02-19 | +| 22 | [VideoPoet](https://arxiv.org/abs/2312.14125) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 20.3 % | 2025-02-19 | +| 23 | [Lumiere](https://arxiv.org/abs/2401.12945) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 19.0 % | 2025-02-19 | +| 24 | [Stable Video Diffusion](https://arxiv.org/abs/2311.15127) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 14.8 % | 2025-02-19 | +| 25 | [Pika](https://pika.art/) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 13.0 % | 2025-02-19 | +| 26 | [Sora](https://openai.com/sora/) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 10.0 % | 2025-02-19 | *Note to early adopters of the benchmark: results from the paper were finalized on February 19, 2025; if you used the toolbox before please re-run since we changed and improved a few aspects. Likewise, if you downloaded the dataset before that date, it is recommended to re-download it, ensuring the ground truth video masks have a duration of five seconds.* From 1e774d101346f14d8f94cce72c4c32143ec0bc21 Mon Sep 17 00:00:00 2001 From: Jinwei Gu Date: Mon, 25 May 2026 18:38:03 -0700 Subject: [PATCH 5/9] Update URL Update URL https://research.nvidia.com/publication/2026-05-cosmos3 --- README.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index c2bc698..3acfbfd 100644 --- a/README.md +++ b/README.md @@ -37,19 +37,19 @@ If you test your model on Physics-IQ and would like your score/paper/model to be | **#** | **Model** | **input type** | **Physics-IQ score** | **date added (YYYY-MM-DD)** | | -- | --- | --- | --- | --- | -| 1 | [Cosmos3-Super + WMReward (BoN)](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | **63.4 %** :1st_place_medal: | 2026-05-25 | +| 1 | [Cosmos3-Super + WMReward (BoN)](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | multiframe (v2v) | **63.4 %** :1st_place_medal: | 2026-05-25 | | 2 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | multiframe (v2v) | **62.6 %** :2nd_place_medal: | 2025-10-28 | -| 3 | [Cosmos3-Super](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | **59.7 %** :3rd_place_medal: | 2026-05-25 | -| 4 | [Cosmos3-Nano + WMReward (BoN)](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | 57.7 % | 2026-05-25 | +| 3 | [Cosmos3-Super](https://placeholder) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | multiframe (v2v) | **59.7 %** :3rd_place_medal: | 2026-05-25 | +| 4 | [Cosmos3-Nano + WMReward (BoN)](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | multiframe (v2v) | 57.7 % | 2026-05-25 | | 5 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | multiframe (v2v) | 56.0 % | 2025-04-21 | | 6 | [Cosmos3-Nano](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | 50.2% | 2026-05-25 | -| 7 | [Cosmos3-Super + WMReward (BoN)](https://placeholder) reported [here](https://placeholder) | i2v | **48.9 %** :1st_place_medal: | 2026-05-25 | +| 7 | [Cosmos3-Super + WMReward (BoN)](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | i2v | **48.9 %** :1st_place_medal: | 2026-05-25 | | 8 | [Sora2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **46.4 %** :2nd_place_medal: | 2026-04-01 | | 9 | [Wan2.2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **44.4 %** :3rd_place_medal: | 2026-04-01 | -| 10 | [Cosmos3-Super](https://placeholder) reported [here](https://placeholder) | i2v | 43.8 % | 2026-05-25 | -| 11 | [Cosmos3-Nano + WMReward (BoN)](https://placeholder) reported [here](https://placeholder) | i2v | 43.8 % | 2026-05-25 | +| 10 | [Cosmos3-Super](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | i2v | 43.8 % | 2026-05-25 | +| 11 | [Cosmos3-Nano + WMReward (BoN)](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | i2v | 43.8 % | 2026-05-25 | | 12 | [Sora2](https://openai.com/index/sora-2/) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 42.3 % | 2026-04-01 | -| 13 | [Cosmos3-Nano](https://placeholder) reported [here](https://placeholder) | i2v | 40.2% | 2026-05-25 | +| 13 | [Cosmos3-Nano](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | i2v | 40.2% | 2026-05-25 | | 14 | [Wan2.2](https://github.com/Wan-Video/Wan2.2) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 38.3 % | 2026-04-01 | | 15 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 36.9 % | 2025-10-28 | | 16 | [Video-GPT](https://arxiv.org/abs/2505.12489) reported [here](https://arxiv.org/abs/2505.12489) | multiframe (v2v) | 35.0 % | 2025-05-22 | From 41c7577113314b8b3c49835b70e2513250debb7c Mon Sep 17 00:00:00 2001 From: Jinwei Gu Date: Mon, 25 May 2026 18:39:14 -0700 Subject: [PATCH 6/9] Update URL --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 3acfbfd..4f02563 100644 --- a/README.md +++ b/README.md @@ -39,10 +39,10 @@ If you test your model on Physics-IQ and would like your score/paper/model to be | -- | --- | --- | --- | --- | | 1 | [Cosmos3-Super + WMReward (BoN)](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | multiframe (v2v) | **63.4 %** :1st_place_medal: | 2026-05-25 | | 2 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | multiframe (v2v) | **62.6 %** :2nd_place_medal: | 2025-10-28 | -| 3 | [Cosmos3-Super](https://placeholder) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | multiframe (v2v) | **59.7 %** :3rd_place_medal: | 2026-05-25 | +| 3 | [Cosmos3-Super](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | multiframe (v2v) | **59.7 %** :3rd_place_medal: | 2026-05-25 | | 4 | [Cosmos3-Nano + WMReward (BoN)](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | multiframe (v2v) | 57.7 % | 2026-05-25 | | 5 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | multiframe (v2v) | 56.0 % | 2025-04-21 | -| 6 | [Cosmos3-Nano](https://placeholder) reported [here](https://placeholder) | multiframe (v2v) | 50.2% | 2026-05-25 | +| 6 | [Cosmos3-Nano](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | multiframe (v2v) | 50.2% | 2026-05-25 | | 7 | [Cosmos3-Super + WMReward (BoN)](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | i2v | **48.9 %** :1st_place_medal: | 2026-05-25 | | 8 | [Sora2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **46.4 %** :2nd_place_medal: | 2026-04-01 | | 9 | [Wan2.2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **44.4 %** :3rd_place_medal: | 2026-04-01 | From 0f7ce3e6fbe8cb88b2dc58159ca44fa6353aed68 Mon Sep 17 00:00:00 2001 From: Jinwei Gu Date: Mon, 25 May 2026 22:36:43 -0700 Subject: [PATCH 7/9] Update Cosmos3 paper URL Update Cosmos3 paper URL --- README.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 4f02563..e1b4292 100644 --- a/README.md +++ b/README.md @@ -37,19 +37,19 @@ If you test your model on Physics-IQ and would like your score/paper/model to be | **#** | **Model** | **input type** | **Physics-IQ score** | **date added (YYYY-MM-DD)** | | -- | --- | --- | --- | --- | -| 1 | [Cosmos3-Super + WMReward (BoN)](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | multiframe (v2v) | **63.4 %** :1st_place_medal: | 2026-05-25 | +| 1 | [Cosmos3-Super + WMReward (BoN)](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | multiframe (v2v) | **63.4 %** :1st_place_medal: | 2026-05-25 | | 2 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | multiframe (v2v) | **62.6 %** :2nd_place_medal: | 2025-10-28 | -| 3 | [Cosmos3-Super](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | multiframe (v2v) | **59.7 %** :3rd_place_medal: | 2026-05-25 | -| 4 | [Cosmos3-Nano + WMReward (BoN)](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | multiframe (v2v) | 57.7 % | 2026-05-25 | +| 3 | [Cosmos3-Super](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | multiframe (v2v) | **59.7 %** :3rd_place_medal: | 2026-05-25 | +| 4 | [Cosmos3-Nano + WMReward (BoN)](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | multiframe (v2v) | 57.7 % | 2026-05-25 | | 5 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | multiframe (v2v) | 56.0 % | 2025-04-21 | | 6 | [Cosmos3-Nano](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | multiframe (v2v) | 50.2% | 2026-05-25 | -| 7 | [Cosmos3-Super + WMReward (BoN)](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | i2v | **48.9 %** :1st_place_medal: | 2026-05-25 | +| 7 | [Cosmos3-Super + WMReward (BoN)](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | i2v | **48.9 %** :1st_place_medal: | 2026-05-25 | | 8 | [Sora2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **46.4 %** :2nd_place_medal: | 2026-04-01 | | 9 | [Wan2.2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **44.4 %** :3rd_place_medal: | 2026-04-01 | -| 10 | [Cosmos3-Super](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | i2v | 43.8 % | 2026-05-25 | -| 11 | [Cosmos3-Nano + WMReward (BoN)](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | i2v | 43.8 % | 2026-05-25 | +| 10 | [Cosmos3-Super](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | i2v | 43.8 % | 2026-05-25 | +| 11 | [Cosmos3-Nano + WMReward (BoN)](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | i2v | 43.8 % | 2026-05-25 | | 12 | [Sora2](https://openai.com/index/sora-2/) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 42.3 % | 2026-04-01 | -| 13 | [Cosmos3-Nano](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | i2v | 40.2% | 2026-05-25 | +| 13 | [Cosmos3-Nano](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | i2v | 40.2% | 2026-05-25 | | 14 | [Wan2.2](https://github.com/Wan-Video/Wan2.2) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 38.3 % | 2026-04-01 | | 15 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 36.9 % | 2025-10-28 | | 16 | [Video-GPT](https://arxiv.org/abs/2505.12489) reported [here](https://arxiv.org/abs/2505.12489) | multiframe (v2v) | 35.0 % | 2025-05-22 | From 9657f0d5e6a54d742e09b8b278f51b08a079fb4c Mon Sep 17 00:00:00 2001 From: Jinwei Gu Date: Mon, 25 May 2026 22:38:12 -0700 Subject: [PATCH 8/9] Update Cosmos3 paper URL Update Cosmos3 paper URL --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index e1b4292..5d65ed8 100644 --- a/README.md +++ b/README.md @@ -42,7 +42,7 @@ If you test your model on Physics-IQ and would like your score/paper/model to be | 3 | [Cosmos3-Super](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | multiframe (v2v) | **59.7 %** :3rd_place_medal: | 2026-05-25 | | 4 | [Cosmos3-Nano + WMReward (BoN)](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | multiframe (v2v) | 57.7 % | 2026-05-25 | | 5 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | multiframe (v2v) | 56.0 % | 2025-04-21 | -| 6 | [Cosmos3-Nano](https://research.nvidia.com/publication/2026-05-cosmos3) reported [here](https://research.nvidia.com/publication/2026-05-cosmos3) | multiframe (v2v) | 50.2% | 2026-05-25 | +| 6 | [Cosmos3-Nano](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | multiframe (v2v) | 50.2% | 2026-05-25 | | 7 | [Cosmos3-Super + WMReward (BoN)](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | i2v | **48.9 %** :1st_place_medal: | 2026-05-25 | | 8 | [Sora2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **46.4 %** :2nd_place_medal: | 2026-04-01 | | 9 | [Wan2.2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **44.4 %** :3rd_place_medal: | 2026-04-01 | From e4731937ea26061ce000c9b08aab1c790cfc6367 Mon Sep 17 00:00:00 2001 From: Robert Geirhos <23079422+rgeirhos@users.noreply.github.com> Date: Tue, 26 May 2026 09:51:26 +0200 Subject: [PATCH 9/9] Update formatting in leaderboard --- README.md | 52 ++++++++++++++++++++++++++-------------------------- 1 file changed, 26 insertions(+), 26 deletions(-) diff --git a/README.md b/README.md index 5d65ed8..a16aae7 100644 --- a/README.md +++ b/README.md @@ -37,32 +37,32 @@ If you test your model on Physics-IQ and would like your score/paper/model to be | **#** | **Model** | **input type** | **Physics-IQ score** | **date added (YYYY-MM-DD)** | | -- | --- | --- | --- | --- | -| 1 | [Cosmos3-Super + WMReward (BoN)](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | multiframe (v2v) | **63.4 %** :1st_place_medal: | 2026-05-25 | -| 2 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | multiframe (v2v) | **62.6 %** :2nd_place_medal: | 2025-10-28 | -| 3 | [Cosmos3-Super](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | multiframe (v2v) | **59.7 %** :3rd_place_medal: | 2026-05-25 | -| 4 | [Cosmos3-Nano + WMReward (BoN)](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | multiframe (v2v) | 57.7 % | 2026-05-25 | -| 5 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | multiframe (v2v) | 56.0 % | 2025-04-21 | -| 6 | [Cosmos3-Nano](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | multiframe (v2v) | 50.2% | 2026-05-25 | -| 7 | [Cosmos3-Super + WMReward (BoN)](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | i2v | **48.9 %** :1st_place_medal: | 2026-05-25 | -| 8 | [Sora2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **46.4 %** :2nd_place_medal: | 2026-04-01 | -| 9 | [Wan2.2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | **44.4 %** :3rd_place_medal: | 2026-04-01 | -| 10 | [Cosmos3-Super](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | i2v | 43.8 % | 2026-05-25 | -| 11 | [Cosmos3-Nano + WMReward (BoN)](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | i2v | 43.8 % | 2026-05-25 | -| 12 | [Sora2](https://openai.com/index/sora-2/) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 42.3 % | 2026-04-01 | -| 13 | [Cosmos3-Nano](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | i2v | 40.2% | 2026-05-25 | -| 14 | [Wan2.2](https://github.com/Wan-Video/Wan2.2) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 38.3 % | 2026-04-01 | -| 15 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 36.9 % | 2025-10-28 | -| 16 | [Video-GPT](https://arxiv.org/abs/2505.12489) reported [here](https://arxiv.org/abs/2505.12489) | multiframe (v2v) | 35.0 % | 2025-05-22 | -| 17 | [CogVideoX-5b](https://github.com/ved015/CogVideoX-5b-Physics_iq_benchmarking) reported [here](https://github.com/ved015/CogVideoX-5b-Physics_iq_benchmarking) | i2v | 32.3 % | 2026-01-06 | -| 18 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | i2v | 30.2 % | 2025-04-21 | -| 19 | [VideoPoet](https://arxiv.org/abs/2312.14125) reported [here](https://arxiv.org/abs/2501.09038) | multiframe (v2v) | 29.5 % | 2025-02-19 | -| 20 | [Lumiere](https://arxiv.org/abs/2401.12945) reported [here](https://arxiv.org/abs/2501.09038) | multiframe (v2v) | 23.0 % | 2025-02-19 | -| 21 | [Runway Gen 3](https://runwayml.com/research/introducing-gen-3-alpha) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 22.8 % | 2025-02-19 | -| 22 | [VideoPoet](https://arxiv.org/abs/2312.14125) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 20.3 % | 2025-02-19 | -| 23 | [Lumiere](https://arxiv.org/abs/2401.12945) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 19.0 % | 2025-02-19 | -| 24 | [Stable Video Diffusion](https://arxiv.org/abs/2311.15127) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 14.8 % | 2025-02-19 | -| 25 | [Pika](https://pika.art/) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 13.0 % | 2025-02-19 | -| 26 | [Sora](https://openai.com/sora/) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 10.0 % | 2025-02-19 | +| 1 | [Cosmos3-Super + WMReward (BoN)](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | multiframe (v2v) | **63.4 %** :1st_place_medal: v2v | 2026-05-26 | +| 2 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | multiframe (v2v) | **62.6 %** :2nd_place_medal: v2v | 2025-10-28 | +| 3 | [Cosmos3-Super](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | multiframe (v2v) | **59.7 %** :3rd_place_medal: v2v | 2026-05-26 | +| 4 | [Cosmos3-Nano + WMReward (BoN)](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | multiframe (v2v) | 57.7 % | 2026-05-26 | +| 5 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | multiframe (v2v) | 56.0 % | 2025-04-21 | +| 6 | [Cosmos3-Nano](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | multiframe (v2v) | 50.2 % | 2026-05-26 | +| 7 | [Cosmos3-Super + WMReward (BoN)](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | i2v | 48.9 % :1st_place_medal: i2v | 2026-05-26 | +| 8 | [Sora2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 46.4 % :2nd_place_medal: i2v | 2026-04-01 | +| 9 | [Wan2.2 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 44.4 % :3rd_place_medal: i2v | 2026-04-01 | +| 10 | [Cosmos3-Super](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | i2v | 43.8 % | 2026-05-26 | +| 11 | [Cosmos3-Nano + WMReward (BoN)](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | i2v | 43.8 % | 2026-05-26 | +| 12 | [Sora2](https://openai.com/index/sora-2/) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 42.3 % | 2026-04-01 | +| 13 | [Cosmos3-Nano](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) reported [here](https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf) | i2v | 40.2 % | 2026-05-26 | +| 14 | [Wan2.2](https://github.com/Wan-Video/Wan2.2) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 38.3 % | 2026-04-01 | +| 15 | [Magi-1 + WMReward (BoN)](https://arxiv.org/abs/2601.10553) reported [here](https://arxiv.org/abs/2601.10553) | i2v | 36.9 % | 2025-10-28 | +| 16 | [Video-GPT](https://arxiv.org/abs/2505.12489) reported [here](https://arxiv.org/abs/2505.12489) | multiframe (v2v) | 35.0 % | 2025-05-22 | +| 17 | [CogVideoX-5b](https://github.com/ved015/CogVideoX-5b-Physics_iq_benchmarking) reported [here](https://github.com/ved015/CogVideoX-5b-Physics_iq_benchmarking) | i2v | 32.3 % | 2026-01-06 | +| 18 | [Magi-1](https://arxiv.org/abs/2505.13211) reported [here](https://arxiv.org/pdf/2505.13211) | i2v | 30.2 % | 2025-04-21 | +| 19 | [VideoPoet](https://arxiv.org/abs/2312.14125) reported [here](https://arxiv.org/abs/2501.09038) | multiframe (v2v) | 29.5 % | 2025-02-19 | +| 20 | [Lumiere](https://arxiv.org/abs/2401.12945) reported [here](https://arxiv.org/abs/2501.09038) | multiframe (v2v) | 23.0 % | 2025-02-19 | +| 21 | [Runway Gen 3](https://runwayml.com/research/introducing-gen-3-alpha) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 22.8 % | 2025-02-19 | +| 22 | [VideoPoet](https://arxiv.org/abs/2312.14125) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 20.3 % | 2025-02-19 | +| 23 | [Lumiere](https://arxiv.org/abs/2401.12945) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 19.0 % | 2025-02-19 | +| 24 | [Stable Video Diffusion](https://arxiv.org/abs/2311.15127) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 14.8 % | 2025-02-19 | +| 25 | [Pika](https://pika.art/) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 13.0 % | 2025-02-19 | +| 26 | [Sora](https://openai.com/sora/) reported [here](https://arxiv.org/abs/2501.09038) | i2v | 10.0 % | 2025-02-19 | *Note to early adopters of the benchmark: results from the paper were finalized on February 19, 2025; if you used the toolbox before please re-run since we changed and improved a few aspects. Likewise, if you downloaded the dataset before that date, it is recommended to re-download it, ensuring the ground truth video masks have a duration of five seconds.*