VLM在BabyVision-Gen上的结果 Results of VLMs on BabyVision-Gen

现在有一些vlm也可以通过写代码之类的操作完成babyvision-gen里面的任务。请问你们有测过/打算测非图像生成的vlm在babyvision-gen上的性能吗？

比如gpt5.4thinking做迷宫：https://chatgpt.com/share/69ae8b45-746c-8003-9b57-b8be05412d2a

Some VLMs can now complete tasks in BabyVision-Gen by performing actions such as writing code. Have you tested, or do you plan to test, the performance of non–image-generation VLMs on BabyVision-Gen?

For example, GPT-5.4 Thinking solving a maze:
https://chatgpt.com/share/69ae8b45-746c-8003-9b57-b8be05412d2a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VLM在BabyVision-Gen上的结果 Results of VLMs on BabyVision-Gen #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

VLM在BabyVision-Gen上的结果 Results of VLMs on BabyVision-Gen #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions