Skip to content

VLM在BabyVision-Gen上的结果 Results of VLMs on BabyVision-Gen #6

@agoyang

Description

@agoyang

现在有一些vlm也可以通过写代码之类的操作完成babyvision-gen里面的任务。请问你们有测过/打算测非图像生成的vlm在babyvision-gen上的性能吗?

比如gpt5.4thinking做迷宫:https://chatgpt.com/share/69ae8b45-746c-8003-9b57-b8be05412d2a

Some VLMs can now complete tasks in BabyVision-Gen by performing actions such as writing code. Have you tested, or do you plan to test, the performance of non–image-generation VLMs on BabyVision-Gen?

For example, GPT-5.4 Thinking solving a maze:
https://chatgpt.com/share/69ae8b45-746c-8003-9b57-b8be05412d2a

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions