Skip to content

评估代码里的提示词写的似乎有问题 #3

@mozhengzy

Description

@mozhengzy

在Evaluation/test_benchmark.py中,函数get_prompt的suffix里,对起点的描述似乎有问题
2D任务的起点都是" the toothpaste on the table",3D任务的起点都是" the pale blue pillow on the sofa which is the second pale blue pillow from the right "。这里应该替换成 {object_name}

`def get_prompt(model_name, task_name, object_name, target_location, prompt):

if "Gemini" in model_name or "Qwen" in model_name:
    if task_name == "2D":
        prefix = f"You are currently a robot performing robotic manipulation tasks. You have already pick up {object_name}. The task instruction is: move {object_name} to {target_location}."
        suffix = f"Please predict up to 10 key 2D trajectory points starting from the toothpaste on the table to complete the task. Your answer should be formatted as a list of tuples, i.e. [[x1, y1], [x2, y2], ...], where each tuple contains the x and y coordinates of the image. The x and y should be 0-1000 to indicate a point in the image. Output the point coordinates in JSON format."
    elif task_name == "3D":
        prefix = f"You are currently a robot performing robotic manipulation tasks. You have already pick up {object_name}. The task instruction is: move {object_name} to {target_location}."
        suffix = f"Please predict up to 10 key 3D trajectory points starting from the pale blue pillow on the sofa which is the second pale blue pillow from the right to complete the task. Your answer should be formatted as a list of tuples, i.e. [[x1, y1, d1], [x2, y2, d2], ...], where each tuple contains the x, y, and depth coordinates of the image. The x and y should be 0-1000 to indicate a point in the image.  The unit of depth is meters. But you never output unit, just output number. Output the point coordinates in JSON format."
    else:
        raise ValueError(f"Unsupported task: {task_name}")
    full_input_instruction = f"{prefix} {suffix}"`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions