Skip to content

[Bug] Qwen3.5-35B,textVQA评测,prediction目视正常,评测分数为0 #212

@YuhanBai

Description

@YuhanBai

操作系统及版本

Ubuntu 22.04

安装工具的python环境

在anaconda/miniconda创建的python虚拟环境

python版本

3.10

AISBench工具版本

3.0.20251103

AISBench执行命令

ais_bench --models vllm_api_stream_chat --datasets textvqa_gen

模型配置文件或自定义配置文件内容

from ais_bench.benchmark.models import VLLMCustomAPIChatStream
from ais_bench.benchmark.utils.model_postprocessors import extract_non_reasoning_content

models = [
dict(
attr="service",
type=VLLMCustomAPIChatStream,
abbr='vllm-api-stream-chat',
path="",
model="",
request_rate = 0,
rpm_verbose = False,
retry = 2,
host_ip = "localhost",
host_port = 21116,
enable_ssl = False,
max_out_len = 20480,
batch_sizie=1024,
generation_kwargs = dict(
temperature = 0,
seed = 1234,
max_tokens = 20*1024,
),
)
]

预期行为

推理评测完成,输出评测分数。

实际行为

评测完成后,分数为0

Image

打开prediction,答案目视正常:
"prediction": "The user wants to know the brand of the camera in the image.\n\n1. Analyze the image: I see a blue and silver disposable camera.\n2. Locate text: There is text printed on the top left of the camera body.\n3. Read the text: The text says "DAKOTA DIGITAL" in a blue box, and below it says "Single-Use Camera".\n4. Identify the brand: The most prominent brand name is "Dakota Digital". There is also a smaller logo at the bottom left that says "pure digital", but "Dakota Digital" is clearly the primary branding on the front.\n5. Formulate the answer: The user requested a single word or phrase. "Dakota Digital" fits this perfectly.\n\n\nDakota Digital",
"gold": [
{
"answer": "nous les gosses",
"answer_confidence": "yes",
"answer_id": 0
},
{
"answer": "dakota",
"answer_confidence": "yes",
"answer_id": 1
},
{
"answer": "clos culombu",
"answer_confidence": "yes",
"answer_id": 2
},
{
"answer": "dakota digital",
"answer_confidence": "yes",
"answer_id": 3
},
{
"answer": "dakota",
"answer_confidence": "yes",
"answer_id": 4
},
{
"answer": "dakota",
"answer_confidence": "yes",
"answer_id": 5
},
{
"answer": "dakota digital",
"answer_confidence": "yes",
"answer_id": 6
},
{
"answer": "dakota digital",
"answer_confidence": "yes",
"answer_id": 7
},
{
"answer": "dakota",
"answer_confidence": "yes",
"answer_id": 8
},
{
"answer": "dakota",
"answer_confidence": "yes",
"answer_id": 9
}

前置检查

  • 我已读懂主页文档的快速入门,无法解决问题
  • 我已检索过FAQ,无重复问题
  • 我已搜索过现有Issue,无重复问题
  • 我已更新到最新版本,问题仍存在

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcontent_check_passedissue content check passed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions