-
Notifications
You must be signed in to change notification settings - Fork 20
Description
操作系统及版本
Ubuntu 22.04
安装工具的python环境
在anaconda/miniconda创建的python虚拟环境
python版本
3.10
AISBench工具版本
3.0.20251103
AISBench执行命令
ais_bench --models vllm_api_stream_chat --datasets textvqa_gen
模型配置文件或自定义配置文件内容
from ais_bench.benchmark.models import VLLMCustomAPIChatStream
from ais_bench.benchmark.utils.model_postprocessors import extract_non_reasoning_content
models = [
dict(
attr="service",
type=VLLMCustomAPIChatStream,
abbr='vllm-api-stream-chat',
path="",
model="",
request_rate = 0,
rpm_verbose = False,
retry = 2,
host_ip = "localhost",
host_port = 21116,
enable_ssl = False,
max_out_len = 20480,
batch_sizie=1024,
generation_kwargs = dict(
temperature = 0,
seed = 1234,
max_tokens = 20*1024,
),
)
]
预期行为
推理评测完成,输出评测分数。
实际行为
评测完成后,分数为0
打开prediction,答案目视正常:
"prediction": "The user wants to know the brand of the camera in the image.\n\n1. Analyze the image: I see a blue and silver disposable camera.\n2. Locate text: There is text printed on the top left of the camera body.\n3. Read the text: The text says "DAKOTA DIGITAL" in a blue box, and below it says "Single-Use Camera".\n4. Identify the brand: The most prominent brand name is "Dakota Digital". There is also a smaller logo at the bottom left that says "pure digital", but "Dakota Digital" is clearly the primary branding on the front.\n5. Formulate the answer: The user requested a single word or phrase. "Dakota Digital" fits this perfectly.\n\n\nDakota Digital",
"gold": [
{
"answer": "nous les gosses",
"answer_confidence": "yes",
"answer_id": 0
},
{
"answer": "dakota",
"answer_confidence": "yes",
"answer_id": 1
},
{
"answer": "clos culombu",
"answer_confidence": "yes",
"answer_id": 2
},
{
"answer": "dakota digital",
"answer_confidence": "yes",
"answer_id": 3
},
{
"answer": "dakota",
"answer_confidence": "yes",
"answer_id": 4
},
{
"answer": "dakota",
"answer_confidence": "yes",
"answer_id": 5
},
{
"answer": "dakota digital",
"answer_confidence": "yes",
"answer_id": 6
},
{
"answer": "dakota digital",
"answer_confidence": "yes",
"answer_id": 7
},
{
"answer": "dakota",
"answer_confidence": "yes",
"answer_id": 8
},
{
"answer": "dakota",
"answer_confidence": "yes",
"answer_id": 9
}
前置检查
- 我已读懂主页文档的快速入门,无法解决问题
- 我已检索过FAQ,无重复问题
- 我已搜索过现有Issue,无重复问题
- 我已更新到最新版本,问题仍存在