small fixes for LLM config by farook-edev · Pull Request #1130 · mlcommons/mobile_app_open

farook-edev · 2026-04-20T09:33:24Z

This PR fixes the incorrect model filename for 3B and 8B benchmarks, and disables all benchmark set options by default.

github-actions · 2026-04-20T09:33:35Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

Mostelk · 2026-04-21T09:52:56Z

@farook-edev @mohitmundhragithub @freedomtan This PR solves the IC offline issue, but LLM accuracy with IFEval still doesn't work. Error: throughput must be a finite number,: it must be related to previous PR when we reported Throughout as 10^12/mean token output time; interesting that this issue shows up only in the accuracy mode and not in performance mode.

freedomtan · 2026-04-21T13:58:47Z

@farook-edev @mohitmundhragithub @freedomtan This PR solves the IC offline issue, but LLM accuracy with IFEval still doesn't work. Error: throughput must be a finite number,: it must be related to previous PR when we reported Throughout as 10^12/mean token output time; interesting that this issue shows up only in the accuracy mode and not in performance mode.

This is consistent with my tests conducted at #1098 (comment) and a previous macOS CLI test. That is, mostly he C++ part is working fine.

farook-edev · 2026-04-21T16:35:26Z

@freedomtan @Mostelk I ran the app and it is erring at the very end of MMLU benchmark, that's why there were no IFEval logs. It is related to the flutter code, I'm working on a fix and will push ASAP. (I suspect it happens in Accuracy mode because no Throughput numbers are getting reported)

farook-edev · 2026-04-21T18:27:48Z

Update: I'm still running the test, but it just passed MMLU and is progressing through IFEval.

I pushed the fix for further testing.

The problem was that token latency was coming as 0 because the value doesn't exist in loadgen's accuracy logs, my previous PR takes the reciprocal of that which is ∞.

For some reason, the app emits an error if the throughput is not a finite number, regardless of run mode. Hence the error.

Mostelk · 2026-04-21T18:46:04Z

I think we can adopt 2 out of 3 commits, this 3e51658 was not reason for previous crash, so we dont need it, The error still persists

I flutter : Error: throughput must be a finite number: Infinity
I flutter : #0 new RunInfo (package:mlperfbench/benchmark/run_info.dart:19)

farook-edev · 2026-04-21T18:49:26Z

@Mostelk I don't believe the CI artifact for the latest commit is built yet, could you please test once CI is finished, if all is well, I'll re-add the accuracy.txt part

This reverts commit 3e51658.

farook-edev · 2026-04-21T20:17:59Z

latest tflite-only CI build (before re-adding accuracy.txt) available here, others available from Action 151's artifacts

sonarqubecloud · 2026-04-21T20:50:37Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

farook-edev · 2026-04-21T21:39:13Z

@freedomtan @Mostelk @mohitmundhragithub I was able to complete the LLM-1B and LLM-1B-Instruct benchmarks on my S25 in Accuracy mode, and got results for both benchmarks with no errors or crashing on 0492023, the later commit shouldn't affect this result since it was already tested and confirmed to be working separately.

Please test the latest CI artifact and let me know if any errors occur. APK Build number is 802 and can be found here.

freedomtan · 2026-04-22T01:13:59Z

@freedomtan @Mostelk @mohitmundhragithub I was able to complete the LLM-1B and LLM-1B-Instruct benchmarks on my S25 in Accuracy mode, and got results for both benchmarks with no errors or crashing on 0492023, the later commit shouldn't affect this result since it was already tested and confirmed to be working separately.

Please test the latest CI artifact and let me know if any errors occur. APK Build number is 802 and can be found here.

Running the 802 tflite app on Pixel 10 Pro now, will report the results later.

update:

Yes, accuracy mode for 1B model finished running and reported expected numbers.

freedomtan

LGTM.

small fixes for LLM config

2ef4af7

farook-edev requested review from a team and anhappdev as code owners April 20, 2026 09:33

farook-edev mentioned this pull request Apr 20, 2026

Update v6.0 LLM Implementation #1098

Open

10 tasks

farook-edev added 2 commits April 21, 2026 08:22

undo accuracy.txt writing

3e51658

re-added min-query-count line to driver

2783cc8

farook-edev mentioned this pull request Apr 21, 2026

v6.0 LLM Loadgen does not query all samples for Offline mode #1127

Open

fixed division by 0 in accuracy mode

0492023

Revert "undo accuracy.txt writing"

ef3d1ea

This reverts commit 3e51658.

freedomtan approved these changes Apr 22, 2026

View reviewed changes

farook-edev merged commit 5e5472d into submission-v6.0 Apr 22, 2026
29 of 30 checks passed

farook-edev deleted the config-fix branch April 22, 2026 05:03

github-actions Bot locked and limited conversation to collaborators Apr 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

small fixes for LLM config#1130

small fixes for LLM config#1130
farook-edev merged 5 commits intosubmission-v6.0from
config-fix

farook-edev commented Apr 20, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

Mostelk commented Apr 21, 2026

Uh oh!

freedomtan commented Apr 21, 2026

Uh oh!

farook-edev commented Apr 21, 2026 •

edited

Loading

Uh oh!

farook-edev commented Apr 21, 2026 •

edited

Loading

Uh oh!

Mostelk commented Apr 21, 2026

Uh oh!

farook-edev commented Apr 21, 2026

Uh oh!

farook-edev commented Apr 21, 2026

Uh oh!

sonarqubecloud Bot commented Apr 21, 2026

Uh oh!

farook-edev commented Apr 21, 2026 •

edited

Loading

Uh oh!

freedomtan commented Apr 22, 2026 •

edited

Loading

Uh oh!

freedomtan left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

farook-edev commented Apr 20, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mostelk commented Apr 21, 2026

Uh oh!

freedomtan commented Apr 21, 2026

Uh oh!

farook-edev commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

farook-edev commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mostelk commented Apr 21, 2026

Uh oh!

farook-edev commented Apr 21, 2026

Uh oh!

farook-edev commented Apr 21, 2026

Uh oh!

sonarqubecloud Bot commented Apr 21, 2026

Quality Gate passed

Uh oh!

farook-edev commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

freedomtan commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

freedomtan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented Apr 20, 2026 •

edited

Loading

farook-edev commented Apr 21, 2026 •

edited

Loading

farook-edev commented Apr 21, 2026 •

edited

Loading

farook-edev commented Apr 21, 2026 •

edited

Loading

freedomtan commented Apr 22, 2026 •

edited

Loading