small fixes for LLM config#1130
Conversation
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
|
@farook-edev @mohitmundhragithub @freedomtan This PR solves the IC offline issue, but LLM accuracy with IFEval still doesn't work. Error: throughput must be a finite number,: it must be related to previous PR when we reported Throughout as 10^12/mean token output time; interesting that this issue shows up only in the accuracy mode and not in performance mode. |
This is consistent with my tests conducted at #1098 (comment) and a previous macOS CLI test. That is, mostly he C++ part is working fine. |
|
@freedomtan @Mostelk I ran the app and it is erring at the very end of MMLU benchmark, that's why there were no IFEval logs. It is related to the flutter code, I'm working on a fix and will push ASAP. (I suspect it happens in Accuracy mode because no Throughput numbers are getting reported) |
|
Update: I'm still running the test, but it just passed MMLU and is progressing through IFEval. I pushed the fix for further testing. The problem was that token latency was coming as 0 because the value doesn't exist in loadgen's accuracy logs, my previous PR takes the reciprocal of that which is ∞. For some reason, the app emits an error if the throughput is not a finite number, regardless of run mode. Hence the error. |
|
I think we can adopt 2 out of 3 commits, this 3e51658 was not reason for previous crash, so we dont need it, The error still persists I flutter : Error: throughput must be a finite number: Infinity |
|
@Mostelk I don't believe the CI artifact for the latest commit is built yet, could you please test once CI is finished, if all is well, I'll re-add the |
This reverts commit 3e51658.
|
latest tflite-only CI build (before re-adding |
|
|
@freedomtan @Mostelk @mohitmundhragithub I was able to complete the LLM-1B and LLM-1B-Instruct benchmarks on my S25 in Accuracy mode, and got results for both benchmarks with no errors or crashing on 0492023, the later commit shouldn't affect this result since it was already tested and confirmed to be working separately. Please test the latest CI artifact and let me know if any errors occur. APK Build number is 802 and can be found here. |
Running the 802 tflite app on Pixel 10 Pro now, will report the results later. update:
|



This PR fixes the incorrect model filename for 3B and 8B benchmarks, and disables all benchmark set options by default.