[llm] Add a generic text only LLM runner by larryliu0820 · Pull Request #11342 · pytorch/executorch

larryliu0820 · 2025-06-03T23:08:01Z

Stack from ghstack (oldest at bottom):

-> [llm] Add a generic text only LLM runner #11342

Introducing text_llm_runner. This can be used to run all text only decoder only LLM models supported by ExecuTorch.

Metadata is being read out from the .pte file and being used to construct the runner object.
examples/models/llama/runner.h[.cpp] only contains a simple wrapper around text_llm_runner.h[.cpp].

In next PRs I will move examples/models/phi-3-mini/runner to use the generic runner.

Will look into QNN and MediaTek runners as well.

Differential Revision: D75910889

Introducing `text_llm_runner`. This can be used to run all text only decoder only LLM models supported by ExecuTorch. * Metadata is being read out from the .pte file and being used to construct the runner object. * examples/models/llama/runner.h[.cpp] only contains a simple wrapper around `text_llm_runner.h[.cpp]`. In next PRs I will move examples/models/phi-3-mini/runner to use the generic runner. Will look into QNN and MediaTek runners as well. Differential Revision: [D75910889](https://our.internmc.facebook.com/intern/diff/D75910889/) [ghstack-poisoned]

Introducing `text_llm_runner`. This can be used to run all text only decoder only LLM models supported by ExecuTorch. * Metadata is being read out from the .pte file and being used to construct the runner object. * examples/models/llama/runner.h[.cpp] only contains a simple wrapper around `text_llm_runner.h[.cpp]`. In next PRs I will move examples/models/phi-3-mini/runner to use the generic runner. Will look into QNN and MediaTek runners as well. Differential Revision: [D75910889](https://our.internmc.facebook.com/intern/diff/D75910889/) ghstack-source-id: 288004703 Pull Request resolved: #11342

pytorch-bot · 2025-06-03T23:08:05Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11342

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Cancelled Job, 1 Unrelated Failure

As of commit 04dbca7 with merge base b2c02fe ():

NEW FAILURES - The following jobs have failed:

pull / test-llava-runner-linux / linux-job (gh)
RuntimeError: Command docker exec -t de24e39974a1b210231018ee8ba7c2e4383ef5fc2dba4afee82e157540f0e7ee /exec failed with exit code 139
trunk / test-arm-cortex-m-size-test / linux-job (gh)
RuntimeError: Command docker exec -t e3003c783cb7f90061a6349f5a7a88ba44a560c8b67714cb10f6d3deab996e38 /exec failed with exit code 1

CANCELLED JOB - The following job was cancelled. Please retry:

trunk / test-llama-runner-linux (bf16, portable, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / android / run-emulator (gh) (trunk failure)
The process '/usr/bin/sh' failed with exit code 255

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-06-03T23:08:15Z