DyLLM

DyLLM selects salient tokens after attention to remove redundant computations in FFN and use approximate attention enlightening the attention operation. Without hurting the accuracy of the original implementation, DyLLM achieve ~9.6x higher throughput.

How to install

conda create --name dyllm python=3.10 -y
conda activate dyllm
bash setup_env.sh

How to run

python run.py

Algorithm

After attention context operation, DyLLM compares the cosine similarity of context activation of each token with the same activation from the previous step. If the similarity is smaller than the given $\tau$, the token is selected as salient token. Only the salient tokens are computed in FFN significantly reducing the computational overhead.

We further reduce the runtime by focusing more on repsonse tokens. DyLLM basically picks salient tokens from the response tokens and attends the whole sentence periodically.

Overall Comparison

Commands to reproduce

bash ./scripts/run_gsm8k_acc_llada.sh # accuracy test
bash ./scripts/run_gsm8k_llada.sh # throughput test

Citation

If you find our code useful, please cite our paper.

@inproceedings{dyllm2026,
    title={Dy{LLM}: Efficient Diffusion {LLM} inference via saliency-based token selection and partial attention},
    author={Younjoo Lee and Seungkyun Dan and Junghoo Lee and Jaiyoung Park and {Jung Ho} Ahn},
    booktitle={Forty-third International Conference on Machine Learning},
    year={2026},
    url={https://openreview.net/forum?id=0azUrmsSyA}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
dyllm		dyllm
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
run.py		run.py
setup.py		setup.py
setup_env.sh		setup_env.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DyLLM

How to install

How to run

Algorithm

Overall Comparison

Commands to reproduce

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DyLLM

How to install

How to run

Algorithm

Overall Comparison

Commands to reproduce

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages