[Executorch][llm] Add support for ring kv cache and ring attention#10608
Conversation
Introduced CachePositionManager to keep track of what is the position for each slot in ring kv cache. This is used to generate mask. Differential Revision: [D73891427](https://our.internmc.facebook.com/intern/diff/D73891427/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10608
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit c5cdb0c with merge base bf50527 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This pull request was exported from Phabricator. Differential Revision: D73891427 |
…ttention" Introduced CachePositionManager to keep track of what is the position for each slot in ring kv cache. This is used to generate mask. Differential Revision: [D73891427](https://our.internmc.facebook.com/intern/diff/D73891427/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D73891427 |
…ttention" Introduced CachePositionManager to keep track of what is the position for each slot in ring kv cache. This is used to generate mask. Differential Revision: [D73891427](https://our.internmc.facebook.com/intern/diff/D73891427/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D73891427 |
…ttention" Introduced CachePositionManager to keep track of what is the position for each slot in ring kv cache. This is used to generate mask. Differential Revision: [D73891427](https://our.internmc.facebook.com/intern/diff/D73891427/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D73891427 |
…ttention" Introduced CachePositionManager to keep track of what is the position for each slot in ring kv cache. This is used to generate mask. Differential Revision: [D73891427](https://our.internmc.facebook.com/intern/diff/D73891427/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D73891427 |
…ttention" Introduced CachePositionManager to keep track of what is the position for each slot in ring kv cache. This is used to generate mask. Differential Revision: [D73891427](https://our.internmc.facebook.com/intern/diff/D73891427/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D73891427 |
…ttention" Introduced CachePositionManager to keep track of what is the position for each slot in ring kv cache. This is used to generate mask. Differential Revision: [D73891427](https://our.internmc.facebook.com/intern/diff/D73891427/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D73891427 |
3539275
into
gh/kimishpatel/185/base
…10832) Pull Request resolved: #10608 Introduced CachePositionManager to keep track of what is the position for each slot in ring kv cache. This is used to generate mask. ghstack-source-id: 283404678 @exported-using-ghexport Differential Revision: [D73891427](https://our.internmc.facebook.com/intern/diff/D73891427/)
Stack from ghstack (oldest at bottom):
Introduced CachePositionManager to keep track of what is the position for each slot in ring kv cache. This is used to generate mask.
Differential Revision: D73891427