Commit 36aa8be
ssjia
[ET-VK][runtime] Add prepack cache to avoid duplicate weight prepacking
Pull Request resolved: #18361
When embedding and linear ops share tied weights and both use the same
prepacking function (prepack_quantized_linear_weight), the weight gets
prepacked twice, wasting GPU memory. Add a cache to ComputeGraph keyed
by (input ValueRef, kernel name) that returns the already-prepacked
tensor on cache hit, avoiding the duplicate allocation.
ghstack-source-id: 355397958
@exported-using-ghexport
Differential Revision: [D97430801](https://our.internmc.facebook.com/intern/diff/D97430801/)1 parent b5e7462 commit 36aa8be
3 files changed
Lines changed: 64 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
297 | 297 | | |
298 | 298 | | |
299 | 299 | | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
300 | 317 | | |
301 | 318 | | |
302 | 319 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
15 | 16 | | |
16 | 17 | | |
17 | 18 | | |
| |||
204 | 205 | | |
205 | 206 | | |
206 | 207 | | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
207 | 224 | | |
208 | 225 | | |
209 | 226 | | |
| |||
687 | 704 | | |
688 | 705 | | |
689 | 706 | | |
| 707 | + | |
| 708 | + | |
| 709 | + | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
690 | 723 | | |
691 | 724 | | |
692 | 725 | | |
| |||
Lines changed: 14 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
256 | 256 | | |
257 | 257 | | |
258 | 258 | | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
259 | 272 | | |
260 | 273 | | |
261 | 274 | | |
| |||
273 | 286 | | |
274 | 287 | | |
275 | 288 | | |
276 | | - | |
277 | | - | |
278 | | - | |
279 | | - | |
280 | | - | |
281 | 289 | | |
282 | 290 | | |
283 | 291 | | |
| |||
294 | 302 | | |
295 | 303 | | |
296 | 304 | | |
| 305 | + | |
297 | 306 | | |
298 | 307 | | |
299 | 308 | | |
| |||
0 commit comments