Commit 182be0e
authored
cuda: add per-session mutable state rebinding (#20241)
Local agent serving needs to host multiple logical conversations on one
CUDA-resident model without multiplying the model weights. Loading one
AOTI module per conversation is not viable for large local models, while
sharing the default mutable state across conversations would let
KV/recurrent/conv buffers bleed between users.
This adds the CUDA-private foundation for separating those concerns:
weights remain owned by the loaded AOTI container, while mutable buffer
FQNs can be registered as per-session state and rebound before
execution. The path is fail-closed and dormant until a model opts in by
creating a mutable-state context and validating coverage, so existing
CUDA models keep their current behavior.
The branch also wires the new source and fall-closed unit test into both
Buck and CMake so the primitive can land independently before any
model-specific engine consumes it.
#200011 parent 99ca02f commit 182be0e
6 files changed
Lines changed: 1783 additions & 6 deletions
File tree
- backends/cuda
- runtime
- test
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
184 | 184 | | |
185 | 185 | | |
186 | 186 | | |
187 | | - | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
188 | 190 | | |
189 | 191 | | |
190 | 192 | | |
| |||
236 | 238 | | |
237 | 239 | | |
238 | 240 | | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
| 2 | + | |
| 3 | + | |
2 | 4 | | |
3 | 5 | | |
4 | 6 | | |
| |||
105 | 107 | | |
106 | 108 | | |
107 | 109 | | |
| 110 | + | |
108 | 111 | | |
109 | 112 | | |
110 | 113 | | |
| 114 | + | |
111 | 115 | | |
112 | 116 | | |
113 | 117 | | |
| |||
135 | 139 | | |
136 | 140 | | |
137 | 141 | | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| 47 | + | |
47 | 48 | | |
48 | 49 | | |
49 | 50 | | |
| |||
436 | 437 | | |
437 | 438 | | |
438 | 439 | | |
| 440 | + | |
| 441 | + | |
439 | 442 | | |
440 | 443 | | |
441 | 444 | | |
| |||
539 | 542 | | |
540 | 543 | | |
541 | 544 | | |
| 545 | + | |
| 546 | + | |
542 | 547 | | |
543 | 548 | | |
544 | 549 | | |
| |||
826 | 831 | | |
827 | 832 | | |
828 | 833 | | |
| 834 | + | |
| 835 | + | |
829 | 836 | | |
830 | 837 | | |
831 | 838 | | |
| |||
899 | 906 | | |
900 | 907 | | |
901 | 908 | | |
902 | | - | |
903 | | - | |
904 | | - | |
905 | | - | |
906 | | - | |
| 909 | + | |
| 910 | + | |
| 911 | + | |
| 912 | + | |
| 913 | + | |
| 914 | + | |
907 | 915 | | |
908 | 916 | | |
909 | 917 | | |
| |||
0 commit comments