Skip to content

Fix illegal memory access caused by missing _kernel_block_sizes attribute in some vLLM versions#330

Open
lianghao208 wants to merge 1 commit into
ovg-project:mainfrom
lianghao208:lianghao_c++
Open

Fix illegal memory access caused by missing _kernel_block_sizes attribute in some vLLM versions#330
lianghao208 wants to merge 1 commit into
ovg-project:mainfrom
lianghao208:lianghao_c++

Conversation

@lianghao208
Copy link
Copy Markdown
Contributor

In vLLM 0.16, gpu_model_runner does not have the self._kernel_block_sizes attribute. As a result, kernel_block_sizes = getattr(self, "_kernel_block_sizes", None) returns None, which causes kvcached to incorrectly assume ratio = 1. This leads to an excessively large stride value being computed, ultimately resulting in an illegal memory access error.

For example:

Variable Value
Model Qwen3.5-9B hybrid architecture
kv_cache_group[0,1,2] MambaSpec
kv_cache_group[3] FullAttentionSpec, layers=8, block_size=528
num_kv_heads 4
head_size 256
page_size_bytes 2,162,688 > 2MB
virtual block size 528
FA3 kernel_block_size 16 (determined by get_supported_kernel_block_sizes())
Variable Incorrect Value Correct Value
kernel_block_size 528 16
ratio 1 33 (= 528 ÷ 16)
kernel_kvcache_shape [2, 2699, 528, 4, 256] [2, 89067, 16, 4, 256]
kvcached's perspective One FA3 kernel block holds 528 tokens, up to 2,699 blocks One FA3 kernel block holds 16 tokens, up to 89,067 blocks
vllm's perspective One FA3 kernel block holds 16 tokens, up to 89,067 blocks One FA3 kernel block holds 16 tokens, up to 89,067 blocks
block dimension stride 1,081,344 illegal memory access!!! (=528*4*256*2) 32,768 (=2*16*4*256)

This PR adds proper handling for the case where _kernel_block_sizes is not defined, ensuring the stride is computed correctly across different vLLM versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant