Skip to content

[NPUW]Fixed high memory footprint with blob cache.#36195

Merged
dmatveev merged 1 commit into
openvinotoolkit:masterfrom
intelgaoxiong:xiong/fixed_RSS_with_blob_cache
Jun 3, 2026
Merged

[NPUW]Fixed high memory footprint with blob cache.#36195
dmatveev merged 1 commit into
openvinotoolkit:masterfrom
intelgaoxiong:xiong/fixed_RSS_with_blob_cache

Conversation

@intelgaoxiong
Copy link
Copy Markdown
Contributor

@intelgaoxiong intelgaoxiong commented Jun 2, 2026

Details:

Very high memory footprint when running LLM with blob cache (import blob) - 20GB RSS
But it was only 8GB when running LLM with the default compilation approach.

Root cause is we exported / imported for every repeating layer.
RSS increased after importing for each repeating layer and then the accumulated RSS is very huge.

Tickets:

AI Assistance:

  • AI assistance used: no / yes
  • If yes, summarize how AI was used and what human validation was performed (build/tests/manual checks).

@intelgaoxiong intelgaoxiong changed the title Fixed high memory footprint with blob cache. [NPUW]Fixed high memory footprint with blob cache. Jun 2, 2026
@github-actions github-actions Bot added category: NPU OpenVINO NPU plugin category: NPUW NPUW plugin labels Jun 2, 2026
@intelgaoxiong intelgaoxiong marked this pull request as ready for review June 2, 2026 22:11
@intelgaoxiong intelgaoxiong requested review from a team as code owners June 2, 2026 22:11
@intelgaoxiong intelgaoxiong requested a review from dmatveev June 2, 2026 22:11
@dmatveev dmatveev added this to the 2026.3 milestone Jun 2, 2026
Signed-off-by: intelgaoxiong <xiong.gao@intel.com>
@dmatveev dmatveev added this pull request to the merge queue Jun 3, 2026
Merged via the queue into openvinotoolkit:master with commit a959d01 Jun 3, 2026
343 checks passed
@dmatveev dmatveev deleted the xiong/fixed_RSS_with_blob_cache branch June 3, 2026 09:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: NPU OpenVINO NPU plugin category: NPUW NPUW plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants