Skip to content

Reduce FFmpeg cover-art storms#3109

Open
lukaszwawrzyk wants to merge 3 commits intomusic-assistant:devfrom
lukaszwawrzyk:img-cache-fix
Open

Reduce FFmpeg cover-art storms#3109
lukaszwawrzyk wants to merge 3 commits intomusic-assistant:devfrom
lukaszwawrzyk:img-cache-fix

Conversation

@lukaszwawrzyk
Copy link

Background (my setup)

  • Running a recent beta of Music Assistant server.
  • Music library is local at home but accessed by the server over SMB.
  • Many tracks have embedded cover art (and my playlist folders can mix tracks from different albums, each with its own artwork).
  • When skipping/changing tracks, the server spawns many ffmpeg processes to extract the same embedded image repeatedly (same/similar tracks on every song change), causing high CPU usage and frequent interruptions in playback, changing song can take a few seconds.

Flow (before)

  • UI/client requests artwork via GET /imageproxy.
  • MetaDataController.handle_imageproxyget_thumbnailhelpers/images.get_image_thumb.
  • get_image_thumb calls helpers/images.get_image_data.
  • For local tracks where the image path points to the media file (embedded cover art):
    • get_image_data falls back to helpers/tags.get_embedded_image,
    • which spawns ffmpeg (-vcodec mjpeg -f mjpeg -) to extract the picture.

Issue

  • helpers/images.get_image_data had no local caching and no in-flight de-duplication [TODO in code]
  • Result: every imageproxy request (and especially concurrent ones) could trigger a fresh embedded-art extraction.
  • This caused bursts of ffmpeg processes for the same file, leading to significant slowdowns (amplified on slower I/O, e.g. SMB mounts).

Fix (this PR)

  • Add a small per-server in-memory cache for raw image bytes (TTL + max entries).
  • De-duplicate concurrent requests for the same (provider, path) so only one extraction runs and other callers await the same task.
  • Add a semaphore to cap concurrent embedded-art extractions, preventing ffmpeg bursts even across different files.

Add a TTL-based LRU memory cache (256 entries, 15min TTL) to
get_image_data() to avoid redundant fetches of the same image.
Deduplicate concurrent in-flight requests via task_id and limit
embedded image extractions to 2 concurrent ffmpeg processes
using a semaphore.

Includes unit tests for cache hits, TTL expiry, eviction, and
request deduplication.
@OzGav OzGav added the bugfix label Feb 7, 2026
@marcelveldt
Copy link
Member

In general for descent hardware, this should not give any real issues.
Browsers should cache the data and only request the thumbnail once.

If we want to improve this, I'd rather implement a on-disk cache of thumbs (per size) with a small/limited memory cache and a max file size on the thumbnails cache folder.

@lukaszwawrzyk
Copy link
Author

Well trust me it is a significant problem IRL. I have a 4 core Intel J5005. Initially I was using 2 cores for this VM but I upgraded to 4 and it didn't help at all. It takes whatever there is... I am not even looking at MASS interface directly. I have a bubble card in my HA dashboard and I use it to change song. Then, with docker top I can observe tens of ffmpeg process spawning and cpu spike for my machine. Each time I change song it reextracts thumbnails.

I am using https://github.com/droans/mass-player-card and https://github.com/droans/mass_queue (it is in hidden popup from bubble cards), so maybe this is bugged? But still, as you say, wouldn't the browser cache these?

I have a separate server at home that serves the music over SMB, so it might add to the cost of extracting these images. But it is surprising to me that on each song change so many images are fetched all over again. I can confirm that after changes from this PR my playback is quite solid and CPU usage never exceeds 100. I also see 0 image extraction processes on song change. Also there is this # TODO: add local cache here ! comment so I believe some cache is necessary.

I wanted to add some context, to maybe help you reasoning about this error. LMK if you have any extra thoughts. I can reimplement it as disk cache.

@marcelveldt
Copy link
Member

My plan has always been to add a on-disk cache (as you could also see in the comment I left in the code) so if we want to touch this now, I want to go with a disk cache as that makes much more sense. We can also combine it with a limited FIFO cache to cache the last 50 thumbnails or something but in the end you just do not want to go through ffmpeg every time to grab the thumbnail.

When using the normal MA webinterface I am 100% confident that the images are cached by the browser but as you confirmed, this is not the case with any external tools accessing the MA data.

Let me know if you want us/me to fix the disk cache or you like to give it a stab yourself.
I think we need to create a safe hash of the path to store on disk and append a size;

So something like this

provider + item_id ==> some hash string
thumbnails folder (subfolder of data dir)
hash_{size}.jpg
etc.

@lukaszwawrzyk
Copy link
Author

If you have cycles to implement soon-ish, I'd greatly appreciate that. But if it is likely to wait a few months to get implemented, I'd probably throw AI at the problem, do some review and let you take a look and iterate with this.
In any case I can definitely help with testing/reproducing issue.

@OzGav OzGav removed the bugfix label Feb 11, 2026
@OzGav
Copy link
Contributor

OzGav commented Feb 11, 2026

Converting to draft so we can focus on those needing review

@OzGav OzGav marked this pull request as draft February 11, 2026 11:36
@MarvinSchenkel
Copy link
Contributor

If you have cycles to implement soon-ish, I'd greatly appreciate that. But if it is likely to wait a few months to get implemented, I'd probably throw AI at the problem, do some review and let you take a look and iterate with this. In any case I can definitely help with testing/reproducing issue.

Currently we have other priorities, so I would be very thankful if you could take a stab at it 🙏

…tractions

Implement a two-tier caching system for image thumbnails:

- On-disk cache: thumbnails stored as {sha256(provider/path)}_{size}.{ext}
  in a "thumbnails" subfolder of the cache directory, surviving restarts.
- In-memory FIFO cache: last 50 thumbnails for instant access on hot paths.
- In-flight deduplication: concurrent requests for the same thumbnail share
  a single generation task via create_task, preventing ffmpeg bursts.

This eliminates repeated ffmpeg spawns for embedded cover art extraction,
which caused high CPU usage and playback interruptions especially on
lower-powered hardware and network-mounted music libraries (e.g. SMB).
@lukaszwawrzyk lukaszwawrzyk marked this pull request as ready for review February 17, 2026 22:39
@lukaszwawrzyk
Copy link
Author

Ok, I updated PR with disk caching. It still does help with my issue. I can see 314 files in /data/.cache/thumbnails/ and no ffmpeg image extractions on song change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants