Reduce FFmpeg cover-art storms#3109
Reduce FFmpeg cover-art storms#3109lukaszwawrzyk wants to merge 3 commits intomusic-assistant:devfrom
Conversation
Add a TTL-based LRU memory cache (256 entries, 15min TTL) to get_image_data() to avoid redundant fetches of the same image. Deduplicate concurrent in-flight requests via task_id and limit embedded image extractions to 2 concurrent ffmpeg processes using a semaphore. Includes unit tests for cache hits, TTL expiry, eviction, and request deduplication.
|
In general for descent hardware, this should not give any real issues. If we want to improve this, I'd rather implement a on-disk cache of thumbs (per size) with a small/limited memory cache and a max file size on the thumbnails cache folder. |
|
Well trust me it is a significant problem IRL. I have a 4 core Intel J5005. Initially I was using 2 cores for this VM but I upgraded to 4 and it didn't help at all. It takes whatever there is... I am not even looking at MASS interface directly. I have a bubble card in my HA dashboard and I use it to change song. Then, with docker top I can observe tens of ffmpeg process spawning and cpu spike for my machine. Each time I change song it reextracts thumbnails. I am using https://github.com/droans/mass-player-card and https://github.com/droans/mass_queue (it is in hidden popup from bubble cards), so maybe this is bugged? But still, as you say, wouldn't the browser cache these? I have a separate server at home that serves the music over SMB, so it might add to the cost of extracting these images. But it is surprising to me that on each song change so many images are fetched all over again. I can confirm that after changes from this PR my playback is quite solid and CPU usage never exceeds 100. I also see 0 image extraction processes on song change. Also there is this I wanted to add some context, to maybe help you reasoning about this error. LMK if you have any extra thoughts. I can reimplement it as disk cache. |
|
My plan has always been to add a on-disk cache (as you could also see in the comment I left in the code) so if we want to touch this now, I want to go with a disk cache as that makes much more sense. We can also combine it with a limited FIFO cache to cache the last 50 thumbnails or something but in the end you just do not want to go through ffmpeg every time to grab the thumbnail. When using the normal MA webinterface I am 100% confident that the images are cached by the browser but as you confirmed, this is not the case with any external tools accessing the MA data. Let me know if you want us/me to fix the disk cache or you like to give it a stab yourself. So something like this provider + item_id ==> some hash string |
|
If you have cycles to implement soon-ish, I'd greatly appreciate that. But if it is likely to wait a few months to get implemented, I'd probably throw AI at the problem, do some review and let you take a look and iterate with this. |
|
Converting to draft so we can focus on those needing review |
Currently we have other priorities, so I would be very thankful if you could take a stab at it 🙏 |
…age data" This reverts commit a676a24.
…tractions
Implement a two-tier caching system for image thumbnails:
- On-disk cache: thumbnails stored as {sha256(provider/path)}_{size}.{ext}
in a "thumbnails" subfolder of the cache directory, surviving restarts.
- In-memory FIFO cache: last 50 thumbnails for instant access on hot paths.
- In-flight deduplication: concurrent requests for the same thumbnail share
a single generation task via create_task, preventing ffmpeg bursts.
This eliminates repeated ffmpeg spawns for embedded cover art extraction,
which caused high CPU usage and playback interruptions especially on
lower-powered hardware and network-mounted music libraries (e.g. SMB).
a5334dc to
c5b7b9c
Compare
|
Ok, I updated PR with disk caching. It still does help with my issue. I can see 314 files in |
Background (my setup)
ffmpegprocesses to extract the same embedded image repeatedly (same/similar tracks on every song change), causing high CPU usage and frequent interruptions in playback, changing song can take a few seconds.Flow (before)
GET /imageproxy.MetaDataController.handle_imageproxy→get_thumbnail→helpers/images.get_image_thumb.get_image_thumbcallshelpers/images.get_image_data.get_image_datafalls back tohelpers/tags.get_embedded_image,ffmpeg(-vcodec mjpeg -f mjpeg -) to extract the picture.Issue
helpers/images.get_image_datahad no local caching and no in-flight de-duplication [TODO in code]ffmpegprocesses for the same file, leading to significant slowdowns (amplified on slower I/O, e.g. SMB mounts).Fix (this PR)
(provider, path)so only one extraction runs and other callers await the same task.ffmpegbursts even across different files.