2026-04-28 13:41:48,416 - omlx.scheduler - INFO - [-] - Loaded 1 additional EOS token(s) from generation_config.json: {248044}
2026-04-28 13:41:48,416 - omlx.scheduler - INFO - [-] - Enlarging paged cache block_size=256 to 2048 for ArraysCache hybrid model (reduces boundary snapshot overhead)
2026-04-28 13:41:48,416 - omlx.scheduler - INFO - [-] - paged SSD-only mode: max_blocks=100000, block_size=2048 tokens
2026-04-28 13:41:48,417 - omlx.cache.paged_cache - INFO - [-] - PagedCacheManager initialized: block_size=2048, initial_blocks=256, max_blocks=100000, max_tokens=204800000
2026-04-28 13:41:48,417 - omlx.cache.paged_ssd_cache - INFO - [-] - Scanning SSD cache directory: /Users/jarhead/.omlx/cache
2026-04-28 13:41:48,532 - omlx.cache.paged_ssd_cache - INFO - [-] - SSD cache scan complete: scanned=485, indexed=0, errors=0, total_size=0 B
2026-04-28 13:41:48,532 - omlx.cache.paged_ssd_cache - INFO - [-] - PagedSSDCacheManager initialized: dir=/Users/jarhead/.omlx/cache, max_size=372.00 GB, hot_cache=12.00 GB, existing_files=0, disk_free=104.59 GB, cache_used=0 B
2026-04-28 13:41:48,532 - omlx.cache.paged_cache - INFO - [-] - paged SSD cache manager connected to PagedCacheManager
2026-04-28 13:41:48,532 - omlx.cache.prefix_cache - INFO - [-] - PagedSSDCacheManager connected to BlockAwarePrefixCache
2026-04-28 13:41:48,532 - omlx.scheduler - INFO - [-] - paged SSD cache enabled: cache_dir=/Users/jarhead/.omlx/cache, max_size=372.00 GB, block_size=2048 tokens
2026-04-28 13:41:48,532 - omlx.scheduler - INFO - [-] - paged SSD cache enabled: /Users/jarhead/.omlx/cache, block_size=2048, max_blocks=100000
2026-04-28 13:41:48,533 - omlx.engine_core - INFO - [-] - Engine started
2026-04-28 13:41:48,535 - omlx.patches.turboquant_attention - INFO - [-] - TurboQuant attention patch applied
2026-04-28 13:41:48,535 - omlx.engine.vlm - INFO - [-] - TurboQuant KV cache enabled for VLM: 4.0 bits
2026-04-28 13:41:48,582 - omlx.engine.vlm - INFO - [-] - VLM tool calling enabled: parser=qwen3_coder
2026-04-28 13:41:48,585 - omlx.engine.vlm - INFO - [-] - VLMBatchedEngine loaded: /Users/jarhead/.omlx/models/Qwen3.6-35B-A3B-qx86-hi-mlx
2026-04-28 13:41:48,585 - omlx.engine.dflash - INFO - [-] - DFlash fallback engine started: vlm
2026-04-28 13:41:49,462 - omlx.patches.gated_delta_advance - INFO - [-] - [gdn-body-replacement] call #1 B,S=1,2048 cache=set cache.lengths=None
2026-04-28 13:41:49,633 - omlx.patches.qwen3_5_attention - INFO - [-] - [qwen3_5-attn-patch] plain-rope call #1 B,L=1,2048 cache_offset=0
2026-04-28 13:41:53,893 - omlx.patches.gated_delta_advance - INFO - [-] - [gdn-body-replacement] call #100 B,S=1,2048 cache=set cache.lengths=None
2026-04-28 13:42:04,027 - omlx.patches.qwen3_5_attention - INFO - [-] - [qwen3_5-attn-patch] plain-rope call #100 B,L=1,2048 cache_offset=18432
2026-04-28 13:43:15,710 - omlx.patches.gated_delta_advance - INFO - [-] - [gdn-body-replacement] call #1000 B,S=1,2048 cache=set cache.lengths=None
2026-04-28 13:43:29,905 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] Aborting request 96fe74f0-91c7-4732-9973-f7e7d2a52f30
2026-04-28 13:43:34,071 - omlx.scheduler - INFO - [-] - Prefill interrupted at 75776/116377 tokens: 1 request(s) aborted
2026-04-28 13:49:23,949 - omlx.patches.qwen3_5_attention - INFO - [-] - [qwen3_5-attn-patch] plain-rope call #1000 B,L=1,1 cache_offset=array([116323], dtype=int32)
2026-04-28 13:49:28,895 - omlx.scheduler - INFO - [-] - Using boundary cache snapshot for ccc541af-6812-45ab-ac00-1b1fd6313ea8: storing 114688/116448 tokens (skipping trailing partial block, 55 intermediate snapshots)
2026-04-28 13:49:30,399 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request ccc541af-6812-45ab-ac00-1b1fd6313ea8
2026-04-28 13:49:48,313 - omlx.scheduler - INFO - [-] - Using boundary cache snapshot for 1ed5772f-50af-4256-884c-f61148ab8737: storing 116736/117054 tokens (skipping trailing partial block, 0 intermediate snapshots)
2026-04-28 13:49:48,431 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 1ed5772f-50af-4256-884c-f61148ab8737
2026-04-28 13:49:58,688 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request e69f2fb3-89b2-402d-8089-0c6612ab95d3
2026-04-28 13:50:13,362 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 16f19737-0dfe-466f-a8f4-5356d38c1012
2026-04-28 13:50:30,830 - omlx.scheduler - INFO - [-] - Using boundary cache snapshot for 82324925-613f-4131-bb96-58d2df92924f: storing 118784/118988 tokens (skipping trailing partial block, 0 intermediate snapshots)
2026-04-28 13:50:30,940 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 82324925-613f-4131-bb96-58d2df92924f
2026-04-28 13:50:41,449 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 9e2891ae-27ce-4503-be66-98486671a523
2026-04-28 13:50:55,483 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 6daf58ff-df66-4732-a0cf-801803123f87
2026-04-28 13:51:11,374 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 0d9bf662-d596-4eb0-b5bc-873b21c7876e
2026-04-28 13:52:00,415 - omlx.scheduler - INFO - [-] - Using boundary cache snapshot for f14ee070-ea21-4662-b8fd-4e5e79620800: storing 120832/122282 tokens (skipping trailing partial block, 0 intermediate snapshots)
2026-04-28 13:52:00,543 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request f14ee070-ea21-4662-b8fd-4e5e79620800
2026-04-28 13:52:34,739 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 818c9f80-cecc-4d77-9d66-67137948b7cb
2026-04-28 13:52:49,511 - omlx.scheduler - INFO - [-] - Using boundary cache snapshot for 72112a44-fd18-41ee-a07c-12c9f006c4b3: storing 122880/123018 tokens (skipping trailing partial block, 0 intermediate snapshots)
2026-04-28 13:52:49,633 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 72112a44-fd18-41ee-a07c-12c9f006c4b3
2026-04-28 13:53:16,791 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request aa5978e6-84eb-4c50-a998-ba76825ee0a1
2026-04-28 13:54:56,832 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 19c6871b-5b57-4da9-b770-81d5a5673c76
2026-04-28 13:55:09,757 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request f4249fa6-ccf9-4ab7-900b-753fc5905bcb
2026-04-28 13:55:25,361 - omlx.scheduler - INFO - [-] - Using boundary cache snapshot for e1152b6a-e805-4510-9fbd-591fdb319f4d: storing 124928/125660 tokens (skipping trailing partial block, 0 intermediate snapshots)
2026-04-28 13:55:25,482 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request e1152b6a-e805-4510-9fbd-591fdb319f4d
2026-04-28 13:55:39,571 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request d912767e-3503-4c2e-abf9-c681b41b3902
2026-04-28 13:55:58,444 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 753d9c9a-698e-4273-87ad-c13639115a8f
2026-04-28 13:41:48,416 - omlx.scheduler - INFO - [-] - Loaded 1 additional EOS token(s) from generation_config.json: {248044}
2026-04-28 13:41:48,416 - omlx.scheduler - INFO - [-] - Enlarging paged cache block_size=256 to 2048 for ArraysCache hybrid model (reduces boundary snapshot overhead)
2026-04-28 13:41:48,416 - omlx.scheduler - INFO - [-] - paged SSD-only mode: max_blocks=100000, block_size=2048 tokens
2026-04-28 13:41:48,417 - omlx.cache.paged_cache - INFO - [-] - PagedCacheManager initialized: block_size=2048, initial_blocks=256, max_blocks=100000, max_tokens=204800000
2026-04-28 13:41:48,417 - omlx.cache.paged_ssd_cache - INFO - [-] - Scanning SSD cache directory: /Users/jarhead/.omlx/cache
2026-04-28 13:41:48,532 - omlx.cache.paged_ssd_cache - INFO - [-] - SSD cache scan complete: scanned=485, indexed=0, errors=0, total_size=0 B
2026-04-28 13:41:48,532 - omlx.cache.paged_ssd_cache - INFO - [-] - PagedSSDCacheManager initialized: dir=/Users/jarhead/.omlx/cache, max_size=372.00 GB, hot_cache=12.00 GB, existing_files=0, disk_free=104.59 GB, cache_used=0 B
2026-04-28 13:41:48,532 - omlx.cache.paged_cache - INFO - [-] - paged SSD cache manager connected to PagedCacheManager
2026-04-28 13:41:48,532 - omlx.cache.prefix_cache - INFO - [-] - PagedSSDCacheManager connected to BlockAwarePrefixCache
2026-04-28 13:41:48,532 - omlx.scheduler - INFO - [-] - paged SSD cache enabled: cache_dir=/Users/jarhead/.omlx/cache, max_size=372.00 GB, block_size=2048 tokens
2026-04-28 13:41:48,532 - omlx.scheduler - INFO - [-] - paged SSD cache enabled: /Users/jarhead/.omlx/cache, block_size=2048, max_blocks=100000
2026-04-28 13:41:48,533 - omlx.engine_core - INFO - [-] - Engine started
2026-04-28 13:41:48,535 - omlx.patches.turboquant_attention - INFO - [-] - TurboQuant attention patch applied
2026-04-28 13:41:48,535 - omlx.engine.vlm - INFO - [-] - TurboQuant KV cache enabled for VLM: 4.0 bits
2026-04-28 13:41:48,582 - omlx.engine.vlm - INFO - [-] - VLM tool calling enabled: parser=qwen3_coder
2026-04-28 13:41:48,585 - omlx.engine.vlm - INFO - [-] - VLMBatchedEngine loaded: /Users/jarhead/.omlx/models/Qwen3.6-35B-A3B-qx86-hi-mlx
2026-04-28 13:41:48,585 - omlx.engine.dflash - INFO - [-] - DFlash fallback engine started: vlm
2026-04-28 13:41:49,462 - omlx.patches.gated_delta_advance - INFO - [-] - [gdn-body-replacement] call #1 B,S=1,2048 cache=set cache.lengths=None
2026-04-28 13:41:49,633 - omlx.patches.qwen3_5_attention - INFO - [-] - [qwen3_5-attn-patch] plain-rope call #1 B,L=1,2048 cache_offset=0
2026-04-28 13:41:53,893 - omlx.patches.gated_delta_advance - INFO - [-] - [gdn-body-replacement] call #100 B,S=1,2048 cache=set cache.lengths=None
2026-04-28 13:42:04,027 - omlx.patches.qwen3_5_attention - INFO - [-] - [qwen3_5-attn-patch] plain-rope call #100 B,L=1,2048 cache_offset=18432
2026-04-28 13:43:15,710 - omlx.patches.gated_delta_advance - INFO - [-] - [gdn-body-replacement] call #1000 B,S=1,2048 cache=set cache.lengths=None
2026-04-28 13:43:29,905 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] Aborting request 96fe74f0-91c7-4732-9973-f7e7d2a52f30
2026-04-28 13:43:34,071 - omlx.scheduler - INFO - [-] - Prefill interrupted at 75776/116377 tokens: 1 request(s) aborted
2026-04-28 13:49:23,949 - omlx.patches.qwen3_5_attention - INFO - [-] - [qwen3_5-attn-patch] plain-rope call #1000 B,L=1,1 cache_offset=array([116323], dtype=int32)
2026-04-28 13:49:28,895 - omlx.scheduler - INFO - [-] - Using boundary cache snapshot for ccc541af-6812-45ab-ac00-1b1fd6313ea8: storing 114688/116448 tokens (skipping trailing partial block, 55 intermediate snapshots)
2026-04-28 13:49:30,399 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request ccc541af-6812-45ab-ac00-1b1fd6313ea8
2026-04-28 13:49:48,313 - omlx.scheduler - INFO - [-] - Using boundary cache snapshot for 1ed5772f-50af-4256-884c-f61148ab8737: storing 116736/117054 tokens (skipping trailing partial block, 0 intermediate snapshots)
2026-04-28 13:49:48,431 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 1ed5772f-50af-4256-884c-f61148ab8737
2026-04-28 13:49:58,688 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request e69f2fb3-89b2-402d-8089-0c6612ab95d3
2026-04-28 13:50:13,362 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 16f19737-0dfe-466f-a8f4-5356d38c1012
2026-04-28 13:50:30,830 - omlx.scheduler - INFO - [-] - Using boundary cache snapshot for 82324925-613f-4131-bb96-58d2df92924f: storing 118784/118988 tokens (skipping trailing partial block, 0 intermediate snapshots)
2026-04-28 13:50:30,940 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 82324925-613f-4131-bb96-58d2df92924f
2026-04-28 13:50:41,449 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 9e2891ae-27ce-4503-be66-98486671a523
2026-04-28 13:50:55,483 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 6daf58ff-df66-4732-a0cf-801803123f87
2026-04-28 13:51:11,374 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 0d9bf662-d596-4eb0-b5bc-873b21c7876e
2026-04-28 13:52:00,415 - omlx.scheduler - INFO - [-] - Using boundary cache snapshot for f14ee070-ea21-4662-b8fd-4e5e79620800: storing 120832/122282 tokens (skipping trailing partial block, 0 intermediate snapshots)
2026-04-28 13:52:00,543 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request f14ee070-ea21-4662-b8fd-4e5e79620800
2026-04-28 13:52:34,739 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 818c9f80-cecc-4d77-9d66-67137948b7cb
2026-04-28 13:52:49,511 - omlx.scheduler - INFO - [-] - Using boundary cache snapshot for 72112a44-fd18-41ee-a07c-12c9f006c4b3: storing 122880/123018 tokens (skipping trailing partial block, 0 intermediate snapshots)
2026-04-28 13:52:49,633 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 72112a44-fd18-41ee-a07c-12c9f006c4b3
2026-04-28 13:53:16,791 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request aa5978e6-84eb-4c50-a998-ba76825ee0a1
2026-04-28 13:54:56,832 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 19c6871b-5b57-4da9-b770-81d5a5673c76
2026-04-28 13:55:09,757 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request f4249fa6-ccf9-4ab7-900b-753fc5905bcb
2026-04-28 13:55:25,361 - omlx.scheduler - INFO - [-] - Using boundary cache snapshot for e1152b6a-e805-4510-9fbd-591fdb319f4d: storing 124928/125660 tokens (skipping trailing partial block, 0 intermediate snapshots)
2026-04-28 13:55:25,482 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request e1152b6a-e805-4510-9fbd-591fdb319f4d
2026-04-28 13:55:39,571 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request d912767e-3503-4c2e-abf9-c681b41b3902
2026-04-28 13:55:58,444 - omlx.engine.vlm - INFO - [-] - [vlm_stream_generate] GeneratorExit for request 753d9c9a-698e-4273-87ad-c13639115a8f