-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Description
At the end of a NVFLARE training run, this error appears. It seems that the directory has been cleaned up before the 'stats pool summary file' can be created. All the results are uploaded to S3, so this is low priority, but it would be good to handle this more gracefully.
This only happens in 'prod' where the directory is actually removed (in 'dev' mode we don't remove it).
@virginiafdez FYI
Log output or error messages
From docker logs of the FL client when running spleen segmentation tutorial:
2026-03-12 10:59:06,737 - FederatedClient - INFO - Starting to push execute result.
2026-03-12 10:59:06,740 - Communicator - INFO - SubmitUpdate to: server. size: 875B (875 Bytes). time: 0.003483 seconds
2026-03-12 10:59:06,740 - ClientRunner - INFO - [identity=Trust_1, run=ec995c28-f742-4166-8f45-b047a34fa30e, peer=net-2, peer_run=ec995c28-f742-4166-8f45-b047a34fa30e, task_name=post_validation, task_id=bce645bd-45a2-4621-b381-ae6bc87ee520] - task result sent to server
2026-03-12 10:59:06,953 - ClientRunner - INFO - [identity=Trust_1, run=ec995c28-f742-4166-8f45-b047a34fa30e, peer=net-2, peer_run=ec995c28-f742-4166-8f45-b047a34fa30e] - received request from Server to end current RUN
2026-03-12 10:59:08,741 - ClientRunner - INFO - [identity=Trust_1, run=ec995c28-f742-4166-8f45-b047a34fa30e] - started end-run events sequence
2026-03-12 10:59:08,741 - ClientRunner - INFO - [identity=Trust_1, run=ec995c28-f742-4166-8f45-b047a34fa30e] - ABOUT_TO_END_RUN fired
2026-03-12 10:59:08,741 - ClientRunner - INFO - [identity=Trust_1, run=ec995c28-f742-4166-8f45-b047a34fa30e] - Firing CHECK_END_RUN_READINESS ...
2026-03-12 10:59:08,741 - ClientRunner - INFO - [identity=Trust_1, run=ec995c28-f742-4166-8f45-b047a34fa30e] - END_RUN fired
2026-03-12 10:59:08,741 - ReliableMessage - INFO - ReliableMessage is shutdown
2026-03-12 10:59:08,744 - conn_manager - INFO - Connection [CN00003 Not Connected] is closed PID: 24
2026-03-12 10:59:08,744 - conn_manager - INFO - Connection [CN00002 Not Connected] is closed PID: 1050
2026-03-12 10:59:08,744 - conn_manager - INFO - Connection [CN00003 Not Connected] is closed PID: 1050
2026-03-12 10:59:08,745 - GrpcDriver - INFO - CLIENT: finished connection [CN00003 Not Connected]
2026-03-12 10:59:08,745 - FederatedClient - INFO - Shutting down client run: Trust_1
2026-03-12 10:59:08,870 - ClientRunner - INFO - [identity=Trust_1, run=ec995c28-f742-4166-8f45-b047a34fa30e] - Client is stopping ...
2026-03-12 10:59:08,872 - worker_process - WARNING - Failed to create stats pool summary file /app/startup/../ec995c28-f742-4166-8f45-b047a34fa30e/stats_pool_summary.json: FileNotFoundError: [Errno 2] No such file or directory: '/app/startup/../ec995c28-f742-4166-8f45-b047a34fa30e/stats_pool_summary.json'
--- Logging error ---
Traceback (most recent call last):
File "/usr/local/lib/python3.12/logging/handlers.py", line 73, in emit
if self.shouldRollover(record):
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/logging/handlers.py", line 191, in shouldRollover
self.stream = self._open()
^^^^^^^^^^^^
File "/usr/local/lib/python3.12/logging/__init__.py", line 1263, in _open
return open_func(self.baseFilename, self.mode,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/app/ec995c28-f742-4166-8f45-b047a34fa30e/log_fl.txt'
Call stack:
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/app/.venv/lib/python3.12/site-packages/nvflare/private/fed/app/client/worker_process.py", line 207, in <module>
rc = mpm.run(main_func=main, run_dir=run_dir, args=args)
File "/app/.venv/lib/python3.12/site-packages/nvflare/fuel/f3/mpm.py", line 154, in run
rc = main_func(**kwargs)
File "/app/.venv/lib/python3.12/site-packages/nvflare/private/fed/app/client/worker_process.py", line 140, in main
logger.warning(err)
Message: "Failed to create stats pool summary file /app/startup/../ec995c28-f742-4166-8f45-b047a34fa30e/stats_pool_summary.json: FileNotFoundError: [Errno 2] No such file or directory: '/app/startup/../ec995c28-f742-4166-8f45-b047a34fa30e/stats_pool_summary.json'"
Arguments: ()
2026-03-12 10:59:11,874 - MPM - INFO - MPM: Good Bye!
2026-03-12 10:59:12,685 - JobExecutor - INFO - run (ec995c28-f742-4166-8f45-b047a34fa30e): child worker process finished with RC 0Acceptance criteria
- No errors at the end of a NVFLARE run in the FL Client.
Assign the relevant label(s) and, if known, Milestone to this bug report
- I confirm I have assigned the relevant label(s) and, if known, Milestone to this bug report
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working