Skip to content

[Bug]: worker_process - WARNING - Failed to create stats pool summary file /app/startup/.. #166

@atriaybagur

Description

@atriaybagur

Description

At the end of a NVFLARE training run, this error appears. It seems that the directory has been cleaned up before the 'stats pool summary file' can be created. All the results are uploaded to S3, so this is low priority, but it would be good to handle this more gracefully.

This only happens in 'prod' where the directory is actually removed (in 'dev' mode we don't remove it).

@virginiafdez FYI

Log output or error messages

From docker logs of the FL client when running spleen segmentation tutorial:

2026-03-12 10:59:06,737 - FederatedClient - INFO - Starting to push execute result.
2026-03-12 10:59:06,740 - Communicator - INFO - SubmitUpdate to: server. size: 875B (875 Bytes). time: 0.003483 seconds
2026-03-12 10:59:06,740 - ClientRunner - INFO - [identity=Trust_1, run=ec995c28-f742-4166-8f45-b047a34fa30e, peer=net-2, peer_run=ec995c28-f742-4166-8f45-b047a34fa30e, task_name=post_validation, task_id=bce645bd-45a2-4621-b381-ae6bc87ee520] - task result sent to server
2026-03-12 10:59:06,953 - ClientRunner - INFO - [identity=Trust_1, run=ec995c28-f742-4166-8f45-b047a34fa30e, peer=net-2, peer_run=ec995c28-f742-4166-8f45-b047a34fa30e] - received request from Server to end current RUN
2026-03-12 10:59:08,741 - ClientRunner - INFO - [identity=Trust_1, run=ec995c28-f742-4166-8f45-b047a34fa30e] - started end-run events sequence
2026-03-12 10:59:08,741 - ClientRunner - INFO - [identity=Trust_1, run=ec995c28-f742-4166-8f45-b047a34fa30e] - ABOUT_TO_END_RUN fired
2026-03-12 10:59:08,741 - ClientRunner - INFO - [identity=Trust_1, run=ec995c28-f742-4166-8f45-b047a34fa30e] - Firing CHECK_END_RUN_READINESS ...
2026-03-12 10:59:08,741 - ClientRunner - INFO - [identity=Trust_1, run=ec995c28-f742-4166-8f45-b047a34fa30e] - END_RUN fired
2026-03-12 10:59:08,741 - ReliableMessage - INFO - ReliableMessage is shutdown
2026-03-12 10:59:08,744 - conn_manager - INFO - Connection [CN00003 Not Connected] is closed PID: 24
2026-03-12 10:59:08,744 - conn_manager - INFO - Connection [CN00002 Not Connected] is closed PID: 1050
2026-03-12 10:59:08,744 - conn_manager - INFO - Connection [CN00003 Not Connected] is closed PID: 1050
2026-03-12 10:59:08,745 - GrpcDriver - INFO - CLIENT: finished connection [CN00003 Not Connected]
2026-03-12 10:59:08,745 - FederatedClient - INFO - Shutting down client run: Trust_1
2026-03-12 10:59:08,870 - ClientRunner - INFO - [identity=Trust_1, run=ec995c28-f742-4166-8f45-b047a34fa30e] - Client is stopping ...
2026-03-12 10:59:08,872 - worker_process - WARNING - Failed to create stats pool summary file /app/startup/../ec995c28-f742-4166-8f45-b047a34fa30e/stats_pool_summary.json: FileNotFoundError: [Errno 2] No such file or directory: '/app/startup/../ec995c28-f742-4166-8f45-b047a34fa30e/stats_pool_summary.json'
--- Logging error ---
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/logging/handlers.py", line 73, in emit
    if self.shouldRollover(record):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/logging/handlers.py", line 191, in shouldRollover
    self.stream = self._open()
                  ^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1263, in _open
    return open_func(self.baseFilename, self.mode,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/app/ec995c28-f742-4166-8f45-b047a34fa30e/log_fl.txt'
Call stack:
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/app/.venv/lib/python3.12/site-packages/nvflare/private/fed/app/client/worker_process.py", line 207, in <module>
    rc = mpm.run(main_func=main, run_dir=run_dir, args=args)
  File "/app/.venv/lib/python3.12/site-packages/nvflare/fuel/f3/mpm.py", line 154, in run
    rc = main_func(**kwargs)
  File "/app/.venv/lib/python3.12/site-packages/nvflare/private/fed/app/client/worker_process.py", line 140, in main
    logger.warning(err)
Message: "Failed to create stats pool summary file /app/startup/../ec995c28-f742-4166-8f45-b047a34fa30e/stats_pool_summary.json: FileNotFoundError: [Errno 2] No such file or directory: '/app/startup/../ec995c28-f742-4166-8f45-b047a34fa30e/stats_pool_summary.json'"
Arguments: ()
2026-03-12 10:59:11,874 - MPM - INFO - MPM: Good Bye!
2026-03-12 10:59:12,685 - JobExecutor - INFO - run (ec995c28-f742-4166-8f45-b047a34fa30e): child worker process finished with RC 0

Acceptance criteria

  • No errors at the end of a NVFLARE run in the FL Client.

Assign the relevant label(s) and, if known, Milestone to this bug report

  • I confirm I have assigned the relevant label(s) and, if known, Milestone to this bug report

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions