Conversation
In my case I kept finding celery running at 100% and doing nothing. py-spy
pointed to the close_open_fds and then ulimit inside the container showed gory
detail of
❯ docker run -it --rm --entrypoint bash dandiarchive/dandiarchive-api -c "ulimit -n"
1073741816
situation is not unique to me. See more at
dandi/dandi-cli#1488
|
hm, why pre-commit.ci is even configured if there is no |
auvipy
left a comment
There was a problem hiding this comment.
lets ignore the pre commit. can you elaborate more on the change please? also should we also consider adding some tests to verify the proposed changes?
|
I would be happy to elaborate! ATM I can only reiterate what tried to describe in original description -- on some systems ulimit would return HUGE number for maximal number of open descriptiors, which would be infeasible to loop through. So, billiard should not try to loop through all the possible billion of them. |
|
May be we can add some unit tests for the suggested changes as well |
There was a problem hiding this comment.
Pull Request Overview
The PR adds logic to cap and warn about excessively high file descriptor limits returned by get_fdmax, preventing performance issues when iterating open descriptors.
- Capture
os.sysconf('SC_OPEN_MAX')intofdmaxand handle errors uniformly - Introduce a threshold (100k) to cap
fdmaxto a sensible default (either the passeddefaultor 10 000) and emit a warning - Import
warningsand emit a deprecation-style warning instead of returning an oversized limit
Comments suppressed due to low confidence (1)
billiard/compat.py:118
- Consider adding unit tests for the new high-value cap branch to verify that large
fdmaxvalues produce the expected warning and capped return value.
if fdmax >= 1e5:
| if fdmax >= 1e5: | ||
| # limiting value is ad-hoc and already more than sensible | ||
| if default: | ||
| fdmax_limited = default | ||
| msg = "the default" | ||
| else: | ||
| fdmax_limited = 10000 |
There was a problem hiding this comment.
The magic numbers 1e5 and 10000 should be extracted to named constants for clarity and easier future tuning.
| if fdmax >= 1e5: | |
| # limiting value is ad-hoc and already more than sensible | |
| if default: | |
| fdmax_limited = default | |
| msg = "the default" | |
| else: | |
| fdmax_limited = 10000 | |
| if fdmax >= FD_MAX_LIMIT_THRESHOLD: | |
| # limiting value is ad-hoc and already more than sensible | |
| if default: | |
| fdmax_limited = default | |
| msg = "the default" | |
| else: | |
| fdmax_limited = FD_MAX_DEFAULT_LIMIT |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

In my case I kept finding celery running at 100% and doing nothing. py-spy pointed to the close_open_fds and then ulimit inside the container showed gory detail of
situation is not unique to me. See more at
I verified that with this fix my celery container gets unstuck and proceeds to report useful errors ;)