Skip to content
This repository was archived by the owner on Jan 23, 2026. It is now read-only.
This repository was archived by the owner on Jan 23, 2026. It is now read-only.

Load balancer with 60s inactivity timeout disconnects exporters/clients #345

@mangelajo

Description

@mangelajo

i.e.:

$ jmp exporter --log-level DEBUG run --config exp-mk.yml
DEBUG:grpc._cython.cygrpc:Using AsyncIOEngine.POLLER as I/O engine
DEBUG:jumpstarter.exporter.session:GetReport()
INFO:jumpstarter.exporter.exporter:Registering exporter with controller
INFO:jumpstarter.exporter.exporter:Currently not leased
INFO:jumpstarter.exporter.exporter:Unregistering exporter with controller
Exception while serving on the exporter: (<AioRpcError of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Socket closed"
	debug_error_string = "UNKNOWN:Error received from peer ipv4:3.12.180.35:443 {created_time:"2025-03-14T16:01:16.478446+01:00", grpc_status:14, grpc_message:"Socket closed"}"
>,)
Traceback (most recent call last):
  File "/Users/majopela/work/jumpstarter/packages/jumpstarter/jumpstarter/exporter/exporter.py", line 81, in serve
    async for status in controller.Status(jumpstarter_pb2.StatusRequest()):
  File "/Users/majopela/work/jumpstarter/.venv/lib/python3.12/site-packages/grpc/aio/_call.py", line 365, in _fetch_stream_responses
    await self._raise_for_status()
  File "/Users/majopela/work/jumpstarter/.venv/lib/python3.12/site-packages/grpc/aio/_call.py", line 272, in _raise_for_status
    raise _create_rpc_error(
grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Socket closed"
	debug_error_string = "UNKNOWN:Error received from peer ipv4:3.12.180.35:443 {created_time:"2025-03-14T16:01:16.478446+01:00", grpc_status:14, grpc_message:"Socket closed"}"
Image

Lowering the grpc keepalive to 45s seems to fix the issue.

Notes:

  • we should make keepalive time configurable in the client & exporter configurations
  • we should make the server accept a very low grpc keepalive (1s or so)... for extreme situations.
  • we should not allow configurations go under our server grpc keepalive default limit.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions