feat: Implement graceful shutdown with connection tracking #578
+368
−16
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR implements graceful shutdown support for chproxy, allowing the proxy to properly handle SIGTERM and SIGINT signals and wait for active connections to complete before terminating.
Motivation: When chproxy is deployed in orchestrated environments like Kubernetes, abrupt termination can kill in-flight queries, leading to failed client requests and incomplete data operations. This implementation ensures clean shutdowns by waiting for active connections to complete (or timeout) before process termination.
Key changes:
signal.NotifyContextfor SIGTERM/SIGINThttp.Server.ConnStatecallback with atomic countergraceful_shutdown_timeout(default: 25s, fits within K8s 30s grace period)Example shutdown flow:
Pull request type
Please check the type of change your PR introduces:
Checklist
Does this introduce a breaking change?
Further comments
Implementation Highlights:
Connection Tracking: Uses
http.Server.ConnStatecallback. This correctly tracks TCP connections and handles HTTP keep-alive.Concurrent Server Shutdown: HTTP and HTTPS servers shut down in parallel using
sync.WaitGroupfor efficiency.Thread Safety:
configLock->lock) to prevent deadlocksreloadSignalError Handling: Errors collected via buffered channel and aggregated with
errors.JoinTesting Strategy: Uses subprocess pattern to test shutdown behavior in isolation
Files Changed:
main.go: Signal handling, connection tracking, graceful shutdown logicproxy.go: Resource cleanup (caches, heartbeat goroutines, transport)config/config.go: AddedGracefulShutdownTimeoutfieldmain_test.go: Two comprehensive test cases (normal shutdown + timeout)Testing: