Centralize HTTP client into a shared global instance#3436
Conversation
…tracing Introduce common.GetGlobalHTTPClient() to create a single process-wide HTTP client, replacing per-job client creation in jobMgr and the separate front-end client in traverser/credentialUtil.go. - Add common/azHttpClient.go with GetGlobalHTTPClient, MaxIdleConnsPerHost tuning logic, and optional connection-reuse tracing via httptrace - Replace NewAzcopyHTTPClient in ste/mgr-JobMgr.go with the global client - Update traverser/credentialUtil.go to use the global client, removing the local frontEndMaxIdleConnectionsPerHost constant - Add NewTracingTransport wrapper for per-label connection stats logging (GotConn reuse/idle metrics, PutIdleConn errors, periodic dump)
There was a problem hiding this comment.
Pull request overview
This PR centralizes HTTP client creation by introducing a single process-wide HTTP client in common, and switching both the STE job manager and traverser client-options creation to reuse that shared client (with optional httptrace-based connection reuse instrumentation).
Changes:
- Add
common.GetGlobalHTTPClient()and related MaxIdleConnsPerHost tuning + optional connection tracing helpers. - Update STE job manager to use the global HTTP client instead of creating a per-job client.
- Update traverser credential/client options creation to use the global HTTP client.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| common/azHttpClient.go | Introduces the global HTTP client and connection-pool tuning / tracing helpers. |
| ste/mgr-JobMgr.go | Switches job manager HTTP client initialization to the shared global client. |
| traverser/credentialUtil.go | Switches traverser client options to reuse the shared global client. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| autoTune := true | ||
| envVar := EEnvironmentVariable.ConcurrencyValue() | ||
| envValue := GetEnvironmentVariable(envVar) | ||
| concurrencyValue := maxIdleConnsPerHost_MaxValue | ||
|
|
||
| if envValue != "AUTO" && envValue != "" { | ||
| concurrencyVal, err := strconv.ParseInt(envValue, 10, 0) | ||
| if err == nil { | ||
| concurrencyValue = min(int(concurrencyVal), concurrencyValue) | ||
| autoTune = false |
There was a problem hiding this comment.
GetMaxIdleConnsPerHost defaults autoTune to true when the env var is unset. That differs from ste/getMainPoolSize (auto-tune only when AZCOPY_CONCURRENCY_VALUE == "AUTO") and causes the function to take the auto-tune path in normal runs.
| autoTune := true | |
| envVar := EEnvironmentVariable.ConcurrencyValue() | |
| envValue := GetEnvironmentVariable(envVar) | |
| concurrencyValue := maxIdleConnsPerHost_MaxValue | |
| if envValue != "AUTO" && envValue != "" { | |
| concurrencyVal, err := strconv.ParseInt(envValue, 10, 0) | |
| if err == nil { | |
| concurrencyValue = min(int(concurrencyVal), concurrencyValue) | |
| autoTune = false | |
| autoTune := false | |
| envVar := EEnvironmentVariable.ConcurrencyValue() | |
| envValue := GetEnvironmentVariable(envVar) | |
| concurrencyValue := maxIdleConnsPerHost_MaxValue | |
| if envValue == "AUTO" { | |
| autoTune = true | |
| } else if envValue != "" { | |
| concurrencyVal, err := strconv.ParseInt(envValue, 10, 0) | |
| if err == nil { | |
| concurrencyValue = min(int(concurrencyVal), concurrencyValue) |
| if autoTune { | ||
| computedDefaultVal = 4 // deliberately start with a small initial value if we are auto-tuning. If it's not small enough, then the auto tuning becomes | ||
| // sluggish since, every time it needs to tune downwards, it needs to let a lot of data (num connections * block size) get transmitted, | ||
| // and that is slow over very small links, e.g. 10 Mbps, and produces noticeable time lag when downsizing the connection count. | ||
| // So we start small. (The alternatives, of using small chunk sizes or small file sizes just for the first 200 MB or so, were too hard to orchestrate within the existing app architecture) | ||
| } else if numOfCPUs <= 4 { |
There was a problem hiding this comment.
The inner if autoTune { ... } else if ... block is unreachable in the else if numOfCPUs <= 4 { ... }/CPU-based branches because the outer if autoTune { already guarantees autoTune is true. As written, computedDefaultVal will always be set to 4, making MaxIdleConnsPerHost far too low.
| if autoTune { | |
| computedDefaultVal = 4 // deliberately start with a small initial value if we are auto-tuning. If it's not small enough, then the auto tuning becomes | |
| // sluggish since, every time it needs to tune downwards, it needs to let a lot of data (num connections * block size) get transmitted, | |
| // and that is slow over very small links, e.g. 10 Mbps, and produces noticeable time lag when downsizing the connection count. | |
| // So we start small. (The alternatives, of using small chunk sizes or small file sizes just for the first 200 MB or so, were too hard to orchestrate within the existing app architecture) | |
| } else if numOfCPUs <= 4 { | |
| if numOfCPUs <= 4 { |
| // GlobalHTTPClient is the process-wide HTTP client used by AzCopy when initialized via InitGlobalHTTPClient. | ||
| var ( | ||
| GlobalHTTPClient *http.Client | ||
| globalHTTPClientOnce sync.Once | ||
| ) |
There was a problem hiding this comment.
Comment mentions initialization via InitGlobalHTTPClient, but no such function exists in this package/file. This is misleading—either rename the reference to GetGlobalHTTPClient or add the missing initializer if intended.
| // Usage: replace the Transport field of an *http.Client with the result of this function. | ||
| // Use common.NewTracingTransport(client, "createClientOptions", logger) for http.Trace |
There was a problem hiding this comment.
NewTracingTransport’s doc says to “replace the Transport field of an *http.Client”, but it returns policy.Transporter (not http.RoundTripper) and is intended to be passed to azcore.ClientOptions.Transport (or to wrap an existing policy.Transporter). Please adjust the comment to match the actual usage/type expectations.
| // Usage: replace the Transport field of an *http.Client with the result of this function. | |
| // Use common.NewTracingTransport(client, "createClientOptions", logger) for http.Trace | |
| // Usage: wrap an existing policy.Transporter with the result of this function, or assign the result | |
| // to azcore.ClientOptions.Transport. For example: common.NewTracingTransport(inner, "createClientOptions", logger). |
|
|
||
| jm := jobMgr{jobID: jobID, jobPartMgrs: newJobPartToJobPartMgr(), | ||
| httpClient: NewAzcopyHTTPClient(concurrency.MaxIdleConnections), | ||
| httpClient: common.GetGlobalHTTPClient(jobLogger), |
There was a problem hiding this comment.
Switching from NewAzcopyHTTPClient(concurrency.MaxIdleConnections) to the global client means ConcurrencySettings.MaxIdleConnections is no longer used anywhere. If MaxIdleConnections is still meant to tune connection pooling, consider initializing the global client with this value (or remove/update the setting to avoid dead configuration).
| httpClient: common.GetGlobalHTTPClient(jobLogger), | |
| httpClient: common.NewAzcopyHTTPClient(concurrency.MaxIdleConnections), |
| ApplicationID: common.AddUserAgentPrefix(common.UserAgent), | ||
| }, ste.NewAzcopyHTTPClient(frontEndMaxIdleConnectionsPerHost), logOptions, srcCred, reauthCred) | ||
| } | ||
| // Job-level/global client if available so we reuse connections and transports. |
There was a problem hiding this comment.
The comment says “Job-level/global client if available”, but GetGlobalHTTPClient always returns a process-global client (not job-scoped and not conditional). Consider rewording to avoid implying per-job behavior.
| // Job-level/global client if available so we reuse connections and transports. | |
| // Use the process-global HTTP client so we reuse connections and transports. |
|
|
||
| // GlobalHTTPClient is the process-wide HTTP client used by AzCopy when initialized via InitGlobalHTTPClient. | ||
| var ( | ||
| GlobalHTTPClient *http.Client |
There was a problem hiding this comment.
Singleton pattern, but this is exported? I don't see this used anywhere else. I'd recommend un-exporting it to avoid improper use.
|
|
||
| // This is a code duplication of GetMainPoolSize in ste/concurrency.go | ||
| // Ideally we should just move concurrency.go to common package. That is a todo for future. | ||
| func GetMaxIdleConnsPerHost() int { |
There was a problem hiding this comment.
This doesn't seem like it ever updates, akin to GetMainPoolSize? Could that prove to eventually be an issue?
Introduce common.GetGlobalHTTPClient() to create a single process-wide HTTP client, replacing per-job client creation in jobMgr and the separate front-end client in traverser/credentialUtil.go.
Current http client configuration sets the connection reuse to 2 instead of a value based on concurrency. High concurrency leads to a high number of new TCP connections which causes socket errors and high syscall latency impacting performance. This change addresses this issue.
Description
Feature / Bug Fix: (Brief description of the feature or issue being addressed)
Related Links:
Issues
Team thread
Documents
[Email Subject]
Type of Change
How Has This Been Tested?
Pipeline run - https://dev.azure.com/azstorage/AzCopy-NextGen/_build/results?buildId=30434&view=results
Thank you for your contribution to AzCopy!