feat: implement fast multipart download API for envd#1802
feat: implement fast multipart download API for envd#1802mishushakov wants to merge 4 commits intomainfrom
Conversation
Implements efficient multipart file downloads using direct ReadAt() for random access reads, mirroring the existing multipart upload pattern. Supports concurrent part downloads with configurable part sizes (default 5MB, max 100MB) and session-based management with 1-hour TTL and automatic cleanup. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
| } | ||
|
|
||
| // Check if file exists and get its size | ||
| stat, err := os.Stat(filePath) |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 16 days ago
In general, to fix uncontrolled path usage, we should resolve the user-provided path against a well-defined base directory and then verify that the resulting absolute path is still inside that base. If it is not, we reject the request. This prevents directory traversal or arbitrary absolute path usage even when the attacker supplies .. components or absolute paths.
The best place to fix this without changing existing behavior too much is in permissions.ExpandAndResolve. That function is the centralized helper used to turn a user-supplied path plus user and defaultPath info into an absolute path. We can extend it so that it (1) computes a base directory (either from defaultPath or user.HomeDir), (2) expands ~ relative to that base, (3) resolves to an absolute path, and (4) ensures the resulting absolute path stays under the base directory. If the path escapes the base (for example, via .. or an absolute path like /etc/passwd), we return an error instead of the path.
Concretely, in permissions/path.go we will:
- Import
"strings"to support prefix checks. - Change
ExpandAndResolveto:- Compute an absolute
baseDirfromdefaultPath(if provided) oruser.HomeDir. - Apply
execcontext.ResolveDefaultWorkdiras before to handle empty/default values. - Use
expandto deal with~, but ensure it uses the base directory semantics. - Always join the resulting path to
baseDir(ignoring whether the user supplied an absolute path). - Call
filepath.Absto normalize the joined path. - Check that
absis withinbaseDirusing a prefix check after ensuring both are absolute, with a trailing separator logic to avoid partial prefix issues. - Return an error if the check fails.
- Compute an absolute
We will not modify multipart_download.go because it is already centralizing resolution via permissions.ExpandAndResolve; after we harden that function, filePath will be safe to use with os.Stat and subsequent file operations.
| @@ -7,6 +7,7 @@ | ||
| "os/user" | ||
| "path/filepath" | ||
| "slices" | ||
| "strings" | ||
|
|
||
| "github.com/e2b-dev/infra/packages/envd/internal/execcontext" | ||
| ) | ||
| @@ -28,23 +29,43 @@ | ||
| } | ||
|
|
||
| func ExpandAndResolve(path string, user *user.User, defaultPath *string) (string, error) { | ||
| // Determine the base directory for all paths: prefer defaultPath (if provided), | ||
| // otherwise fall back to the user's home directory. | ||
| base := execcontext.ResolveDefaultWorkdir("", defaultPath) | ||
| if base == "" { | ||
| base = user.HomeDir | ||
| } | ||
|
|
||
| baseAbs, err := filepath.Abs(base) | ||
| if err != nil { | ||
| return "", fmt.Errorf("failed to resolve base path '%s' for user '%s': %w", base, user.Username, err) | ||
| } | ||
|
|
||
| // Resolve the requested path relative to the base directory. | ||
| path = execcontext.ResolveDefaultWorkdir(path, defaultPath) | ||
|
|
||
| path, err := expand(path, user.HomeDir) | ||
| path, err = expand(path, baseAbs) | ||
| if err != nil { | ||
| return "", fmt.Errorf("failed to expand path '%s' for user '%s': %w", path, user.Username, err) | ||
| } | ||
|
|
||
| if filepath.IsAbs(path) { | ||
| return path, nil | ||
| // Always join with the base directory so that user-supplied absolute | ||
| // paths cannot escape the intended root. | ||
| joined := filepath.Join(baseAbs, path) | ||
|
|
||
| abs, err := filepath.Abs(joined) | ||
| if err != nil { | ||
| return "", fmt.Errorf("failed to resolve path '%s' for user '%s' with base dir '%s': %w", joined, user.Username, baseAbs, err) | ||
| } | ||
|
|
||
| // The filepath.Abs can correctly resolve paths like /home/user/../file | ||
| path = filepath.Join(user.HomeDir, path) | ||
| // Ensure the resolved path is within the base directory. | ||
| baseWithSep := baseAbs | ||
| if !strings.HasSuffix(baseWithSep, string(os.PathSeparator)) { | ||
| baseWithSep += string(os.PathSeparator) | ||
| } | ||
|
|
||
| abs, err := filepath.Abs(path) | ||
| if err != nil { | ||
| return "", fmt.Errorf("failed to resolve path '%s' for user '%s' with home dir '%s': %w", path, user.Username, user.HomeDir, err) | ||
| if abs != baseAbs && !strings.HasPrefix(abs, baseWithSep) { | ||
| return "", fmt.Errorf("resolved path '%s' is outside of allowed base directory '%s' for user '%s'", abs, baseAbs, user.Username) | ||
| } | ||
|
|
||
| return abs, nil |
| } | ||
|
|
||
| // Open the file for reading | ||
| srcFile, err := os.Open(filePath) |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 16 days ago
General fix approach: After resolving the user-supplied path into an absolute path, ensure that it is contained within a safe base directory (such as the user’s home directory or a configured workdir). Reject paths that escape this base. This is done by: (1) computing the base directory as an absolute path, (2) resolving the requested path relative to that base (or expanding ~ into that base), and (3) verifying that the final absolute path still has the base as its prefix on a path-component boundary.
Best concrete fix here without changing external behavior more than necessary:
-
Strengthen
permissions.ExpandAndResolveso it:- Always resolves the user-supplied
pathto an absolute path. - Determines a “base directory” from the given
useranddefaultPath(viaexeccontext.ResolveDefaultWorkdir), normalizes that base to an absolute path, and uses it as the root. - Expands
~usinguser.HomeDiras before. - If the resulting path is not absolute, join it relative to the base directory, then normalize.
- Finally, verify that the resulting absolute path is within the base directory using a safe prefix check (e.g.,
strings.HasPrefix(absPathWithSeparator, baseWithSeparator)after both are cleaned and made absolute). If it is outside, return an error instead of the path.
- Always resolves the user-supplied
-
To avoid breaking existing behavior where a fully absolute path is passed, we still accept it only if it lies under the computed base directory; otherwise, we now reject it. This is the key security change.
-
The caller in
PostFilesDownloadInitcontinues to callpermissions.ExpandAndResolveand then usesfilePathforos.Statandos.Open. With the added containment check,filePathcan no longer reference arbitrary filesystem locations outside the intended base, which addresses the CodeQL finding.
Implementation details / specific changes:
-
In
packages/envd/internal/permissions/path.go:- Add an import of
strings. - Introduce computation of a
baseDirat the start ofExpandAndResolve:- Use
execcontext.ResolveDefaultWorkdir(path, defaultPath)as now, but we’ll separate the concerns: first resolveresolvedPath := execcontext.ResolveDefaultWorkdir(path, defaultPath), then derivebaseDirfromdefaultPath(oruser.HomeDirif no default path). - Compute
baseDirAbs := filepath.Abs(baseDir)and handle errors.
- Use
- Apply
expandon the resolved path as now. - If the expanded path is not absolute, join it with
baseDirAbsand normalize viafilepath.Abs. - If the (possibly already absolute) expanded path is absolute but not under
baseDirAbs, return an error indicating the path is outside the allowed base. - Implement a helper-style check inline in
ExpandAndResolveusingstrings.TrimRight+HasPrefixto enforce directory containment.
- Add an import of
-
No changes are needed in
packages/envd/internal/api/multipart_download.go, because we centralize the security inExpandAndResolve. The call site already handles an error fromExpandAndResolveand returns a400 Bad Request, which is appropriate when the requested path is invalid or disallowed.
- Remove unused mu mutex field from MultipartDownloadSession - Add activeReads ref-counting to prevent race condition when closing file during active read operations - Use sync.Pool for buffer reuse to reduce memory pressure under load - Accept context in cleanupExpiredDownloads for graceful shutdown - Add Close() method to API for proper resource cleanup - Update tests to call Close() on cleanup Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
Implements efficient multipart file downloads using direct
ReadAt()for random access reads, mirroring the existing multipart upload pattern. Supports concurrent part downloads with configurable part sizes (default 5MB, max 100MB) and session-based management with 1-hour TTL and automatic cleanup.Test plan
🤖 Generated with Claude Code