Skip to content

fix: migrate E2E tests to Docker Chrome for CI reliability#123

Merged
adnaan merged 2 commits intomainfrom
fix/docker-chrome-e2e-tests
Jan 26, 2026
Merged

fix: migrate E2E tests to Docker Chrome for CI reliability#123
adnaan merged 2 commits intomainfrom
fix/docker-chrome-e2e-tests

Conversation

@adnaan
Copy link
Contributor

@adnaan adnaan commented Jan 26, 2026

Summary

  • Migrate all 20+ E2E tests from local Chrome to Docker Chrome for reliable CI execution
  • Add shared helper functions (SetupDockerChrome, ConvertURLForDockerChrome)
  • Fix flaky tests caused by local Chrome timing issues in CI

Problem

Tests like TestMermaidDiagramsRendering and TestLoadingIndicator were flaky in CI due to:

  • Local Chrome without proper readiness checks
  • "websocket url timeout reached" errors
  • "context deadline exceeded" errors

Solution

Use Docker Chrome (chromedp/headless-shell) with health polling - waits up to 60 seconds for Chrome to be ready before running tests.

Test plan

  • All test files compile without errors
  • go vet passes
  • Tests skip gracefully when Docker is unavailable
  • CI tests pass with Docker Chrome

🤖 Generated with Claude Code

Replace local Chrome (chromedp.NewExecAllocator) with Docker Chrome
(chromedp/headless-shell) across all E2E tests to fix flaky test failures
in CI environments.

Changes:
- Add e2e_helpers_test.go and e2e_helpers_external_test.go with Docker
  Chrome setup helpers (SetupDockerChrome, ConvertURLForDockerChrome)
- Migrate 20+ E2E test files to use Docker Chrome pattern
- Refactor setupMarkdownTestInternal helper to return structured context
- Tests now skip gracefully when Docker is unavailable

This fixes flaky tests like TestMermaidDiagramsRendering (websocket url
timeout) and TestLoadingIndicator (context deadline exceeded) that were
caused by local Chrome timing issues.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 26, 2026 06:50
@claude
Copy link

claude bot commented Jan 26, 2026

PR Review: Migrate E2E Tests to Docker Chrome for CI Reliability

Summary

This PR successfully addresses flaky E2E tests in CI by migrating from local Chrome to Docker-based Chrome (chromedp/headless-shell). The implementation is solid, with good test infrastructure and proper error handling. I've identified several issues that should be addressed before merging.


Critical Issues

1. Code Duplication Between Helper Files

Location: e2e_helpers_test.go (line 1-220) and e2e_helpers_external_test.go (line 1-220)

These two files contain nearly identical code (440 lines of duplication). The only differences are:

  • Package declaration: package tinkerdown vs package tinkerdown_test
  • Container prefix: chrome-e2e-tinkerdown- vs chrome-e2e-tinkerdown-ext-
  • One missing helper in external: cleanupContainerByName takes t *testing.T in internal but not in external

Impact: Maintenance burden - any bug fix or improvement needs to be applied twice.

Recommendation: Consider one of these approaches:

  1. Export the internal package helpers and use them from external tests
  2. Create a shared internal package (e.g., internal/e2ehelpers) that both can import
  3. If the duplication is intentional due to package boundaries, add a comment explaining why

2. Inconsistent cleanupContainerByName Signature

Location:

  • e2e_helpers_test.go:190 - Takes t *testing.T parameter
  • e2e_helpers_external_test.go:200 - Does NOT take t *testing.T parameter

This inconsistency is confusing and could lead to bugs if code is copied between files.

Recommendation: Standardize the signature. Since the function doesn't use t.Helper() or log anything, removing the parameter from both would be cleaner.


Security Concerns

1. Container Resource Limits

Location: e2e_helpers_test.go:126-127

"--memory", "512m",
"--cpus", "0.5",

Good: Resource limits prevent runaway containers.
Concern: These limits are quite generous for a test container. Consider if 256MB would suffice to reduce resource consumption in CI.

2. Docker Image Trust

Location: Using chromedp/headless-shell:latest

Concern: Using :latest tag means different CI runs could use different versions, potentially causing inconsistency.

Recommendation: Pin to a specific version (e.g., chromedp/headless-shell:119.0.6045.105) for reproducibility. Update periodically as needed.


Potential Bugs

1. Port Collision Risk

Location: e2e_helpers_test.go:36-39

chromePort, err := getFreePort()
if err != nil {
    t.Fatalf("Failed to allocate Chrome port: %v", err)
}

Issue: There's a race condition between getting a free port and starting the container. Another process could grab the port in between.

Likelihood: Low in practice, but possible in heavily loaded CI environments.

Recommendation: Add retry logic or catch the "port already in use" error from Docker and retry with a new port.

2. Orphaned Containers on Cleanup Failure

Location: e2e_helpers_test.go:173-186

The cleanup logic is good, but if stopDockerChrome is never called (e.g., panic in test, process killed), containers remain.

Current Mitigation: Using --rm flag helps, but not if container is still running.

Recommendation: Consider adding a pre-test cleanup that removes any containers matching the prefix pattern before starting tests.

3. HTTP Client Without Timeout

Location: e2e_helpers_test.go:144

resp, err := http.Get(chromeURL)

Using http.Get without a timeout could hang indefinitely if Chrome is misbehaving.

Recommendation: Use a client with a timeout:

client := &http.Client{Timeout: 2 * time.Second}
resp, err := client.Get(chromeURL)

Code Quality & Best Practices

Strengths

  • Excellent use of t.Helper() throughout
  • Proper build tags (//go:build !ci) to skip in CI as documented
  • Good error messages with context
  • Proper cleanup with defer patterns
  • Thread-safe logging in lvtsource_markdown_e2e_test.go:21-37
  • Comprehensive timeout handling (60s wait for Chrome readiness)
  • Good logging for debugging (container logs on failure)

Minor Issues

1. Magic Numbers

  • e2e_helpers_test.go:143 - 120 iterations = 60 seconds (0.5s each)
  • Consider making these constants or comments explaining the math

2. Unused Return Value

  • e2e_helpers_test.go:109 - cleanupContainerByName(t, containerName) could fail silently
  • Same pattern at line 192

3. URL Construction
Location: e2e_helpers_test.go:214-219

The URL conversion is fragile with strings.Replace. What if the URL is http://127.0.0.1:12345/127.0.0.1?

Recommendation: Use url.Parse() and replace only the host:

func ConvertURLForDockerChrome(httptestURL string) string {
    u, err := url.Parse(httptestURL)
    if err != nil {
        return httptestURL // fallback
    }
    u.Host = "host.docker.internal:" + strings.Split(u.Host, ":")[1]
    return u.String()
}

Performance Considerations

1. Docker Image Pull Timeout

Location: e2e_helpers_test.go:107

60-second timeout for Docker pull is reasonable but could fail on slow connections.

Recommendation: Document that developers should pre-pull the image, or increase timeout to 120s for first-time setup.

2. Chrome Startup Wait

Location: e2e_helpers_test.go:143

Waiting up to 60 seconds with 500ms polling is good. However, the implementation does 120 iterations of 0.5s each.

Current behavior: Actually waits up to 60 seconds (correct).
Minor: Consider exponential backoff to reduce load: start at 100ms, double up to 1s max.

3. Sleep Calls in Tests

Multiple tests use chromedp.Sleep() for waiting:

  • action_buttons_e2e_test.go:75 - 500ms
  • action_buttons_e2e_test.go:112 - 500ms
  • mermaid_diagrams_e2e_test.go:58 - 2 seconds

Concern: Fixed sleeps can make tests slower than necessary or flaky if too short.

Recommendation: Prefer polling with chromedp.WaitVisible or custom JavaScript polling where possible.


Test Coverage

Excellent Coverage

The PR migrates 20+ test files consistently, covering:

  • Action buttons, build commands, component library
  • Data sources (markdown, SQLite, PostgreSQL, GraphQL, REST, file)
  • UI features (presentation mode, search, partials, mermaid diagrams)
  • Computed expressions, frontmatter config, exec toolbar

Missing

  • No new tests added for the helper functions themselves
  • Consider adding unit tests for getFreePort(), ConvertURLForDockerChrome(), etc.

Documentation

Good:

  • Clear PR description with problem/solution
  • Test plan checklist
  • Inline comments in complex sections

Missing:

  • No README or doc update explaining how to run E2E tests locally
  • No documentation of the Docker requirement
  • No explanation of why tests are skipped in CI with !ci build tag

Recommendation: Add a section to project docs:

## Running E2E Tests

E2E tests require Docker to be installed and running. They use a Dockerized Chrome instance for reliable test execution.

To run tests: `go test -v ./... -run E2E`

Note: E2E tests are skipped in CI via build tags. To enable: `go test -v -tags=e2e ./...`

Verdict

Overall Assessment: This is a well-implemented solution to a real problem. The code quality is high, and the approach is sound. However, the code duplication and minor issues should be addressed.

Recommendation:

  • Fix the critical code duplication issue before merging
  • Address the security and bug concerns
  • Consider the performance optimizations
  • Add basic documentation

Priority:

  1. HIGH: Fix code duplication between helper files
  2. MEDIUM: Pin Docker image version, add HTTP client timeout
  3. LOW: Improve URL parsing, add helper tests, documentation

Once the duplication is resolved and the Docker image is pinned, this PR will be ready to merge.

Great work on improving CI reliability!

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates E2E tests from local Chrome to Docker Chrome (chromedp/headless-shell) to improve CI reliability. The migration addresses flaky tests that were experiencing "websocket url timeout" and "context deadline exceeded" errors by introducing containerized Chrome with proper health checks.

Changes:

  • Added two helper files (e2e_helpers_test.go and e2e_helpers_external_test.go) with shared functions for Docker Chrome setup and URL conversion
  • Migrated 20+ E2E test files to use Docker Chrome instead of local Chrome execution allocators
  • Replaced hardcoded port numbers with dynamic port allocation to avoid conflicts

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
e2e_helpers_test.go New helper for package tinkerdown with Docker Chrome setup and utilities
e2e_helpers_external_test.go New helper for package tinkerdown_test with Docker Chrome setup and utilities
ux_enhancements_e2e_test.go Migrated to Docker Chrome; removed context imports
tutorial_nav_bug_test.go Migrated to Docker Chrome; retained context for HTTP server
search_e2e_test.go Migrated to Docker Chrome with URL conversion
presentation_mode_e2e_test.go Migrated to Docker Chrome; added dynamic ports; fixed typo
partials_e2e_test.go Migrated to Docker Chrome with URL conversion
mermaid_diagrams_e2e_test.go Migrated to Docker Chrome; improved wait logic for SVG rendering
lvtsource_sqlite_e2e_test.go Migrated to Docker Chrome with URL conversion
lvtsource_rest_e2e_test.go Migrated to Docker Chrome with URL conversion
lvtsource_pg_e2e_test.go Migrated to Docker Chrome; updated to use any instead of interface{}
lvtsource_markdown_e2e_test.go Migrated to Docker Chrome; refactored test context structure
lvtsource_graphql_e2e_test.go Migrated to Docker Chrome with URL conversion
lvtsource_file_e2e_test.go Migrated to Docker Chrome with URL conversion
lvtsource_e2e_test.go Migrated to Docker Chrome with URL conversion
frontmatter_config_e2e_test.go Migrated to Docker Chrome with URL conversion
execargs_e2e_test.go Migrated to Docker Chrome with URL conversion
exec_toolbar_e2e_test.go Migrated to Docker Chrome with URL conversion
computed_expressions_e2e_test.go Migrated to Docker Chrome with URL conversion
component_library_e2e_test.go Migrated to Docker Chrome with URL conversion
build_command_e2e_test.go Migrated to Docker Chrome; retained context for exec operations
action_buttons_e2e_test.go Migrated to Docker Chrome with URL conversion

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +244 to +246
// Set viewport via emulation
chromedp.EmulateViewport(int64(vp.width), int64(vp.height)),
chromedp.Navigate(url),
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The viewport configuration from the test case is no longer being applied. Previously, the viewport was set through chromedp.WindowSize() in the allocator options, but now you're calling chromedp.EmulateViewport() after the Chrome context is already created with default settings. However, the viewport dimensions from the test case (vp.width, vp.height) are not being used - the test appears to be testing different viewport sizes but is now using the same viewport for all test cases.

Copilot uses AI. Check for mistakes.
Comment on lines 1 to 220
//go:build !ci

package tinkerdown

import (
"context"
"fmt"
"net"
"net/http"
"os/exec"
"strings"
"testing"
"time"

"github.com/chromedp/chromedp"
)

const (
dockerImage = "chromedp/headless-shell:latest"
chromeContainerPrefix = "chrome-e2e-tinkerdown-"
)

// DockerChromeContext provides a Docker Chrome context for E2E tests.
type DockerChromeContext struct {
Context context.Context
Cancel context.CancelFunc
ChromePort int
t *testing.T
}

// SetupDockerChrome starts a Docker Chrome container and returns a chromedp context.
// Call cleanup() when done to stop the container and cancel the context.
func SetupDockerChrome(t *testing.T, timeout time.Duration) (*DockerChromeContext, func()) {
t.Helper()

chromePort, err := getFreePort()
if err != nil {
t.Fatalf("Failed to allocate Chrome port: %v", err)
}

if err := startDockerChrome(t, chromePort); err != nil {
t.Fatalf("Failed to start Docker Chrome: %v", err)
}

chromeURL := fmt.Sprintf("http://localhost:%d", chromePort)
allocCtx, allocCancel := chromedp.NewRemoteAllocator(context.Background(), chromeURL)

ctx, ctxCancel := chromedp.NewContext(allocCtx, chromedp.WithLogf(t.Logf))

// Apply timeout
ctx, timeoutCancel := context.WithTimeout(ctx, timeout)

dcc := &DockerChromeContext{
Context: ctx,
Cancel: timeoutCancel,
ChromePort: chromePort,
t: t,
}

cleanup := func() {
timeoutCancel()
ctxCancel()
allocCancel()
stopDockerChrome(t, chromePort)
}

return dcc, cleanup
}

// GetChromeTestURL returns the URL for Chrome (in Docker) to access the test server.
// Chrome container uses host.docker.internal to reach the host on all platforms.
func GetChromeTestURL(port int) string {
return fmt.Sprintf("http://host.docker.internal:%d", port)
}

// getFreePort asks the kernel for a free open port that is ready to use.
func getFreePort() (port int, err error) {
var a *net.TCPAddr
if a, err = net.ResolveTCPAddr("tcp", "localhost:0"); err == nil {
var l *net.TCPListener
if l, err = net.ListenTCP("tcp", a); err == nil {
defer l.Close()
return l.Addr().(*net.TCPAddr).Port, nil
}
}
return
}

// startDockerChrome starts the chromedp headless-shell Docker container.
func startDockerChrome(t *testing.T, debugPort int) error {
t.Helper()

// Check if Docker is available
if _, err := exec.Command("docker", "version").CombinedOutput(); err != nil {
t.Skip("Docker not available, skipping E2E test")
}

containerName := fmt.Sprintf("%s%d", chromeContainerPrefix, debugPort)
cleanupContainerByName(t, containerName)

// Check if image exists, if not try to pull it (with timeout)
checkCmd := exec.Command("docker", "image", "inspect", dockerImage)
if _, err := checkCmd.CombinedOutput(); err != nil {
// Image doesn't exist, try to pull with timeout
t.Log("Pulling chromedp/headless-shell Docker image...")

pullCtx, pullCancel := context.WithTimeout(context.Background(), 60*time.Second)
defer pullCancel()

pullCmd := exec.CommandContext(pullCtx, "docker", "pull", dockerImage)
if output, err := pullCmd.CombinedOutput(); err != nil {
if pullCtx.Err() == context.DeadlineExceeded {
t.Fatal("Docker pull timed out after 60 seconds")
}
t.Fatalf("Failed to pull Docker image: %v\nOutput: %s", err, output)
}
t.Log("Docker image pulled successfully")
}

// Start the container
t.Log("Starting Chrome headless Docker container...")
portMapping := fmt.Sprintf("%d:9222", debugPort)

cmd := exec.Command("docker", "run", "-d",
"--rm",
"--memory", "512m",
"--cpus", "0.5",
"-p", portMapping,
"--name", containerName,
"--add-host", "host.docker.internal:host-gateway",
dockerImage,
)

if _, err := cmd.Output(); err != nil {
return fmt.Errorf("failed to start Chrome Docker container: %w", err)
}

// Wait for Chrome to be ready
t.Log("Waiting for Chrome to be ready...")
chromeURL := fmt.Sprintf("http://localhost:%d/json/version", debugPort)
ready := false
var lastErr error
for i := 0; i < 120; i++ { // 60 seconds
resp, err := http.Get(chromeURL)
if err == nil {
resp.Body.Close()
ready = true
t.Logf("Chrome ready after %d attempts (%.1fs)", i+1, float64(i+1)*0.5)
break
}
lastErr = err
time.Sleep(500 * time.Millisecond)
}

if !ready {
t.Logf("Chrome failed to start within 60 seconds. Last error: %v", lastErr)

// Try to get container logs for debugging
logsCmd := exec.Command("docker", "logs", "--tail", "50", containerName)
if output, err := logsCmd.CombinedOutput(); err == nil && len(output) > 0 {
t.Logf("Chrome container logs:\n%s", string(output))
}

// Clean up the container
_, _ = exec.Command("docker", "rm", "-f", containerName).CombinedOutput()
return fmt.Errorf("Chrome failed to start within 60 seconds: %w", lastErr)
}

t.Log("Chrome headless Docker container ready")
return nil
}

// stopDockerChrome stops and removes the Chrome Docker container.
func stopDockerChrome(t *testing.T, debugPort int) {
t.Helper()
t.Log("Stopping Chrome Docker container...")

containerName := fmt.Sprintf("%s%d", chromeContainerPrefix, debugPort)

rmCmd := exec.Command("docker", "rm", "-f", containerName)
if output, err := rmCmd.CombinedOutput(); err != nil {
errMsg := string(output)
if !strings.Contains(errMsg, "No such container") && !strings.Contains(err.Error(), "No such container") {
t.Logf("Warning: Failed to remove Docker container: %v (output: %s)", err, errMsg)
}
}
}

// cleanupContainerByName removes any existing container with the given name.
func cleanupContainerByName(t *testing.T, name string) {
rmCmd := exec.Command("docker", "rm", "-f", name)
rmCmd.CombinedOutput() // Ignore errors - container may not exist
}

// WaitForServer polls an HTTP server until it responds or timeout is reached.
func WaitForServer(t *testing.T, serverURL string, timeout time.Duration) {
t.Helper()

deadline := time.Now().Add(timeout)
for time.Now().Before(deadline) {
resp, err := http.Get(serverURL)
if err == nil {
resp.Body.Close()
t.Logf("Server ready at %s", serverURL)
return
}
time.Sleep(100 * time.Millisecond)
}
t.Fatalf("Server at %s failed to become ready within %v", serverURL, timeout)
}

// ConvertURLForDockerChrome converts an httptest URL (like http://127.0.0.1:12345)
// to a URL accessible from Docker Chrome (http://host.docker.internal:12345).
func ConvertURLForDockerChrome(httptestURL string) string {
// httptest URLs are like "http://127.0.0.1:12345" or "http://[::1]:12345"
// We need to replace the host with host.docker.internal
url := strings.Replace(httptestURL, "127.0.0.1", "host.docker.internal", 1)
url = strings.Replace(url, "[::1]", "host.docker.internal", 1)
return url
}
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two helper files (e2e_helpers_test.go and e2e_helpers_external_test.go) contain nearly identical code (220 lines each), differing only in the package name and container name prefix. This violates the DRY principle and creates a maintenance burden. Consider consolidating this duplicated code into a single shared implementation or using build tags more effectively to share the common functionality while allowing package-specific variations only where necessary.

Copilot uses AI. Check for mistakes.
Comment on lines +212 to +220
// ConvertURLForDockerChrome converts an httptest URL (like http://127.0.0.1:12345)
// to a URL accessible from Docker Chrome (http://host.docker.internal:12345).
func ConvertURLForDockerChrome(httptestURL string) string {
// httptest URLs are like "http://127.0.0.1:12345" or "http://[::1]:12345"
// We need to replace the host with host.docker.internal
url := strings.Replace(httptestURL, "127.0.0.1", "host.docker.internal", 1)
url = strings.Replace(url, "[::1]", "host.docker.internal", 1)
return url
}
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URL conversion logic only handles IPv4 (127.0.0.1) and IPv6 ([::1]) localhost addresses. However, httptest.NewServer() may use other address formats depending on the system configuration. Consider using a more robust URL parsing approach with the net/url package to extract and replace just the host portion, which would handle edge cases more reliably.

Copilot uses AI. Check for mistakes.
Comment on lines 89 to 99
// startDockerChrome starts the chromedp headless-shell Docker container.
func startDockerChrome(t *testing.T, debugPort int) error {
t.Helper()

// Check if Docker is available
if _, err := exec.Command("docker", "version").CombinedOutput(); err != nil {
t.Skip("Docker not available, skipping E2E test")
}

containerName := fmt.Sprintf("%s%d", chromeContainerPrefix, debugPort)
cleanupContainerByName(t, containerName)
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a potential race condition when running tests in parallel. If multiple tests use the same Chrome port due to getFreePort() being called before the container actually starts listening, or if container cleanup from a previous test hasn't completed when a new test starts, tests could interfere with each other. The container name includes the port to help with this, but consider adding additional safeguards such as retrying container startup if the port is already in use, or using a mutex to serialize Chrome container creation.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +2
//go:build !ci

Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build tag //go:build !ci prevents these tests from running in CI environments, which contradicts the PR's stated goal of "migrating E2E tests to Docker Chrome for reliable CI execution". If the intent is to make tests runnable in CI, this build tag should either be removed or changed to a different condition. The PR description mentions tests should "skip gracefully when Docker is unavailable" which is already handled by the Docker check in startDockerChrome, so the build tag seems unnecessary.

Suggested change
//go:build !ci

Copilot uses AI. Check for mistakes.
}

// cleanupContainerByName removes any existing container with the given name.
func cleanupContainerByName(t *testing.T, name string) {
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanupContainerByName function signature is inconsistent between the two helper files. In e2e_helpers_test.go it takes a *testing.T parameter (line 190), while in e2e_helpers_external_test.go it does not (line 200). This inconsistency could lead to confusion and makes the code harder to maintain. The function should have the same signature in both files.

Suggested change
func cleanupContainerByName(t *testing.T, name string) {
func cleanupContainerByName(name string) {

Copilot uses AI. Check for mistakes.
Comment on lines +126 to +127
"--memory", "512m",
"--cpus", "0.5",
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Docker container is configured with very conservative resource limits (512MB memory, 0.5 CPUs). While this is good for preventing resource exhaustion, these limits might be too restrictive for Chrome to run reliably, especially when running multiple tests in parallel or when pages have complex rendering requirements. Consider making these limits configurable or increasing them based on empirical testing to ensure Chrome doesn't crash due to resource constraints.

Copilot uses AI. Check for mistakes.
Comment on lines 138 to 153
// Wait for Chrome to be ready
t.Log("Waiting for Chrome to be ready...")
chromeURL := fmt.Sprintf("http://localhost:%d/json/version", debugPort)
ready := false
var lastErr error
for i := 0; i < 120; i++ { // 60 seconds
resp, err := http.Get(chromeURL)
if err == nil {
resp.Body.Close()
ready = true
t.Logf("Chrome ready after %d attempts (%.1fs)", i+1, float64(i+1)*0.5)
break
}
lastErr = err
time.Sleep(500 * time.Millisecond)
}
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Chrome readiness check polls for up to 60 seconds (120 iterations × 0.5s), but closes the HTTP response body immediately after a successful GET without reading it. While this is technically correct for checking readiness, it would be more robust to also verify the response status code is 200 and optionally parse the JSON to ensure Chrome is actually ready to accept connections, not just responding to HTTP requests.

Copilot uses AI. Check for mistakes.
Comment on lines 643 to +660
tempDir, cleanup := createTempMarkdownExample(t)
defer cleanup()

ts, ctx, cancel, consoleLogs := setupMarkdownTest(t, tempDir)
defer cancel()
testCtx, cleanup := setupMarkdownTest(t, tempDir)
defer cleanup()
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable shadowing issue: The cleanup variable is declared twice in the same scope - once from createTempMarkdownExample(t) and again from setupMarkdownTest(t, tempDir). This shadows the first cleanup function, which means the temporary directory cleanup from createTempMarkdownExample will never be called, potentially leaving temporary files on disk after the test completes. You should use different variable names (e.g., tempCleanup and testCleanup) as was done in other tests in this file.

Copilot uses AI. Check for mistakes.
- Pin Docker image to specific version (131.0.6778.264) for reproducible builds
- Add HTTP client timeout (2s) to avoid hanging during Chrome readiness check
- Remove unused `t` parameter from cleanupContainerByName
- Add explanatory comment about intentional code duplication between
  e2e_helpers_test.go and e2e_helpers_external_test.go (required by Go
  package boundaries - cannot share code between internal and external tests)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude
Copy link

claude bot commented Jan 26, 2026

Code Review Summary

This PR successfully migrates E2E tests from local Chrome to Docker Chrome for improved CI reliability. The changes are well-structured and address real flakiness issues. Overall, this is a solid improvement with good attention to detail.

✅ Strengths

  1. Clear Problem & Solution: The PR addresses specific CI flakiness (websocket timeouts, context deadline exceeded) with a well-defined Docker-based approach.

  2. Comprehensive Migration: All 20+ E2E test files have been consistently updated using the new helper functions.

  3. Robust Implementation:

    • Pinned Docker image version (131.0.6778.264) for reproducibility
    • HTTP client timeout (2s) prevents hanging during readiness checks
    • Proper health polling (60s timeout) before running tests
    • Graceful test skipping when Docker unavailable
    • Resource limits on containers (--memory 512m, --cpus 0.5)
  4. Good Code Organization: Extracted shared helpers (SetupDockerChrome, ConvertURLForDockerChrome) reduce duplication and standardize the pattern.

  5. CI Success: All checks passing - the migration works as intended.

🔍 Potential Issues & Concerns

1. Code Duplication (e2e_helpers_test.go vs e2e_helpers_external_test.go)

The 230+ lines are duplicated between internal and external test packages. While the comment explains the Go package boundary limitation, consider:

  • Alternative: Could these be exported from a dedicated testing or testutil sub-package?
  • Risk: Future changes need to be applied to both files manually, increasing maintenance burden
  • Observation: Only differences are package name and container prefix (chrome-e2e-tinkerdown- vs chrome-e2e-tinkerdown-ext-)

2. Missing GetChromeTestURL Usage

Both helper files define GetChromeTestURL(port int) but it's never used in e2e_helpers_test.go:82 and e2e_helpers_external_test.go:81. All tests use ConvertURLForDockerChrome(httptestURL) instead. This suggests:

  • Either remove the unused function
  • Or document when to use each function

3. Resource Cleanup Concerns

Issue in cleanupContainerByName (e2e_helpers_test.go:202, e2e_helpers_external_test.go:207):
This function silently ignores all errors. If Docker daemon is not responsive or there's a permission issue, containers could accumulate.

Recommendation: At minimum, log errors that aren't "No such container" (similar to how stopDockerChrome handles it properly in e2e_helpers_test.go:189-195).

4. Context Hierarchy in DockerChromeContext

The struct holds both Context and Cancel, but the context is created with THREE cancels (timeout, context, allocator). Only the timeout cancel is exposed. This means:

  • Calling dcc.Cancel() only cancels the timeout
  • The underlying contexts are cancelled via the cleanup function
  • Potential confusion: The Cancel field name suggests it cancels everything

Suggestion: Either remove the Cancel field (since cleanup handles everything) or rename it to TimeoutCancel for clarity.

5. Test Timeout Consistency

Different tests use different timeouts:

  • Most tests: 60 seconds
  • Build tests: 30 seconds
  • Markdown tests: 90 seconds

Question: Are these timeouts based on empirical data or arbitrary? Consider documenting why certain tests need longer timeouts.

6. Docker Image Pull Timeout

The 60-second pull timeout (e2e_helpers_test.go:115) might be too short on slow connections. Consider making this configurable via environment variable for CI environments with slow network.

🔐 Security Considerations

  1. Docker Socket Access: Tests require Docker daemon access. This is acceptable for E2E tests but ensure CI runners have appropriate permissions.

  2. Container Cleanup: The --rm flag ensures containers are removed, and explicit cleanup in stopDockerChrome provides redundancy. Good defensive programming.

  3. Resource Limits: Memory and CPU limits prevent resource exhaustion. Well done.

⚡ Performance Considerations

  1. Port Allocation: Using getFreePort() is good, but there's a race condition window between getting the port and starting the container. This is generally acceptable for tests but be aware of potential conflicts in heavily parallel test runs.

  2. Docker Overhead: Each test now incurs Docker container startup time. This is the trade-off for reliability. Current approach is safer and simpler than trying to reuse containers.

  3. CI Build Time: Watch for increased CI duration due to Docker startup overhead.

📝 Code Quality

  1. Build Tag Usage: The build tag //go:build !ci at the top of both helper files suggests these helpers shouldn't be used in CI. But the entire PR purpose is CI reliability. This needs clarification:

    • If the tag is correct, tests won't use Docker Chrome in CI (defeating the purpose)
    • If tests should use Docker Chrome in CI, the build tag should be removed or changed

    This needs immediate clarification - it might prevent the entire migration from working in CI.

  2. Error Messages: Good descriptive error messages throughout.

  3. Comments: The duplication justification comment is helpful. Consider adding more comments explaining the Docker networking setup (host.docker.internal).

🧪 Test Coverage

The PR description checklist shows:

  • All test files compile without errors ✅
  • go vet passes ✅
  • Tests skip gracefully when Docker unavailable ✅
  • CI tests pass with Docker Chrome ⚠️

Critical: The last item is unchecked. Combined with the !ci build tag concern, this suggests the Docker Chrome migration may not actually be running in CI yet.

📌 Recommendations

High Priority:

  1. Clarify the //go:build !ci tag - This is the most critical issue that could prevent the PR from achieving its goal
  2. Remove unused GetChromeTestURL function or document its purpose
  3. Verify Docker Chrome tests are actually running in CI (not just skipping)

Medium Priority:
4. Consider extracting duplicated code to a shared testutil package
5. Improve cleanupContainerByName error handling
6. Clarify or rename DockerChromeContext.Cancel field

Low Priority:
7. Document timeout choices
8. Consider making image pull timeout configurable
9. Add comment explaining host.docker.internal networking

✨ Conclusion

This is a well-executed migration that addresses real CI reliability issues. The code is clean, follows good practices, and the comprehensive migration across all test files shows attention to detail. The main concern is the !ci build tag which may prevent the Docker Chrome setup from being used in CI environments - please verify this isn't blocking the intended functionality.

Once the build tag issue is clarified and the unused function is addressed, this PR will be ready to merge.

Recommendation: Approve with requested changes to build tags and unused code removal.

@adnaan adnaan merged commit cac6f2d into main Jan 26, 2026
3 checks passed
@adnaan adnaan deleted the fix/docker-chrome-e2e-tests branch January 26, 2026 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant