Skip to content

refactor(testing): add Docker Chrome helpers for reliable CI testing#85

Open
adnaan wants to merge 3 commits intomainfrom
refactor/docker-chrome-test-helpers
Open

refactor(testing): add Docker Chrome helpers for reliable CI testing#85
adnaan wants to merge 3 commits intomainfrom
refactor/docker-chrome-test-helpers

Conversation

@adnaan
Copy link
Contributor

@adnaan adnaan commented Feb 2, 2026

Summary

  • Add SetupDockerChrome() helper to simplify Docker Chrome setup in tests that manage their own server
  • Add WaitForServer() helper to replace arbitrary time.Sleep() calls with proper readiness checks
  • Refactor loading indicator and focus preservation tests to use dynamic ports, Docker Chrome, and proper server readiness

Changes

File Description
testing/testing.go Added DockerChromeContext struct and SetupDockerChrome() helper
testing/chrome.go Added WaitForServer() helper function
e2e/livetemplate_core_test.go Refactored 4 tests to use new helpers

Benefits

  • No more port conflicts: Dynamic port allocation via GetFreePort() instead of hardcoded 9001-9004
  • Reliable CI: Docker Chrome for consistent headless browser execution
  • No race conditions: WaitForServer() replaces time.Sleep(100ms) with proper polling
  • Simpler test code: New helpers reduce boilerplate in tests with custom server setup

Test plan

  • CI browser tests pass
  • Loading indicator tests work correctly
  • Focus preservation tests work correctly

🤖 Generated with Claude Code

- Add SetupDockerChrome() helper to simplify Docker Chrome setup in tests
- Add WaitForServer() helper to replace arbitrary time.Sleep() calls
- Refactor loading indicator and focus preservation tests to use:
  - Dynamic port allocation via GetFreePort()
  - Docker Chrome for consistent CI execution
  - Proper server readiness checks
  - GetChromeTestURL() for Docker-to-host networking

This improves test reliability by eliminating hardcoded ports and
race conditions from sleep-based waiting.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 2, 2026 06:38
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds reusable helpers to make browser E2E tests more reliable in CI by standardizing Docker-based Chrome setup and replacing arbitrary sleeps with a server-readiness poll, then refactors selected tests to use these helpers and dynamic ports.

Changes:

  • Added SetupDockerChrome() and a DockerChromeContext helper to simplify Docker Chrome usage in tests that run their own HTTP server.
  • Added WaitForServer() helper to poll for server readiness instead of using time.Sleep.
  • Refactored 4 E2E tests to use dynamic ports + Docker Chrome + readiness polling.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
testing/testing.go Introduces DockerChromeContext and SetupDockerChrome() helper for Docker-based chromedp contexts.
testing/chrome.go Adds WaitForServer() helper for server readiness polling.
e2e/livetemplate_core_test.go Updates loading/focus E2E tests to use dynamic ports, Docker Chrome, and server readiness checks.
Comments suppressed due to low confidence (1)

e2e/livetemplate_core_test.go:1216

  • url is now GetChromeTestURL(port) (host.docker.internal), but this http.Get(url) runs on the host test process, not inside the Docker Chrome container. On Linux CI, host.docker.internal typically won’t resolve from the host, causing this check to fail. Use http://localhost:<port> for the host-side fetch, and keep GetChromeTestURL only for chromedp navigation inside Docker.
	// Use host.docker.internal for Docker Chrome to access host server
	url := e2etest.GetChromeTestURL(port)

	resp, err := http.Get(url)
	if err != nil {
		t.Fatalf("Failed to fetch page: %v", err)
	}

//
// // Use GetChromeTestURL for Docker Chrome to access host
// url := e2etest.GetChromeTestURL(port)
// chromedp.Run(chromeCtx, chromedp.Navigate(url), ...)
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the usage comment, SetupDockerChrome returns a *DockerChromeContext, but the example calls chromedp.Run(chromeCtx, ...). That won’t compile and may mislead future test authors; the example should use chromeCtx.Context (or return a context.Context directly).

Suggested change
// chromedp.Run(chromeCtx, chromedp.Navigate(url), ...)
// chromedp.Run(chromeCtx.Context, chromedp.Navigate(url), ...)

Copilot uses AI. Check for mistakes.
Comment on lines 260 to 294
type DockerChromeContext struct {
Context context.Context
Cancel context.CancelFunc
ChromePort int
t *testing.T
}

// SetupDockerChrome starts a Docker Chrome container and returns a chromedp context.
// Call cleanup() when done to stop the container and cancel the context.
func SetupDockerChrome(t *testing.T, timeout time.Duration) (*DockerChromeContext, func()) {
t.Helper()

chromePort, err := GetFreePort()
if err != nil {
t.Fatalf("Failed to allocate Chrome port: %v", err)
}

if err := StartDockerChrome(t, chromePort); err != nil {
t.Fatalf("Failed to start Docker Chrome: %v", err)
}

chromeURL := fmt.Sprintf("http://localhost:%d", chromePort)
allocCtx, allocCancel := chromedp.NewRemoteAllocator(context.Background(), chromeURL)

ctx, ctxCancel := chromedp.NewContext(allocCtx, chromedp.WithLogf(t.Logf))

// Apply timeout
ctx, timeoutCancel := context.WithTimeout(ctx, timeout)

dcc := &DockerChromeContext{
Context: ctx,
Cancel: timeoutCancel,
ChromePort: chromePort,
t: t,
}
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DockerChromeContext.Cancel is set to timeoutCancel, which only cancels the timeout context and does not cancel the chromedp context/allocator or stop the Docker container. Since this is an exported field, it’s easy to misuse and leak containers/resources; consider removing the Cancel field or making it cancel the full Chrome session (or expose only the cleanup function).

Copilot uses AI. Check for mistakes.
Comment on lines 353 to 366
func WaitForServer(t *testing.T, serverURL string, timeout time.Duration) {
t.Helper()

deadline := time.Now().Add(timeout)
for time.Now().Before(deadline) {
resp, err := http.Get(serverURL)
if err == nil {
resp.Body.Close()
t.Logf("✅ Server ready at %s", serverURL)
return
}
time.Sleep(100 * time.Millisecond)
}
t.Fatalf("Server at %s failed to become ready within %v", serverURL, timeout)
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WaitForServer uses http.Get with the default client, which has no request/response timeout. If the connection hangs (e.g., SYN stalls or server accepts but never responds), this function can block well past the provided timeout. Use an http.Client with a small per-request timeout (or NewRequestWithContext) so the overall deadline is respected.

Copilot uses AI. Check for mistakes.
@claude
Copy link

claude bot commented Feb 2, 2026

Code Review: PR #85 - Docker Chrome Test Helpers

Summary

This PR refactors E2E testing infrastructure by introducing reusable Docker Chrome helpers (SetupDockerChrome() and WaitForServer()) to eliminate port conflicts, race conditions, and improve CI reliability. The changes reduce test boilerplate and make tests more maintainable.

✅ Strengths

1. Excellent Design & Abstraction

  • SetupDockerChrome() encapsulates complex Chrome setup into a simple helper with clean cleanup
  • DockerChromeContext struct provides good encapsulation of Chrome context and metadata
  • WaitForServer() replaces arbitrary sleeps with proper readiness polling - much more reliable
  • Dynamic port allocation via GetFreePort() eliminates hardcoded ports and race conditions

2. Code Quality

  • Consistent error handling with clear error messages
  • Proper use of t.Helper() in test helpers (testing/testing.go:270, testing/chrome.go:354)
  • Good documentation in docstrings explaining usage patterns
  • Consistent timeout handling (10s for server, 30s for Chrome)

3. Cleanup & Resource Management

  • Proper cleanup chain: timeout cancel → context cancel → allocator cancel → Docker stop
  • Sequential cleanup prevents resource leaks
  • Consistent use of defer cleanup() pattern in refactored tests

4. Testing Best Practices

  • Eliminates flaky time.Sleep(100ms) with proper server readiness checks
  • Uses host.docker.internal correctly for Docker Chrome → host communication
  • Consistent naming convention for shutdown contexts (shutdownCtx)

🔍 Issues & Concerns

1. Missing Test Coverage ⚠️

Severity: Medium

The new helper functions SetupDockerChrome() and WaitForServer() lack unit tests:

  • testing/chrome.go:351-367 (WaitForServer) - no tests validating timeout behavior
  • testing/testing.go:267-304 (SetupDockerChrome) - no tests for cleanup chain

2. Potential Resource Leak in Error Path ⚠️

Severity: Low
Location: testing/testing.go:277-303

If StartDockerChrome() succeeds but chromedp.NewRemoteAllocator() or chromedp.NewContext() fails, the Docker container may not be cleaned up. Add defer StopDockerChrome(t, chromePort) immediately after StartDockerChrome() succeeds.

3. WaitForServer HTTP Response Not Closed in Error Path ⚠️

Severity: Low
Location: testing/chrome.go:358-365

While unlikely, http.Get can return both a non-nil response and an error. Add defensive body close for the error path.

4. No Timeout Context for HTTP.Get ⚠️

Severity: Low
Location: testing/chrome.go:358

Consider using a client with timeout to prevent indefinite blocking.

🔒 Security Considerations

✅ No security concerns identified

⚡ Performance Considerations

Positive Impact:

  • Dynamic port allocation prevents CI failures from port conflicts
  • WaitForServer() with 100ms polling is efficient

📝 Documentation & Maintainability

Excellent documentation:

  • Clear usage examples in docstrings
  • Inline comments explain the "why"

🎯 Verdict

APPROVE with minor fixes recommended

This is a solid refactoring that significantly improves test reliability and maintainability. All issues identified are low severity and can be addressed as follow-ups.

Should fix soon:

  • Add test coverage for new helpers
  • Fix potential resource leak in SetupDockerChrome error path
  • Add defensive body close in WaitForServer

Nice to have:

  • Extract magic numbers to constants
  • Add HTTP client timeout to WaitForServer

Great work on improving test infrastructure! 🚀

Generated by Claude Code PR Review

- Fix docstring example to use chromeCtx.Context (Copilot)
- Remove Cancel field from DockerChromeContext to prevent misuse (Copilot)
  Users should use the cleanup function returned by SetupDockerChrome
- Add per-request HTTP timeout to WaitForServer (Copilot)
  Prevents indefinite blocking if connection hangs

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude
Copy link

claude bot commented Feb 2, 2026

PR Review: Refactor Testing Helpers for Docker Chrome

Overview

This PR refactors E2E tests to use new helper functions for Docker Chrome setup and server readiness checks. The changes improve test reliability by eliminating hardcoded ports and race conditions.


✅ Strengths

1. Excellent Problem-Solution Fit

  • Port conflicts eliminated: Dynamic port allocation via GetFreePort() replaces hardcoded ports (9001-9004)
  • Race conditions removed: WaitForServer() replaces time.Sleep(100*time.Millisecond) with proper polling
  • Code reuse: New helpers reduce boilerplate across 4 tests (55 lines added vs 71 deleted)

2. Well-Designed Abstractions

  • SetupDockerChrome() provides a clean, composable API for tests managing their own servers
  • DockerChromeContext struct properly encapsulates Chrome context and metadata
  • Clear separation of concerns between server setup and browser setup

3. Good Documentation

  • Comprehensive godoc comments with usage examples
  • Inline comments explain the host.docker.internal pattern
  • PR description clearly articulates benefits

🔍 Code Quality Observations

testing/testing.go:268-301

The SetupDockerChrome() function is well-structured with proper resource cleanup:

  • Multiple context cancel functions properly chained in cleanup closure
  • Timeout applied at the chromedp context level
  • t.Helper() correctly marks this as a test helper

testing/chrome.go:351-370

WaitForServer() implementation is solid:

  • Per-request timeout prevents indefinite blocking
  • 100ms polling interval is reasonable
  • Proper t.Helper() usage for test output
  • Log message provides useful feedback

e2e/livetemplate_core_test.go

Refactored tests follow consistent patterns:

  • Dynamic port allocation before server start
  • Server readiness check before Chrome operations
  • Proper cleanup with defer
  • Consistent use of GetChromeTestURL() for Docker networking

🐛 Potential Issues

1. Missing Error Check (Minor)

Location: e2e/livetemplate_core_test.go:1286 (and similar lines)

The tests check err from GetFreePort() but then immediately use chromeCtx.Context without checking if SetupDockerChrome() succeeded:

chromeCtx, cleanup := e2etest.SetupDockerChrome(t, 30*time.Second)
defer cleanup()

ctx := chromeCtx.Context  // Used immediately without checking for setup errors

Impact: Low - SetupDockerChrome() calls t.Fatalf() on failure, so the test will fail appropriately. However, the pattern is slightly inconsistent.

Recommendation: This is acceptable given t.Fatalf() usage, but consider documenting this behavior in the function godoc.

2. Cleanup Order with Deferred Functions

Location: All refactored tests in e2e/livetemplate_core_test.go

The tests have two deferred cleanup functions:

chromeCtx, cleanup := e2etest.SetupDockerChrome(t, 30*time.Second)
defer cleanup()  // Chrome cleanup (second to execute)

// ... server setup ...

defer func() {   // Server cleanup (first to execute)
    shutdownCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()
    if err := server.Shutdown(shutdownCtx); err != nil {
        t.Logf("Server shutdown warning: %v", err)
    }
}()

Impact: Low - The cleanup order (server first, then Chrome) is actually correct for this use case, since Chrome depends on the server.

Observation: The order is correct but worth noting for future reviewers.


🚀 Performance & Reliability

Positive Impact

  1. Dynamic ports eliminate test suite parallelization issues
  2. Proper server readiness checks make tests more deterministic
  3. Docker Chrome provides consistent headless browser environment across CI/local

Concerns

None identified. The polling intervals (100ms) are reasonable for test scenarios.


🔒 Security Considerations

No security concerns. These are test-only utilities.


🧪 Test Coverage

Coverage Impact

  • The new helper functions (SetupDockerChrome, WaitForServer) are tested indirectly through the refactored E2E tests
  • 4 existing tests refactored to use new helpers
  • No new test cases added (refactor-only PR)

Recommendation

Consider adding unit tests for WaitForServer() to verify:

  • Proper timeout behavior
  • Correct handling of server startup delays
  • HTTP client timeout isolation

This could be a follow-up PR rather than blocking this one.


📝 Documentation & Maintainability

Strengths

  • Godoc comments on new functions are excellent
  • Usage examples in comments match actual usage
  • PR description provides clear motivation and benefits
  • Inline comments explain Docker networking pattern

Minor Suggestions

  1. Consider adding a comment in SetupDockerChrome godoc about the t.Fatalf() behavior on errors
  2. The DockerChromeContext.t field is unexported and unused - consider removing it or documenting why it's there for future use

✅ Final Verdict

LGTM - Approve with minor observations

This is a well-executed refactoring that improves test reliability without introducing regressions. The new abstractions are clean, well-documented, and properly tested through usage. The minor issues noted are cosmetic and don't affect functionality.

Checklist

  • ✅ Code quality and best practices followed
  • ✅ No significant bugs identified
  • ✅ Performance improvements (eliminates race conditions)
  • ✅ No security concerns
  • ✅ Test coverage adequate (tests the refactored code)
  • ✅ Documentation is comprehensive

Recommendations for Future PRs

  1. Add unit tests for WaitForServer() helper
  2. Consider removing unused DockerChromeContext.t field or document its purpose
  3. Document t.Fatalf() behavior in SetupDockerChrome godoc

Great work! This refactoring significantly improves the test infrastructure. 🎉

On Linux CI, host.docker.internal only resolves from inside Docker
containers, not from the host machine. This fixes the test failure:

  Failed to fetch page: Get "http://host.docker.internal:38295":
  dial tcp: lookup host.docker.internal: no such host

Now uses:
- localhost for host-side http.Get() calls (runs on test host)
- GetChromeTestURL() only for chromedp navigation (runs inside Docker)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude
Copy link

claude bot commented Feb 2, 2026

Code Review - PR #85

Overall, this is a well-structured refactoring that significantly improves test reliability. The changes follow good practices and the implementation is solid. Below are my detailed findings:

✅ Strengths

  1. Excellent Abstraction: The SetupDockerChrome() and WaitForServer() helpers reduce boilerplate and improve test maintainability
  2. Proper Resource Management: Dynamic port allocation via GetFreePort() eliminates hardcoded ports and prevents conflicts
  3. Reliability Improvements: Replacing time.Sleep() with WaitForServer() polling eliminates race conditions
  4. Good Documentation: Clear godoc comments explain usage patterns with helpful examples
  5. Consistent Cleanup: Proper defer patterns ensure resources are cleaned up even on test failure

🔍 Code Quality Observations

testing/testing.go:268-302 (SetupDockerChrome)

Good:

  • Proper use of t.Helper() for better stack traces
  • Multiple context cancellation functions properly chained
  • Clear separation of concerns in cleanup function

Minor Suggestion:
The cleanup function stacks three cancel functions (timeoutCancel, ctxCancel, allocCancel) plus StopDockerChrome. Consider adding a comment explaining the cleanup order matters (timeout → context → allocator → container):

cleanup := func() {
    // Cancel in reverse order of creation
    timeoutCancel()  // Cancel timeout first
    ctxCancel()      // Then context
    allocCancel()    // Then allocator
    StopDockerChrome(t, chromePort)  // Finally stop container
}

testing/chrome.go:348-369 (WaitForServer)

Good:

  • 2-second per-request timeout prevents indefinite blocking
  • Proper use of t.Helper()
  • Closes response body immediately after successful check
  • Clear success logging

Potential Issue:
The function doesn't distinguish between different types of HTTP errors. A server returning 500 will be considered "ready" when it might actually be broken. Consider checking the status code:

resp, err := client.Get(serverURL)
if err == nil {
    resp.Body.Close()
    if resp.StatusCode < 500 {  // Accept any non-server-error status
        t.Logf("✅ Server ready at %s", serverURL)
        return
    }
}

However, this might be intentional if you want to detect the server is listening regardless of its health state.

e2e/livetemplate_core_test.go (Test Refactoring)

Good:

  • Consistent pattern across all 4 refactored tests
  • Proper distinction between localhostURL (for host-side HTTP) and chromeURL (for Docker Chrome)
  • Variable naming (shutdownCtx) is clearer than reusing ctx

Observation:
Lines 1210-1212 show careful handling of the URL distinction:

localhostURL := fmt.Sprintf("http://localhost:%d", port)
chromeURL := e2etest.GetChromeTestURL(port)
resp, err := http.Get(localhostURL)  // Correct - host-side fetch

This is excellent and prevents a common mistake.

🔒 Security Considerations

No security concerns identified. The changes:

  • Use proper resource cleanup to prevent resource exhaustion
  • Don't introduce any new external inputs or attack surfaces
  • Memory limits on Docker containers (512MB) are retained from existing code

⚡ Performance Considerations

Positive:

  • Dynamic port allocation enables parallel test execution
  • Docker Chrome container reuse is more efficient than spawning new Chrome instances

Minor Note:
WaitForServer() polls every 100ms which is reasonable, but the total timeout of 10 seconds might be conservative for slow CI environments. Current implementation is fine, but if tests occasionally timeout, this is a tuning knob.

🧪 Test Coverage

Observation:
The PR modifies 4 existing tests to use the new helpers:

  • TestLoadingIndicator
  • TestLoadingIndicatorDisabled
  • TestFocusPreservation
  • TestFocusPreservationMultipleInputs

Suggestion:
Consider adding a dedicated test for the new helpers themselves:

  • Test WaitForServer() timeout behavior
  • Test SetupDockerChrome() cleanup on test failure
  • Test that GetFreePort() actually returns available ports

This would increase confidence that the helpers work correctly in edge cases.

📋 Documentation

Good:

  • Clear godoc on DockerChromeContext struct with usage example
  • Inline comments explain the localhostURL vs chromeURL distinction
  • PR description clearly explains benefits

Suggestion:
Update CONTRIBUTING.md or e2e/README.md (if it exists) to document the new testing patterns for future contributors.

🐛 Potential Issues

  1. Error Handling in WaitForServer (Low Priority): As mentioned above, any HTTP response (including 500 errors) is considered success. This is likely intentional but worth confirming.

  2. Context Naming Collision (Low Priority): In the refactored tests, ctx := chromeCtx.Context reuses the variable name from the old code. While this works fine, it could be clearer to use chromedpCtx to distinguish it from other contexts like shutdownCtx.

  3. No Verification of Docker Availability (Very Low Priority): SetupDockerChrome() assumes Docker is available. The underlying StartDockerChrome() likely handles this, but a more explicit error message for missing Docker would improve developer experience.

✅ Recommendations

Must Fix: None

Should Consider:

  1. Add status code check to WaitForServer() or document why it's not needed
  2. Add tests for the new helper functions themselves
  3. Document the new testing patterns in contributing guides

Nice to Have:

  1. Add comment explaining cleanup order in SetupDockerChrome()
  2. Consider more descriptive variable names (chromedpCtx instead of ctx)

📊 Summary

This is a high-quality refactoring that achieves its stated goals:

  • ✅ Eliminates port conflicts
  • ✅ Improves CI reliability with Docker Chrome
  • ✅ Removes race conditions from time.Sleep() calls
  • ✅ Reduces boilerplate in tests with custom server setup

The code is well-written, properly documented, and ready for merge. The suggestions above are minor improvements and edge case considerations rather than blocking issues.

Verdict: Approved with minor suggestions for future improvements.


Review generated with assistance from Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant