refactor(testing): add Docker Chrome helpers for reliable CI testing by adnaan · Pull Request #85 · livetemplate/lvt

adnaan · 2026-02-02T06:38:31Z

Summary

Add SetupDockerChrome() helper to simplify Docker Chrome setup in tests that manage their own server
Add WaitForServer() helper to replace arbitrary time.Sleep() calls with proper readiness checks
Refactor loading indicator and focus preservation tests to use dynamic ports, Docker Chrome, and proper server readiness

Changes

File	Description
`testing/testing.go`	Added `DockerChromeContext` struct and `SetupDockerChrome()` helper
`testing/chrome.go`	Added `WaitForServer()` helper function
`e2e/livetemplate_core_test.go`	Refactored 4 tests to use new helpers

Benefits

No more port conflicts: Dynamic port allocation via GetFreePort() instead of hardcoded 9001-9004
Reliable CI: Docker Chrome for consistent headless browser execution
No race conditions: WaitForServer() replaces time.Sleep(100ms) with proper polling
Simpler test code: New helpers reduce boilerplate in tests with custom server setup

Test plan

CI browser tests pass
Loading indicator tests work correctly
Focus preservation tests work correctly

🤖 Generated with Claude Code

- Add SetupDockerChrome() helper to simplify Docker Chrome setup in tests - Add WaitForServer() helper to replace arbitrary time.Sleep() calls - Refactor loading indicator and focus preservation tests to use: - Dynamic port allocation via GetFreePort() - Docker Chrome for consistent CI execution - Proper server readiness checks - GetChromeTestURL() for Docker-to-host networking This improves test reliability by eliminating hardcoded ports and race conditions from sleep-based waiting. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Copilot

Pull request overview

Adds reusable helpers to make browser E2E tests more reliable in CI by standardizing Docker-based Chrome setup and replacing arbitrary sleeps with a server-readiness poll, then refactors selected tests to use these helpers and dynamic ports.

Changes:

Added SetupDockerChrome() and a DockerChromeContext helper to simplify Docker Chrome usage in tests that run their own HTTP server.
Added WaitForServer() helper to poll for server readiness instead of using time.Sleep.
Refactored 4 E2E tests to use dynamic ports + Docker Chrome + readiness polling.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
testing/testing.go	Introduces `DockerChromeContext` and `SetupDockerChrome()` helper for Docker-based chromedp contexts.
testing/chrome.go	Adds `WaitForServer()` helper for server readiness polling.
e2e/livetemplate_core_test.go	Updates loading/focus E2E tests to use dynamic ports, Docker Chrome, and server readiness checks.

Comments suppressed due to low confidence (1)

e2e/livetemplate_core_test.go:1216

url is now GetChromeTestURL(port) (host.docker.internal), but this http.Get(url) runs on the host test process, not inside the Docker Chrome container. On Linux CI, host.docker.internal typically won’t resolve from the host, causing this check to fail. Use http://localhost:<port> for the host-side fetch, and keep GetChromeTestURL only for chromedp navigation inside Docker.

	// Use host.docker.internal for Docker Chrome to access host server
	url := e2etest.GetChromeTestURL(port)

	resp, err := http.Get(url)
	if err != nil {
		t.Fatalf("Failed to fetch page: %v", err)
	}

Copilot · 2026-02-02T06:42:32Z

testing/testing.go

+//
+//	// Use GetChromeTestURL for Docker Chrome to access host
+//	url := e2etest.GetChromeTestURL(port)
+//	chromedp.Run(chromeCtx, chromedp.Navigate(url), ...)


In the usage comment, SetupDockerChrome returns a *DockerChromeContext, but the example calls chromedp.Run(chromeCtx, ...). That won’t compile and may mislead future test authors; the example should use chromeCtx.Context (or return a context.Context directly).

Suggested change

// chromedp.Run(chromeCtx, chromedp.Navigate(url), ...)

// chromedp.Run(chromeCtx.Context, chromedp.Navigate(url), ...)

Copilot · 2026-02-02T06:42:32Z

testing/testing.go

+type DockerChromeContext struct {
+	Context    context.Context
+	Cancel     context.CancelFunc
+	ChromePort int
+	t          *testing.T
+}
+
+// SetupDockerChrome starts a Docker Chrome container and returns a chromedp context.
+// Call cleanup() when done to stop the container and cancel the context.
+func SetupDockerChrome(t *testing.T, timeout time.Duration) (*DockerChromeContext, func()) {
+	t.Helper()
+
+	chromePort, err := GetFreePort()
+	if err != nil {
+		t.Fatalf("Failed to allocate Chrome port: %v", err)
+	}
+
+	if err := StartDockerChrome(t, chromePort); err != nil {
+		t.Fatalf("Failed to start Docker Chrome: %v", err)
+	}
+
+	chromeURL := fmt.Sprintf("http://localhost:%d", chromePort)
+	allocCtx, allocCancel := chromedp.NewRemoteAllocator(context.Background(), chromeURL)
+
+	ctx, ctxCancel := chromedp.NewContext(allocCtx, chromedp.WithLogf(t.Logf))
+
+	// Apply timeout
+	ctx, timeoutCancel := context.WithTimeout(ctx, timeout)
+
+	dcc := &DockerChromeContext{
+		Context:    ctx,
+		Cancel:     timeoutCancel,
+		ChromePort: chromePort,
+		t:          t,
+	}


DockerChromeContext.Cancel is set to timeoutCancel, which only cancels the timeout context and does not cancel the chromedp context/allocator or stop the Docker container. Since this is an exported field, it’s easy to misuse and leak containers/resources; consider removing the Cancel field or making it cancel the full Chrome session (or expose only the cleanup function).

Copilot · 2026-02-02T06:42:32Z

testing/chrome.go

+func WaitForServer(t *testing.T, serverURL string, timeout time.Duration) {
+	t.Helper()
+
+	deadline := time.Now().Add(timeout)
+	for time.Now().Before(deadline) {
+		resp, err := http.Get(serverURL)
+		if err == nil {
+			resp.Body.Close()
+			t.Logf("✅ Server ready at %s", serverURL)
+			return
+		}
+		time.Sleep(100 * time.Millisecond)
+	}
+	t.Fatalf("Server at %s failed to become ready within %v", serverURL, timeout)


WaitForServer uses http.Get with the default client, which has no request/response timeout. If the connection hangs (e.g., SYN stalls or server accepts but never responds), this function can block well past the provided timeout. Use an http.Client with a small per-request timeout (or NewRequestWithContext) so the overall deadline is respected.

claude · 2026-02-02T06:43:01Z

Code Review: PR #85 - Docker Chrome Test Helpers

Summary

This PR refactors E2E testing infrastructure by introducing reusable Docker Chrome helpers (SetupDockerChrome() and WaitForServer()) to eliminate port conflicts, race conditions, and improve CI reliability. The changes reduce test boilerplate and make tests more maintainable.

✅ Strengths

1. Excellent Design & Abstraction

SetupDockerChrome() encapsulates complex Chrome setup into a simple helper with clean cleanup
DockerChromeContext struct provides good encapsulation of Chrome context and metadata
WaitForServer() replaces arbitrary sleeps with proper readiness polling - much more reliable
Dynamic port allocation via GetFreePort() eliminates hardcoded ports and race conditions

2. Code Quality

Consistent error handling with clear error messages
Proper use of t.Helper() in test helpers (testing/testing.go:270, testing/chrome.go:354)
Good documentation in docstrings explaining usage patterns
Consistent timeout handling (10s for server, 30s for Chrome)

3. Cleanup & Resource Management

Proper cleanup chain: timeout cancel → context cancel → allocator cancel → Docker stop
Sequential cleanup prevents resource leaks
Consistent use of defer cleanup() pattern in refactored tests

4. Testing Best Practices

Eliminates flaky time.Sleep(100ms) with proper server readiness checks
Uses host.docker.internal correctly for Docker Chrome → host communication
Consistent naming convention for shutdown contexts (shutdownCtx)

🔍 Issues & Concerns

1. Missing Test Coverage ⚠️

Severity: Medium

The new helper functions SetupDockerChrome() and WaitForServer() lack unit tests:

testing/chrome.go:351-367 (WaitForServer) - no tests validating timeout behavior
testing/testing.go:267-304 (SetupDockerChrome) - no tests for cleanup chain

2. Potential Resource Leak in Error Path ⚠️

Severity: Low
Location: testing/testing.go:277-303

If StartDockerChrome() succeeds but chromedp.NewRemoteAllocator() or chromedp.NewContext() fails, the Docker container may not be cleaned up. Add defer StopDockerChrome(t, chromePort) immediately after StartDockerChrome() succeeds.

3. WaitForServer HTTP Response Not Closed in Error Path ⚠️

Severity: Low
Location: testing/chrome.go:358-365

While unlikely, http.Get can return both a non-nil response and an error. Add defensive body close for the error path.

4. No Timeout Context for HTTP.Get ⚠️

Severity: Low
Location: testing/chrome.go:358

Consider using a client with timeout to prevent indefinite blocking.

🔒 Security Considerations

✅ No security concerns identified

⚡ Performance Considerations

Positive Impact:

Dynamic port allocation prevents CI failures from port conflicts
WaitForServer() with 100ms polling is efficient

📝 Documentation & Maintainability

✅ Excellent documentation:

Clear usage examples in docstrings
Inline comments explain the "why"

🎯 Verdict

APPROVE with minor fixes recommended

This is a solid refactoring that significantly improves test reliability and maintainability. All issues identified are low severity and can be addressed as follow-ups.

Should fix soon:

Add test coverage for new helpers
Fix potential resource leak in SetupDockerChrome error path
Add defensive body close in WaitForServer

Nice to have:

Extract magic numbers to constants
Add HTTP client timeout to WaitForServer

Great work on improving test infrastructure! 🚀

Generated by Claude Code PR Review

- Fix docstring example to use chromeCtx.Context (Copilot) - Remove Cancel field from DockerChromeContext to prevent misuse (Copilot) Users should use the cleanup function returned by SetupDockerChrome - Add per-request HTTP timeout to WaitForServer (Copilot) Prevents indefinite blocking if connection hangs Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude · 2026-02-02T06:46:38Z

PR Review: Refactor Testing Helpers for Docker Chrome

Overview

This PR refactors E2E tests to use new helper functions for Docker Chrome setup and server readiness checks. The changes improve test reliability by eliminating hardcoded ports and race conditions.

✅ Strengths

1. Excellent Problem-Solution Fit

Port conflicts eliminated: Dynamic port allocation via GetFreePort() replaces hardcoded ports (9001-9004)
Race conditions removed: WaitForServer() replaces time.Sleep(100*time.Millisecond) with proper polling
Code reuse: New helpers reduce boilerplate across 4 tests (55 lines added vs 71 deleted)

2. Well-Designed Abstractions

SetupDockerChrome() provides a clean, composable API for tests managing their own servers
DockerChromeContext struct properly encapsulates Chrome context and metadata
Clear separation of concerns between server setup and browser setup

3. Good Documentation

Comprehensive godoc comments with usage examples
Inline comments explain the host.docker.internal pattern
PR description clearly articulates benefits

🔍 Code Quality Observations

testing/testing.go:268-301 ✅

The SetupDockerChrome() function is well-structured with proper resource cleanup:

Multiple context cancel functions properly chained in cleanup closure
Timeout applied at the chromedp context level
t.Helper() correctly marks this as a test helper

testing/chrome.go:351-370 ✅

WaitForServer() implementation is solid:

Per-request timeout prevents indefinite blocking
100ms polling interval is reasonable
Proper t.Helper() usage for test output
Log message provides useful feedback

e2e/livetemplate_core_test.go ✅

Refactored tests follow consistent patterns:

Dynamic port allocation before server start
Server readiness check before Chrome operations
Proper cleanup with defer
Consistent use of GetChromeTestURL() for Docker networking

🐛 Potential Issues

1. Missing Error Check (Minor)

Location: e2e/livetemplate_core_test.go:1286 (and similar lines)

The tests check err from GetFreePort() but then immediately use chromeCtx.Context without checking if SetupDockerChrome() succeeded:

chromeCtx, cleanup := e2etest.SetupDockerChrome(t, 30*time.Second)
defer cleanup()

ctx := chromeCtx.Context  // Used immediately without checking for setup errors

Impact: Low - SetupDockerChrome() calls t.Fatalf() on failure, so the test will fail appropriately. However, the pattern is slightly inconsistent.

Recommendation: This is acceptable given t.Fatalf() usage, but consider documenting this behavior in the function godoc.

2. Cleanup Order with Deferred Functions

Location: All refactored tests in e2e/livetemplate_core_test.go

The tests have two deferred cleanup functions:

chromeCtx, cleanup := e2etest.SetupDockerChrome(t, 30*time.Second)
defer cleanup()  // Chrome cleanup (second to execute)

// ... server setup ...

defer func() {   // Server cleanup (first to execute)
    shutdownCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()
    if err := server.Shutdown(shutdownCtx); err != nil {
        t.Logf("Server shutdown warning: %v", err)
    }
}()

Impact: Low - The cleanup order (server first, then Chrome) is actually correct for this use case, since Chrome depends on the server.

Observation: The order is correct but worth noting for future reviewers.

🚀 Performance & Reliability

Positive Impact

Dynamic ports eliminate test suite parallelization issues
Proper server readiness checks make tests more deterministic
Docker Chrome provides consistent headless browser environment across CI/local

Concerns

None identified. The polling intervals (100ms) are reasonable for test scenarios.

🔒 Security Considerations

No security concerns. These are test-only utilities.

🧪 Test Coverage

Coverage Impact

The new helper functions (SetupDockerChrome, WaitForServer) are tested indirectly through the refactored E2E tests
4 existing tests refactored to use new helpers
No new test cases added (refactor-only PR)

Recommendation

Consider adding unit tests for WaitForServer() to verify:

Proper timeout behavior
Correct handling of server startup delays
HTTP client timeout isolation

This could be a follow-up PR rather than blocking this one.

📝 Documentation & Maintainability

Strengths

Godoc comments on new functions are excellent
Usage examples in comments match actual usage
PR description provides clear motivation and benefits
Inline comments explain Docker networking pattern

Minor Suggestions

Consider adding a comment in SetupDockerChrome godoc about the t.Fatalf() behavior on errors
The DockerChromeContext.t field is unexported and unused - consider removing it or documenting why it's there for future use

✅ Final Verdict

LGTM - Approve with minor observations

This is a well-executed refactoring that improves test reliability without introducing regressions. The new abstractions are clean, well-documented, and properly tested through usage. The minor issues noted are cosmetic and don't affect functionality.

Checklist

✅ Code quality and best practices followed
✅ No significant bugs identified
✅ Performance improvements (eliminates race conditions)
✅ No security concerns
✅ Test coverage adequate (tests the refactored code)
✅ Documentation is comprehensive

Recommendations for Future PRs

Add unit tests for WaitForServer() helper
Consider removing unused DockerChromeContext.t field or document its purpose
Document t.Fatalf() behavior in SetupDockerChrome godoc

Great work! This refactoring significantly improves the test infrastructure. 🎉

On Linux CI, host.docker.internal only resolves from inside Docker containers, not from the host machine. This fixes the test failure: Failed to fetch page: Get "http://host.docker.internal:38295": dial tcp: lookup host.docker.internal: no such host Now uses: - localhost for host-side http.Get() calls (runs on test host) - GetChromeTestURL() only for chromedp navigation (runs inside Docker) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude · 2026-02-02T07:00:30Z

Code Review - PR #85

Overall, this is a well-structured refactoring that significantly improves test reliability. The changes follow good practices and the implementation is solid. Below are my detailed findings:

✅ Strengths

Excellent Abstraction: The SetupDockerChrome() and WaitForServer() helpers reduce boilerplate and improve test maintainability
Proper Resource Management: Dynamic port allocation via GetFreePort() eliminates hardcoded ports and prevents conflicts
Reliability Improvements: Replacing time.Sleep() with WaitForServer() polling eliminates race conditions
Good Documentation: Clear godoc comments explain usage patterns with helpful examples
Consistent Cleanup: Proper defer patterns ensure resources are cleaned up even on test failure

🔍 Code Quality Observations

testing/testing.go:268-302 (SetupDockerChrome)

Good:

Proper use of t.Helper() for better stack traces
Multiple context cancellation functions properly chained
Clear separation of concerns in cleanup function

Minor Suggestion:
The cleanup function stacks three cancel functions (timeoutCancel, ctxCancel, allocCancel) plus StopDockerChrome. Consider adding a comment explaining the cleanup order matters (timeout → context → allocator → container):

cleanup := func() {
    // Cancel in reverse order of creation
    timeoutCancel()  // Cancel timeout first
    ctxCancel()      // Then context
    allocCancel()    // Then allocator
    StopDockerChrome(t, chromePort)  // Finally stop container
}

testing/chrome.go:348-369 (WaitForServer)

Good:

2-second per-request timeout prevents indefinite blocking
Proper use of t.Helper()
Closes response body immediately after successful check
Clear success logging

Potential Issue:
The function doesn't distinguish between different types of HTTP errors. A server returning 500 will be considered "ready" when it might actually be broken. Consider checking the status code:

resp, err := client.Get(serverURL)
if err == nil {
    resp.Body.Close()
    if resp.StatusCode < 500 {  // Accept any non-server-error status
        t.Logf("✅ Server ready at %s", serverURL)
        return
    }
}

However, this might be intentional if you want to detect the server is listening regardless of its health state.

e2e/livetemplate_core_test.go (Test Refactoring)

Good:

Consistent pattern across all 4 refactored tests
Proper distinction between localhostURL (for host-side HTTP) and chromeURL (for Docker Chrome)
Variable naming (shutdownCtx) is clearer than reusing ctx

Observation:
Lines 1210-1212 show careful handling of the URL distinction:

localhostURL := fmt.Sprintf("http://localhost:%d", port)
chromeURL := e2etest.GetChromeTestURL(port)
resp, err := http.Get(localhostURL)  // Correct - host-side fetch

This is excellent and prevents a common mistake.

🔒 Security Considerations

No security concerns identified. The changes:

Use proper resource cleanup to prevent resource exhaustion
Don't introduce any new external inputs or attack surfaces
Memory limits on Docker containers (512MB) are retained from existing code

⚡ Performance Considerations

Positive:

Dynamic port allocation enables parallel test execution
Docker Chrome container reuse is more efficient than spawning new Chrome instances

Minor Note:
WaitForServer() polls every 100ms which is reasonable, but the total timeout of 10 seconds might be conservative for slow CI environments. Current implementation is fine, but if tests occasionally timeout, this is a tuning knob.

🧪 Test Coverage

Observation:
The PR modifies 4 existing tests to use the new helpers:

TestLoadingIndicator
TestLoadingIndicatorDisabled
TestFocusPreservation
TestFocusPreservationMultipleInputs

Suggestion:
Consider adding a dedicated test for the new helpers themselves:

Test WaitForServer() timeout behavior
Test SetupDockerChrome() cleanup on test failure
Test that GetFreePort() actually returns available ports

This would increase confidence that the helpers work correctly in edge cases.

📋 Documentation

Good:

Clear godoc on DockerChromeContext struct with usage example
Inline comments explain the localhostURL vs chromeURL distinction
PR description clearly explains benefits

Suggestion:
Update CONTRIBUTING.md or e2e/README.md (if it exists) to document the new testing patterns for future contributors.

🐛 Potential Issues

Error Handling in WaitForServer (Low Priority): As mentioned above, any HTTP response (including 500 errors) is considered success. This is likely intentional but worth confirming.
Context Naming Collision (Low Priority): In the refactored tests, ctx := chromeCtx.Context reuses the variable name from the old code. While this works fine, it could be clearer to use chromedpCtx to distinguish it from other contexts like shutdownCtx.
No Verification of Docker Availability (Very Low Priority): SetupDockerChrome() assumes Docker is available. The underlying StartDockerChrome() likely handles this, but a more explicit error message for missing Docker would improve developer experience.

✅ Recommendations

Must Fix: None

Should Consider:

Add status code check to WaitForServer() or document why it's not needed
Add tests for the new helper functions themselves
Document the new testing patterns in contributing guides

Nice to Have:

Add comment explaining cleanup order in SetupDockerChrome()
Consider more descriptive variable names (chromedpCtx instead of ctx)

📊 Summary

This is a high-quality refactoring that achieves its stated goals:

✅ Eliminates port conflicts
✅ Improves CI reliability with Docker Chrome
✅ Removes race conditions from time.Sleep() calls
✅ Reduces boilerplate in tests with custom server setup

The code is well-written, properly documented, and ready for merge. The suggestions above are minor improvements and edge case considerations rather than blocking issues.

Verdict: Approved with minor suggestions for future improvements.

Review generated with assistance from Claude Code

Copilot AI review requested due to automatic review settings February 2, 2026 06:38

Copilot started reviewing on behalf of adnaan February 2, 2026 06:38 View session

Copilot AI reviewed Feb 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(testing): add Docker Chrome helpers for reliable CI testing#85

refactor(testing): add Docker Chrome helpers for reliable CI testing#85
adnaan wants to merge 3 commits intomainfrom
refactor/docker-chrome-test-helpers

adnaan commented Feb 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

claude bot commented Feb 2, 2026

Uh oh!

claude bot commented Feb 2, 2026

Uh oh!

claude bot commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	// chromedp.Run(chromeCtx, chromedp.Navigate(url), ...)
	// chromedp.Run(chromeCtx.Context, chromedp.Navigate(url), ...)

Conversation

adnaan commented Feb 2, 2026

Summary

Changes

Benefits

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot commented Feb 2, 2026

Code Review: PR #85 - Docker Chrome Test Helpers

Summary

✅ Strengths

1. Excellent Design & Abstraction

2. Code Quality

3. Cleanup & Resource Management

4. Testing Best Practices

🔍 Issues & Concerns

1. Missing Test Coverage ⚠️

2. Potential Resource Leak in Error Path ⚠️

3. WaitForServer HTTP Response Not Closed in Error Path ⚠️

4. No Timeout Context for HTTP.Get ⚠️

🔒 Security Considerations

⚡ Performance Considerations

📝 Documentation & Maintainability

🎯 Verdict

Uh oh!

claude bot commented Feb 2, 2026

PR Review: Refactor Testing Helpers for Docker Chrome

Overview

✅ Strengths

1. Excellent Problem-Solution Fit

2. Well-Designed Abstractions

3. Good Documentation

🔍 Code Quality Observations

testing/testing.go:268-301 ✅

testing/chrome.go:351-370 ✅

e2e/livetemplate_core_test.go ✅

🐛 Potential Issues

1. Missing Error Check (Minor)

2. Cleanup Order with Deferred Functions

🚀 Performance & Reliability

Positive Impact

Concerns

🔒 Security Considerations

🧪 Test Coverage

Coverage Impact

Recommendation

📝 Documentation & Maintainability

Strengths

Minor Suggestions

✅ Final Verdict

Checklist

Recommendations for Future PRs

Uh oh!

claude bot commented Feb 2, 2026

Code Review - PR #85

✅ Strengths

🔍 Code Quality Observations

testing/testing.go:268-302 (SetupDockerChrome)

testing/chrome.go:348-369 (WaitForServer)

e2e/livetemplate_core_test.go (Test Refactoring)

🔒 Security Considerations

⚡ Performance Considerations

🧪 Test Coverage

📋 Documentation

🐛 Potential Issues

✅ Recommendations