Skip to content

Issue 2: Expand Automated Test Coverage (Unit, Integration, and Visual Tests) #3

@truevox

Description

@truevox

Summary: To increase confidence in the game’s stability (especially before a Steam release), we need to significantly improve automated test coverage. Currently, the project has a basic test setup using Jest and Puppeteer, but coverage is incomplete – some critical systems and scenarios are either untested or only partially tested. This issue proposes adding a comprehensive suite of unit tests (for game logic modules and utilities) and integration/end-to-end tests (using Puppeteer for in-browser gameplay) to cover key gameplay aspects: input handling, physics responses, terrain generation consistency, UI flows, player actions (tricks, collisions), and rendering correctness. Why It Matters: Robust test coverage will catch regressions when we refactor or add features. With the upcoming code modularization (Issue 1) and other refactors, having tests for each component (e.g. terrain generator, input mapper, trick system) will ensure we haven’t broken anything. For a Steam release, we want to guarantee a stable, high-quality experience – automated tests will continuously verify core mechanics (such as movement and scoring) and prevent subtle bugs from reaching players. Moreover, writing tests now forces us to define clear expectations for each part of the game, thereby improving design quality. Current State: The project already includes Jest and Puppeteer in its toolchain
github.com
. Some tests exist (for example, there are integration tests that simulate gameplay using Puppeteer, and a few unit tests for scenes), but coverage is below the desired 80% threshold. We also have incomplete .bak test files
github.com
indicating planned but unfinished tests (e.g., trick detection). Areas like the input system, physics configuration, and certain UI behaviors appear to have little or no automated tests. We should build on the existing testing framework to fill these gaps. Proposed Testing Improvements:
Unit Tests for Game Logic Modules: After refactoring, we will have discrete modules (InputController, TerrainManager, etc.) which can be tested in isolation. Write Jest unit tests for each:
Input handling: Simulate keyboard events or call input module methods to ensure that pressing keys (or gamepad buttons) correctly sets the expected action flags. For example, test that when the Input module receives a "jump" key, the internal state reflects that and resets appropriately after use. If using the Manette class, test methods like isActionActive() and toggling walk mode (simulate pressing Tab and ensure walkMode toggles).
Physics helper logic: Test any deterministic functions or config usage. For instance, if there are utility functions (like seed generation or angle normalization), verify their outputs. Test RotationSystem (flip and landing detection) by feeding sequences of angles and checking that it triggers flip completion events at 360° rotations, or classifies landing angles as clean vs. crash correctly.
Terrain generation: Given a fixed random seed, ensure that TerrainManager produces a consistent sequence of terrain segment endpoints. We can inject a predictable RNG (e.g., mock initializeRandomWithSeed to return a constant sequence) and verify the segments generated match expected coordinates. This ensures determinism for a given seed and catches any algorithm changes.
UI logic: For simple UI updates, we can write unit tests that instantiate the HUD module with a mock scene (Phaser objects can be stubbed) and call its update methods. For example, call hud.updateLives(3) and verify that the lives display object now contains 3 life icons. Similarly, test that showToast("Test") adds a text child to the toast container, and that it is cleared after the intended duration (this might involve using Jest timers or simulating the passage of time).
Edge cases: Each unit test suite should also cover edge conditions (e.g., terrain manager with no initial segments, input controller with no input, rotation system at boundary angles like exactly 30° or 110° to ensure correct classification).
Integration Tests (Gameplay Flows with Puppeteer): Augment the existing browser tests to cover more scenarios end-to-end:
Input and Movement: Use Puppeteer to simulate key presses and ensure the player responds correctly. For example, press the right arrow (or ‘D’ key) and verify the player’s X position increases on the next frame (the existing integration tests already do a basic check for movement
github.com
github.com
; we can expand on that by checking the magnitude of movement or continuous movement over time). Simulate jump (space bar) and confirm the player’s Y velocity changes (the player should go up).
Trick execution: Write a test to perform a backflip and verify it is detected and scored. For instance, using Puppeteer, have the player gain speed, jump, then hold a rotation key (like ‘W’/up arrow) to flip. When the player lands, check that the game registers a flip (perhaps via a score increase or a console message or an in-game indicator). We can also verify the “wobble” state on a bad landing by scripting a scenario where the player lands at a slightly off angle – the RotationSystem’s wobble callback could trigger a known outcome (like a certain message or state flag we can inspect via page.evaluate).
UI flow and transitions: Automate the start-to-game flow and game-over reset. E.g., verify that clicking the "START GAME" button on the start scene actually launches the GameScene
github.com
github.com
. Then simulate losing all lives: intentionally crash the player enough times (you can force the player’s y position high and drop them to simulate a crash, or manipulate gameScene.lives in the page context) and ensure that the game over sequence triggers (e.g., the game might show a “Game Over” text or return to the start menu). If the game automatically restarts or goes to a game-over screen, check that flow. There is an existing test for restarting after lives are lost
github.com
; we should ensure it passes and maybe extend it to check that the state truly resets (score back to 0, terrain reset).
Physics and Collision: Use integration tests to validate physics interactions. For example, test that the player cannot fall through the ground: push the player downward at high speed and ensure they collide and stop above terrain. Or spawn an extra life collectible (perhaps by calling a method or fast-forwarding the spawn timer) and simulate the player hitting it, then verify lives increased by one and the collectible was removed.
Rendering consistency (Visual Regression): Introduce visual snapshot tests for critical game states. Using Puppeteer’s screenshot capabilities, we can capture images of the canvas at certain points (e.g., immediately after starting the game, after a crash, after resizing the window) and compare them to expected images. This will catch any unintended changes in rendering. We can utilize a library (like jest-image-snapshot or pixelmatch) to automate the comparison. For example, after starting the game, take a screenshot of the entire game canvas and compare against a baseline image to ensure the terrain and UI are drawn correctly (no obvious graphical glitch). Likewise, after performing a trick, take a screenshot to verify the toast message appears. These tests would flag if, say, a refactor accidentally stopped the terrain from drawing or a UI element from updating, even if the game logic is running.
Multi-seed determinism: Write an integration test for terrain determinism using seeds. For instance, launch the game with a known seed (we can programmatically set window.gameSeed before starting GameScene), run the player forward for a few seconds, then record the generated terrain points. Restart the game with the same seed and ensure the terrain generated is identical. This could be done by evaluating gameScene.terrainSegments in the browser context and comparing arrays. The existing tests had a concept of storing terrain in localStorage for comparison
github.com
github.com
; we can turn that into a puppeteer test that does the comparison in code and uses expect assertions to fail if there's a mismatch.
UI responsiveness: Test the game on different viewport sizes via Puppeteer’s page.setViewport to simulate window resize. Verify that on resize, the game resizes properly and UI elements (like HUD text) remain correctly positioned (e.g., lives display stays in top-right). There is already a test for window resizing behavior
github.com
; we should keep it and possibly enhance it (check some UI element’s new position or canvas dimensions).
Test Organization and Tools: We will organize new tests into appropriate directories (e.g., tests/unit/ for unit tests by module, and tests/e2e/ for integration tests). We should leverage the existing configuration:
Continue using Jest for all unit tests. Use Jest’s mocks for Phaser classes to test modules in isolation (there are already mocks in start-scene.test.js for Phaser methods
github.com
github.com
which we can emulate for other scenes or modules).
Use Puppeteer (via Mocha or Jest) for browser tests. The project currently runs Puppeteer tests through Mocha (npm run test:puppeteer uses Mocha and Chai)
github.com
. We can continue with Mocha for now to avoid reconfiguration, writing new .test.js files for Puppeteer scenarios as needed. Alternatively, consider switching to Jest for E2E as well for consistency, but that can be decided later – the immediate goal is more coverage, not tool refactoring.
Aim to incorporate these tests into the CI pipeline (if one exists or when one is set up). All tests (unit + integration) should be run in CI to catch issues early.
Plan & Steps:
Audit and Identify Gaps: Review the current test suite to list which functionalities are not covered. For example, note that trick detection has a .bak placeholder
github.com
– that’s a gap to fill. Make a list: (a) Input actions (no direct test), (b) Terrain shape (no direct test), (c) Rotation/flip logic (no direct test), (d) UI text updates (no test), (e) Edge cases like max lives, pause (if any), etc.
Write Unit Tests for Modules: For each core module or utility, write a Jest test file. Use descriptive names like terrain-manager.test.js, rotation-system.test.js, manette-input.test.js, etc. Start with the most critical logic:
Test outputs and state changes, not just internal functions. For instance, in rotation-system.test.js, simulate a full 360° rotation sequence by calling rotationSystem.update() with incremented angles and assert that onFlipCompleteCallback was called the expected number of times.
In terrain-manager.test.js, perhaps call a method to generate several segments and verify the continuity of segments (e.g., x2 of segment N matches x1 of segment N+1, no large gaps or overlaps, and y values within expected ranges). Use a fixed seed to get deterministic results.
In input-controller.test.js, simulate pressing and releasing keys. You might call the internal callback that the Input module uses (for example, trigger the 'keydown-SPACE' handler and check that an internal jump flag becomes true). Similarly, test that toggling walk mode (Tab key) flips the walkMode state.
Use mocks/stubs for any Phaser global objects as needed (similar to how StartScene tests create a fake scene). For physics body methods (Matter.Body), you can spy on them if the module calls them, to ensure e.g. applyForce is invoked with expected values when a trick is active.
Enhance Integration Tests: Create new Puppeteer test files or extend existing ones for the scenarios listed above. For example, add tests in puppeteer-tests.js (or split into multiple files if it’s becoming very large) for:
Trick performance: script a flip and check outcome.
Collectible pickup: perhaps spawn an extra life by adjusting the game state via page.evaluate (set gameScene.nextLifeAvailableTime = now to force a spawn) and then move the player to collect it.
UI elements: after certain actions, use page.evaluate to get text content of HUD elements (e.g., document.querySelector('#game-container').innerText or better, access window.game.scene.keys.GameScene.speedText.text if accessible) to verify the HUD updated (like speedText shows a non-zero speed when moving).
Game reset: deliberately lose and ensure restart works and variables reset (score back to zero, etc.).
Structure these as separate describe blocks in the test file for clarity (e.g., describe('Tricks', ...), describe('UI', ...)).
Introduce Visual Snapshots: Decide on a snapshot approach (optional but recommended for rendering). Possibly use Puppeteer to take screenshots:
Add a dependency like pixelmatch or jest-image-snapshot. We can have Puppeteer save an image, then use Node canvas or an image diff library to compare to a stored baseline image. This requires storing baseline images in the repo (perhaps in a tests/screenshots/ folder).
Start with one or two critical screens (title screen, in-game with HUD). Manually verify the baseline images are correct, then have the test assert that the current render matches within a small tolerance.
Alternatively, simpler: use Puppeteer’s page.screenshot() and keep that image as an artifact that a developer can manually compare if a test fails. This is slightly more manual, but still flags that “rendering changed” which is the main goal.
If implementing fully, write a test like it('renders start screen correctly', async () => { /
navigate to start, screenshot, compare */ }).
Run and Refine: Execute the full test suite frequently while writing to catch issues. Some tests, especially integration ones, may be flaky initially (due to timing or asynchronous behavior). Use Puppeteer’s wait-for functions (like page.waitForFunction or explicit delays) to ensure the game is in a stable state before assertions (e.g., wait for the scene to load, wait a few frames after a jump for the player to actually move). Adjust timeouts if needed (the test:puppeteer script allows up to 30s; we should ensure our tests finish well within this).
Review Coverage: After adding tests, run npm run test:coverage to get a coverage report. Ensure we meet or exceed the 80% coverage threshold globally (and ideally each important file is well above that). Identify any remaining low-coverage areas and consider adding tests for those if time permits.
Continuous Integration: (If not already in place) add steps in the CI pipeline to run npm run test:all (which runs both Jest and Puppeteer tests). This will automatically catch failing tests on future commits. Also, consider running the Puppeteer tests in a headless environment on CI (ensuring the environment has a browser or use Puppeteer’s built-in Chromium).
Acceptance Criteria:
Test coverage increased to ≥ 80% across the codebase, with critical modules nearly 100% covered by unit tests. (Coverage reports should show high coverage for input, terrain, physics logic, etc.)
New unit tests are in place for Input, Terrain, Rotation/Trick logic, and UI modules. These tests should pass consistently and cover normal and edge cases (as described, e.g., toggling walk mode, multiple flips, etc.).
New integration tests validate all major gameplay flows: starting a game, playing (moving, jumping), performing a trick, collecting items, losing lives, and restarting. The tests should automatically exercise these behaviors and use assertions to confirm the game’s state or output is correct (for example, the player’s position changes, the lives counter goes up/down, the appropriate scene is active, etc.).
A form of rendering validation is in place. At minimum, integration tests confirm that crucial UI text elements contain expected values at runtime (e.g., score increments when it should). Ideally, screenshot comparisons are implemented for key screens with an acceptable tolerance for minor differences.
All tests (old and new) are passing reliably on local and CI environments. Flaky tests (ones that fail intermittently) have been fixed or stabilized (for instance, by adjusting waits or using deterministic seeds for randomness).
The test suite is documented: there should be an updated section in the README or a CONTRIBUTING.md explaining how to run all tests, and describing any new testing utilities (like how to update a baseline screenshot if a deliberate rendering change is made).
Developers can run npm test and get a quick feedback on whether any core functionality regressed. This safety net will be crucial as we continue refactoring and optimizing for the Steam launch.
Verification Checklist:
Coverage report shows ≥ 80% coverage globally (with no major code section untested) – attach or link the coverage summary in the issue comments.
New unit tests for game logic modules are all green. Confirm tests for Input (keyboard/gamepad), Terrain, Rotation/Trick system, and UI updates are executed and passing.
New integration tests (Puppeteer) pass consistently. Verify they cover start->game transition, in-game actions (movement, trick, pause if any), and game over/restart loop. No test should be flaky over multiple runs.
Visual check: Run the game manually and compare against any reference images or expected UI text to ensure the game still looks and behaves normally. (This is to double-confirm that our assertions match reality – e.g., if a screenshot test failed, verify if it was a true regression or a test issue.)
CI integration: The CI pipeline (if configured) runs the full test suite and passes. If CI is not yet set up, at minimum, document the intention to include these tests in CI for future work.
Documentation updated: The project documentation includes instructions for running tests and interpreting results (especially if using new tools like image snapshot testing), so other developers or QA folks can easily use the new tests.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions