feat: add run comparison type definitions and diff engine by Harshil-Malisetty · Pull Request #199 · Netflix/metaflow-ui

Harshil-Malisetty · 2026-03-27T15:25:29Z

Requirements for a pull request

Unit tests related to the change have been updated
Documentation related to the change has been updated

Description of the Change

This PR adds foundational types and a client-side diff engine for a run comparison view, where users can select any two runs of the same flow and immediately see what changed between them: parameter values, step durations, and artifact presence.

The goal was to add this incrementally rather than in a single large PR, so this is the data layer only. UI components and hooks will follow separately once this is reviewed.

`src/types/comparison.ts`

All types are built by composing from the existing interfaces in src/types.ts (Run, Step, Task, TaskStatus, RunParam) so there is no duplication and the comparison view stays in sync with any future changes to those interfaces.

RunSnapshot holds a run, its RunParam map, and an array of StepSnapshot objects
StepSnapshot holds a Step, a TaskStatus (sourced from getStepStatus in taskdataUtils.ts), its Task[], and artifact key names
RunComparisonData is just { runA, runB } and is the input to every diff function
RunDiff is the output: { params: ParamDiff[], steps: StepDiff[], artifacts: ArtifactDiff[] }
StepDiff includes a delta_ms field (null when either side has no duration) which maps directly to how the existing timeline already handles incomplete timing data

The comments at the top of types/comparison.ts map each field to the specific existing endpoint it comes from (/runs/{run_number}, /parameters, /steps, /tasks, /artifacts).

`src/utils/comparison.ts`

Four pure functions, no side effects, no API calls:

computeParamDiffs(paramsA, paramsB): unions all keys from both RunParam objects (which are plain string-keyed maps), checks value equality per key, sorts results alphabetically so diffs are stable across renders
computeStepDiffs(data): matches steps by step_name, computes delta_ms = durationB - durationA. Steps missing from one side get null durations and 'unknown' status, consistent with how the rest of the app handles missing step data
computeArtifactDiffs(data): compares artifact key presence per step, not artifact values. Artifact values would require additional per-artifact fetches which are expensive; presence diffing is enough to show what the run produced or lost
computeRunDiff(data): composes all three into a single RunDiff

`src/utils/tests/comparison.test.cypress.ts`

12 test cases using createRun, createStep, createTask from testhelper.ts and Chai assertions, matching the conventions in the existing test files. A local makeSnapshot helper keeps each test focused on the case being tested rather than boilerplate.

computeParamDiffs: identical params, changed value, params in only one run, empty params
computeStepDiffs: duration delta, step in only one run, null delta when duration missing, alphabetical sort by step_name
computeArtifactDiffs: both runs have artifact, one run missing artifact, empty artifact lists
computeRunDiff: full integration with params + steps + artifacts in one snapshot

No new API endpoints are needed.

Alternate Designs

Server-side diff endpoint: Considered adding a /compare endpoint that returns a pre-computed diff. Decided against it because all the necessary data is already fetched by existing hooks (useResource) when viewing a run, the dataset per run is small enough that client-side diffing adds no perceptible cost, and it avoids requiring backend changes for a frontend-driven feature.

General-purpose diff library (deep-diff, microdiff): Rejected to avoid adding a new dependency. The three diff operations are domain-specific enough (step duration delta, artifact presence, param value equality) that a small handwritten implementation is clearer and easier to test precisely.

Possible Drawbacks

This PR contains no UI and will produce no visible change on its own. The types may need revision once the comparison view components are designed in detail, but since they compose from existing interfaces rather than duplicating fields, the surface area for breaking changes is small.

Verification Process

yarn cypress run --component --spec "src/utils/__tests__/comparison.test.cypress.ts"

Groundwork for the run comparison view (GSoC 2026 deliverable). - RunSnapshot, StepSnapshot, RunComparisonData types composing from existing interfaces - Client-side diff engine: computeParamDiffs, computeStepDiffs, computeArtifactDiffs - 10 Cypress test cases following existing conventions (Chai, testhelper.ts) - All data sourced from existing API endpoints - no new backend routes needed

Harshil-Malisetty added 2 commits March 27, 2026 19:35

fix: sort computeStepDiffs alphabetically, add sort order test

7dc006b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add run comparison type definitions and diff engine#199

feat: add run comparison type definitions and diff engine#199
Harshil-Malisetty wants to merge 2 commits intoNetflix:masterfrom
Harshil-Malisetty:feat/run-comparison-types

Harshil-Malisetty commented Mar 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Harshil-Malisetty commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Requirements for a pull request

Description of the Change

src/types/comparison.ts

src/utils/comparison.ts

src/utils/__tests__/comparison.test.cypress.ts

Alternate Designs

Possible Drawbacks

Verification Process

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Harshil-Malisetty commented Mar 27, 2026 •

edited

Loading

`src/types/comparison.ts`

`src/utils/comparison.ts`

`src/utils/tests/comparison.test.cypress.ts`