Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 95 additions & 22 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ This file provides guidance to AI agents when working with code in this reposito

## Project Overview

This is the `oc-mirror` repository - an OpenShift client tool for mirroring container registry content for disconnected cluster installs.
This is the `oc-mirror` repository an OpenShift command-line tool for mirroring container registry content for disconnected cluster installs.
oc-mirror reads an `ImageSetConfiguration` YAML file and mirrors container images from source registries to:
- a target mirror registry (direct mirroring)
- a local cache on disk, then generates a tarball for later mirroring (air-gapped scenarios)
Expand All @@ -15,51 +15,124 @@ oc-mirror reads an `ImageSetConfiguration` YAML file and mirrors container image
1. **diskToMirror (d2m)** : copy images from tarball to target registry

### Version structure
- **v2** (Current) : code lives in the root directory - **THIS IS WHAT YOU SHOULD WORK ON**
- **v1** (Deprecated) : code lives under the v1/ folder - **DO NOT MODIFY v1 CODE**
- **v2** (Current) : code lives in the root directory **THIS IS WHAT YOU SHOULD WORK ON**
- **v1** (Deprecated) : code lives under the `v1/` folder **DO NOT MODIFY v1 CODE**

## Key Architecture components
## Architecture

The `oc-mirror` project relies heavily on the [container-libs](https://github.com/containers/container-libs) library for low-level container image operations.
### Data flow

For each container image type, `oc-mirror` defines a `Collector` interface responsible for discovering all the related images:
The overall pipeline follows this flow:

| Collector | Location | Purpose |
|-----------|----------|---------|
| Release | `internal/pkg/release` | Openshift release payloads |
| Operator | `internal/pkg/operator`| RedHat operator catalogs |
| Helm | `internal/pkg/helm`| Helm charts |
| Additional | `internal/pkg/additional` | generic container images |
```text
ImageSetConfiguration → Collectors → Batch Worker → Output (Archive or ClusterResources)
→ Metadata persisted for incremental runs
```

1. The user provides an `ImageSetConfiguration` YAML describing what to mirror (releases, operators, helm charts, additional images).
2. **Collectors** read that configuration and discover all container images that need to be mirrored, returning normalized `CopyImageSchema` (source/destination pairs).
3. The **Batch Worker** copies images concurrently using goroutine semaphores.
4. Depending on the workflow, output is either a tar archive (m2d) or Kubernetes resources pointing the cluster at the mirror registry (m2m, d2m).
5. **Metadata** is persisted so subsequent runs can perform incremental mirroring.

### Mirror orchestration

The three workflows are orchestrated by executor types that wire together collectors, batch workers, and output generators:

- **MirrorToMirror**: discovers images, rebuilds operator catalogs, copies images directly to the target registry, and generates Kubernetes resources (IDMS/ITMS, CatalogSource, etc.)
- **MirrorToDisk**: discovers images, rebuilds operator catalogs, copies images to a local cache, and packages everything into a tar archive for transport
- **DiskToMirror**: extracts a previously created archive, discovers the images within it, copies them to the target registry, and generates Kubernetes resources

### Collectors

Each image type has a dedicated `CollectorInterface` implementation that discovers images to mirror:

| Collector | Purpose |
|-----------|---------|
| **Release** | Discovers OpenShift/OKD release payload images by querying the Cincinnati update graph for the requested channels and version ranges |
| **Operator** | Discovers operator catalog images, applies filtering (by package, channel, version range), and identifies all related bundle images |
| **Helm** | Discovers container images referenced by Helm charts |
| **Additional** | Passes through user-specified additional container images |

Release, Helm, and Additional collectors return `[]v2alpha1.CopyImageSchema` — a list of source→destination image pairs. The Operator collector returns `v2alpha1.CollectorSchema`, which wraps `CopyImageSchema` along with additional operator-specific metadata. All image lists are ultimately aggregated into a `CollectorSchema` by `CollectAll()` so the batch worker can process them uniformly.

### Batch Worker

The batch worker (`ChannelConcurrentBatch`) handles concurrent image copy/delete operations. It uses goroutine semaphores and channels to limit parallelism. It tracks errors per image type and handles operator bundle dependencies (skipping bundles whose related images failed).

### ImageSetConfiguration

The `ImageSetConfiguration` type defines what content to mirror. Its spec includes sections for platform releases (channels, architectures, update graph), operators (packages, channels, version ranges), additional images, Helm charts, blocked images, and archive size limits for segmented tarballs.

### Archive

The archive package handles packaging mirrored images into tar archives (used by MirrorToDisk) and extracting archives back to the filesystem (used by DiskToMirror). This enables the air-gapped workflow where images are transported physically.

The image copying itself happens concurrently in batches via `internal/pkg/batch` for optimal performance.
### Cluster Resources

The `ImageSetConfiguration` is defined in `internal/pkg/api/v2alpha1/`. See `docs/image-set-examples/imaget-set-config.yaml` for examples.
After mirroring, oc-mirror generates Kubernetes resources that configure the target cluster to use the mirror registry:

- **IDMS** (ImageDigestMirrorSet) and **ITMS** (ImageTagMirrorSet) — redirect image pulls to the mirror
- **CatalogSource** / **ClusterCatalog** — point the cluster's OLM at mirrored operator catalogs
- **UpdateService** — configure Cincinnati graph for mirrored release channels

### History

The `History` interface (`internal/pkg/history`) tracks which images were synced across runs via `Read()` and `Append()` methods. This enables incremental mirroring — subsequent runs only copy new or updated images.

### Image transport

The project uses [podman-container-tools/container-libs](https://github.com/podman-container-tools/container-libs/tree/main/image) for low-level container image transport and copy operations.

## Common development commands

### Building

```bash
make clean # clean up build artifacts
make build # compiles the oc-mirror binary
make build # build v1 + v2 binary into ./bin/
make clean # remove build artifacts
make cross-build # cross-compile for amd64, arm64, ppc64le, s390x
```

**Important**: always use `make build`, not `go build` directly - the Makefile sets required build tags.
**Important**: always use `make build`, not `go build` directly — the Makefile sets required build tags (`json1`, btrfs/libdm exclusions, etc.) and embeds the v1 binary.

Individual cross-build targets are also available: `cross-build-linux-amd64`, `cross-build-linux-arm64`, `cross-build-linux-ppc64le`, `cross-build-linux-s390x`.

### Testing

See [docs/testing/](docs/testing/) for the full testing strategy, conventions, and examples.

```bash
make test-unit # run unit tests
make test-integration # run integration tests
make test-unit # unit tests (./internal/pkg/... -short)
make test-integration # integration tests (runs specific Integration* test funcs)
make cover # generate HTML coverage report from unit test results
```

### Validation and Verification
Unit tests write coverage to `tests/results/cover.out`. Integration tests write to `tests/results-integration/`.

To run a single test or package directly, you **must** pass the build tags:

```bash
make verify # run golangci-lint
make sanity # runs: tidy, format, and vet checks
go test -tags "json1 exclude_graphdriver_devicemapper exclude_graphdriver_btrfs containers_image_openpgp" \
Comment thread
dorzel marked this conversation as resolved.
-short -race -count=1 ./internal/pkg/image/...

go test -tags "json1 exclude_graphdriver_devicemapper exclude_graphdriver_btrfs containers_image_openpgp" \
-short -race -count=1 -run TestSpecificName ./internal/pkg/release/...
Comment thread
dorzel marked this conversation as resolved.
```

### Validation and verification

```bash
make verify # run golangci-lint
make vet # run go vet
make format # check gofmt compliance
make tidy # run go mod tidy
make sanity # runs tidy + format + vet, then fails if working tree is dirty
make all # clean + tidy + build (full pipeline)
```

Run `make sanity` before committing — it will fail if there are uncommitted formatting or module changes.

## Contributing

1. Write understandable code. Always prefer clarity over other things.
Expand Down