Skip to content

Add support for ItemReader, ItemBatcher, ToleratedFailure* in Map states for both JSONpath and JSONata#18

Open
johnbenjaminmccarthy wants to merge 1 commit intobbc:mainfrom
johnbenjaminmccarthy:map_state_updates
Open

Add support for ItemReader, ItemBatcher, ToleratedFailure* in Map states for both JSONpath and JSONata#18
johnbenjaminmccarthy wants to merge 1 commit intobbc:mainfrom
johnbenjaminmccarthy:map_state_updates

Conversation

@johnbenjaminmccarthy
Copy link
Copy Markdown

@johnbenjaminmccarthy johnbenjaminmccarthy commented Apr 25, 2026

This commit adds support for the following Map state configuration fields:

  • ItemReader specifying a resource or resources in S3 to read items from
  • ItemBatcher to batch map state executions
  • ToleratedFailureCount and ToleratedFailurePercentage on map states (and -Path variants for JSONpath).

It also adds supports and checks for:

  • MaxItems on map states (and MaxItemsPath for JSONpath)
  • ProcessorConfig.Mode ("INLINE" or "DISTRIBUTED") for validating whether ItemReader should be processed
  • Label for adding (more) accurate context.Execution information for distributed child map state executions

All changes are implemented for both JSONpath and JSONata configuration of Map states.

All changes are covered by new unit tests for map states, covering normal input, edge cases, and validating inputs for ItemReader, ItemBatcher, and ToleratedFailure* as well as INLINE v.s. DISTRIBUTED.

This commit does not address the following additional functionalities of AWS map states which are still remaining:

  • MaxConcurrency
  • MaxInputBytesPerBatch (and MaxInputBytesPerBatchPath for JSONpath)
  • ProcessorConfig.ExecutionType for distributed map states.
  • ResultWriter
  • S3 object input types of JSONL, CSV, MANIFEST, or PARQUET in map states with ItemReader configured to use S3 getObject .
  • Transformation === "LOAD_AND_FLATTEN" functionality.

This commit adds support for the following Map state configuration fields:
- `ItemReader` specifying a resource or resources in S3 to read items from
- `ItemBatcher` to batch map state executions
- `ToleratedFailureCount` and `ToleratedFailurePercentage` on map states (and `-Path` variants for JSONpath).

It also adds supports and checks for:
- `MaxItems` on map states (and `MaxItemsPath` for JSONpath)
- `ProcessorConfig.Mode` (`"INLINE"` or `"DISTRIBUTED"`) for validating whether `ItemReader` should be processed
- `Label` for adding (more) accurate `context.Execution` information for distributed child map state executions

All changes are implemented for both JSONpath and JSONata configuration of Map states.

All changes are covered by new unit tests for map states, covering normal input, edge cases, and validating inputs for `ItemReader`, `ItemBatcher`, and `ToleratedFailure*` as well as `INLINE` v.s. `DISTRIBUTED`.

This commit does not address the following additional functionalities of AWS map states which are still remaining:
- `MaxConcurrency`
- `MaxInputBytesPerBatch`  (and `MaxInputBytesPerBatchPath` for JSONpath)
- `ProcessorConfig.ExecutionType` for distributed map states.
- `ResultWriter`
- S3 object input types of `JSONL`, `CSV`, `MANIFEST`, or `PARQUET` in map states with `ItemReader` configured to use S3 `getObject` .
- `Transformation === "LOAD_AND_FLATTEN"` functionality.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant