Skip to content

[3/4] Add Step Functions Lambda handlers for orchestration#77

Closed
ericc59 wants to merge 5 commits into
aws-integration-pr2from
aws-integration-pr3
Closed

[3/4] Add Step Functions Lambda handlers for orchestration#77
ericc59 wants to merge 5 commits into
aws-integration-pr2from
aws-integration-pr3

Conversation

@ericc59
Copy link
Copy Markdown
Contributor

@ericc59 ericc59 commented Jan 29, 2026

Summary

Implements the Lambda handlers invoked by AWS Step Functions to orchestrate benchmark runs. This is PR 3/3 in the AWS integration series, completing the distributed execution layer.

Lambda Handlers:

Handler Purpose
initialize Creates run record in DynamoDB, initializes task records, returns run_id
handle_error Handles Batch job failures, manages retry logic (max 3 attempts)
aggregate Calculates metrics (accuracy, cost), stores results in S3
complete Marks run complete, publishes CloudWatch metrics

State Machine Flow:

Initialize → ProcessTasks (Map/Parallel) → Aggregate → Complete
                    ↓
              SubmitBatchJob → HandleError (on failure)

Files Added:

  • src/arc_agi_benchmarking/lambdas/__init__.py
  • src/arc_agi_benchmarking/lambdas/initialize.py
  • src/arc_agi_benchmarking/lambdas/handle_error.py
  • src/arc_agi_benchmarking/lambdas/aggregate.py
  • src/arc_agi_benchmarking/lambdas/complete.py
  • src/arc_agi_benchmarking/lambdas/README.md (deployment instructions)
  • src/arc_agi_benchmarking/tests/test_lambdas.py (8 tests)

Dependencies

Test plan

  • All existing tests pass (440 tests)
  • New Lambda handler tests pass with moto mocking (8 tests)
  • Integration test with LocalStack
  • End-to-end test with Step Functions in AWS sandbox

Implements Lambda handlers invoked by AWS Step Functions to orchestrate
benchmark runs:

- initialize: Creates run record in DynamoDB, initializes task records
  for all tasks to be processed, returns run_id and task_ids

- handle_error: Handles task failures from Batch jobs, manages retry
  logic (up to MAX_RETRIES), marks tasks as FAILED after exhausting
  retries

- aggregate: Queries all task records, calculates metrics (accuracy,
  cost, completion rate), stores aggregated results in S3

- complete: Marks run as COMPLETED/COMPLETED_WITH_ERRORS/FAILED based
  on task outcomes, publishes final metrics to CloudWatch

Also includes:
- Comprehensive test suite using moto (8 tests)
- README with deployment instructions and state machine flow diagram
- Support for LocalStack testing via AWS_ENDPOINT_URL
@ericc59 ericc59 changed the title Add Step Functions Lambda handlers for orchestration (PR 3/3) Add Step Functions Lambda handlers for orchestration (PR 3/4) Jan 29, 2026
@ericc59 ericc59 changed the title Add Step Functions Lambda handlers for orchestration (PR 3/4) [3/4] Add Step Functions Lambda handlers for orchestration Jan 29, 2026
@ericc59 ericc59 force-pushed the aws-integration-pr3 branch from ca9e4e6 to 187f23a Compare January 29, 2026 17:59
@ericc59 ericc59 closed this Jan 30, 2026
@ericc59 ericc59 deleted the aws-integration-pr3 branch January 30, 2026 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant