Skip to content

Add another layer of rate limiting for verification.#672

Draft
carte7000 wants to merge 6 commits intomainfrom
simon/aggregator/verification-level-rate-limiting
Draft

Add another layer of rate limiting for verification.#672
carte7000 wants to merge 6 commits intomainfrom
simon/aggregator/verification-level-rate-limiting

Conversation

@carte7000
Copy link
Contributor

This pull request introduces a verification rate limiter for commit verification records in the aggregator service. The main change is the addition of a Redis-backed rate limiter that uses a median absolute deviation (MAD) algorithm to detect and limit outlier verification rates among committee members. The implementation is configurable and includes a no-op fallback, integration into the handler logic, and comprehensive tests.

Verification Rate Limiting Implementation:

  • Added a new VerificationRateLimiter interface and its Redis-backed implementation in rate_limiting/verification_rate_limiter.go, which uses a MAD-based algorithm to detect outliers and rate limit verification attempts. [1] [2]
  • Introduced a NoopVerificationRateLimiter as a default implementation that always allows requests when rate limiting is not enabled.

Handler Integration:

  • Updated WriteCommitVerifierNodeResultHandler to accept and use a VerificationRateLimiter, invoking TryAcquire before saving commit verifications and returning a ResourceExhausted error if the rate limit is exceeded. [1] [2] [3]

Configuration Enhancements:

  • Added new config types for verification rate limiting, including support for Redis connection details and rate limiting parameters, and extended AggregatorConfig to include these options. [1] [2]
  • Implemented environment variable loading for verification rate limiter Redis configuration. [1] [2]

Testing:

  • Updated and extended tests to cover rate limiting scenarios, including the case where the rate limit is exceeded and the handler returns the correct gRPC error. [1] [2] [3] [4] [5] [6] [7] [8]

Dependency and Mockery Updates:

  • Registered the new VerificationRateLimiter interface in .mockery.yaml to enable mock generation for testing.The goal is to have all verifiers for a given source chain to more or less send the same amount of verification. A verifier sending way more verification than than the median will be rate limited

…ave all verifiers for a given source chain to more or less send the same amount of verification. A verifier sending way more verification than than the median will be rate limited
@carte7000 carte7000 force-pushed the simon/aggregator/verification-level-rate-limiting branch from d0ac49e to 98544c8 Compare February 10, 2026 21:49
@carte7000 carte7000 force-pushed the simon/aggregator/verification-level-rate-limiting branch from 98544c8 to 10415d7 Compare February 10, 2026 21:50
@github-actions
Copy link

Code coverage report:

Package main simon/aggregator/verification-level-rate-limiting diff
github.com/smartcontractkit/chainlink-ccv/aggregator 46.98% 49.26% +2.28%
github.com/smartcontractkit/chainlink-ccv/cmd 0.00% 0.00% +0.00%
github.com/smartcontractkit/chainlink-ccv/committee 100.00% 100.00% +0.00%
github.com/smartcontractkit/chainlink-ccv/common 52.13% 52.13% +0.00%
github.com/smartcontractkit/chainlink-ccv/executor 38.51% 38.45% -0.06%
github.com/smartcontractkit/chainlink-ccv/indexer 37.14% 37.16% +0.02%
github.com/smartcontractkit/chainlink-ccv/integration 37.03% 37.03% +0.00%
github.com/smartcontractkit/chainlink-ccv/pkg 100.00% 100.00% +0.00%
github.com/smartcontractkit/chainlink-ccv/pricer 15.70% 15.70% +0.00%
github.com/smartcontractkit/chainlink-ccv/protocol 62.88% 62.88% +0.00%
github.com/smartcontractkit/chainlink-ccv/verifier 55.59% 55.59% +0.00%

Copy link
Collaborator

@mateusz-sekara mateusz-sekara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern is, is this layer worth adding? The algorithm is pretty complex and requires some extra logic to run properly. Likely, any bug in TryAquire will vastly limit the availability of the system, rejecting all the incoming writes. Can't we rely on fixed rate limits per sender, but with thresholds adjusted to the throughput of the underlying PG storage? We can run the overprovisioned DB as a safety measure.

)

const (
tryAcquireTimeout = 5 * time.Second
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be in milliseconds, no? If Redis is not responding (or some other issues are happening during TryAcquire), we will be delaying every incoming request by at least 5 seconds with the current setting. I'd expect the timeout here to be very aggressive to not impact the request processing duration

}

return &VerificationRateLimiter{
redisClient: redisClient,
Copy link
Collaborator

@mateusz-sekara mateusz-sekara Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, but why not keep data in memory here? You wouldn't need the timeouts because no IO is performed. Probably it's fine to lost the rateLimiter state after the node restart. It doesn't need to be durable, does it?

@carte7000
Copy link
Contributor Author

My main concern is, is this layer worth adding? The algorithm is pretty complex and requires some extra logic to run properly. Likely, any bug in TryAquire will vastly limit the availability of the system, rejecting all the incoming writes. Can't we rely on fixed rate limits per sender, but with thresholds adjusted to the throughput of the underlying PG storage? We can run the overprovisioned DB as a safety measure.

Yeah def worth the discussion we can compensate with alert as well. Would like to bring more people in that discussion cc: @KodeyThomas, @emate WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants