Add another layer of rate limiting for verification.#672
Add another layer of rate limiting for verification.#672
Conversation
…ave all verifiers for a given source chain to more or less send the same amount of verification. A verifier sending way more verification than than the median will be rate limited
d0ac49e to
98544c8
Compare
98544c8 to
10415d7
Compare
|
Code coverage report:
|
There was a problem hiding this comment.
My main concern is, is this layer worth adding? The algorithm is pretty complex and requires some extra logic to run properly. Likely, any bug in TryAquire will vastly limit the availability of the system, rejecting all the incoming writes. Can't we rely on fixed rate limits per sender, but with thresholds adjusted to the throughput of the underlying PG storage? We can run the overprovisioned DB as a safety measure.
| ) | ||
|
|
||
| const ( | ||
| tryAcquireTimeout = 5 * time.Second |
There was a problem hiding this comment.
This should probably be in milliseconds, no? If Redis is not responding (or some other issues are happening during TryAcquire), we will be delaying every incoming request by at least 5 seconds with the current setting. I'd expect the timeout here to be very aggressive to not impact the request processing duration
| } | ||
|
|
||
| return &VerificationRateLimiter{ | ||
| redisClient: redisClient, |
There was a problem hiding this comment.
Just curious, but why not keep data in memory here? You wouldn't need the timeouts because no IO is performed. Probably it's fine to lost the rateLimiter state after the node restart. It doesn't need to be durable, does it?
Yeah def worth the discussion we can compensate with alert as well. Would like to bring more people in that discussion cc: @KodeyThomas, @emate WDYT? |
This pull request introduces a verification rate limiter for commit verification records in the aggregator service. The main change is the addition of a Redis-backed rate limiter that uses a median absolute deviation (MAD) algorithm to detect and limit outlier verification rates among committee members. The implementation is configurable and includes a no-op fallback, integration into the handler logic, and comprehensive tests.
Verification Rate Limiting Implementation:
VerificationRateLimiterinterface and its Redis-backed implementation inrate_limiting/verification_rate_limiter.go, which uses a MAD-based algorithm to detect outliers and rate limit verification attempts. [1] [2]NoopVerificationRateLimiteras a default implementation that always allows requests when rate limiting is not enabled.Handler Integration:
WriteCommitVerifierNodeResultHandlerto accept and use aVerificationRateLimiter, invokingTryAcquirebefore saving commit verifications and returning aResourceExhaustederror if the rate limit is exceeded. [1] [2] [3]Configuration Enhancements:
AggregatorConfigto include these options. [1] [2]Testing:
Dependency and Mockery Updates:
VerificationRateLimiterinterface in.mockery.yamlto enable mock generation for testing.The goal is to have all verifiers for a given source chain to more or less send the same amount of verification. A verifier sending way more verification than than the median will be rate limited