Reduce tooBusy false-negative by sauvainr · Pull Request #17 · STRML/node-toobusy

sauvainr · 2016-11-08T06:20:05Z

In my company we realize that toobusy.js generate a large amount of false-positive (Our settings are highWater = 80 & smootingFactor = 1/5).
Event with this conservative factor, in a sudden load increase on the server would cause currentLag to jump from 0 to 200+ ms and cause all consecutive requests to be rejected, even the server has largely the resource to handle them.

This commit bring a proposal that solve this situation:

Limit the maximum lag value:
As highWater * 2 == (100% too busy) we want to avoid the current lag to suddenly jump to a state of rejection of all requests.
Limiting the lag metric to highWater * 2 insure a smooth and coherent current Lag increase.
Inverse smoothfactor for decrementing the currentLag value:
In a situation of a quick punctual overload of the system the recovery should be fast to avoid false-negative rejections when the resources are already available.
Inverting the smoothfactor when the lag measure is smaller than the current lag insure full resources usage.

In my company we realize that toobusy.js generate a large amount of false-positive (Our settings are highWater = 80 & smootingFactor = 1/5). Event with this conservative factor, in a sudden load on the service would cause currentLag to jump from 0 to 200+ ms which triggered cause all consecutive requests to be rejected even the server has largely the resource to handle them. This commit bring a proposal that solve this situation: 1. Limit the maximum lag value: As highWater * 2 == (100% too busy) we want to avoid the current lag to suddenly jump to a stat of rejection all requests. Limiting the lag metric to highWater * 2 insure a smooth and coherent current Lag increase. 2. Inverse smoothfactor for decrementing the currentLag value: In a situation of a quick punctual overload of the system the recovery should be fast to avoid false-negative rejections when the resources are already available. Inverting the smoothfactor when the lag measure is smaller than the current lag insure full resources usage.

asilvas · 2016-11-08T16:07:52Z

I've used this pattern for quite some time, and agree it has it's limitations. It's especially flaky with apps that can have bursts of cpu-intensive workloads, easily spiking lag times even during relatively low traffic. If your entire app/api is reliably light weight, toobusy pattern works quite well.

For more complex apps, I recommend concurrency monitoring, capping the number of concurrently processed requests, and responding too busy if threshold is exceeded. Or monitor the average request/response timings averaged over the last X requests, and responding too busy if threshold exceeds Y.

knoxcard · 2018-01-28T10:54:18Z

Sounds critical, has this been merged in and pushed up?

sauvainr · 2018-01-29T08:52:04Z

Hi, @knoxcard as the thread didn't have updates, I am still using our fork in my company -> https://github.com/exosite/node-toobusy Which is on production for more then a year now and provided the expected behavior for us.

@asilvas may want to have a look again and decides if he wants to integrates or document the limitation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce tooBusy false-negative#17

Reduce tooBusy false-negative#17
sauvainr wants to merge 1 commit intoSTRML:masterfrom
sauvainr:master

sauvainr commented Nov 8, 2016

Uh oh!

asilvas commented Nov 8, 2016

Uh oh!

knoxcard commented Jan 28, 2018 •

edited

Loading

Uh oh!

sauvainr commented Jan 29, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sauvainr commented Nov 8, 2016

Uh oh!

asilvas commented Nov 8, 2016

Uh oh!

knoxcard commented Jan 28, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sauvainr commented Jan 29, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

knoxcard commented Jan 28, 2018 •

edited

Loading