Skip to content

Add a function that updates value of the count min sketch instead of updating it one at a time#38

Open
ghost wants to merge 1 commit into
tylertreat:masterfrom
darkcoderrises:master
Open

Add a function that updates value of the count min sketch instead of updating it one at a time#38
ghost wants to merge 1 commit into
tylertreat:masterfrom
darkcoderrises:master

Conversation

@ghost

@ghost ghost commented Nov 29, 2024

Copy link
Copy Markdown

No description provided.

dimitarvdimitrov added a commit to dimitarvdimitrov/BoomFilters that referenced this pull request Jun 28, 2025
Implements an AddN function that adds N occurrences of an item to the Count-Min Sketch,
unlike the existing Set function in PR tylertreat#38 which sets the count to a specific value.

The AddN function works similar to Add but takes a uint64 parameter for the count:
- Increments matrix buckets by N instead of 1
- Increments total count by N instead of 1
- Returns the sketch instance for method chaining

Includes comprehensive tests covering functionality, chaining, and total count verification.
dimitarvdimitrov added a commit to dimitarvdimitrov/BoomFilters that referenced this pull request Jun 28, 2025
Implements an AddN function that adds N occurrences of an item to the Count-Min Sketch,
unlike the existing Set function in PR tylertreat#38 which sets the count to a specific value.

The AddN function works similar to Add but takes a uint64 parameter for the count:
- Increments matrix buckets by N instead of 1
- Increments total count by N instead of 1
- Returns the sketch instance for method chaining

Includes comprehensive tests covering functionality, chaining, and total count verification.
@dimitarvdimitrov

Copy link
Copy Markdown
Contributor

wouldn't this API skew distributions of other values?

Let's say we wanted to add these elements:

foo: 10
bar: 5
baz: 1

Let's say that foo and bar have been added already:

Header Header Header Header Header
15 (foo, bar) 0 0 0 0
0 5 (bar) 10 (foo) 0 0
10 (foo) 0 0 5 (bar) 0

Now we want to call Set(baz, 1). Now cell (0,0) will throw off the values for both foo and bar: we will get counts of 1 for all 3 elements.

Header Header Header Header Header
1 (foo, bar, baz) 0 0 0 0
1 (baz) 5 (bar) 10 (foo) 0 0
10 (foo) 0 1 (baz) 5 (bar) 0

I'm not sure if an AddN(baz, 1) won't be more inline with how CMS are normally used. The paper for CMS doesn't describe set operations only increment and query. I'm also not sure how this interacts with the total count.

I opened another PR to implement AddN: #41

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants