dnsdist: Optionally keep all data in LRU cache#16692
dnsdist: Optionally keep all data in LRU cache#16692karelbilek wants to merge 2 commits intoPowerDNS:masterfrom
Conversation
pdns/dnsdistdist/dnsdist-cache.cc
Outdated
| } | ||
| } | ||
|
|
||
| std::pair<bool, bool> DNSDistPacketCache::getWriteLocked(MaybeLruCache<uint32_t, CacheValue>& map, DNSQuestion& dnsQuestion, bool& stale, PacketBuffer& response, time_t& age, uint32_t key, bool recordMiss, time_t now, uint32_t allowExpired, bool receivedOverUDP, bool dnssecOK, const std::optional<Netmask>& subnet, bool truncatedOK, uint16_t queryId, const DNSName::string_t& dnsQName) |
There was a problem hiding this comment.
these are all new functions but the code is moved from get; it is here just so it can all work with both read lock and write lock. The function signature is very ugly though.
6139fca to
d83bf64
Compare
Pull Request Test Coverage Report for Build 21174868678Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
|
Just to keep you updated, I have not forgotten about this PR! The code is actually a lot cleaner than I feared, and I'm trying to see if we can make it more generic so it would also fit with a S3-FIFO experiment that I have had in a branch for a while. |
|
I have looked at S3-Fifo before and it went a bit over my head. LRU was simpler to understand/implement. I will look what have you implemented. S3-Fifo is lock-free, which is great. |
|
I am seeing your branch - https://github.com/rgacogne/pdns/commits/ddist-s3-fifo-rebased With that branch, I am not sure how much of my branch would even be necessary. |
|
Note that:
|
|
The more I am reading about S3-Fifo, the more I realize it's not really right fit for my use-case here. Broadly speaking, my use-case here is: I want to use dnsdist as a local stub resolver - and in the case where the upstream is down, I want to return at least something; even in case the TTL is old. I do not care about throughput (as it's minimal anyway in the local case); I want to keep things in cache as long as possible. I don't care about lock contention that much, because there isn't that many requests anyway. The LRU with write locking on each operation, as I did in my implementation, is in the end ideal. (for my usecase) The only upside of potentially using S3-FIFO would be that it would already be implemented in dnsdist and there wouldn't need to be another cache eviction algorithm :) |
|
Yup, I agree that S3-FIFO is not really the silver bullet I was hoping for, which is why this branch is dusty :) It works very nicely in a lot of cases but is not optimal in others. |
|
I have realized LRU cache might be good even when you don't have keepStaleData so I have changed the code somewhat; instead of the ugly named This seems to me less ugly. |
224d787 to
9c9277e
Compare
|
I'm tentatively putting this in the 2.1.0 milestone and I'll do my best to make it land for 2.1.0, but it might have to wait until 2.2.0. |
|
Note to self: also consider SIEVE which is much simpler than S3-FIFO and still has the property of not requiring a write-lock on a cache-hit: https://cachemon.github.io/SIEVE-website/ |
|
SIEVE looks very easy to implement indeed. I will have a look at it |
|
There is only one thing that requires special attention: the |
|
@rgacogne do you think it would be better to allow both SIEVE and LRU - which is more configurable, but will make the code slightly more complex for doubtful benefit? I cannot decide. Looking at others, BIND seems to implement just SIEVE after recent changes |
|
I think supporting only SIEVE is enough. |
|
Looking more closely at SIEVE, it seems it might have some surprising properties that might not be ideal. If the "hand" comes all the way to the "head", SIEVE will keep evicting the latest entry unless it's visited at least once. Which might be fine! But it can be surprising. I am not entirely sure until what circumstances would that happen. I have implemented SIEVE already (not committed yet) but I am thinking I might keep both in the end. |
pdns/dnsdistdist/dnsdist-cache.hh
Outdated
| struct DNSQuestion; | ||
|
|
||
| // lru cache is NOT locked; we need to use shared lock | ||
| template <typename K, typename V> |
There was a problem hiding this comment.
As I am editing the code more, I start to wonder if the templating is worth it, when it's never being used anywhere else. I will keep it for now though
There was a problem hiding this comment.
I'd say it's not useful at the moment, and we can add it back later if we reuse that code with different types.
This is not yet doing any actual change; it is just to make the next commit diff smaller Signed-off-by: Karel Bilek <kb@karelbilek.com>
Signed-off-by: Karel Bilek <kb@karelbilek.com>
ec7a948 to
a40bf9e
Compare
|
I have reworked the PR very significantly. I have made the cache container abstract and made the methods virtual; that allowed me to implement all 3 caches (the default "non-eviction" map-based one, LRU and SIEVE). The virtual dispatch might add some overhead, but my intuition was that the burtle hashing will be more expensive anyway and the virtual dispatch won't matter. I did not make any bigger measurements or profiling. The SIEVE container still goes through every element for TTL expiration. The SIEVE authors mention in the paper, that a possible optimization is to make TTL range buckets as in Segcache (other cache system of some of the authors) and then just look at the end of the list for each; I have not done that here and leave it for later. |
|
When I try to run regression tests with |
rgacogne
left a comment
There was a problem hiding this comment.
I've only done a very high-level scan but I like it, thanks a lot!
The next step would be to test the new eviction algorithms in the regression tests as well, I think?
It's too late in the cycle for 2.1.0 but I'd like to have this in 2.2 :)
|
|
||
| if (d_shards.at(shardIndex).d_entriesCount >= (d_settings.d_maxEntries / d_settings.d_shardCount)) { | ||
| return; | ||
| // with LRU cache, we don't check size; we always insert |
There was a problem hiding this comment.
| // with LRU cache, we don't check size; we always insert | |
| // with LRU or SIEVE cache, we don't check size; we always insert |
|
|
||
| value = newValue; | ||
|
|
||
| map.visit(key); |
There was a problem hiding this comment.
Would it make sense to pass value to visit()? It would prevent a lookup in the SIEVE case.
There was a problem hiding this comment.
as written now, this code won't get the visited atomic flag in value here, I don't see how we would skip a lookup?
|
I refrained from doing any extensive testing before the general shape is agreed. Yeah I should add some regression tests + some general tests of the cache eviction logic. I am not entirely sure if to leave in the code the |
Definitely don't spend time on new tests specific to the eviction logic yet, my idea was to see if the existing tests are still passing with the new eviction logic. I expect some won't because they rely on the existing behaviour but most of them should pass;
The only reason it would make sense for me to keep the "none" case around is if the new eviction algorithms have a noticeable impact on performance that we cannot solve. Otherwise I don't see any reason to keep it, the memory overhead is likely negligible. |
|
Yeah I did test it manually with the regression tests (by manual text replace) and they pass, except for There shouldn't be a big performance impact; but then I did not test it yet. |
|
I have realized; the code in this MR doesn't regularly delete the expired entries in the background thread; however, when the entry is older than I don't know how to solve it though - I don't want to go through the whole cache in the background thread, I like that I got rid of that for SIEVE/LRU; I want to keep the idea of TTL-buckets for later. One solution is really just go all-in and implement just SIEVE and the TTL buckets (as they don't really work with LRU). |
Short description
To be able to keep all data in cache, and not delete expired, we need another mechanism to evict from cache; and LRU is the most useful here, as the actually used elements will stay in the cache.
MaybeLruCache is either behaving like a regular map, if isLRU is false, or also has a linked list+map for constant-time putFront
One of the changes is also that finding in cache can now require a write lock.
I have hand-tested this code and it works for my purposes. I have not yet written any unit tests or regression tests.
For my local testing, I have had
assertin the code to check that all the three containers have same size at all time; but I have removed it here.Checklist
I have: