Skip to content

ECS Scope Preservation and NODATA‑Scoped Caching for Incoming EDNS Subnet#16819

Open
jd82k wants to merge 1 commit intoPowerDNS:masterfrom
jd82k:master
Open

ECS Scope Preservation and NODATA‑Scoped Caching for Incoming EDNS Subnet#16819
jd82k wants to merge 1 commit intoPowerDNS:masterfrom
jd82k:master

Conversation

@jd82k
Copy link

@jd82k jd82k commented Feb 4, 2026

Options

  1. ecs-scope-zero-on-no-record (scope_zero_on_no_record), default true

    • Meaning: when enabled, the recursor forces ECS scope to 0 for NODATA responses (including CNAME‑chain NODATA), regardless of the authoritative scope.
    • When disabled, the authoritative scope is preserved when available.
    • Note: when scope_zero_on_no_record=false and the authoritative scope is 0, negative caching remains global (negcache is not ECS‑aware), so there is no ECS scoping effect in that case.
  2. return-incoming-edns-subnet (return_incoming_edns_subnet), default true

    • Meaning: when enabled (and use-incoming-edns-subnet is enabled), the recursor echoes an ECS option back to the client.
    • Purpose: returning ECS in the response conveys the applicable scope, so downstream resolvers and caches can avoid reusing a subnet‑specific answer for unrelated clients. This prevents incorrect upstream/downstream cache reuse.

Record Cache (ECS scope preservation)

  1. Record cache entries now store ECS scope

    • Added d_ecsScope to MemRecursorCache::CacheEntry.
    • replace() writes scope into the cache entry (only when the entry is ECS‑specific).
    • get() and handleHit() return the stored scope via an output parameter.
  2. Cache hits now propagate ECS scope back to the response path

    • SyncRes::doCacheCheck() and SyncRes::doCNAMECacheCheck() read ECS scope from record cache hits and set d_answerECSScope, so cached answers return the correct scope.
  3. Cache dump/load includes ECS scope

    • PBCacheEntry now carries optional_uint32_ecsScope so ECS scope survives getRecordSets() / putRecordSets().

Negative Cache (NODATA + ECS)

  1. Avoid global negcache pollution for ECS‑specific NODATA
    • In SyncRes::processRecords(), when scope_zero_on_no_record=false and the authoritative NODATA response contains a non‑zero ECS scope, we skip inserting the NODATA entry into negcache.
    • This avoids a global negative cache entry that would incorrectly apply to all scopes.

Response ECS Scope when scope_zero_on_no_record=true

  1. Scope forced to 0 only for true NODATA
    • In pdns_recursor.cc, ECS scope is forced to 0 only when the final response is NODATA (NoError with no relevant RRSet for the requested QTYPE), including CNAME‑chain NODATA.
    • Other responses keep the authoritative scope.

Short description

Preserves authoritative ECS scope across record cache hits, returns ECS to downstream clients, and avoids global caching of ECS‑specific NODATA responses.

Checklist

I have:

  • read the CONTRIBUTING.md document
  • read and accepted the Developer Certificate of Origin document, including the AI Policy, and added a "Signed-off-by" to my commits
  • compiled this code
  • tested this code
  • included documentation (including possible behaviour changes)
  • documented the code
  • added or modified regression test(s)
  • added or modified unit test(s)
  • checked that this code was merged to master

@omoerbeek
Copy link
Member

omoerbeek commented Feb 5, 2026

Thanks. I'll try to look at this PR the coming time. In the meantime, please make sure the at least PR compiles.

@coveralls
Copy link

coveralls commented Feb 5, 2026

Pull Request Test Coverage Report for Build 21971072129

Details

  • 77 of 88 (87.5%) changed or added relevant lines in 7 files are covered.
  • 13401 unchanged lines in 188 files lost coverage.
  • Overall coverage increased (+5.8%) to 71.563%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pdns/recursordist/syncres.cc 29 30 96.67%
pdns/recursordist/pdns_recursor.cc 26 36 72.22%
Files with Coverage Reduction New Missed Lines %
modules/gpgsqlbackend/gpgsqlbackend.cc 1 88.89%
modules/ldapbackend/ldapauthenticator.cc 1 6.05%
pdns/auth-zonecache.hh 1 93.33%
pdns/backends/gsql/gsqlbackend.hh 1 98.88%
pdns/dnspacket.hh 1 94.44%
pdns/dnswriter.hh 1 76.6%
pdns/dolog.hh 1 6.66%
pdns/dynlistener.hh 1 0.0%
pdns/logging.cc 1 80.34%
pdns/recursordist/lwres.hh 1 75.0%
Totals Coverage Status
Change from base Build 21966021133: 5.8%
Covered Lines: 129600
Relevant Lines: 166044

💛 - Coveralls

@rgacogne
Copy link
Member

rgacogne commented Feb 5, 2026

Options

1. `ecs-scope-zero-on-no-record` (`scope_zero_on_no_record`), default **true**
   
   * Meaning: when enabled, the recursor forces ECS scope to **0** for NODATA responses (including CNAME‑chain NODATA), regardless of the authoritative scope.

That sounds like a very bad idea, especially for (but not limited to) CNAME chains. An authoritative server can serve a scoped CNAME to some clients and a different one (or A, AAAA, whatever) to others.

2. `return-incoming-edns-subnet` (`return_incoming_edns_subnet`), default **true**
   
   * Meaning: when enabled (and `use-incoming-edns-subnet` is enabled), the recursor echoes an ECS option back to the client.
   * Purpose: returning ECS in the response conveys the applicable scope, so downstream resolvers and caches can avoid reusing a subnet‑specific answer for unrelated clients. This prevents incorrect upstream/downstream cache reuse.

This might work in some very specific environments where you can ensure that clients are always sending a very narrow source (/32 and /128), otherwise you might be sending a badly scoped response.

None of this should be enabled by default and I'm not convinced we should merge it at all.

@jd82k
Copy link
Author

jd82k commented Feb 5, 2026

@rgacogne
Thanks for the feedback — I agree these options should be used carefully. A few clarifications on intent and defaults:

1) ecs-scope-zero-on-no-record (scope_zero_on_no_record)

  • The reason I set the default to true is not because I think it is ideal behavior.
  • It reflects current PowerDNS behavior: when the response is NODATA, the recursor effectively treats it as scope‑zero and inserts it into negcache, which then applies globally and can override record cache entries. This happens even if the authoritative server returned a non‑zero ECS scope.
  • I added this option so operators can turn that behavior off. In real deployments I would set it to false specifically to avoid the global‑scope negcache issue you described (e.g., scoped CNAME chains or scoped A/AAAA).
  • In other words: the default keeps existing behavior, and the option is there to allow the safer behavior.

2) return-incoming-edns-subnet (return_incoming_edns_subnet)

  • I agree it must be used responsibly. The motivation is correctness in caching, not blindly echoing ECS.
  • A “good recursor” should forward ECS to the client only when the authoritative response actually contained ECS scope information. If the auth doesn’t return ECS, we should not return it.
  • Today PowerDNS ignores the authoritative ECS scope in the downstream response, which makes downstream caches treat scoped answers as globally reusable. This becomes a real problem in multi‑site setups where one recursor uses another as upstream.
  • This option exists to let operators enable correct ECS propagation when they need it. If your environment can’t guarantee narrow ECS or you don’t want ECS in responses, you can disable it.

@rgacogne
Copy link
Member

rgacogne commented Feb 5, 2026

* It reflects _current PowerDNS behavior_: when the response is NODATA, the recursor effectively treats it as scope‑zero and inserts it into negcache, which then applies globally and can override record cache entries. This happens **even if the authoritative server returned a non‑zero ECS scope**.

Are you sure this is true? I don't think it is:

  • a negative response cannot be ECS-scoped, so the recursor and will effectively add it to the negative cache without a scope
  • but a CNAME can be, and the recursor will store it with its scope, and later will retrieve the scope with the CNAME when reconstructing the CNAME chain.

@omoerbeek
Copy link
Member

Also: see https://www.rfc-editor.org/rfc/rfc7871#section-7.4 which basically says: negative answers should/can be treated as having scope /0.

@rgacogne
Copy link
Member

rgacogne commented Feb 5, 2026

For CNAME chains, see https://www.rfc-editor.org/rfc/rfc7871#section-7.2.1:

For the specific case of a Canonical Name (CNAME) chain, the Authoritative Nameserver SHOULD only place the initial CNAME record in the Answer section, to have it cached unambiguously and appropriately. Most modern Recursive Resolvers restart the query with the CNAME, so the remainder of the chain is typically ignored anyway. For message-focused resolvers, rather than RRset-focused ones, this will mean caching the entire CNAME chain at the longest PREFIX-LENGTH of any RRset in the chain.

@jd82k
Copy link
Author

jd82k commented Feb 5, 2026

@rgacogne @omoerbeek
Thanks for the pointers. I’ve read those sections too, and I agree they set a baseline. A few clarifications on why I still think the change (and configurability) is needed:

1) RFC7871 §7.4 (negative answers and scope /0)
Yes, it says negative answers can be treated as /0, but it is not a strict MUST in all operational contexts. It’s guidance meant to avoid over‑scoping negatives.
The practical issue is that PowerDNS currently makes every NODATA globally negative even when the authoritative response explicitly carries ECS scope > 0. In real deployments with ECS‑aware CDNs or geo routing, that can produce incorrect results for other clients until the negcache expires.
That’s why I added the option: keep legacy behavior by default, but allow operators who need correctness to preserve ECS scope and avoid global negcache pollution.

2) RFC7871 §7.2.1 (CNAME chains)
Agreed — authoritative servers should ideally return only the initial CNAME, and recursive resolvers should restart the query.
However, in practice:

  • Many servers include CNAME + SOA, which triggers NODATA handling in PowerDNS for the original QTYPE.
  • The CNAME RRSet can indeed be scoped and cached, but the negcache entry for the original QTYPE is global, so it still masks valid answers for other ECS scopes.
  • This is exactly what we observed in tests: scope‑specific CNAME in record cache is not enough because the negcache entry blocks other scopes.

So the change is not contradicting the RFC; it’s adding a switch to avoid a real operational failure mode when ECS is used. If a deployment wants strict RFC7871 semantics, they keep the default behavior. If they need ECS correctness across scopes, they can disable it.

@jd82k
Copy link
Author

jd82k commented Feb 5, 2026

Example scenario:

For www.youtube.com AAAA:

  • With ECS subnet 149.22.95.0/24, the authoritative server returns an AAAA record.
  • With ECS subnet 111.32.70.0/24, the authoritative server returns NODATA + SOA, and the SOA carries a non‑zero ECS scope.

PowerDNS currently ignores that non‑zero scope for NODATA and inserts the negative answer into the global negcache.
As a result, the negcache entry overrides the previously cached scoped AAAA answer.

So when 149.22.95.0/24 queries AAAA again, it receives the same NODATA/SOA result as 111.32.70.0/24, even though the authoritative response intended different answers for different ECS scopes.

That is why I added the option: it allows administrators to decide whether NODATA should be forced to scope /0 (current behavior) or whether ECS‑scoped NODATA should not be inserted into the global negcache. The default keeps existing behavior for backward compatibility.

@jd82k
Copy link
Author

jd82k commented Feb 5, 2026

The option return-incoming-edns-subnet (return_incoming_edns_subnet) is also related to PowerDNS issue #14022 (#14022). In clustered deployments, the recursor does not return ECS to downstream clients even when the upstream provided ECS, which leads to incorrect downstream caching. return-incoming-edns-subnet (return_incoming_edns_subnet) makes that behavior configurable so ECS scope can be preserved end‑to‑end when needed.

@rgacogne
Copy link
Member

rgacogne commented Feb 5, 2026

I'm not disagreeing on the problem statement that right now chaining caching resolvers with ECS does not work, but I'm unconvinced about some details of the proposed solution. Otto will do a proper review in the coming days in any case, as he stated already.

@omoerbeek
Copy link
Member

Example scenario:

For www.youtube.com AAAA:

* With ECS subnet **149.22.95.0/24**, the authoritative server returns an AAAA record.

* With ECS subnet **111.32.70.0/24**, the authoritative server returns **NODATA + SOA**, and the SOA carries a **non‑zero ECS scope**.

PowerDNS currently ignores that non‑zero scope for NODATA and inserts the negative answer into the global negcache. As a result, the negcache entry overrides the previously cached scoped AAAA answer.

So when 149.22.95.0/24 queries AAAA again, it receives the same NODATA/SOA result as 111.32.70.0/24, even though the authoritative response intended different answers for different ECS scopes.

That is why I added the option: it allows administrators to decide whether NODATA should be forced to scope /0 (current behavior) or whether ECS‑scoped NODATA should not be inserted into the global negcache. The default keeps existing behavior for backward compatibility.

I get

$ dig +nord @ns1.google.com +subnet=111.32.70.0/24 www.youtube.com aaaa

; <<>> DiG 9.20.11-1ubuntu2.1-Ubuntu <<>> +nord @ns1.google.com +subnet=111.32.70.0/24 www.youtube.com aaaa
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2870
;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
; CLIENT-SUBNET: 111.32.70.0/24/17
;; QUESTION SECTION:
;www.youtube.com.		IN	AAAA

;; ANSWER SECTION:
www.youtube.com.	300	IN	CNAME	youtube-ui.l.google.com.

;; Query time: 15 msec
;; SERVER: 2001:4860:4802:32::a#53(ns1.google.com) (UDP)
;; WHEN: Thu Feb 12 08:34:32 CET 2026
;; MSG SIZE  rcvd: 89

That is not a NODATA + SOA response. It isn't the final complete answer, but rec follows the target and constructs the complete response.

}
sendit:;

if (g_useIncomingECS && comboWriter->d_ecsFound && !resolver.wasVariable() && !variableAnswer) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong. It will break setting variable from Lua scripts and also cases where SyncRes indicates the answer should not be packetcached.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the old implementation, we always returned ECS with scope=0 in that if block.
A scope=0 ECS tells downstream caches the answer is broadly reusable.

That is why we had to exclude variable answers before:
if a variable answer was returned with scope=0, downstream might cache and reuse it too broadly, even though we intentionally do not packet-cache it locally.

My change is different: we no longer force scope=0.
We now return the scope learned from upstream ECS.
So downstream gets the real ECS scope and can shard cache correctly by subnet, instead of being told “global” by scope=0.

Because of that, the old !wasVariable()/!variableAnswer guard is no longer needed for this specific ECS path.
That guard was mainly protecting against the incorrect scope=0 behavior.

QType d_qtype; // 2
mutable vState d_state{vState::Indeterminate}; // 1
bool d_auth; // 1
uint8_t d_ecsScope{0};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field seems to add double bookkeeping of scope, which is also in d_netmask.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d_ecsScope is not duplicate bookkeeping of d_netmask.
d_netmask is a cache selection key (which entry matches a client subnet), while d_ecsScope preserves the response ECS scope to be returned downstream. They can diverge (for example we clamp the stored netmask to min(scope, source-prefix) and in routing-tagged entries netmask may be empty by design), so scope cannot be reliably reconstructed from d_netmask alone.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you are right. srcmask in asycnresolve is set to the received ECS including scope. It is an in/out argument to asyncresolve. It is further processed as ednsmask in doResolveAtThisIP and end up finally in updateCacheFromRecords.

Cache entries store the received scope, as it is indicative which client IP match.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I agree with your flow description, but one key point is incorrect: srcmask is not a full representation of the received ECS scope.

Code evidence:

  1. In lwres.cc, srcmask is reset before parsing the reply (lwres.cc:688).
  2. The received scope is stored in lwr->d_ednsECScope (lwres.cc:870).
  3. srcmask is only set when scope != 0, and it is clamped with min(scope, source-prefix) (lwres.cc:875-879).

So srcmask cannot represent all received scope states (notably scope=0).

Also, cache selection uses d_netmask, not d_ecsScope:

  • ECS index / best-match is done on netmask (recursor_cache.cc:330+).
  • entryMatches() checks d_netmask.match(who) (recursor_cache.cc:406-407).

And d_netmask can intentionally differ from scope:

  • cache key netmask is empty when routingTag is present (recursor_cache.cc:670),
  • while d_ecsScope is still stored (recursor_cache.cc:717).

Therefore d_ecsScope is not duplicate bookkeeping of d_netmask; they carry different semantics (selection key vs returned ECS scope metadata).

@omoerbeek
Copy link
Member

For the caching of scoped negative answers: I agree that it can lead to issues, but I don't agree that the recursor is the place to fix this. The single example you mention does not reproduce and the RFC is clear about it (even though it does not use MUST):

   It is RECOMMENDED that no specific behavior regarding
   negative answers be relied upon, but that Authoritative Nameservers
   should conservatively expect that Intermediate Nameservers will treat
   all negative answers as /0; therefore, they SHOULD set SCOPE PREFIX-
   LENGTH accordingly.

ECS handling already comes with quite a high cost and complexity. Conditionally not caching negative scoped answers will make this cost and complexity even higher. Not caching some negative answers will cause a potential big increase of outgoing traffic.

@omoerbeek
Copy link
Member

I have made some comments on detail, but let's look at the main logic of your PR.

An ECS EDSN record has thee fields: address (specifying a network), prefixlength and scope.

Desired behavior:

If a client sends an ECS (address and prefixlength set) it should receive in the anwer the same address and prefixlength, but with a specific scope added to it. The scope is received from the auth and may be wider that the prefixlength sent. The received scope is also what gets cached and determines if a lookup with ECS set matches in the recordcache.

If an answer is constructed from multiple records received from auths e.g. a CNAME chain), it should set a scope that corresponds to the most narrow scope of the records involved.

The ECS received from the client is available form comboWriter->d_ednssubnet. The ECS sent out (which might differ, see SyncRes::setQuerySource) is available as ecsSubnet in the PackeID struct. The received scope from the auth is available in thed_network field of the recordcache entries and in the srcmask field of the asyncresolve call. The subnet is passed to the record cache lookup in the who argument.

Currently the scope is not returned by the recordcache get() call. I would expect to see code to return the scope, taking the d_netmask field and returning the scope value (probably the right place is the Extra struct). The PR is adding an extra argument to get get() call and also an extra redundant field to the cache entry struct.

class LWResult also has an extra field that I think is doing double bookkeeping: asyncresolve() already returns the received scope via the srcmask argument.

But more importantly:

The scope end result to be sent to the client has to be computed taking into account that multiple records (originally received from multiple auths/the result of multiple recordcache lookups) can be involved. But I don't see that code. Instead I see to code to set the scope based on characteristics of the produced recordset and a single scope value that looks to be set from that last answer record involved.

So the main logic of your PR is not doing the right thing. The (recursive) calls to SyncRes::doResolve should do the actual computation of the scope to be returned to the client.

@jd82k
Copy link
Author

jd82k commented Feb 12, 2026

I have made some comments on detail, but let's look at the main logic of your PR.

An ECS EDSN record has thee fields: address (specifying a network), prefixlength and scope.

Desired behavior:

If a client sends an ECS (address and prefixlength set) it should receive in the anwer the same address and prefixlength, but with a specific scope added to it. The scope is received from the auth and may be wider that the prefixlength sent. The received scope is also what gets cached and determines if a lookup with ECS set matches in the recordcache.

If an answer is constructed from multiple records received from auths e.g. a CNAME chain), it should set a scope that corresponds to the most narrow scope of the records involved.

The ECS received from the client is available form comboWriter->d_ednssubnet. The ECS sent out (which might differ, see SyncRes::setQuerySource) is available as ecsSubnet in the PackeID struct. The received scope from the auth is available in thed_network field of the recordcache entries and in the srcmask field of the asyncresolve call. The subnet is passed to the record cache lookup in the who argument.

Currently the scope is not returned by the recordcache get() call. I would expect to see code to return the scope, taking the d_netmask field and returning the scope value (probably the right place is the Extra struct). The PR is adding an extra argument to get get() call and also an extra redundant field to the cache entry struct.

class LWResult also has an extra field that I think is doing double bookkeeping: asyncresolve() already returns the received scope via the srcmask argument.

But more importantly:

The scope end result to be sent to the client has to be computed taking into account that multiple records (originally received from multiple auths/the result of multiple recordcache lookups) can be involved. But I don't see that code. Instead I see to code to set the scope based on characteristics of the produced recordset and a single scope value that looks to be set from that last answer record involved.

So the main logic of your PR is not doing the right thing. The (recursive) calls to SyncRes::doResolve should do the actual computation of the scope to be returned to the client.

Thanks for the detailed review. I agree with your main point, and I have now changed the implementation to compute the final ECS scope as the narrowest scope (largest prefix length) across all RRsets that contribute to the final answer.

What is fixed now:

  1. Scope is aggregated across recursive resolution
    I added explicit scope merging in SyncRes::doResolve paths, instead of overwriting with a single last value.
    Implementation uses max(prefixLength), i.e. narrowest scope wins.

Example:

  • RRset A from auth#1 has scope /16
  • RRset B from auth#2 (CNAME target) has scope /24
    Final returned scope is /24 (narrowest), not whichever RRset happened to be processed last.
  1. recordcache get() now returns and aggregates scope correctly
    get() now carries ecsScope out, and handleHit() aggregates with max() across multiple cache hits.
    (See /Users/miaosenwang/pdns/pdns/recursordist/recursor_cache.cc:276)

Example:

  • QTYPE=ADDR hits both A(/20) and AAAA(/24) cache entries
    Final scope returned from cache lookup is /24 (narrowest), not overwritten by the later hit.
  1. Why d_netmask cannot replace d_ecsScope
    d_netmask is the cache key mask used for lookup/match, not the authoritative scope from the upstream answer.
    Those are not equivalent.

Example:

  • Outgoing ECS source prefix is /24
  • Auth reply returns scope /16 (or scope 0)
    d_netmask remains /24, but returned scope is /16 (or 0). So scope cannot be reconstructed from d_netmask.
    Therefore d_ecsScope is not redundant bookkeeping.
  1. Why LWResult::d_ednsECScope is not redundant with srcmask
    srcmask is only set when scope != 0 in asyncresolve, so it cannot represent “ECS present with scope=0”.
    (See /pdns/pdns/recursordist/lwres.cc around :815 and :870-879)
    So we still need an explicit optional scope field to preserve semantics.

@jd82k jd82k requested a review from omoerbeek February 12, 2026 09:25
@omoerbeek
Copy link
Member

I need to find some time to confirm your observations and update to the PR. That will not be this week.

@jd82k
Copy link
Author

jd82k commented Feb 12, 2026

I need to find some time to confirm your observations and update to the PR. That will not be this week.

Thanks for the update — no problem.

Take your time to verify. I’ve already adjusted the PR logic to aggregate ECS scope as the narrowest scope across all contributing RRsets, and I can help rework any remaining parts once you’ve had time to review.

I’ll wait for your follow-up.

@rgacogne
Copy link
Member

The unit test runner no longer links:

  clang++-19  -o testrunner testrunner.p/testrunner.cc.o testrunner.p/pollmplexer.cc.o testrunner.p/epollmplexer.cc.o -L/usr/local/lib -fsanitize=address,undefined -Wl,--as-needed -Wl,--no-undefined -Wl,--whole-archive -Wl,--start-group librec-test.a -Wl,--no-whole-archive -Wl,-z,relro -Wl,-z,now -fprofile-instr-generate -fcoverage-mapping -Wl,-rpath,/usr/local/lib -O1 -Werror=vla -Werror=shadow -Wformat=2 -Werror=format-security -fstack-clash-protection -fstack-protector-strong -Werror=string-plus-int -Wp,-D_GLIBCXX_ASSERTIONS '-Wl,-rpath,$ORIGIN/rec-rust-lib/rust' librec-common.a rec-rust-lib/rust/librecrust.a ext/arc4random/libarc4random.a ext/json11/libjson11.a librec-dnslabeltext.a ext/probds/libprobds.a /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.83.0 /usr/lib/x86_64-linux-gnu/libboost_context.so.1.83.0 -pthread /usr/lib/x86_64-linux-gnu/libcrypto.so -lresolv -L/usr/lib/x86_64-linux-gnu -lnetsnmpmibs -lnetsnmpagent -lnetsnmp /usr/lib/x86_64-linux-gnu/libsodium.so /usr/lib/x86_64-linux-gnu/libssl.so /usr/lib/x86_64-linux-gnu/libgnutls.so /usr/lib/x86_64-linux-gnu/libluajit-5.1.so /usr/lib/x86_64-linux-gnu/libfstrm.so /usr/lib/x86_64-linux-gnu/libcurl.so /usr/lib/x86_64-linux-gnu/libcap.so -lresolv -L/usr/lib/x86_64-linux-gnu -lnetsnmpmibs -lnetsnmpagent -lnetsnmp /usr/lib/x86_64-linux-gnu/libboost_filesystem.so.1.83.0 -Wl,--end-group
  /usr/bin/ld: librec-common.a.p/syncres.cc.o: in function `SyncRes::processRecords(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, DNSName const&, QType, DNSName const&, LWResult&, bool, std::vector<DNSRecord, std::allocator<DNSRecord> >&, std::set<DNSName, std::less<DNSName>, std::allocator<DNSName> >&, DNSName&, DNSName&, bool&, bool&, vState&, bool, bool, unsigned int, int&, bool&, unsigned int)':
  /__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/../../../../../../tmp/rec-meson-dist-build/meson-dist/pdns-recursor-0.0.0-git1/syncres.cc:(.text+0x81b86): undefined reference to `g_ECSScopeZeroOnNoRecord'
  Error: /usr/bin/ld: /__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/../../../../../../tmp/rec-meson-dist-build/meson-dist/pdns-recursor-0.0.0-git1/syncres.cc:(.text+0x81b98): undefined reference to `g_ECSScopeZeroOnNoRecord'
  Error: /usr/bin/ld: /__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/../../../../../../tmp/rec-meson-dist-build/meson-dist/pdns-recursor-0.0.0-git1/syncres.cc:5306:(.text+0x873e3): undefined reference to `g_ECSScopeZeroOnNoRecord'
  Error: /usr/bin/ld: /__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/../../../../../../tmp/rec-meson-dist-build/meson-dist/pdns-recursor-0.0.0-git1/syncres.cc:5306:(.text+0x89910): undefined reference to `g_ECSScopeZeroOnNoRecord'
  clang++-19: error: linker command failed with exit code 1 (use -v to see invocation)

@jd82k
Copy link
Author

jd82k commented Feb 13, 2026

The unit test runner no longer links:

  clang++-19  -o testrunner testrunner.p/testrunner.cc.o testrunner.p/pollmplexer.cc.o testrunner.p/epollmplexer.cc.o -L/usr/local/lib -fsanitize=address,undefined -Wl,--as-needed -Wl,--no-undefined -Wl,--whole-archive -Wl,--start-group librec-test.a -Wl,--no-whole-archive -Wl,-z,relro -Wl,-z,now -fprofile-instr-generate -fcoverage-mapping -Wl,-rpath,/usr/local/lib -O1 -Werror=vla -Werror=shadow -Wformat=2 -Werror=format-security -fstack-clash-protection -fstack-protector-strong -Werror=string-plus-int -Wp,-D_GLIBCXX_ASSERTIONS '-Wl,-rpath,$ORIGIN/rec-rust-lib/rust' librec-common.a rec-rust-lib/rust/librecrust.a ext/arc4random/libarc4random.a ext/json11/libjson11.a librec-dnslabeltext.a ext/probds/libprobds.a /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.83.0 /usr/lib/x86_64-linux-gnu/libboost_context.so.1.83.0 -pthread /usr/lib/x86_64-linux-gnu/libcrypto.so -lresolv -L/usr/lib/x86_64-linux-gnu -lnetsnmpmibs -lnetsnmpagent -lnetsnmp /usr/lib/x86_64-linux-gnu/libsodium.so /usr/lib/x86_64-linux-gnu/libssl.so /usr/lib/x86_64-linux-gnu/libgnutls.so /usr/lib/x86_64-linux-gnu/libluajit-5.1.so /usr/lib/x86_64-linux-gnu/libfstrm.so /usr/lib/x86_64-linux-gnu/libcurl.so /usr/lib/x86_64-linux-gnu/libcap.so -lresolv -L/usr/lib/x86_64-linux-gnu -lnetsnmpmibs -lnetsnmpagent -lnetsnmp /usr/lib/x86_64-linux-gnu/libboost_filesystem.so.1.83.0 -Wl,--end-group
  /usr/bin/ld: librec-common.a.p/syncres.cc.o: in function `SyncRes::processRecords(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, DNSName const&, QType, DNSName const&, LWResult&, bool, std::vector<DNSRecord, std::allocator<DNSRecord> >&, std::set<DNSName, std::less<DNSName>, std::allocator<DNSName> >&, DNSName&, DNSName&, bool&, bool&, vState&, bool, bool, unsigned int, int&, bool&, unsigned int)':
  /__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/../../../../../../tmp/rec-meson-dist-build/meson-dist/pdns-recursor-0.0.0-git1/syncres.cc:(.text+0x81b86): undefined reference to `g_ECSScopeZeroOnNoRecord'
  Error: /usr/bin/ld: /__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/../../../../../../tmp/rec-meson-dist-build/meson-dist/pdns-recursor-0.0.0-git1/syncres.cc:(.text+0x81b98): undefined reference to `g_ECSScopeZeroOnNoRecord'
  Error: /usr/bin/ld: /__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/../../../../../../tmp/rec-meson-dist-build/meson-dist/pdns-recursor-0.0.0-git1/syncres.cc:5306:(.text+0x873e3): undefined reference to `g_ECSScopeZeroOnNoRecord'
  Error: /usr/bin/ld: /__w/pdns/pdns/pdns/recursordist/pdns-recursor-0.0.0-git1/../../../../../../tmp/rec-meson-dist-build/meson-dist/pdns-recursor-0.0.0-git1/syncres.cc:5306:(.text+0x89910): undefined reference to `g_ECSScopeZeroOnNoRecord'
  clang++-19: error: linker command failed with exit code 1 (use -v to see invocation)

Now it should work. I changed the code.

@jqyisbest
Copy link

I was confused why there was no subnet information in the response, so I spent a day troubleshooting until I saw this issue from two years ago...

This might work in some very specific environments where you can ensure that clients are always sending a very narrow source (/32 and /128), otherwise you might be sending a badly scoped response.

How to use the information returned by the authoritative/upstream server is the user's business. In any case, hiding the information returned by the upstream server without authorization is not what a recursor should do...

@jd82k
Copy link
Author

jd82k commented Feb 24, 2026

I was confused why there was no subnet information in the response, so I spent a day troubleshooting until I saw this issue from two years ago...

This might work in some very specific environments where you can ensure that clients are always sending a very narrow source (/32 and /128), otherwise you might be sending a badly scoped response.

How to use the information returned by the authoritative/upstream server is the user's business. In any case, hiding the information returned by the upstream server without authorization is not what a recursor should do...

Thanks for raising this, and I agree with the main point: a recursor should not silently drop ECS information returned by upstream.

I have updated the logic so that when incoming ECS is present (and ECS return is enabled), we include ECS in the downstream response. The address/prefix is taken from the incoming client ECS, and the scope is taken from upstream/auth data. For multi-step answers (for example CNAME chains / multiple cache hits), the final scope is now aggregated as the narrowest one (largest prefix length), instead of being overwritten by the last record.

So we no longer hide upstream ECS scope by default, and we avoid bad scope selection in composed answers.

**Options**
1. `ecs-scope-zero-on-no-record` (`scope_zero_on_no_record`), default **true**
   - Meaning: when enabled, the recursor forces ECS scope to **0** for NODATA responses (including CNAME‑chain NODATA), regardless of the authoritative scope.
   - When disabled, the authoritative scope is preserved when available.
   - **Note:** when `scope_zero_on_no_record=false` and the authoritative scope is **0**, negative caching remains **global** (negcache is not ECS‑aware), so there is no ECS scoping effect in that case.

2. `return-incoming-edns-subnet` (`return_incoming_edns_subnet`), default **true**
   - Meaning: when enabled (and `use-incoming-edns-subnet` is enabled), the recursor echoes an ECS option back to the client.
   - Purpose: returning ECS in the response conveys the applicable scope, so downstream resolvers and caches can avoid reusing a subnet‑specific answer for unrelated clients. This prevents incorrect upstream/downstream cache reuse.

**Record Cache (ECS scope preservation)**
1. **Record cache entries now store ECS scope**
   - Added `d_ecsScope` to `MemRecursorCache::CacheEntry`.
   - `replace()` writes scope into the cache entry (only when the entry is ECS‑specific).
   - `get()` and `handleHit()` return the stored scope via an output parameter.

2. **Cache hits now propagate ECS scope back to the response path**
   - `SyncRes::doCacheCheck()` and `SyncRes::doCNAMECacheCheck()` read ECS scope from record cache hits and set `d_answerECSScope`, so cached answers return the correct scope.

3. **Cache dump/load includes ECS scope**
   - `PBCacheEntry` now carries `optional_uint32_ecsScope` so ECS scope survives `getRecordSets()` / `putRecordSets()`.

**Negative Cache (NODATA + ECS)**
1. **Avoid global negcache pollution for ECS‑specific NODATA**
   - In `SyncRes::processRecords()`, when `scope_zero_on_no_record=false` and the authoritative NODATA response contains a **non‑zero** ECS scope, we skip inserting the NODATA entry into `negcache`.
   - This avoids a global negative cache entry that would incorrectly apply to all scopes.

**Response ECS Scope when `scope_zero_on_no_record=true`**
1. **Scope forced to 0 only for true NODATA**
   - In `pdns_recursor.cc`, ECS scope is forced to 0 only when the final response is NODATA (NoError with no relevant RRSet for the requested QTYPE), including CNAME‑chain NODATA.
   - Other responses keep the authoritative scope.

Signed-off-by: Miaosen Wang <secretandanon@gmail.com>
@jd82k jd82k reopened this Feb 24, 2026
@jd82k
Copy link
Author

jd82k commented Feb 25, 2026

@omoerbeek @rgacogne Please review this PR. Thanks!

@rgacogne
Copy link
Member

It's on our list, spamming does not help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants