Skip to content

Fix policy server signature merging again#19797

Open
tulir wants to merge 4 commits into
element-hq:developfrom
tulir:fix-policy-server-signatures-again
Open

Fix policy server signature merging again#19797
tulir wants to merge 4 commits into
element-hq:developfrom
tulir:fix-policy-server-signatures-again

Conversation

@tulir
Copy link
Copy Markdown
Contributor

@tulir tulir commented May 21, 2026

Fixes #19796

Haven't tested this in practice yet, but it's pretty simple

Pull Request Checklist

@tulir tulir requested a review from a team as a code owner May 21, 2026 21:14
@MadLittleMods MadLittleMods added the A-Abuse Reports, media quarantine, policy servers, etc label May 21, 2026
Comment thread synapse/handlers/room_policy.py
# than simply `update` the signatures on the event.
# because the event will fail authorization. This is why we add items individually
# rather than simply `update` the signatures on the event.
#
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some sort of test we could have for this kind of thing to prevent regressions?

Ideally, some sort of end-to-end Complement test. But I don't think we have any policy server tests in Complement yet. For the tests, I'm guessing it would be an engineered dummy server for the policy server like we have for engineered homeservers in Complement to federate with.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, a unit test is probably possible, there are already other unit tests for this stuff

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good enough.

Would love to see an actual end-to-end test as we could actually test whether the event is visible on other federating homeservers after it goes through the policy server. Instead of testing some arbitrary detail.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a unit test

Comment on lines +112 to +114
# Remove any existing signatures to ensure we only return the new signature
# like the policy server spec says.
pdu_dict.pop("signatures", None)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where in the spec is this defined? I'm not seeing it at https://spec.matrix.org/v1.18/server-server-api/#post_matrixpolicyv1sign or more generally in https://spec.matrix.org/v1.18/server-server-api/#policy-servers

If a room has enabled a Policy Server, the Policy Server’s signature appears alongside the normal event signatures, [...]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 200 response schema says

The Policy Server has signed the event, indicating that it recommends the event for inclusion in the room. Only the Policy Server’s signature is returned. This signature is to be added to the event before sending or processing the event further.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, this is mocking the federation endpoint response (confusing method name).

Perhaps, this would also be more clear if we plucked the relevant single signature out to return instead.

Comment on lines +335 to +341
self._sign_with_random_key("example.org", event)
self.mock_federation_transport_client.ask_policy_server_to_sign_event.side_effect = self.policy_server_signs_event
self.get_success(
self.handler.ask_policy_server_to_sign_event(event, verify=True)
)
self.assertEqual(len(event.signatures), 2)
self.assertEqual(len(event.signatures[self.OTHER_SERVER_NAME]), 1)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably needs some comments to describe what we're trying to do at each step and what it's trying to test

Same with the other test

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments to both

},
},
)
self._sign_with_random_key("example.org", event)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this test really care about this detail? Feels like something that should be split out to another test

Copy link
Copy Markdown
Contributor Author

@tulir tulir May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having signatures from 2 different servers (origin & policy server) is the standard case that the vast majority of events will hit, which is why I added it to the "ok" test. Having only a signature from a policy server never happens in practice. I could add another test anyway instead if that's better

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should comment about the case. // Sign the event as the origin homeserver (normal part of the process) - Something better/more precise.

self.get_success(
self.handler.ask_policy_server_to_sign_event(event, verify=True)
)
self.assertEqual(len(event.signatures), 2)
Copy link
Copy Markdown
Contributor

@MadLittleMods MadLittleMods May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kinda part of the existing code but we should use a better assertion that actually compares the servers and number of keys and has a better error message of what's actually different.

Suggested change
self.assertEqual(len(event.signatures), 2)
self.assertEqual(
{server: len(signatures) for server, signatures in event.signatures.items()},
TODO,
"Expected signatures for the origin homeserver (%s) and policy server (%s)"
)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added another assert for the origin server signatures length

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of the assertion I'm suggesting is that it gives better info when it fails on top of asserting things more fully.

self.assertEqual(len(event.signatures), 2)
self.assertEqual(len(event.signatures[self.OTHER_SERVER_NAME]), 1)

def test_ask_origin_server_to_sign_event_doesnt_replace_signatures(self) -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could definitely use test description to describe the point of this test. Especially when compared to test_ask_policy_server_to_sign_event_ok

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the comments on the asserts enough for this or should there be something else?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is decent but I think it deserves a better overview on what/why we care about.

Comment thread tests/handlers/test_room_policy.py Outdated
Comment on lines +112 to +114
# Remove any existing signatures to ensure we only return the new signature
# like the policy server spec says.
pdu_dict.pop("signatures", None)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, this is mocking the federation endpoint response (confusing method name).

Perhaps, this would also be more clear if we plucked the relevant single signature out to return instead.

self.get_success(
self.handler.ask_policy_server_to_sign_event(event, verify=True)
)
self.assertEqual(len(event.signatures), 2)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of the assertion I'm suggesting is that it gives better info when it fails on top of asserting things more fully.

},
},
)
self._sign_with_random_key(self.OTHER_SERVER_NAME, event)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.OTHER_SERVER_NAME is special here (and done on purpose) as it matches what the policy_server_signs_event(...) mock is doing to stress what this PR is fixing. (comment to call it out)

_sign_with_random_key(...) could use a better name as in this case, it's not that "random".

Comment on lines +264 to +266
signatures = signature.get(policy_server.server_name, {})
for key_id, sig in signatures.items():
event.signatures.add_signature(policy_server.server_name, key_id, sig)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing it this way is also better because we're strictly pulling out the signature for the policy_server.server_name instead of blindly trusting the entire response which can overwrite and add stray signatures for other servers.

Although we're probably in a cooperative setting anyway (we can assume you trust the policy server you have set).

It would be good to validate this earlier when we get the response. Something that can be tackled in another PR.

self.assertEqual(len(event.signatures), 2)
self.assertEqual(len(event.signatures[self.OTHER_SERVER_NAME]), 1)

def test_ask_origin_server_to_sign_event_doesnt_replace_signatures(self) -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is decent but I think it deserves a better overview on what/why we care about.

},
},
)
self._sign_with_random_key("example.org", event)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should comment about the case. // Sign the event as the origin homeserver (normal part of the process) - Something better/more precise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Abuse Reports, media quarantine, policy servers, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Synapse doesn't merge signatures correctly when a policy server is running on the same domain

2 participants