Skip to content

Port Event class to Rust#19701

Open
erikjohnston wants to merge 24 commits into
developfrom
erikj/events_rust
Open

Port Event class to Rust#19701
erikjohnston wants to merge 24 commits into
developfrom
erikj/events_rust

Conversation

@erikjohnston
Copy link
Copy Markdown
Member

@erikjohnston erikjohnston commented Apr 16, 2026

Ports the event class to Rust.

The main difference here are:

  1. There is now a single event class
  2. We now validate a lot more at event construction time than we previously did (we basically checked nothing before). This required some changes to the tests, including Fix tests to work with Rust event matrix-org/sytest#1423

Reviewable commit-by-commit.

Overview of Event Rust structure

The format of the event struct in Rust is quite different than that in Python.

The top-level looks like:

pub struct Event {
    /// The parsed event JSON.
    fields: FormattedEvent,

    /// The event ID. For format v1 this is read directly from the JSON;
    /// for v2+ it is computed from the canonical-JSON hash at
    /// construction time and cached here.
    event_id: Arc<str>,

    /// Synapse-internal per-event state that lives outside the federated
    /// JSON (e.g. outlier flag, soft-failure, stream positions).
    #[pyo3(get)]
    internal_metadata: EventInternalMetadata,

    /// The room version this event was parsed for.
    #[pyo3(get)]
    room_version: &'static RoomVersion,

    /// `None` for accepted events; otherwise a short reason set by auth
    /// when the event was rejected.
    rejected_reason: Option<Box<str>>,
}

which includes the actual parsed event in FormattedEvent, plus the rest of the event metadata.

pub struct FormattedEvent<E = Arc<EventFormatEnum>> {
    #[serde(default)]
    pub signatures: Signatures,

    #[serde(default)]
    pub unsigned: Unsigned,

    #[serde(flatten)]
    pub specific_fields: E,

    #[serde(flatten)]
    pub common_fields: Arc<EventCommonFields>,
}

The struct is further split into the common fields, format specific fields, plus the signatures and unsigned. We split out the signature and unsigned fields as they are mutable, so when we clone the event we can still share the common and specific fields and only copy signature and unsigned.

The specific_fields are the fields that depend on the format version. They can either be a specific format (e.g. E = EventFormatV1) or a type-erased enum EventFormatEnum that is across all room versions:

pub enum EventFormatEnum {
    V1(EventFormatV1),
    V2V3(EventFormatV2V3),
    V4(EventFormatV4),
    VMSC4242(EventFormatVMSC4242),
}

For example:

/// Shared flat-list encoding of `auth_events` and `prev_events`, reused
/// by every format from v2/v3 onwards.
#[derive(Serialize, Deserialize)]
pub struct SimpleAuthPrevEvents {
    pub auth_events: Vec<String>,
    pub prev_events: Vec<String>,
}

/// Version-specific fields for room versions 3-10.
#[derive(Serialize, Deserialize)]
pub struct EventFormatV2V3 {
    pub room_id: Box<str>,
    #[serde(flatten)]
    pub auth_prev_events: SimpleAuthPrevEvents,
}

Dev notes

As discussed in #element-backend-internal:matrix.org

@erikjohnston erikjohnston changed the base branch from develop to erikj/port_event_content May 8, 2026 12:01
Base automatically changed from erikj/port_event_content to develop May 8, 2026 13:19
Comment thread rust/src/events/mod.rs Outdated
@erikjohnston erikjohnston changed the base branch from develop to erikj/room_versions_rust May 8, 2026 14:39
@erikjohnston erikjohnston force-pushed the erikj/room_versions_rust branch from 48b0729 to f1705f2 Compare May 8, 2026 14:40
@erikjohnston erikjohnston force-pushed the erikj/events_rust branch 3 times, most recently from 68b0ec2 to 7c5c0b7 Compare May 12, 2026 12:08
Base automatically changed from erikj/room_versions_rust to develop May 14, 2026 10:21
@erikjohnston erikjohnston force-pushed the erikj/events_rust branch 2 times, most recently from db0b3f3 to 2b8272e Compare May 15, 2026 13:15
Small prerequisites for porting the Python EventBase hierarchy to Rust:

- duration: make `from_milliseconds` const and add an `IntoPyObject` impl
  for owned `SynapseDuration`, so the new Rust `Event.sticky_duration()`
  can return one directly to Python.
- internal_metadata: rename `copy()` to `deep_copy()` (matching the new
  naming used by the rest of the events module) and make `new()` callable
  from sibling modules.
- json_object: expose `object` as a `pub` field and add a `get_field`
  helper so the new Event class can read from it without going through
  Python.
- signatures, unsigned: add `deep_copy()` methods so the new Event class
  can implement its own deep-copy.
Adds a single `Event` Rust pyclass that replaces the Python EventBase /
FrozenEventV{1,2,3,4,VMSC4242} hierarchy. The class is added but not yet
wired into Python — callers continue to use the existing Python classes
in this commit; the migration follows in the next commit.

The internals use an `FormattedEvent` over
`EventFormatV{1,2V3,4,VMSC4242}` structs sharing an `EventCommonFields`.
Format-specific behaviour (prev_event_ids, auth_event_ids, room_id
derivation for v12 create events, etc) is encapsulated per variant.
Event IDs are computed in the constructor for v3+ formats; v1/v2 use the
`event_id` field as-is.

Two supporting Rust modules are added at the same time:

- `events::constants` — string constants for event types, top-level
  fields, and per-event-type content fields, used to keep the redaction
  rules and field accessors readable.
- `events::utils` — `redact()`, `compute_event_reference_hash()`, and
  `calculate_event_id()`, ported from `synapse.crypto.event_signing` /
  `synapse.events.utils`.
Replace the abstract `synapse.events.EventBase` and the concrete
`FrozenEvent`, `FrozenEventV2`, `FrozenEventV3`, `FrozenEventV4`, and
`FrozenEventVMSC4242` Python classes with a single Rust-backed
`Event`, exposed via `synapse.synapse_rust.events.Event`. `EventBase`
becomes a `TypeAlias` for `Event` so that the existing type annotations
across the codebase keep working.

Notable behavioural notes:

- `make_event_from_dict()` now constructs the Rust class. Event IDs for
  v3+ formats are computed in the constructor (instead of lazily on
  first access).
- `clone_event()` is now a single `event.deep_copy()` call. The old
  shallow copy of `unsigned` was effectively a deep copy in practice;
  `deep_copy()` matches that.
- The third-party event-rules callback no longer needs to call
  `event.freeze()` — Events are immutable from Python by construction.
- A small `assert_never` is added in `events_worker.py` to make the
  `redact_behaviour` switch exhaustive now that the type checker can
  see all branches.

All test fixtures that constructed `FrozenEventV3` etc. directly are
updated to construct `Event` instead.
Adapt tests that mutated Python event internals (`_event_id`, `_dict`,
direct attribute assignment, `FrozenEventV3(...)` construction) to work
with the new Rust-backed `Event` class:

- Rebuild events via `make_event_from_dict` / `make_test_event` instead
  of patching attributes in place.
- Plumb `rejected_reason` through `_join_rules_event` rather than
  assigning to `rejected_reason` after construction.
- Replace the hand-built event in `test_msc4242_state_dag` with a
  `Mock(spec=EventBase)` since the test only needs a handful of
  attributes.
- Add `# type: ignore` for the deprecated `event.user_id` / `event[key]`
  accessors and for assigning to `event.content`.
- In `make_test_event`, drop the default `room_id` for v11+ create
  events so each gets a distinct hash-derived room ID.
We have to take a slightly different approach here as we can't subclass
the native Event type.
Now that we do a bit more validadtion of events, it's possible that an
event persisted in the database may now not pass validation. This
shouldn't happen, but let's handle it correctly by logging and returning
that we couldn't find the event.

This is the same as what we do if we can't parse the JSON.
@erikjohnston erikjohnston marked this pull request as ready for review May 18, 2026 08:56
@erikjohnston erikjohnston requested a review from a team as a code owner May 18, 2026 08:56
Comment thread rust/src/events/mod.rs
/// The event ID. For format v1 this is read directly from the JSON;
/// for v2+ it is computed from the canonical-JSON hash at
/// construction time and cached here.
event_id: Arc<str>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What benefit are we getting from this being a Arc<str> vs an owned String?

Also a general question of the Arc/Box usage throughout these new types

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, so Box<str> vs String is mostly about space, since String has to store a capacity it takes up more memory. It also has mild advantage that is really indicates its immutable.

In this particular case, we also use Arc<str> so that we can point to the same event_id which is defined in EventFormatV1. (Will add a comment to that effect).

Comment thread rust/src/events/formats/mod.rs Outdated
Comment thread rust/src/events/formats/mod.rs
Comment thread rust/src/events/formats/mod.rs Outdated
Comment on lines +16 to +17
//! On-the-wire representations of Matrix events, parameterised by event format
//! version.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could explain this a little more plainly. Like "event JSON" and "parsed".

Reading far enough, I see a "Serialization and deserialization" but I think we could explain a bit here.

Comment on lines +54 to +55
//! don't need to be parsed up front. Generally, optional fields should be
//! handled via `other_fields`, as this saves space when they are not present.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, optional fields should be handled via other_fields, as this saves space when they are not present.

These details should be on described next to other_fields

Comment on lines -53 to -54
with self.assertRaises(NotImplementedError):
isinstance(object(), EventProtocol)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could still test this raises NotImplementedError right?

ev: EventBase, state: StateMap[EventBase]
) -> tuple[bool, JsonDict | None]:
ev.content = {"x": "y"}
ev.content = {"x": "y"} # type: ignore[misc]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain type ignore (comment)

room_version=RoomVersions.MSC4242v12,
)
ev._event_id = id # type: ignore[attr-defined]
assert supports_msc4242_state_dag(ev)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no longer using real events?

def __iter__(self) -> Iterator[str]: ...
def __eq__(self, other: object) -> bool: ...

class Event:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to manually transfer docstrings

Comment thread rust/src/events/mod.rs
}

#[getter(state_key)]
fn state_key_attr(&self) -> PyResult<&str> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why state_key_attr naming?

/// flat object matching the Matrix spec.
#[derive(Serialize, Deserialize)]
pub struct FormattedEvent<E = Arc<EventFormatEnum>> {
/// The event's signatures. This is a mutable field.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// The event's signatures. This is a mutable field.
/// The event's signatures. Kept separate from common/specific fields as this is a mutable field.

(also applies for unsigned)

Previous conversation

Comment on lines +94 to +96
/// The `signatures` and `unsigned` fields are kept as dedicated typed
/// wrappers because they round-trip between Rust and Python repeatedly
/// and benefit from caching. `common_fields` and `specific_fields` are
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following exactly. I would think common_fields/specific_fields are the ones that are shared/cached because they don't change.

How are signatures/unsigned being cached?

This also goes against what's written above AFAICT:

//! The `signatures` and `unsigned` fields are kept distinct from the
//! common/specific as they allow mutation. When copying an event they need to
//! be deep-copied, but the common/specific fields (which are immutable) can be
//! shared.

Comment on lines +123 to +124
specific_fields: Arc::clone(&self.specific_fields),
common_fields: Arc::clone(&self.common_fields),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌 Perfect. I feel like we should drop a note about the writable aspect protecting us here.

Comment thread synapse/events/utils.py

def clone_event(event: EventBase) -> EventBase:
"""Take a copy of the event.
"""Take a copy of the event."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the newly learned details from #19701 (comment), we're luckily protected out of the box 👍

Perhaps more along these lines of a hint:

Suggested change
"""Take a copy of the event."""
"""
Take a copy of the event.
Only `unsigned`/`signatures` is mutable. Mutating other properties will result in `AttributeError` (not writable).
"""

Comment thread rust/src/events/formats/vmsc4242.rs Outdated
if event.fields.common_fields.type_state_key_tuple() != Some((M_ROOM_CREATE, ""))
&& auth_event_ids.is_empty()
{
return Err(PyRuntimeError::new_err(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "assert" language signals an invariant check. It better describes "this should never happen if code is correct" vs. "something went wrong during execution"

Comment thread rust/src/events/formats/vmsc4242.rs Outdated
Comment on lines +74 to +76
return Err(PyRuntimeError::new_err(
"auth_event_ids is unexpectedly empty for a non-create event",
));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More:

Suggested change
return Err(PyRuntimeError::new_err(
"auth_event_ids is unexpectedly empty for a non-create event",
));
"auth_event_ids has not been calculated for event_id={}. All events (aside from the `m.room.create` event should have some `auth_event_ids` set.",

Even more:

Suggested change
return Err(PyRuntimeError::new_err(
"auth_event_ids is unexpectedly empty for a non-create event",
));
"Expected auth_event_ids for event_id={}. All events (aside from the `m.room.create` event) should have some `auth_event_ids` set. Either the `auth_event_ids` have not been calculated yet or this is a Synapse programming error.",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants