KeyError: 'quarantined_media' in on_POSITION causes all workers to crash repeatedly on 1.152.0

### Description

  After upgrading to 1.152.0, the `event_worker` crashes repeatedly with an unhandled `KeyError: 'quarantined_media'` in `on_POSITION`. This tears down the Redis
  replication connection every ~3 seconds, causing the sync worker to stop receiving events and bridges to fail delivering E2EE decryption keys.

### Steps to reproduce

  ## Root Cause

  `on_POSITION` in `synapse/replication/tcp/handler.py` line 635 does a direct dict lookup:

  ```python
  stream = self._streams[cmd.stream_name]

  If a worker receives a POSITION for a stream it doesn't own (e.g. quarantined_media on the event worker), this raises an unhandled KeyError which tears down the
  Twisted connection.

  Note: adding quarantined_media: ["media_worker"] to stream_writers in homeserver.yaml does not help — WriterLocations.__init__() rejects it as an unexpected keyword
  argument.

### Homeserver

homeserver

### Synapse Version

1.152.0

### Installation Method

Docker (matrixdotorg/synapse)

### Database

PostgreSQL 18

### Workers

Multiple workers

### Platform

OS         │ Ubuntu 24.04 (Oracle Cloud)
Kernel     │ 6.17.0
Arch       │ aarch64 (ARM64)
Docker     │ 27.4.1
Python     │ 3.13.13
Synapse    │ 1.152.0 (matrixdotorg/synapse:latest)
Datenbank  │ PostgreSQL 18
Deployment │ Docker, Worker-Setup mit Redis-Replication


### Configuration

  homeserver.yaml (relevanter Ausschnitt):
  stream_writers:
    events: ["event_worker"]
    receipts: ["event_worker"]
    typing: ["event_worker"]
    presence: ["event_worker"]
    to_device: ["event_worker"]
    account_data: ["event_worker"]

  worker_event.yaml:
  worker_app: synapse.app.generic_worker
  worker_name: event_worker

  worker_listeners:
    - port: 8083
      bind_addresses: ['127.0.0.1']
      type: http
      resources:
        - names: [client, federation, replication]
          compress: true

  worker_media.yaml:
  worker_app: synapse.app.generic_worker
  worker_name: media_worker

  worker_listeners:
    - type: http
      port: 8085
      resources:
        - names: [media, replication]


### Relevant log output

```shell
CRITICAL - sentinel - Unhandled Error
  Traceback (most recent call last):
    File ".../twisted/internet/posixbase.py", line 491, in _doReadOrWrite
    File ".../twisted/internet/tcp.py", line 250, in doRead
    File ".../txredisapi.py", line 1858, in dataReceived
    File ".../synapse/replication/tcp/redis.py", line 178, in messageReceived
    File ".../synapse/replication/tcp/redis.py", line 219, in handle_command
    File ".../synapse/replication/tcp/handler.py", line 635, in on_POSITION
  builtins.KeyError: 'quarantined_media'
```

### Anything else that would be useful to know?

###  Fix

  stream = self._streams.get(cmd.stream_name)
  if stream is None:
      logger.debug("Ignoring POSITION for unknown stream %s", cmd.stream_name)
      return


  The fix/workaround needs to be applied to all workers, not just the event worker.
  Every worker subscribes to Redis pub/sub and receives all POSITION broadcasts,
  including for streams it doesn't own.

  Affected workers in our setup: event_worker, sync_worker, media_worker,
  federation_worker, push_worker.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KeyError: 'quarantined_media' in on_POSITION causes all workers to crash repeatedly on 1.152.0 #19750

Description

Steps to reproduce

Root Cause

Anything else that would be useful to know?

Fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

KeyError: 'quarantined_media' in on_POSITION causes all workers to crash repeatedly on 1.152.0 #19750

Description

Description

Steps to reproduce

Root Cause

Anything else that would be useful to know?

Fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions