False `same panId or extendedPanId already exists nearby` startup failure after recovery/firmware events in multi-coordinator Home Assistant deployments

# False `same panId or extendedPanId already exists nearby` startup failure after recovery/firmware events in multi-coordinator Home Assistant deployments

## Summary

We are seeing repeatable Zigbee2MQTT startup failures with:

```text
network commissioning timed out - most likely network with the same panId or extendedPanId already exists nearby
(Error: AREQ - ZDO - stateChangeInd after 60000ms)
```

This has occurred across two independent Home Assistant households with multiple Zigbee2MQTT instances/coordinators. The symptom does **not** look like an intentionally running second coordinator with the same PAN/ExtPAN. It looks like zigbee-herdsman/Z-Stack enters a commissioning path where either the coordinator sees its **own existing mesh routers** or stale adapter/NVRAM/backup state as a foreign PAN/ExtPAN collision.

The practical impact is severe: a previously working mesh refuses to start, and the error points users toward changing PAN/network identity, which can require mass re-pairing. In our recoveries, preserving `pan_id`, `ext_pan_id`, and `network_key` and temporarily changing only the channel restored the original mesh.

## Why this may be structural

Multiple Zigbee2MQTT instances/coordinators are common in larger Home Assistant deployments. The current startup/restore diagnostics appear fragile when:

- multiple Zigbee2MQTT add-ons/coordinators exist in the same HA environment,
- adapters are SLZB/TCP based,
- firmware/core/radio was updated or rolled back,
- `coordinator_backup.json`, adapter NVRAM, and `configuration.yaml` temporarily disagree,
- routers from the previous/own mesh are still powered and beaconing.

In this state, the failure is reported as if an external same-PAN network exists. But the successful recoveries kept the same PAN/ExtPAN/key and restored the existing devices, suggesting the network identity itself was valid.

## Live sanitized evidence from one environment

Read-only inspection after recovery shows that config and backup currently agree on the critical identity:


### `/config/zigbee2mqtttt` (active/recovered instance)
- base topic: `zigbee2mqtttt`
- serial adapter: `zstack`, port: `<redacted private TCP SLZB address>`
- channel: `15`
- PAN config↔backup match: `True`
- ExtPAN config↔backup match: `True`
- channel config↔backup match: `True`
- network key fingerprint config↔backup match: `True`
- historical relevant log hits: `7`
- example log: `[2026-05-30 12:51:55] error: 	z2m: Error: network commissioning timed out - most likely network with the same panId or extendedPanId already exists nearby (Error: AREQ - ZDO - stateChangeInd after 60000ms`

### `/config/zigbee2mqttttt` (active/recovered instance)
- base topic: `zigbee2mqttttt`
- serial adapter: `zstack`, port: `<redacted private TCP SLZB address>`
- channel: `20`
- PAN config↔backup match: `True`
- ExtPAN config↔backup match: `True`
- channel config↔backup match: `True`
- network key fingerprint config↔backup match: `True`
- historical relevant log hits: `5`
- example log: `[2026-05-30 11:48:36] error: 	z2m: Error: network commissioning timed out - most likely network with the same panId or extendedPanId already exists nearby (Error: AREQ - ZDO - stateChangeInd after 60000ms`

### `/config/zigbeeEG2` (stale/manual instance, included because it exists in the same multi-instance environment)
- base topic: `ZigbeeEG2`
- serial adapter: `zstack`, port: `<redacted private TCP SLZB address>`
- channel: `11`
- PAN config↔backup match: `True`
- ExtPAN config↔backup match: `True`
- channel config↔backup match: `True`
- network key fingerprint config↔backup match: `True`
- historical relevant log hits: `0`

## Observed recovery patterns

### Case A: Household 1, one Z2M instance
- Original channel: 20
- Error: same `network commissioning timed out... same panId or extendedPanId...`
- Recovery that worked:
  1. stop Z2M cleanly
  2. back up `configuration.yaml`, `coordinator_backup.json`, `database.db`
  3. temporarily change only channel `20 -> 15`
  4. start Z2M and let it generate/repair coordinator backup
  5. stop Z2M
  6. set config and backup back to channel `20`
  7. verify `pan_id`, `ext_pan_id`, `network_key` unchanged
  8. restore pre-wiggle `database.db`
  9. start Z2M
- Result: expected mesh/device inventory returned; no mass re-pairing.

### Case B: Household 1, second Z2M instance
- Original channel: 15
- Error: same `network commissioning timed out... same panId or extendedPanId...`
- First temporary channel `15 -> 20` still failed.
- Second temporary channel `15 -> 25 -> 15` succeeded.
- PAN/ExtPAN/network key were preserved; `database.db` restored after clean stop.
- Result: existing mesh returned.

### Case C: Household 2, separate HA environment
- Original channel: 11
- Error: same `network commissioning timed out... same panId or extendedPanId...`
- Recovery: channel wiggle `11 -> 15 -> 11`, preserve PAN/ExtPAN/key, restore DB.
- Result: existing mesh returned.
- Sanitized live logs for this second household can be provided separately.

## Related upstream symptoms

- Koenkk/zigbee2mqtt#31519 — same error after HAOS backup restore / moving same adapter to VM; user asks why the same stick/same network is no longer considered part of the network.
- Koenkk/zigbee2mqtt#23730 — same error after coordinator firmware upgrade; changing channel/back restored function for some users.
- Koenkk/zigbee2mqtt#31329 — SLZB-06 firmware update/rollback reports with PAN/commissioning and connection failures.
- SMLIGHT support article recommends temporary `pan_id` wiggle in config + `coordinator_backup.json`, then changing it back. For large meshes this is riskier than a channel-only wiggle when PAN/ExtPAN/key are valid.

## Code path / diagnostic concern

In `zigbee-herdsman` `src/adapter/z-stack/adapter/manager.ts`, the failure appears during `beginCommissioning()`:

- `bdbStartCommissioning` is called with mode `0x04`
- code waits for `ZDO stateChangeInd` for 60s
- on timeout it throws the same-PAN/ExtPAN message

There is also a later explicit `panId collision detected` check. This suggests the timeout message is heuristic and may hide other root causes.

## Expected behavior

When `configuration.yaml` + `coordinator_backup.json` contain the same PAN/ExtPAN/network key as the existing mesh, and the same coordinator is being restarted/restored:

1. Zigbee2MQTT should reliably restore/start the existing network without falling into a fragile commissioning collision path.
2. If it cannot, the error should distinguish:
   - true external duplicate PAN/ExtPAN detected,
   - adapter NVRAM/config/backup mismatch,
   - own mesh routers beaconing while coordinator is re-forming,
   - firmware/baudrate/TCP adapter issue,
   - backup parse/restore failure.
3. The suggested recovery should avoid PAN/ExtPAN/network key changes unless the user explicitly accepts re-pairing.

## Questions for maintainers

1. Under what exact conditions does startup strategy choose `startCommissioning` instead of `startup` or `restoreBackup` when a `coordinator_backup.json` exists?
2. Can the code detect and report when config/backup/adapter NVRAM mismatch is the real cause before calling BDB commissioning?
3. During commissioning, can the coordinator distinguish "my own mesh routers are beaconing with the expected ExtPAN" from a true foreign PAN collision?
4. Would maintainers accept diagnostic improvements that print:
   - startup strategy,
   - config vs backup vs adapter channel/PAN/ExtPAN comparison,
   - whether `coordinator_backup.json` was accepted/rejected,
   - exact reason for entering `startCommissioning`?
5. Is a safe channel-wiggle preserving PAN/ExtPAN/key an acceptable documented recovery path compared with PAN-ID wiggle workarounds?

## Safety note

Changing `pan_id`, `ext_pan_id`, or `network_key` is not acceptable as a general workaround for large production meshes, because it can strand devices or require mass re-pairing. In the successful recoveries above, those values were preserved.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

False `same panId or extendedPanId already exists nearby` startup failure after recovery/firmware events in multi-coordinator Home Assistant deployments #1768

False `same panId or extendedPanId already exists nearby` startup failure after recovery/firmware events in multi-coordinator Home Assistant deployments

Summary

Why this may be structural

Live sanitized evidence from one environment

`/config/zigbee2mqtttt` (active/recovered instance)

`/config/zigbee2mqttttt` (active/recovered instance)

`/config/zigbeeEG2` (stale/manual instance, included because it exists in the same multi-instance environment)

Observed recovery patterns

Case A: Household 1, one Z2M instance

Case B: Household 1, second Z2M instance

Case C: Household 2, separate HA environment

Related upstream symptoms

Code path / diagnostic concern

Expected behavior

Questions for maintainers

Safety note

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

False same panId or extendedPanId already exists nearby startup failure after recovery/firmware events in multi-coordinator Home Assistant deployments #1768

Description

False same panId or extendedPanId already exists nearby startup failure after recovery/firmware events in multi-coordinator Home Assistant deployments

Summary

Why this may be structural

Live sanitized evidence from one environment

/config/zigbee2mqtttt (active/recovered instance)

/config/zigbee2mqttttt (active/recovered instance)

/config/zigbeeEG2 (stale/manual instance, included because it exists in the same multi-instance environment)

Observed recovery patterns

Case A: Household 1, one Z2M instance

Case B: Household 1, second Z2M instance

Case C: Household 2, separate HA environment

Related upstream symptoms

Code path / diagnostic concern

Expected behavior

Questions for maintainers

Safety note

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

False `same panId or extendedPanId already exists nearby` startup failure after recovery/firmware events in multi-coordinator Home Assistant deployments #1768

False `same panId or extendedPanId already exists nearby` startup failure after recovery/firmware events in multi-coordinator Home Assistant deployments

`/config/zigbee2mqtttt` (active/recovered instance)

`/config/zigbee2mqttttt` (active/recovered instance)

`/config/zigbeeEG2` (stale/manual instance, included because it exists in the same multi-instance environment)