Offline worker after power outage

A physical worker node remains in offline state permanently after a brief power outage. Running `apply-vpn-routes` command manually fixes the VPN configuration and restores the connection.

**Steps to reproduce**

- Physical Worker and Leader nodes are on the same network, without UPS
- Leader node has a Samba DC, bound to its LAN IP address 
- A power outage occurs

**Expected behavior**

When power comes back, VPN connectivity should be restored automatically after boot, and both nodes must be fully functional.

**Actual behavior**

When power comes back the worker node remains in offline state because the wireguard VPN is misconfigured.

The journal evidences that `apply-vpn-routes` is executed early at boot time, when the network interface is still unconfigured. The script logic calculates a wrong IP route for the leader node, pushing it inside the VPN.

That command is executed by `redis.service` unit, after "network-online.target".

Leader node and Samba DC IP: 192.168.1.3

```text
Jan 09 09:30:46 ns8worker redis[2089]: ip route replace 192.168.1.3 nexthop dev wg0
...
Jan 09 09:32:03 ns8n6 kernel: igb 0000:01:00.0 enp1s0: igb: enp1s0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
...
Jan 09 09:32:03 ns8n6 NetworkManager[1557]: <info>  [1767947523.4155] policy: set 'enp1s0' (enp1s0) as default for IPv4 routing and DNS
```

Contents of `wg0.conf`. The IP 192.168.1.3 must not be among AllowedIPs, however, since it does not match a link-local interface route `apply-vpn-routes` pushes it in the VPN.

```text
[root@ns8worker ~]# cat /etc/wireguard/wg0.conf 
[Interface]
Address = 10.5.4.6/32
ListenPort = 55820
PrivateKey = ***

[Peer]
PublicKey = ***
AllowedIPs = 10.5.4.1/32, 10.5.4.2/32, 10.5.4.3/32, 10.5.4.4/32, 10.5.4.5/32, 192.168.1.3/32
Endpoint = 192.168.1.3:55820
PersistentKeepalive = 25
```


**Components**

Core 3.16.0

**See also**

- Discussion (PVT)  https://mattermost.nethesis.it/nethesis/pl/jqr1156yy7y68qn5yy5erzkc5e


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Offline worker after power outage #7806

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Offline worker after power outage #7806

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions