Solving a tricky problem with ECMP between two machines, where one system refused to balance connections at random intervals.

Let’s imagine the following situation:


We have two Linux servers connected to each other with two network cards and a pair of switches. Everything is on the same network and each network card has a unique IP from the subnet. We configured ECMP within the kernel to load balance packets over the two interfaces towards two IPs configured on the other machine.

In order for this setup to properly work, some ARP related sysctl tweaks are needed:

net.ipv4.conf.ens2f0.arp_ignore = 1
net.ipv4.conf.ens2f1.arp_ignore = 1
net.ipv4.conf.ens2f0.arp_announce = 2
net.ipv4.conf.ens2f1.arp_announce = 2
net.ipv4.conf.ens2f0.arp_notify = 1
net.ipv4.conf.ens2f1.arp_notify = 1

So what does this do? If we look at Linux kernel docs, we get the following information:

arp_ignore - INTEGER
    Define different modes for sending replies in response to
    received ARP requests that resolve local target IP addresses:
    0 - (default): reply for any local target IP address, configured
    on any interface
    1 - reply only if the target IP address is local address
    configured on the incoming interface

The issue with the default setting (0) is that Linux will get an ARP REQUEST on ens2f0 for an IP belonging to ens2f1 and it will reply. As a result, the other system will see that both IPs are visible from the same NIC, hence it will ignore the other NIC completely.

arp_notify - BOOLEAN
    Define mode for notification of address and device changes.
    0 - (default): do nothing
    1 - Generate gratuitous arp requests when device is brought up
        or hardware address changes.

This setting is pretty self explanatory. Useful for fast recovery when the link goes up after an outage.