IPv6: Interface IP operations are stopped with floating IP in HA failover
Asked Answered
N

2

8

When a main node fails, its IP (IPv6) floats to standby node. The standby node is supposed to provide service henceforth on that IP.

Given that both these nodes co-exist in the same LAN, often it is seen that the standby node becomes unreachable. The interface is UP and RUNNING with the IPv6 address assigned, but all the IP operations are stopped.

One possibility is Duplicate Address Detection (DAD) is kicking in when the IP is getting configured on standby. The RFC says all IP operations must be stopped.

My question is regarding the specifics in Linux kernel IPv6 implementation. Previously, from kernel code, I supposed the sysctl variable "disable_ipv6" must be getting set. But the kernel is not disabling IPv6, it is just stops all IP operations on that interface.

Can anyone explain what Linux kernel IPv6 does when it "disables these IP operations" on DAD failure? Can this be reset to normal without doing the interface DOWN & UP? Any pointers in the code will be very helpful.

Nara answered 6/7, 2015 at 22:26 Comment(5)
An OpenStack bug on similar lines: bugs.launchpad.net/nova/+bug/1011134Nara
This is just a gentle reminder to bring up the post: Would really appreciate if there are any insights from the community. The BOUNTY ends in 1 hour. Thanks!Nara
does dmesg show the duplicate address detected message?Timoshenko
Yes, it does, similar to one in net/ipv6/addrconf.c pointing DAD failure.Nara
If anyone is aware, is a solution for this bug thought of? bugs.launchpad.net/nova/+bug/1011134Nara
N
2

This article elaborates the specification and behavior w.r.t. what really is happening in the kernel w.r.t. IPv6 implementation and the floating IP configuration. It also suggests a solution: http://criticalindirection.com/2015/06/30/ipv6_dad_floating_ips/

It mentions for "user-assigned link-local", the IPv6 allocation gets stuck in tentative state, marked by IFA_F_TENTATIVE in the kernel. This state implies DAD is in progress and the IP is not yet validated. For "auto-assigned link-local", if the DAD fails it retries accept_dad times (with new auto-generated IP each time), and after that it disables IPv6 on that interface.

Solution it suggests is: Disable DAD before configuring the floating IP and enable it back when it is out of the tentative state.

For more details refer above link.

Nara answered 17/7, 2015 at 6:14 Comment(3)
That article is a nice read! I think you could make your anwser better if you took the relevant parts out of it and reposted them here, but I'm upvoting anyway. I would edit your post myself but I don't know what the etiquette for that is.Timoshenko
Thanks ssnobody, I have updated the answer with some detail. Although, I still left the whole scenario to refer to in the article.Nara
Doesn't it feel that by disabling IP operations on DAD failure, IPv6 deliberately imposes policy? A returned error value to the user could have been better, where the user then could use it to decide further action whether to disable IP operations or allow. This definitely messes up a valid scenario like floating IP. Overall a good reason for RFC modification.Nara
T
0

This is related to a bug in nova, bug #101134

The documentation for accept_dad says:

accept_dad - INTEGER Whether to accept DAD (Duplicate Address Detection). 0: Disable DAD 1: Enable DAD (default) 2: Enable DAD, and disable IPv6 operation if MAC-based duplicate link-local address has been found.

So you can use sysctl -w net.ipv6.conf.default.accept_dad=0 to workaround the bug and disable DAD.

Alternatively, you can fix this bug by implementing the proposing patches to nova/virt/libvirt/firewall.py from that same bug report.

If it is not already present in the NWFilterFirewall class, add the following staticmethod:

def nova_no_nd_reflection_filter(self):
    """This filter protects false positives on IPv6 Duplicate Address
    Detection(DAD).
    """
    uuid = self._get_filter_uuid('nova-no-nd-reflection')
    return '''<filter name='nova-no-nd-reflection' chain='ipv6'>
              <!-- no nd reflection -->
              <!-- drop if destination mac is v6 mcast mac addr and
                   we sent it. -->
              <uuid>%s</uuid>
              <rule action='drop' direction='in'>
                  <mac dstmacaddr='33:33:00:00:00:00'
                       dstmacmask='ff:ff:00:00:00:00' srcmacaddr='$MAC'/>
              </rule>
              </filter>''' % uuid

Then, add this filter to your filter lists in _ensure_static_filters() by adding:

self._define_filter(self.nova_no_nd_reflection_filter())

after filter_set is defined.

Timoshenko answered 17/7, 2015 at 23:41 Comment(2)
Doesn't dad_transmit need to be made zero?Nara
I believe disabling DAD is what you want to accomplish.Timoshenko

© 2022 - 2024 — McMap. All rights reserved.