vPC and LAG Convergence

Thursday November 16, 2017

Recently Cisco released NXOS 7.0(3)I7(1) for the Nexus 9000 series switches. This brings two new features, called vPC Fast Convergence and LACP Convergence. These are also available on the 7000 series switches.

There wasn’t a lot of information readily available, so I’m going to share what I’ve learned here. I’d like to take a moment to thank Amith Ronad from Cisco for helping me to understand these features.


LACP Convergence

In a normal etherchannel, LACP starts negotiating when a member link comes up. Around the same time, the switch starts to make VLANs available on the link.

LACP negotiation finishes first, but the VLANs may not all be available yet. This means that traffic can pass over the link, but the VLANs they are tagged with are unavailable. For a brief moment, some VLANs are pruned from the link, blackholing traffic.

The LACP Convergence feature changes the order a little. LACP frames are not sent until all the relevant the VLANs are available on the link. This prevents traffic blackholing. LACP Convergence

Switch(config)# interface port-channel 10
Switch(config-if)# lacp graceful-convergence

vPC Fast Convergence

vPC has several failure-handling features in place. When the peer-link fails, the vPC member ports of the secondary switch are shut down. This is done to prevent a split-brain scenario.

The normal behaviour in a case like this is for vPC to shut down any SVI’s whose VLANs are on the peer-link. Sometimes these SVI’s are down before the vPC member ports finish shutting down. The effect is that traffic may flow to a member port, only to find that the SVI is down. This is another case where traffic is briefly blackholed.

If you use the vPC Fast Convergence command, you enable a new feature called MCT Down Handler. This feature creates a list of member ports, layer-3 interfaces (including SVI’s), and the VLANs they use. When the peer-link fails, it sends a ‘suspend’ message to them all at once. The practical benefit of this is that the SVI’s do not shut down first, preventing traffic loss. Fast Convergence

Switch (config)# vpc domain 10
Switch (config-vpc-domain)# fast-convergence

This leads to an interesting question. Shouldn’t autostate keep the SVI’s up until all the member ports are down?

In a non-vPC environment, this is true. vPC however, is special and changes the rules. It will bring down any interfaces whose VLAN is on the peer-link. This includes orphan ports by the way, which is not ideal. There are two ways you design around this.

The first option is run a separate trunk link between your switch pair. This would only carry non-vPC VLANs. These VLANs are manually pruned from the peer-link. These VLANs will no longer be affected by vPC failures.

The second option is to use the dual-active exclude interface-vlan command. This will separate the SVI status from the peer-link failure. Of the two, the first option would be preferable.


Conclusion

There are some small benefits to be gained from these new features. Without them, link failures may lead to 500ms of traffic loss. With the new features, this can be decreased to 50-250ms of loss. Whether this provides any practical benefit to you will depend on your environment.

There does not appear to be any downsides to enabling these features. It’s surprising really, that they’re commands and not just built into vPC. I can’t think of any reasons you wouldn’t want to enable them.


More on vPC…

[maxbutton id=”4″ text=”vPC’s” url=”https://networkdirection.net/Virtual+Port+Channels”][maxbutton id=”4″ text=”Advanced vPC’s” url=”https://networkdirection.net/Advanced+vPC”]