Troubleshooting DMVPN

Troubleshooting DMVPN

There are a few tricks that you can use when you need to troubleshoot DMVPN. 

This is roughly broken into four parts:

  1. Transport
  2. Encryption
  3. Tunnels
  4. Routing

 

But before we start with each section, there are a few best practice items that can help you out:

  • Make sure timestamps on all your routers match
  • Enable msec for debugging and logging
  • Enable terminal exec prompt timestamp (used when running debugs)

Starting with these will make it easier to match up any issues that you’re seeing across different devices.

 

 


Transport

The first place to start is with the underlying transport. The most basic test is to confirm you have connectivity between each endpoint using ping.

Be careful not to introduce recursive routing at this point. For example, don’t ping an NBMA address from a tunnel address.

This is also a good opportunity to confirm that there’s no recursive routing in general. Check that you’re not advertising NBMA addresses over the tunnel interface.

 

If basic connectivity is ok, check that you don’t have any firewalls or IPS blocking your traffic.

This may be GRE traffic, or it may be IPSec, depending on how you’re implementing your tunnels.

 

If this is good, you may want to start a packet capture, and see what’s going on. This is also useful if ICMP is blocked somewhere, preventing the previous testing.

A useful tool here is the Packet Capture Config Generator and Analyzer, a tool that Cisco have provided to help generate the packet capture files, and analyze them.

 

Additional Tools

Run a debug ip icmp to see if pings are arriving, and not returning.

Run debug ip packet [acl] [detail] to dig into the traffic further. There are two important considerations here. First, always use an ACL, so you and the router aren’t overwhelmed by the debug.

Second, this won’t work with CEF. So, you use it effectively, you need to disable CEF (at least for the duration of testing). This may impact performance, so consider doing this out of hours.

 

 


Encryption

If you feel that the transport is ok, it’s time to look at encryption (assuming you’re using it).

Once again, you should be sure that nothing’s blocking IPSec.

 

If that’s good, run show dmvpn detail. This command is useful, as it give you a ton of IPSec info in a single command, along with some tunnel info.

In the example below, the tunnel’s up, but we can see that the Crypto Session Status is DOWN.

ASR-2#show dmvpn detail
Legend: Attrb --> S - Static, D - Dynamic, I - Incomplete
        N - NATed, L - Local, X - No Socket
        T1 - Route Installed, T2 - Nexthop-override
        C - CTS Capable, I2 - Temporary
        # Ent --> Number of NHRP entries with same NBMA peer
        NHS Status: E --> Expecting Replies, R --> Responding, W --> Waiting
        UpDn Time --> Up or Down Time for a Tunnel
==========================================================================

Interface Tunnel0 is up/up, Addr. is 192.168.250.2, VRF "vWAN" 
   Tunnel Src./Dest. addr: 200.1.1.22/Multipoint, Tunnel VRF "Edge"
   Protocol/Transport: "multi-GRE/IP", Protect "TUNNEL" 
   Interface State Control: Disabled
   nhrp event-publisher : Disabled

IPv4 NHS:
192.168.250.1   E priority = 0 cluster = 0
Type:Spoke, Total NBMA Peers (v4/v6): 1

# Ent  Peer NBMA Addr Peer Tunnel Add State  UpDn Tm Attrb    Target Network
----- --------------- --------------- ----- -------- ----- -----------------
    1 200.1.1.21        192.168.250.1   IKE 00:16:28     S   192.168.250.1/32 (vWAN)
          

Crypto Session Details: 
--------------------------------------------------------------------------------

Interface: Tunnel0
Session: [0x7EFB7B4468E8]
  Crypto Session Status: DOWN
  fvrf: Edge,     IPSEC FLOW: permit 47 host 200.1.1.22 host 200.1.1.21 
        Active SAs: 0, origin: crypto map
        Inbound:  #pkts dec'ed 0 drop 0 life (KB/Sec) 0/0
        Outbound: #pkts enc'ed 0 drop 22 life (KB/Sec) 0/0
   Outbound SPI : 0x       0, transform : 
    Socket State: Closed

Pending DMVPN Sessions:

 

When you see problems like the one above, you can use traditional IPSec troubleshooting tools to get to the bottom of the issue. For example:

  • show crypto isakmp sa
  • show crypto ikev2 sa
  • show crypto isakmp sa

In the example above, we are using a front-door VRF, which requires different key configuration to a normal tunnel. This is what’s causing this to fail.

 

One thing to keep an eye out for is if the hub router is also running EZVPN with IKEv1. This can cause confusion, as the DMVPN may try to accept an EZVPN connection.

If this case is true for you, try separating them by using different ISAKMP policies.

 

Additional Tools

A handy debug you can run is debug dmvpn detail crypto. This contains all the relevant crypto debugs in one place.

On the hub you could use a ‘condition’ to limit the debug data we get. For example, use debug dmvpn condition peer {peer-ip}.

 

 


Tunnels

The next step is to look at the tunnels, and confirm if they’re coming up or not. If they’re not, we need to figure out why.

Start by checking the usual issues:

  • Do the NHRP ID’s match?
  • Do the tunnel key’s match?
  • Are the source and destination IP’s correct?
  • Are the NHS entries using the right addresses?
  • Is IPSec working (see previous section)?

 

To help get some general information, start by using show ip nhrp. In particular, look for the ‘created’ timer. If this is a low value, this could indicate that the tunnel is flapping.

Hub#show ip nhrp
192.168.250.10/32 via 192.168.250.10
   Tunnel0 created 00:20:07, expire 00:07:17
   Type: dynamic, Flags: registered nhop 
   NBMA address: 110.1.1.2 
    (Claimed NBMA address: 172.16.16.2) 

Spoke#show ip nhrp
192.168.250.1/32 via 192.168.250.1
   Tunnel0 created 00:16:50, never expire 
   Type: static, Flags: 
   NBMA address: 200.1.1.21 

 

Next, use show ip nhrp nhs detail. This will show the number of requests and replies between the spoke and the NHS. If we’re not getting replies, then we’re not receiving traffic from the hub.

Spoke#show ip nhrp nhs detail
Legend: E=Expecting replies, R=Responding, W=Waiting
Tunnel0:
192.168.250.1  RE priority = 0 cluster = 0  req-sent 12  req-failed 2  repl-recv 7 (00:01:43 ago)

 

Consider as well if the NBMA addresses are changing. This could happen if ISP’s as assigning dynamic public IP addresses, or if we’re making NAT changes.

There are a few ways to check this:

  1. Start by checking what your NBMA (public IP) really is
  2. Run show ip nhrp to see what the other routers are trying to connect to; Is this the correct NBMA address?
  3. Run show crypto socket; Does this show two entries? One for an old NBMA address, and one for a new one?
  4. Run debug nhrp packet and look for the unique address registered already messages

 

If changing IP’s is a regular and expected occurrence, fix it with ip nhrp registration no-unique.

This command is a best-practice command, and allows a tunnel IP to be registered more than once.

As a once-off to resolve this, you can clear the NHRP database entries, which forces a re-registration with the new IP address.

 

Additional Tools

When digging deeper, start with show ip nhrp traffic. Look for messages sent and received, and pay attention to the registration requests and replies.

You can also use the debug dmvpn detail all command. Check that registration messages look correct. Look at the reqid field. This identifies a particular request, which you use to match up debugs on the hub and spoke.

 

 


Routing

Once we get to this stage, you can use the general troubleshooting techniques that you know and love.

There are a few DMVPN specific things to be aware of, but for the most part, you’ve got this!

Aside from the changes to routing with DMVPN, the only real concern to look out for is recursive routing. Remember to keep your tunnel and transport routes separate!

 

 


Sources

Cisco Live – Demystifying DMVPN – BRKSEC-3052

Leave a Reply