We finally have spoke to spoke traffic working in our phase 1 DMVPN! After fixing the BGP third party next hop issue described in the last post we still had reachability issues between spokes. The spokes could reach the hub and the hub could reach the spokes just fine. Traffic transiting the hub to route from spoke to spoke was not working.
Before getting into the cause of the spoke to spoke reachability issues I want to describe some of the IP addressing and BGP ASNs so hopefully the rest of this article will make sense.
BGP ASNs and Tunnel Interfaces
Hub: 65000 172.16.0.1
Spoke 1: 65001 172.16.0.2
Spoke 2: 65003 172.16.0.3
Spoke 3: 65004 172.16.0.4
Relevant BGP Announced Prefixes
Spoke 1: 172.29.0.0/16
Spoke 2: 172.31.100.0/28
Spoke 3: 172.24.0.0/16
Spoke 1 BGP Table:
RoutingLoop_R1#show ip bgp
–Lines omitted for brevity–
*> 10.10.69.0/24 172.16.0.1 0 0 65000 i
*> 10.55.10.0/24 172.16.0.1 0 0 65000 i
*> 10.55.20.0/24 172.16.0.1 0 0 65000 i
*> 10.55.30.0/24 172.16.0.1 0 0 65000 i
*> 172.24.0.0 172.16.0.1 0 65000 65004 i
*> 172.29.0.0 0.0.0.0 0 32768 i
*> 172.31.100.0/28 172.16.0.1 0 65000 65003 i
The problem
The traceroute below is from spoke 2 to the spoke 1 management IP. As we can see, the trace reaches the hub, and no other hops are displayed. This wasn’t expected. The BGP and routing tables on all routers look good with the appropriate next hops. The source of this traceroute is 172.16.0.3. I ssh’d up to the hub for investigation.
DMVPN_SPOKE_2r#traceroute 172.29.254.1
Type escape sequence to abort.
Tracing the route to 172.29.254.1
VRF info: (vrf in name/id, vrf out name/id)
1 172.16.0.1 4 msec 8 msec 4 msec
2 * * *
3 * * *
I checked CEF for forwarding information between spoke 2’s tunnel address and spoke 1’s loopback using the “show ip cef exact-route” command. I’ve never used this command when any kind of tunnel is in play. I assumed the output interface and adjacency information would be tunnel related, effectively mirroring the IP routing table. This was not the case. The CEF command displays the output interface of the hub’s internet connected port and default route next hop. I had to think about this for awhile to start to grasp why the routing table and CEF don’t appear to be aligned. CEF must be considering the packet forwarding requirements of this traffic after GRE/IPsec encapsulation.
DMVPN-HUB-01#show ip cef exact-route 172.16.0.3 172.29.254.1
172.16.0.3 -> 172.29.254.1 =>IP adj out of GigabitEthernet8, addr $Hub public gateway IP
DMVPN-HUB-01#show ip route | begin Gateway
Gateway of last resort is $Hub public gateway IP to network 0.0.0.0
S* 0.0.0.0/0 [1/0] via $Hub public gateway IP
10.0.0.0/8 is variably subnetted, 13 subnets, 3 masks
— Lines omitted for brevity–
172.16.0.0/16 is variably subnetted, 2 subnets, 2 masks
C 172.16.0.0/24 is directly connected, Tunnel10
L 172.16.0.1/32 is directly connected, Tunnel10
B 172.24.0.0/16 [20/0] via 172.16.0.4, 01:55:54
B 172.29.0.0/16 [20/0] via 172.16.0.2, 01:47:55
172.31.0.0/28 is subnetted, 1 subnets
B 172.31.100.0 [20/0] via 172.16.0.3, 01:50:42
I recalled that there is an outbound ACL on the hub’s internet interface (Gi8). I checked the entries and only found permit statements for ICMP and an explicit deny any entry. I asked the operator of spoke 2 to try the traceroute again and noticed that the hit counter for the deny any ACE incremented. Interesting. We are using IPsec transport mode to save on overhead since we already have an outer IP header from GRE. I recalled that outbound interface ACLs are not checked for locally originated traffic. I assumed that when a packet is received by the hub and it is destined for another spoke, the hub would strip off the GRE IP, ESP and GRE headers, check the original IP destination and re-encapsulate the packet in GRE/ESP. After the packet had been re-encapsulated I suspected that it would be treated like a locally originated packet and not be susceptible to the outbound ACL on the hub. This is not the behavior we found.
DMVPN-HUB-01#show access-lists SELF-INTERNET
Extended IP access list SELF-INTERNET
10 permit icmp any any echo (11 matches)
20 permit icmp any any echo-reply
30 deny ip any any (411 matches)
I configured an extended ACL on the hub to match IP traffic from 172.16.0.0/24 (any tunnel interface) to 172.29.254.1 (Spoke 1 mgmt). I used this ACL to filter the output of “debug ip packet” and had spoke 2 and 3 try to ping 172.29.254.1. The pings were not successful, and the debug returned no output. The “deny any” ACL entry hit counter on the hub increased. As a test the hub operator removed the output ACL from the Internet facing port. After ACL removal spoke to spoke traffic is successful.
Spoke 1 to Spoke 3 LAN:
RoutingLoop_R1# ping 172.24.127.250 re 150 size 1400 df-bit
Type escape sequence to abort.
Sending 150, 1400-byte ICMP Echos to 172.24.127.250, timeout is 2 seconds:
Packet sent with the DF bit set
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!
Success rate is 100 percent (150/150), round-trip min/avg/max = 52/60/88 ms
Spoke 1 to Spoke 2 LAN:
RoutingLoop_R1#ping 172.31.100.3
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.31.100.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 44/44/44 ms
Spoke 2 to Spoke 1 Mgmt:
DMVPN_SPOKE_2 #ping 172.29.254.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.29.254.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 36/46/72 ms
Traceroute from Spoke 2 to Spoke 1 Mgmt:
DMVPN_SPOKE_2#traceroute 172.29.254.1
Type escape sequence to abort.
Tracing the route to 172.29.254.1
VRF info: (vrf in name/id, vrf out name/id)
1 172.16.0.1 4 msec 8 msec 4 msec
2 172.16.0.2 48 msec * 40 msec
I tried adding an entry in the hub’s outbound ACL to allow GRE and applied the ACL to the interface, Spoke to spoke traffic did not work. As a shot in the dark I then tried to allow ESP outbound, still no dice. Once the outbound ACL was removed from the interface, spoke to spoke traffic started flowing again. Back to the drawing board to figure out how to make this work with an outbound ACL on the hub’s internet interface.