When I learned about Dead Peer Detection, I assumed it to be some generic IPsec keepalive and didn’t put much thought into it. A few weeks ago, I became more interested in DPD and started reading RFC 3706. I learned that DPD is a feature of IKE and I started considering how the protocol may not guarantee bidirectional IPsec functionality. This morning I verified this in my lab.
I built a GRE over IPsec tunnel between R1 and R3 and leveraged IKEv2 with periodic DPD. On R2 I used policy based routing to forward IKE traffic toward the multilayer switch on the left and ESP traffic to the switch on the right. The segment between the multi-layer switches and R3 is a broadcast network. The MLS interfaces are configured as routed ports in the same subnet as R3. This topology allowed me to disconnect the cable between MLS2 and the Ethernet switch to effectively “blackhole” ESP traffic between R1 and R3. R1 is configured with Loopback interface IP 1.1.1.1, R3’s Loopback is 3.3.3.3. R1 and R3 have static routes to reach each other’s loopback via the tunnel.
Verifying Functionality:
With all cables connected I verified reachability from R1 to R3’s loopback and validated IPsec encapsulation and decapsulation. I used debug crypto ikev2 to validate that DPD messages are being sent and received. show crypto ipsec sa confirms that packets are being encapsulated and decapsulated.
RoutingLoop_R1#ping 3.3.3.3
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 3.3.3.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/2/4 ms
RoutingLoop_R1#traceroute 3.3.3.3
Type escape sequence to abort.
Tracing the route to 3.3.3.3
VRF info: (vrf in name/id, vrf out name/id)
1 172.31.0.2 0 msec * 0 msec <R3’s tunnel IP
RoutingLoop_R1#
RoutingLoop_R1#show crypto ikev2 sa
IPv4 Crypto IKEv2 SA
Tunnel-id Local Remote fvrf/ivrf Status
5 172.16.0.1/500 172.17.0.1/500 none/none READY
Encr: AES-CBC, keysize: 256, PRF: SHA512, Hash: SHA512, DH Grp:5, Auth sign: PSK, Auth verify: PSK
Life/Active Time: 300/242 sec
IPv6 Crypto IKEv2 SA
RoutingLoop_R1#show crypto ipsec sa | inc cap
#pkts encaps: 10181, #pkts encrypt: 10181, #pkts digest: 10181
#pkts decaps: 10135, #pkts decrypt: 10135, #pkts verify: 10135
RoutingLoop_R1#
RoutingLoop_R1#debug crypto ikev2
IKEv2 default debugging is on
RoutingLoop_R1#
*Feb 22 12:22:27.355: IKEv2:(SESSION ID = 3,SA ID = 4):Sending DPD/liveness query
*Feb 22 12:22:27.355: IKEv2:(SESSION ID = 3,SA ID = 4):Building packet for encryption.
*Feb 22 12:22:27.355: IKEv2:(SESSION ID = 3,SA ID = 4):Checking if request will fit in peer window
<lines omitted for brevity>
*Feb 22 12:22:27.451: IKEv2:(SESSION ID = 3,SA ID = 4):Received DPD/liveness query
*Feb 22 12:22:27.451: IKEv2:(SESSION ID = 3,SA ID = 4):Building packet for encryption.
*Feb 22 12:22:27.451: IKEv2:(SESSION ID = 3,SA ID = 4):Sending ACK to informational exchange
RoutingLoop_R1#undebug all
Breaking Stuff:
With functionality verified, I disconnected the cable between MLS2 and Ethernet Switch. Reachability between R1 and R3s loopback was lost. Despite the loss of ESP/IPsec reachability, IKEv2 remained in the “Ready” state and DPD packets were exchanged between IKEv2 peers. R1 is not directly aware of this loss of reachability and continues to ESP/IPsec encapsulate packets and forward them toward R3.
RoutingLoop_R1#show crypto ipsec sa | inc cap
#pkts encaps: 10186, #pkts encrypt: 10186, #pkts digest: 10186
#pkts decaps: 10135, #pkts decrypt: 10135, #pkts verify: 10135
RoutingLoop_R1#ping 3.3.3.3
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 3.3.3.3, timeout is 2 seconds:
…..
Success rate is 0 percent (0/5)
RoutingLoop_R1#show crypto ipsec sa | inc cap
#pkts encaps: 10191, #pkts encrypt: 10191, #pkts digest: 10191
#pkts decaps: 10135, #pkts decrypt: 10135, #pkts verify: 10135
PBR Configuration:
RoutingLoop_R2#show ip access-lists
Extended IP access list MATCH_ESP
10 permit esp any any (10148 matches)
Extended IP access list MATCH_IKE
10 permit udp any any eq isakmp (913 matches)
RoutingLoop_R2#show route-map
route-map PBR, permit, sequence 10
Match clauses:
ip address (access-lists): MATCH_IKE
Set clauses:
ip next-hop 10.0.0.2
Policy routing matches: 915 packets, 127206 bytes
route-map PBR, permit, sequence 20
Match clauses:
ip address (access-lists): MATCH_ESP
Set clauses:
ip next-hop 10.0.0.6
Policy routing matches: 10148 packets, 1948224 bytes
route-map PBR, permit, sequence 30
Match clauses:
Set clauses:
Policy routing matches: 336 packets, 15043 bytes
RoutingLoop_R2#
Conclusions and Thoughts:
I suppose the condition described in this article could occur in equal cost multipath networks without policy based routing. PBR was used in this demonstration to guarantee deterministic traffic forwarding. Relying on DPD to guarantee peer liveliness and reachability may leave your network vulnerable to blackholed IPsec traffic and unidirectional (outbound only) traffic flows. Adding another keepalive protocol in IPsec or using something like IP SLA would consume yet more bandwidth, potentially increase state and interaction between IPsec peers or open a new attack surface. It’s always a tradeoff.