Faster EIGRP Feasible Successor Failover

The feasible successor is EIGRP predetermining an alternate next hop to reach a destination for fast failover. The feasible successor route is stored in the EIGRP topology table but is not installed in the RIB/FIB by default. The variance command can be used to promote the feasible successor to the routing table and traffic will be proportionally load balanced based on the feasible distance. These features of the protocol are well taught and well understood by many, so I won’t get too deep into the details about how they work. If unequal cost load balancing is not desired but fast failover to a feasible successor route is desired, there may be a way to make failover faster using a less known feature.

Demo Network

To test the theory, I set up a simple 4 router topology. 3 Cisco 1921s in a triangle and a Raspberry Pi hanging off the top of R1. The Pi’s main job is acting as an NTP server so that logs across all routers have accurate timestamps. The Pi is participating in EIGRP thanks to Free Range Routing’s (FRR) eigrpd. All links are Gigabit Ethernet except the link between R1-R3, it is a serial T1 line. The focus is on reachability and convergence on R3 to reach R1’s loopback, 198.51.100.0/24. Diagram drawn on diagrams.net.

Routing From R3 to 198.51.100.0/24

R3 has two ways of reaching R1’s loopback prefix, the two hop Gigabit path through R2 and the 1.544 Mbps T1 link directly to R1. By default, EIGRP will promote the route via R2 to the RIB and the route via the T1 link will be elected as a feasible successor. Without additional configuration, the feasible successor is not installed in the routing table. It is kept in the topology table as an alternate path for faster failover if the successor route goes away. The output below confirms that the route via Serial0/0/0 meets the feasibility condition, the reported distance is less than the feasible distance of the route via GigabitEthernet0/0.

RoutingLoop_R3#show ip eigrp topology 198.51.100.0/24
EIGRP-IPv4 Topology Entry for AS(1)/ID(172.16.0.2) for 198.51.100.0/24
  State is Passive, Query origin flag is 1, 1 Successor(s), FD is 131072
  Descriptor Blocks:
  10.0.0.5 (GigabitEthernet0/0), from 10.0.0.5, Send flag is 0x0
      Composite metric is (131072/130816), route is Internal
      Vector metric:
        Minimum bandwidth is 1000000 Kbit
        Total delay is 5020 microseconds
        Reliability is 255/255
        Load is 1/255
        Minimum MTU is 1500
        Hop count is 2
        Originating router is 198.51.100.1
  172.16.0.1 (Serial0/0/0), from 172.16.0.1, Send flag is 0x0
      Composite metric is (2297856/128256), route is Internal
      Vector metric:
        Minimum bandwidth is 1544 Kbit
        Total delay is 25000 microseconds
        Reliability is 255/255
        Load is 1/255
        Minimum MTU is 1500
        Hop count is 1
        Originating router is 198.51.100.1

Promoting the Feasible Successor to the RIB for Unequal Cost Load Balancing

Because the route via T1 is comparatively slow compared to the GigE path, a variance multiplier of 18 is required to install the feasible successor route to the routing table. After issuing the “variance 18” command under EIGRP configuration, both routes are installed in the routing table. Traffic will be proportionally load balanced across both paths. Notice the traffic share values and that the route via serial link is only allocated 1 of 16 interface hash buckets.

RoutingLoop_R3(config-router)#variance 18

RoutingLoop_R3#show ip route 198.51.100.0 255.255.255.0
Routing entry for 198.51.100.0/24
  Known via "eigrp 1", distance 90, metric 131072, type internal
  Redistributing via eigrp 1
  Last update from 172.16.0.1 on Serial0/0/0, 00:01:20 ago
  Routing Descriptor Blocks:
    172.16.0.1, from 172.16.0.1, 00:01:20 ago, via Serial0/0/0
      Route metric is 2297856, traffic share count is 7
      Total delay is 25000 microseconds, minimum bandwidth is 1544 Kbit
      Reliability 255/255, minimum MTU 1500 bytes
      Loading 1/255, Hops 1
  * 10.0.0.5, from 10.0.0.5, 00:01:20 ago, via GigabitEthernet0/0
      Route metric is 131072, traffic share count is 120
      Total delay is 5020 microseconds, minimum bandwidth is 1000000 Kbit
      Reliability 255/255, minimum MTU 1500 bytes
      Loading 1/255, Hops 2


RoutingLoop_R3#show ip cef 198.51.100.0/24 internal
198.51.100.0/24, epoch 0, RIB[I], refcnt 5, per-destination sharing
  sources: RIB
  ifnums:
    GigabitEthernet0/0(3): 10.0.0.5
    Serial0/0/0(5)
  path list 30CC9D98, 3 locks, per-destination, flags 0x49 [shble, rif, hwcn]
    path 2D185E5C, share 7/7, type attached nexthop, for IPv4
      nexthop 172.16.0.1 Serial0/0/0, IP adj out of Serial0/0/0 30FFD040
    path 2D185C40, share 120/120, type attached nexthop, for IPv4
      nexthop 10.0.0.5 GigabitEthernet0/0, IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
  output chain:
    loadinfo 30E1DD18, per-session, 2 choices, flags 0003, 5 locks
      flags [Per-session, for-rx-IPv4]
      16 hash buckets
        < 0 > IP adj out of Serial0/0/0 30FFD040
        < 1 > IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
        < 2 > IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
        < 3 > IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
        < 4 > IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
        < 5 > IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
        < 6 > IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
        < 7 > IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
        < 8 > IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
        < 9 > IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
        <10 > IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
        <11 > IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
        <12 > IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
        <13 > IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
        <14 > IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
        <15 > IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
      Subblocks:
        None
RoutingLoop_R3#

Feasible Successor Failover Without Load Balancing

Suppose that you do want to take advantage of the feasible successor feature but do not want unequal cost load balancing. The most common way of doing this is to simply not issue the variance command when one or more feasible successor routes exist. In this scenario, the feasible successor route will be stored in the EIGRP topology table but will not be installed in the RIB/FIB until the successor route is revoked.

RoutingLoop_R3#show ip eigrp topology 198.51.100.0/24
EIGRP-IPv4 Topology Entry for AS(1)/ID(172.16.0.2) for 198.51.100.0/24
  State is Passive, Query origin flag is 1, 1 Successor(s), FD is 131072
  Descriptor Blocks:
  10.0.0.5 (GigabitEthernet0/0), from 10.0.0.5, Send flag is 0x0
      Composite metric is (131072/130816), route is Internal
      Vector metric:
        Minimum bandwidth is 1000000 Kbit
        Total delay is 5020 microseconds
        Reliability is 255/255
        Load is 1/255
        Minimum MTU is 1500
        Hop count is 2
        Originating router is 198.51.100.1
  172.16.0.1 (Serial0/0/0), from 172.16.0.1, Send flag is 0x0
      Composite metric is (2297856/128256), route is Internal
      Vector metric:
        Minimum bandwidth is 1544 Kbit
        Total delay is 25000 microseconds
        Reliability is 255/255
        Load is 1/255
        Minimum MTU is 1500
        Hop count is 1
        Originating router is 198.51.100.1
RoutingLoop_R3#show ip cef 198.51.100.0/24 internal
198.51.100.0/24, epoch 0, RIB[I], refcnt 5, per-destination sharing
  sources: RIB
  ifnums:
    GigabitEthernet0/0(3): 10.0.0.5
  path list 30CC9D98, 7 locks, per-destination, flags 0x49 [shble, rif, hwcn]
    path 2D185C40, share 1/1, type attached nexthop, for IPv4
      nexthop 10.0.0.5 GigabitEthernet0/0, IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0
  output chain:
    IP adj out of GigabitEthernet0/0, addr 10.0.0.5 30FFCEE0

To test failover to the feasible successor during a direct link failure, I disconnected the cable between R2 and R3. This caused the route to R1’s loopback (198.51.100.0/24) via the GigE path to be revoked. Debbugging of CEF events was enabled before bringing the link down, output for the interesting prefix shown below. Elapsed time from the LINK down event to FIB reconvergence for 198.51.100.0/24 was 16ms. This test was repeated several times with the same result.

RoutingLoop_R3#
Jul 29 19:18:59.705: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to down
Jul 29 19:19:00.705: %LINK-3-UPDOWN: Interface GigabitEthernet0/0, changed state to down

Jul 29 19:19:00.717: FIBfib: [Default:198.51.100.0/24] RIB: Route del, 198.51.100.0/24 (none) {}
Jul 29 19:19:00.717: FIBfib: [Default:198.51.100.0/24] Remove source RIB for prefix 198.51.100.0/24. (Last removal=<none>, depth=0)
Jul 29 19:19:00.717: FIBfib: [Default:198.51.100.0/24] Removing source RIB
Jul 29 19:19:00.717: FIBfib: [Default:198.51.100.0/24] Recomputing forwarding for entry with sources <none>, and cover 0.0.0.0/0
Jul 29 19:19:00.717: FIBfib: [Default:198.51.100.0/24] Invalidating
Jul 29 19:19:00.717: FIBfib: [Default:~198.51.100.0/24] Change path_list, new NULL old 1/0:v4-anh-10.0.0.5-Gi0/0 30CC9D98
Jul 29 19:19:00.717: FIBfib: [Default:~198.51.100.0/24] Prefix uses per-destination load balancing and contains no local and no outgoing labels fib_entry->path_list: 0x0
Jul 29 19:19:00.717: FIBfib: [Default:~198.51.100.0/24] Set output chain to unresolved (was IP adj out of GigabitEthernet0/0, addr 10.0.0.5 (incomplete))
Jul 29 19:19:00.717: FIBfib: [Default:~198.51.100.0/24] Recalculating dflt chains (flags 00000000):
        old: IP adj out of GigabitEthernet0/0, addr 10.0.0.5 (incomplete)/<NULL>/<NULL>
        new: unresolved/<NULL>/<NULL>
Jul 29 19:19:00.717: FIBfib: [Default:~198.51.100.0/24] Recalculating sr chains (flags 00000000):
        old: /<NULL>/<NULL>
        new: unresolved/<NULL>/<NULL>
Jul 29 19:19:00.717: FIBfib: [Default:~198.51.100.0/24] Removed last source (RIB), so deleting fib entry.
Jul 29 19:19:00.717: FIBfib: [Default:~198.51.100.0/24] Requesting delete of entry
Jul 29 19:19:00.717: FIBfib: [Default:~198.51.100.0/24] Performing delete
Jul 29 19:19:00.717: FIBfib: [Default:~198.51.100.0/24] Deleting fib
Jul 29 19:19:00.717: FIBfib: [Default:~198.51.100.0/24] Removing from tree
Jul 29 19:19:00.717: FIBfib: [Default:~198.51.100.0/24] Prefix deleted
Jul 29 19:19:00.717: FIBfib: [Default:~198.51.100.0/24] Uninitialising entry
Jul 29 19:19:00.717: FIBfib: [Default:~198.51.100.0/24] Set output chain to <NULL> (was unresolved)
Jul 29 19:19:00.717: FIBfib: [Default:~198.51.100.0/24] Uninitialising tree item data
Jul 29 19:19:00.717: FIBfib: [Default:198.51.100.0/24] RIB: Route ins, 198.51.100.0/24 (none) {172.16.0.1[0x0] Se0/0/0 (none) #1[lbl ]}
Jul 29 19:19:00.717: FIBfib: [Default:198.51.100.0/24] First source RIB supplied for new prefix
Jul 29 19:19:00.721: FIBfib: [Default:~198.51.100.0/24] Created fib
Jul 29 19:19:00.721: FIBfib: [Default:~198.51.100.0/24] Set output chain to unresolved (was <NULL>)
Jul 29 19:19:00.721: FIBfib: [Default:~198.51.100.0/24] Prefix inserted
Jul 29 19:19:00.721: FIBfib: [Default:~198.51.100.0/24] Source RIB being added
Jul 29 19:19:00.721: FIBfib: [Default:~198.51.100.0/24] FIB entry needs to be updated
Jul 29 19:19:00.721: FIBfib: [Default:~198.51.100.0/24] Recomputing forwarding for entry with sources RIB, and cover 0.0.0.0/0
Jul 29 19:19:00.721: FIBfib: [Default:~198.51.100.0/24] <src:RIB> contributing fwding
Jul 29 19:19:00.721: FIBfib: [Default:~198.51.100.0/24] Change path_list, new 1/0:v4-anh-172.16.0.1-Se0/0/0 30CC9CF8 old NULL
Jul 29 19:19:00.721: FIBfib: [Default:~198.51.100.0/24] <src:RIB> no mpls extensions needed all_rib:0, is_sr_path:0
Jul 29 19:19:00.721: FIBfib: [Default:~198.51.100.0/24] Prefix uses per-destination load balancing and contains no local and no outgoing labels fib_entry->path_list: 0x30CC9CF8
Jul 29 19:19:00.721: FIBfib: [Default:~198.51.100.0/24] Set output chain to IP adj out of Serial0/0/0 30FFD040 (was unresolved)
Jul 29 19:19:00.721: FIBfib: [Default:~198.51.100.0/24] Recalculating dflt chains (flags 00000000):
        old: unresolved/<NULL>/<NULL>
        new: IP adj out of Serial0/0/0 30FFD040/<NULL>/<NULL>
Jul 29 19:19:00.721: FIBfib: [Default:~198.51.100.0/24] Recalculating sr chains (flags 00000000):
        old: /<NULL>/<NULL>
        new: IP adj out of Serial0/0/0 30FFD040/<NULL>/<NULL>

Faster Failover?

There is a combination of features that allows feasible successor routes to be installed in the RIB/FIB but not be used for load balancing. The variance command can be used to promote unequal cost paths to the RIB, but unequal cost multipath load balancing will be used unless the traffic share min command is used. This combination of features allows the feasible successor to be in the RIB/FIB but not be used for forwarding unless the successor route goes away. The output below confirms that the traffic share count for the serial path is 0 despite being installed in the routing table. Will this improve convergence time?

RoutingLoop_R3#show run | section eigrp
router eigrp 1
 traffic-share min across-interfaces
 variance 18
 network 0.0.0.0
RoutingLoop_R3#

RoutingLoop_R3#show ip route 198.51.100.0 255.255.255.0
Routing entry for 198.51.100.0/24
  Known via "eigrp 1", distance 90, metric 131072, type internal
  Redistributing via eigrp 1
  Last update from 10.0.0.5 on GigabitEthernet0/0, 00:00:29 ago
  Routing Descriptor Blocks:
    172.16.0.1, from 172.16.0.1, 00:00:29 ago, via Serial0/0/0
      Route metric is 2297856, traffic share count is 0
      Total delay is 25000 microseconds, minimum bandwidth is 1544 Kbit
      Reliability 255/255, minimum MTU 1500 bytes
      Loading 1/255, Hops 1
  * 10.0.0.5, from 10.0.0.5, 00:00:29 ago, via GigabitEthernet0/0
      Route metric is 131072, traffic share count is 1
      Total delay is 5020 microseconds, minimum bandwidth is 1000000 Kbit
      Reliability 255/255, minimum MTU 1500 bytes
      Loading 1/255, Hops 2

I repeated the cable disconnect test on the link between R2 and R3 several times with CEF event debugging enabled. Not only is there less output but convergence on this prefix is consistently 12ms. This is 4ms/25% faster than the standard feasible successor failover.

RoutingLoop_R3#
Jul 29 19:13:19.713: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to down
Jul 29 19:13:20.701: %LINK-3-UPDOWN: Interface GigabitEthernet0/0, changed state to down

.Jul 29 19:13:20.713: FIBfib: [Default:198.51.100.0/24] RIB: Route mod, 198.51.100.0/24 (none) {172.16.0.1[0x0] Se0/0/0 (none) #1[lbl ]}
.Jul 29 19:13:20.713: FIBfib: [Default:19
RoutingLoop_R3#8.51.100.0/24] Source RIB being modified
.Jul 29 19:13:20.713: FIBfib: [Default:198.51.100.0/24] Modification: path list changed
.Jul 29 19:13:20.713: FIBfib: [Default:198.51.100.0/24] FIB entry needs to be updated
.Jul 29 19:13:20.713: FIBfib: [Default:198.51.100.0/24] Recomputing forwarding for entry with sources RIB, and cover 0.0.0.0/0
.Jul 29 19:13:20.713: FIBfib: [Default:198.51.100.0/24] <src:RIB> contributing fwding
.Jul 29 19:13:20.713: FIBfib: [Default:198.51.100.0/24] Change path_list, new
RoutingLoop_R3#1/0:v4-anh-172.16.0.1-Se0/0/0 30CC9C58 old 1/0:v4-anh-10.0.0.5-Gi0/0 30CC9D48
.Jul 29 19:13:20.713: FIBfib: [Default:198.51.100.0/24] <src:RIB> no mpls extensions needed all_rib:0, is_sr_path:0
.Jul 29 19:13:20.713: FIBfib: [Default:198.51.100.0/24] Informing interested parties that recursion tree has changed for this entry
.Jul 29 19:13:20.713: FIBfib: [Default:198.51.100.0/24] Prefix uses per-destination load balancing and contains no local and no outgoing labels fib_entry->path_list: 0x30CC9C58
.Jul
RoutingLoop_R3# 29 19:13:20.713: FIBfib: [Default:198.51.100.0/24] Set output chain to IP adj out of Serial0/0/0 30FFD040 (was IP adj out of GigabitEthernet0/0, addr 10.0.0.5 (incomplete))
.Jul 29 19:13:20.713: FIBfib: [Default:198.51.100.0/24] Recalculating dflt chains (flags 00000000):
        old: IP adj out of GigabitEthernet0/0, addr 10.0.0.5 (incomplete)/<NULL>/<NULL>
        new: IP adj out of Serial0/0/0 30FFD040/<NULL>/<NULL>
.Jul 29 19:13:20.713: FIBfib: [Default:198.51.100.0/24] Recalculating sr chains (flags 00000000):
RoutingLoop_R3#
        old: /<NULL>/<NULL>
        new: IP adj out of Serial0/0/0 30FFD040/<NULL>/<NULL>

Keep in mind that I’m using old routers that are probably much slower than modern hardware so your mileage may vary. I also only have a small number of routes to converge on in this test topology. The major delay in convergence is interface carrier delay. There is a 1 second delay between the line protocol being declared down and the link being declared down. Routing protocol convergence seems to be triggered by the link down event. If my platform supports it, I will try enabling BFD and repeat this experiment. I’d also like to reuse this topology for indirect link failure convergence measurements.

Jul 29 19:33:14.715: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to down
Jul 29 19:33:15.715: %LINK-3-UPDOWN: Interface GigabitEthernet0/0, changed state to down

Bug?

During this experiment I may have encountered a bug that I suspect is in the version of FRR I’m using. I noticed that when I disconnected the link between R2-R3, the prefix for this link remained in the EIGRP topology table and that a routing loop was present. Interestingly, the topology table entry did not include an “originating router” declaration.

RoutingLoop_R3#show ip eigrp topology 10.0.0.4/30
EIGRP-IPv4 Topology Entry for AS(1)/ID(172.16.0.2) for 10.0.0.4/30
  State is Passive, Query origin flag is 1, 1 Successor(s), FD is 2173184
  Descriptor Blocks:
  172.16.0.1 (Serial0/0/0), from 172.16.0.1, Send flag is 0x0
      Composite metric is (2173184/5888), route is Internal
      Vector metric:
        Minimum bandwidth is 1544 Kbit
        Total delay is 20130 microseconds
        Reliability is 255/255
        Load is 1/255
        Minimum MTU is 1280
        Hop count is 3 
RoutingLoop_R3#

Traceroute to the prefix loops between R1 and the Raspberry Pi

Traceroute to the prefix loops between R1 and the Raspberry Pi 
RoutingLoop_R3#traceroute 10.0.0.4
Type escape sequence to abort.
Tracing the route to 10.0.0.4
VRF info: (vrf in name/id, vrf out name/id)
  1 172.16.0.1 4 msec 0 msec 0 msec
  2 172.30.51.6 4 msec 0 msec 0 msec
  3 172.30.51.5 4 msec 0 msec 0 msec
  4 172.30.51.6 4 msec 0 msec 0 msec
  5 172.30.51.5 4 msec 0 msec 0 msec
  6  *
    172.30.51.6 0 msec 0 msec
  7 172.30.51.5 4 msec 0 msec 4 msec
  8 172.30.51.6 0 msec *  0 msec
  9 172.30.51.5 0 msec 4 msec 0 msec
 10 172.30.51.6 4 msec 0 msec *

~ Lines omitted for brevity ~ 

I performed a soft reset of R1’s EIGRP neighbors and the “ghost prefix” disappeared from the topo table. I found this behavior to be repeatable. In the future I may try to debug further to understand this issue.

RoutingLoop_R1#clear ip eigrp neighbors soft
RoutingLoop_R1#
RoutingLoop_R3#show ip eigrp topology 10.0.0.4/30
EIGRP-IPv4 Topology Entry for AS(1)/ID(172.16.0.2)
%Entry 10.0.0.4/30 not in topology table

Leave Comment

Your email address will not be published. Required fields are marked *

Time limit exceeded. Please complete the captcha once again.