IT Week

IP networks benefit from many forms of built-in resilience but can fail to deliver that same resilience to IP hosts. Stuart Mark looks at a router based solution to the ‘Default Router’ problem.

The IP protocol suite has been the workhorse of  networking and the Internet for many years.  The connectionless nature of IP allows dynamic traffic flow across a network through the use of  routing protocols such as RIP or OSPF. These provide IP routers with the ability to maintain and advertise more than one route to any given destination on a network. This means that load balancing and redundancy can be designed into a network resulting in a robust end-user service on an infrastructure, more tolerant of failures.

However, because of the way IP works, it's not always easy to carry this level of resilience through to the areas of the network that accommodate those same end users. 

IP End to End

When an IP node wishes to communicate with another IP node, it performs a number of steps. First it identifies the destination address, either from a static configuration in an application or hosts file or from DNS. It then compares this address with its own, taking the configured netmask into consideration.

If the destination and host source addresses belong to the same subnet, the node knows that the destination is local and will use Address Resolution Protocol (ARP) to identify the target MAC address and hence, make contact.

However, if the source and destination addresses are on disparate subnets, the source node knows that traffic must pass through at least one IP router to reach its destination. Consequently, the source must identify a router to which it can send the traffic bound for the target network.

This reliance on router identification is one of IP's major drawbacks because it requires every host on a local network to know the address of at least one router. Providing this information at setup time is not in itself so much of a problem but in scenarios where a router or interface fails and the host must find a backup path it can be very difficult to efficiently inform all hosts on a subnet of the failure.

There are a number of mechanisms by which hosts discover routers, each one reacting to failure differently.   

Router Discovery

The most common method of providing router information to a host is to statically configure it. All IP stacks have a 'Default Router' or 'Default Gateway' parameter that refers to the address of a router to be used for all non-local traffic. This can be manually configured along with an IP address and mask or can be supplied by a Dynamic Host Configuration Protocol (DHCP) server. The concept works fine until the default router fails. The loss of a router interface connecting to a different network will be handled by the ICMP redirect mechanism but a local interface or complete router failure will leave all the hosts on the subnet stranded. Some stacks, including Microsoft, provide the ability to configure more that one default router, the idea being that should the first router in the list fail, the second will be used and so on. The problem here is that the host will only revert to the second default router if it can detect the failure of the first for which it too relies on ICMP. Therefore, as with a single default gateway, backup only works with a remote interface failure. The only benefit of this over ICMP redirect with a single default gateway is that all traffic won't travel via the original router to reach the redundant path, the host will send traffic directly to the second router in the list.

A different form of the default router is the static route where, rather than point all traffic towards the same router, a number of routes to specific networks can be defined via the most relevant routers. While this may make better use of routers, it suffers from the same problem, the fact that a host still cannot detect a router failure and so has no redundant path. 

Some hosts are able to run full or listening only (snooping) versions of routing protocols such as RIP, OSPF or GATED. This approach is often used on powerful midrange devices such as UNIX variant hosts, SUNs, RS6000s, etc. but is rarely used on the PC because of the processing overhead or simple lack of availability on a particular platform.

ICMP router discovery client (ICMP DISC) is a method whereby, as its name suggests, hosts can use the Internet Control Message Protocol to discover local routers. This has the benefit of not only identifying a router to a host but enabling it to detect a failure and locate a secondary router. However, every host on a subnet must use ICMP messages to monitor the status of all local router which can mean, increased load on a network and the routers attached to it. To combat this, timers are generally increased which has the knock-on effect of increasing failure detection times. Even though it provides a degree of resilience, ICMP DISC is less popular than default router configuration due to the increase in host configuration complexity and a router-based solution to the redundancy problem.

A Different Approach to Router Redundancy

A more common method of providing redundancy to IP hosts is to move the responsibility for finding a backup path away from the host to the local routers on a subnet. There are a number of ways of achieving this, most notably proprietary solutions such as Digital’s IP Standby Protocol (IPSTB),  Cisco’s Hot Standby Router Protocol (HSRP) and the standards based  Virtual Router Redundancy Protocol (VRRP), described in RFC2338.

Each solution is almost identical in the way it operates, by allowing routers to adopt  a ‘virtual’ IP address that a PC can reference as it’s default router.

Two routers on a subnet will each have distinct interface IP and MAC addresses but they will also share a third IP address and a virtual MAC address. One router will adopt the IP and MAC addresses and all PCs on the subnet will be configured with that IP address as default router. All traffic will flow through this router, known as the ‘active’ router in HSRP or the ‘master’ in VRRP. The second router will not  participate in the forwarding of any host traffic but will communicate with the master using IP multicast.

Should the master router fail, the secondary router will detect this and assume the virtual IP and MAC addresses. Hosts on the network do not have to reconfigure their default router and are largely unaware that a router has failed.

This approach works well and has become the standard solution to IP host router redundancy. Care should be taken in some environments, though. The source routing used on token ring can disrupt failover because, although the MAC address is the same, the routing information field (RIF) will be different between routers. Any implementation should use a TR functional address to get around this problem, as is the case with both HSRP and VRRP.

Also, the physical relocation of the virtual mac address that is associated with a router failover can cause problems in bridged or VLAN environments, where switches must learn paths to MAC addresses. If a router redundancy implementation doesn’t operate as expected, it may be due to the way installed bridges or switches learn and update mac addresses. Always test thoroughly in a layer 2 environment.

Conclusion

Network resilience is key but local router resilience is often overlooked. Using router-based redundancy such as HSRP or VRRP is the simplest and most popular method of improving network availability for end users. When choosing a network vendor, examine it’s particular offering. IPv6 uses an enhanced mode of ICMP discovery which may remove the need for protocols like VRRP but the same issues could still exist so don't be surprised to see router-based redundancy offerings for the next generation of IP.

www.ietf.org

www.cisco.com

  • IP hosts need to know a router address for inter-network communication

  • This can be statically configured or learned

  • Router-based redundancy negates host reconfiguration  

     

 

 

This site was last updated 04/25/07