GRE tunnel keepalives
The IP-over-IP (usually GRE) tunnels (commonly in combination with IPSec to provide security) are frequently used when you want to transport private IP traffic over public IP network that does not support layer 3 VPNs. If you use the GRE tunnels in combination with default routing (or route summarization), you can get serious routing issues when the tunnel destination disappears, but a default (or summary) route in the IP routing table still covers it. You could work around this issue by deploying a routing protocol over the GRE tunnel (which could lead to hard to diagnose routing loops if you're not careful) or by using GRE keepalives introduced in IOS release 12.2(8)T.
The implementation of the GRE keepalives is amazing: the router sending the keepalive packet constructs a GRE packet that would be sent from the remote end back to itself (effectively building a GRE reply), sets the GRE protocol type to zero (to indicate the keepalive packet) and sends the whole packet through the tunnel (effectively encapsulating GRE reply into another GRE envelope). The receiving router strips the GRE envelope and routes the inside packet … which is the properly formatted GRE keepalive reply.
This trick allows you to implement different GRE keepalive timers on each end of the link. For example, the remote site might use fast keepalive timers to detect loss of primary link and switch over to a backup link, while the central site would use less frequent keepalive tests to detect failed remote site (if there is a single path to the remote site, you don't care too much when you detect it's down).
Every ingenious solution has its drawbacks and this one is no exception: if the receiving router protects its IP addresses (to stop spoofing attacks), it will drop the incoming GRE keepalive packet. Furthermore, a document available on Cisco's web describes the issues of using GRE keepalives in IPSec environment.
9 comments:
A good use of GRE keepalive is to monitor a metro ethernet link between two routers. You setup a GRE tunnel with keepalive between two Ethernet endpoints to monitor true end-to-end connectivity over the metro Ethernet link. Keep in mind though you are not sending user traffic through the GRE tunnel, merely you are using the GRE keepalive as a health indicator of the metro Ethernet connection. Of course this will not be needed once Ethernet OAM, E-LMI, etc, have become widely available, but for the time being I find the GRE keepalive has other good uses besides tunneling traffic.
That's definitely an interesting suggestion. But when you know that the end-to-end link is down, what do you do with that information? I have a few crazy ideas, but would like to hear from you first.
We use standard NMS (HP Opeview, CA Spectrum, etc) to monitor customer devices. GRE Tunnel itself is just an other interface to these NMS systems, therefore if the tunnel went down the interface would become RED and an alarm will be triggered. Without this "indicator" tunnel interface we will have no way of knowing that the end-to-end path was actually down somewhere along the path. We have thought about using traps or monitoring routing neighbors logging, etc, but nothing beats the reliable tunnel interface Up/Down. This method has allowed us to open ticket proactively with the Metro Ethernet provider to resolve the issue. Keep in mind that the physical Ethernet interface itself could be UP/UP on the customer router, which isn't a reliable indicator.
Hi ... very nice point. But, how to identify where's the problem path, when we find the Tunnel is flap, but all interface along the circuit is up (never down). Thanks !
The only tool that comes to my mind is the "traceroute" command.
Hi Ivan. The problem is the circuit is L2 based, and this circuit consist of many physical hop. I have checked all log and there's no problem with physical log. Thks.
If you have L2 devices in the path that you don't control, there's no way to figure out where the problem is (in a few years, you might be able to use Ethernet OAM :).
Not to mention VRFs where the keepalive is inside the VRF and not in the transit VRF (or default table if that be the case)
:)
This blog is using JS-Kit comments. You have to enable JavaScript if you want to post a comment.