OSPF neighbors stuck in EXSTART

This problem is quite rare, but tantalizing enough to warrant mentioning: OSPF neighbors are forever stuck in EXSTART state (occasionally going to DOWN and back to EXSTART).

I've actually stumbled across it accidentally in my lab and have luckily seen it before, so I knew immediately what it was.

The moment you start suspecting that something might be wrong with the OSPF adjacencies and use debug ip ospf adj command, the problem becomes obvious: the Database Description packet contains an Interface MTU field and if the value received from the neighbor is higher than the IP MTU configured on the inbound interface, the DBD packet is rejected (section 10.6 of the RFC 2328). The router with the lower MTU complains that “Nbr x.x.x.x has larger interface MTU” and the other router moans about protocol violations (First DBD and we are not SLAVE).

As always, there are two ways to solve this problem:

  • The correct one: fix the MTU issues;
  • The other one: disable MTU checks with the ip ospf mtu-ignore interface configuration command (which might be OK as long as the hardware is able to receive oversized packets and the router is not using fixed-size input buffers).

12 comments:

William Chu said...

I have got an interesting one.

A few years ago I got called to troubleshoot an OSPF Exstart problem, Both routers were connected together over an international frame relay PVC. Both side had MTU 1500 bytes set on their interfaces initially but OSPF got stuck in Exstart. I knew about the OSPF MTU Mismatch issue back then but this one didn't seem to be it because the MTU size match on both ends. However, I was told it was an international Frame Relay PVC so I asked how the PVC was built. It actually went through three providers, and the provider in the middle had the PVC mtu set at 1100 bytes for some reasons and that was the culprit. The fix, as it turned out, was to lower the interface IP MTU on the customer routers (IP MTU = 1024)because the ospf mtu-ignore bit didn't solve it (this was because the middle Frame Relay provider dropped the over-sized frames at layer 2). It was a very unique problem so I would like to pass along. Nowadays frame relay is going away so we may never encounter a problem like this one.

js said...

The place I've seen this several times is when running OSPF between a SVI on a 3550 switch and a router, which have different default MTUs.

Anonymous said...

What is the best way (not "ip ospf mtu-ignore") to resolve MTU mismatch between 3550 SVI and router's physical or BVI interface?

Without affecting other switch ports?

I know about "system mtu routing ..." on 3550, but it is system-wide.

Consider that router has BVI interface (which also produces different mtu) and switch has a SVI int.

Router:

bridge 1 protocol ieee
bridge 1 route ip
bridge irb

interface GigabitEthernet0/0
description trunk to 3750
no ip address
!
interface GigabitEthernet0/0.1
encapsulation dot1Q 100
bridge-group 1

interface BVI1
ip address 10.1.1.2 255.255.255.0

router ospf 1
network 10.1.1.0 0.0.0.255 area 0

BVI1 is up, line protocol is up
MTU is 1514 bytes

Ivan Pepelnjak said...

According to this discussion, you can only set system-wide MTU on 3550, not per interface.

Once I get my hands on a Catalyst switch (and have time to spare), I'll run a few tests.

vladimir said...

Thank you.
So should I set "system mtu routing 1514" on the 3750 to match the bvi's mtu and forget about it?

Any negative consequences?

What about other routers on the same L2 segment with regular routed intefaces? they currently have "ip ospf mtu-ignore" :)

The bvi interface would not take mtu settings.

Thanks,
Vladimir

Ivan Pepelnjak said...

You should set the system MTU to 1500, not 1514 (unless I'm gravely mistaken, the MTU specifies the payload size, not the layer-2 frame size).

There SHOULD be no negative impact, unless the workstations in your LAN use jumbo frames (and let's assume that the switches are not MPLS PE routers :).

As for the BVI interface; I can set the MTU and IP MTU on a BVI interface on a router (using 12.4(15)T1), but as I said in a previous comment, you cannot set per-interface MTU on a Cat3550 at all.

Vladimir said...

Thanks

Nicolas said...

Google got me here with the magic words mtu + ospf while looking for some info regarding this topic for a post in my new blog. I basically wrote the same (in spanish), but added something that I found pretty interesting; lowering back the mtu or removing the ip ospf mtu-ignore and see what would happen. Just the latter would bring us back to the issue. MTU would just be an issue again whenever the adjacency is rebuilt...just my two cents.

Clive said...

Yeh, got a strange issue.

If the MTU is set to 1500 or lower then full adjacency is achieved, anything higher and it stays in 2 way - Anyone got any ideas on that.

Set up is - Juniper -> Foundry -> SmartEdge

Set ups on Juniper and Smartedge as follows:-

Juniper
metric 65535;
retransmit-interval 5;
transit-delay 1;
hello-interval 10;
dead-interval 40;

SmartEdge:

transmit-delay 1
router-priority 0
hello-interval 10
router-dead-interval 40
cost 65534

The only difference I can see is the metric cost, but then why would it work with 1500 but not anything larger?

Ivan Pepelnjak said...

I would suspect the box in the middle is dropping jumbo frames. See also

http://blog.ioshints.info/2009/11/ip-ospf-mtu-ignore-is-dangerous-command.html

Robin M. said...

Funny enough I'm experiencing this issue right now on a Gigabit Ethernet link between two 7609s. Looks like the MTU on the transport network is wrong and the carrier is looking at it now.

New technology, same old problems. :)

jeff said...

hi Robin, im experiencing it right now. i have two routers between two 7609 and sometimes the ospf is going down. how did you resolve the issue?

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.