Ignoring STP? Be careful, be very careful

A while ago I described what it takes to integrate TRILL backbone with the legacy equipment running Spanning Tree Protocol (STP). Unfortunately, Brocade decided to use a non-standard approach to BPDU handling when implementing their TRILL-like VCS fabric. VDX switches running in fabric mode can either drop incoming BPDU frames or transport them transparently across the fabric to other edge ports. Although VDX switches support STP, RSTP and MSTP (as well as RootGuard and BPDUGuard) when configured as standalone switches, the STP processing is disabled when you configure fabric mode; VCS fabric looks like a huge shared LAN segment to the end hosts and core switches.

2013-03-31: Network OS 4.0 and above supports Distributed Spanning Tree (DiST), for more details read this blog post.

The problem

This approach to handling STP might make perfect sense from an architectural perspective, more so to pure VMware shops (vSwitch does not run STP and performs split-horizon bridging by design). Unfortunately, everyone else cannot ignore STP as it’s way too easy to configure bridging between redundant NICs in servers running any other operating system. Robust data center networks thus use BPDU guard on edge ports to block any server port acting as a bridge. At the very minimum, you should use root guard on server ports to prevent STP topology changes triggered by misconfigured STP process running on a server. It seems most vendors are in agreement BPDU Guard and RootGuard are crucial to stable L2 network operation; Brocade claims they are key features [...] to protect network spanning tree operation.

Never configure a PortFast port without BPDU guard or you’ll soon discover that it takes a single click in Hyper-V to melt down your network.

VCS fabric as core transport

If you’re deploying Brocade’s VCS fabric as a Data Center network core or as a transport layer between top-of-rack switches (or between switches embedded in blade enclosures) you’re relatively safe: the access switches (ToR or embedded) should kick out the rogue servers/virtual machines; running STP across the VCS fabric is just an extra disaster-preventing precaution.

VCS fabric in access layer

You probably should not connect servers directly to VCS fabric due to the way it handles BPDUs. VCS fabric’s current implementation gives you only two dismal options:

Ignore BPDUs on the edge ports, risking the stability of the whole data center (see the above warning).

Transport BPDUs across VCS fabric to the core switches. Unfortunately, the core switches are not the right place to implement STP protection. You could decide to configure BPDU guard or root guard on the core switches, in which case you risk cutting off the whole VCS fabric (and all servers connected to it) if a single server starts sending BPDUs. Or you could do nothing, exposing the core switches to the whims of STP configurations of individual servers, allowing any rogue server to bring down the whole network with repetitive bogus topology changes.

Summary

While the shortest-path bridging standards (802.1aq/SPB and TRILL) seem convoluted and overloaded with features, most of those features made it into the standard for a good reason. Cutting corners when implementing standards is always a long-term problem.

More information

You’ll learn more about modern data center architectures in my Data Center 3.0 for Networking Engineers webinar. The details of VMware networking (including the vSwitch behavior) are described in VMware Networking Deep Dive webinar . Both webinars are also part of the yearly subscription package.

Update 2011-06-12: After lengthy e-mail exchange with Jon Hudson, Global Solutions Architect @ Brocade Networks, I slightly reworded two sentences to ensure nobody would assume I "imply a willful choice" or worse. While editing the post, I also included information on STP support by VDX switches running in non-fabric mode and the fact that multiple vendors (including Brocade) support "a Cisco solution to a problem" (BPDU Guard and Root Guard). In fact, both features are supported by VDX switches running in non-fabric mode.

In my personal opinion (I hope I'm still entitled to one), I would wait for Brocade to implement Appointed Forwarder in their TRILL code and enable full set of STP features (already present in the code) in VCS fabric mode before deploying VCS fabric in my network.

14 comments:

  1. Can you open this up a bit:

    "Never configure a PortFast port without BPDU guard or you’ll soon discover that it takes a single click in Hyper-V to melt down your network."

    Why is Hyper-V sending BPDUs?
  2. Hyper-V does "true bridging" if you configure bridging between two uplinks instead of teaming them. A nice recipe for disaster if you're not ready to catch&stop that.
  3. Wouldn't the switch then give up the edge mode and act normally on the BPDUs (= block one port)?
  4. See http://www.cisco.com/en/US/products/hw/switches/ps700/products_tech_note09186a00800b1500.shtml

    "If you turn on PortFast for a port that is part of a physical loop, there can be a window of time when packets are continuously forwarded (and can even multiply) in such a way that the network cannot recover."

    Tested in a live network O:-)
  5. But hey, on the good side, thanks for testing 8-)
  6. Want to make sure I understand this. So you are worried because the VCS switches pass STP instead of blocking it? Or are you saying they should offer a way to block it if you want? As I remember BPDU and root guard are Cisco inventions to protect people from doing dumb things correct?

    The argument being "as it’s way too easy to configure bridging between redundant NICs in servers running any other operating system "or in picking on poor HyperV "a single click in Hyper-V to melt down your network"

    While I agree it would be nice to protect people from putting their heads in ovens, getting upset because a Cisco competitor didn't implement a Cisco solution to a problem of "less than clever users doing dumb things" seems very odd to me.
  7. I am worried because Brocade decided to ignore the STP problem completely. They could have implemented a similar feature set that Cisco did (for a good reason - sometimes you just have to get over the fact that Cisco actually does a few things right) or at the very minimum implemented the solution specified in the standard (also linked from my blog post). Unfortunately they decided to cut corners and I'm just describing what the end results could be.

    As for "picking on poor Hyper-V" - I'm not. I'm positive there are other environments that allow you to configure bridging as easily as Windows does. It's just that I know Hyper-V can do that and it's a very nice graphic example of how quickly you can make a wrong choice that can melt down your network.

    Last but not least "protecting people from their own stupidity" is not a bad idea. Even seat belts and helmets have saved a few lives. What would you say if a car manufacturer would try to sell you a car without seat belts today (and claim that you can transparently mount them around the car)?
  8. The issue is your phrasing. While yes it's always nice to protect the ignorant. TRILL (even TRILLish) constructs absolutely MUST pass STP. It's one of the core principles of TRILL that Rbridges can be put between, around, or intermixed with standard bridges. This requires TRILL Rbridges to appear essentially transparent allowing all STP behavior to continue as if even multiple Rbridges appear simply as wire.

    What you call "ignoring" implies a willful choice to not provide STP protection, or possible, worse an ignorant networking malpractice.

    Where is if read the TRILL mailing list, or you talk to Mrs Perlman, Mr. Eastlake or others in the TRILL WG you see that this transparent behavior is desired and mandated to provide interoperability and backward compatibility.

    So while yes I agree it would be nice or even polite to provide bumpers for bowlers that may get into trouble. To say they ignored STP or cut corners shows a misunderstanding of one of TRILLs core values. And implies a malice or ignorance on the part of hard working engineers and thats just not very nice.

    As to seat-belts and helmets, they wonderful things that should be optional in my opinion.
  9. Ug, some rather poor typos there, my apologies

    "or _possibly_ worse, an ignorant networking malpractice."

    "Where _as_ if _you_ read the TRILL mailing list"
  10. Jon, you might want to read the standards before starting heated arguments. In this particular case, it would be https://tools.ietf.org/html/draft-ietf-trill-rbridge-protocol-16. Just search for "spanning tree".

    Or you might trust my summary of what the draft says http://blog.ioshints.info/2011/03/trillfabric-path-stp-integration.html. Although I don't follow too many mailing lists and don't spend time visiting industry events, I usually read the documents I write about.

    In any case, the way I read the TRILL draft, RBridges have to implement the concept of "appointed forwarder" which, to my understanding, Brocade decided not to do (otherwise they would have no STP issues). They should also terminate the STP domain (reporting their finding in LSP updates to check for unexpected domain merges) and not transparently forward the BPDU across TRILL domain (which Brocade chose to do).

    If after a careful analysis of the above-mentioned draft you still disagree with me, I will be more than willing to discuss the finer details of where I misunderstood the draft.
  11. I am very sorry you feel this is a heated argument.

    I suppose if you felt it was a heated argument it would explain your choice to attempt an insult by implying that I haven't read -16. I have, as I've read -15, -14 and so on as well as all the other bloody docs produced in the WG. Why would anyone spend time reading the TRILL WG mail list and attending IETF meetings on TRILL and then not spend the time to read the publicly available documents? Reading the documents is the peaceful relaxing part ;-)

    I read http://blog.ioshints.info/2011/03/trillfabric-path-stp-integration.html (though there is something wrong with the link in your response) and I see nothing wrong with your understanding or explanation of the topic.

    Ah! Now I understand what you mean! Before you were talking about BDPU and root guard. And I felt expecting those to be implemented was kinda odd due to the relationship between cisco and brocade.

    You are correct on the value of AFs. If your goal is to allow for interoperability with intermixed existing STP bridges, an AF is in TRILL how you make sure that only one forwarder is spreading the good word of STP. You did not mention AF's in your writeup so I did not realize this is what you meant, so I apologize for my misunderstanding.

    Do you know if FabricPath implemented AFs as TRILL suggests? Or do they just get around the problem of misconfigured nics by using BDPU/Root guard?

    So then VCS does not allow the intermixing of STP bridges between Rbridges. They also use FSPF (an ISIS'ish link-state used in FC) instead of ISISL2. Cisco uses a q-n-q form in their egress frames instead of the standard TRILL header. Neither claim to be pure TRILL, and neither are. One needs a control plane change, the other a new chip. (I silently pray to the digital gods each night this will change sooner rather than later)

    However I really don't seem to be communicating well. So I'm very sorry I'm not being clear.

    I think you have reasonable concerns. It's your tone and language that bothered me.

    This for example "which, to my understanding, Brocade decided not to do" is a totally reasonable and even statement

    "Unfortunately they decided to cut corners" and "seem convoluted and overloaded with features" are statements that betray an emotion on your part that you do not approve, or that you think the implementers were stupid or had intentional malice.

    For you to say "hey, in companies X's solution, they didn't implement all of TRILL and as a result if you do Y, Z may happen. So I advise you do not do Y until company X provides a way to prevent Z" is totally reasonable and very helpful.

    For you on the other hand to make assumptions about why something was done and to imply they were ignorant or cut corners seems to be to intentionally insult. This may result in heated conversations =)
  12. Ivan , what about Cisco NX-OS
    they don't propose any virtulization technology like VCS or IRF , does N2K,N5K and N7K have the same problem ?
  13. NX-OS has vPC, which is well integrated with LACP and STP.
  14. Ivan it has finally arrived yesterday:

    Distributed Spanning Tree Protocol (DiST/STPoVCS)
    Network OS v4.0.0 and later supports any version of STP to run in VCS mode and function correctly between interconnecting VCSs, or between VCS and other vendor’s switches. This feature is called Distributed Spanning Tree Protocol (DiST).
    The purpose of DiST is:
    • To support VCS to VCS connectivity and automatic loop detection and prevention.
    • To assist deployment plans for integrating with the legacy xSTP enabled switches in the network, for eventual replacement of such switches with fabrics.
    • Support following flavors of spanning-tree protocol: STP, RSTP, MSTP, PVST+, and RPVST+
Add comment
Sidebar