Followup: All-I-can-eat

The “All-I-can-eat-mentality” article has triggered (as expected) numerous responses. Some of them provided useful data, links to more information or informative perspectives – many thanks to those readers. A few others were unfortunately following the “I-am-right” line without considering facts. Most of the readers from the Service Provider community decided to stay anonymous (when you read all the comments, it becomes obvious they made a wise decision) or respond off-line.

Whatever your position in this issue, I would like to ask you to keep your comments focused on the topic. Although you were all infinitely more polite than the usual forum/blog crowd and provided some really good arguments, writing angry replies does not help. What’s happening with Internet is (like it or not) our common problem … or you could take the blue pill and continue bashing the other side.

I particularly liked the summary of our discussion posted on Slashdot (where someone included the link to my blog):

Whoa, whoa, whoa, that article seems to be promoting a balanced viewpoint that denies a) that telcos are totally evil and b) that we should all be allowed to have as much bandwidth as we want and not have to pay for it. We'll have none of that nonsense on /.

It’s also painfully obvious that the current handling of ISP service offerings is often dismal. For example:

  • Publicly accessible service definitions don’t exist. What most large ISPs offer on their web sites is the X Mbit @ Y $/€/whatever per month. They “forget” to mention that the offered rate is PIR, with expected CIR being only a few percent of PIR.
  • When introducing the traffic caps, some providers started with ridiculously low and probably a bit overpriced figures.
  • A while ago some people decided to keep P2P traffic (which is the worst bandwidth hog of all) in line by resetting TCP sessions (a link to the recent state of affairs would be highly appreciated).

In the next few weeks, I’ll try to cover a few of the topics raised in the comments, including:

  • Why do we have to live with oversubscriptions?
  • Why are the Service Providers forced to use traffic management?
  • Is it possible to have a fair and consumer-friendly service definition?
  • Why will everyone have to invest in deep packet inspection (DPI)?

Another milestone reached :)

According to FeedBurner statistics, my RSS feed had exactly 5000 subscribes on June 30th.

Disclaimer: The Feedburner subscriber count is a notoriously unreliable measure, as it tries to estimate the actual number of subscribers for on-line services that cache the feeds (for example, Google Reader) … but the result is nonetheless quite nice.

What went wrong: end-to-end ATM

Red Pineapple was kind enough to share his 15-year-old ATM slides. They include interesting claims like:

ATM has the potential to displace all existing internetworking technologies
One single network handles all traffic types: Bursty data and Time-sensitive continuous traffic (voice/video).

All these claims are still true if you just replace »ATM« with »IP«. So what went wrong with ATM (and why did the underdog IP win)? I can see the following major issues:

  • ATM is a layer-2 technology that wanted to replace all other layer-2 technologies. Sometimes it made sense (ADSL), sometimes not so much (LAN … not to mention LANE). IP is a layer-3 technology that embraced all layer-2 technologies and unified them into a single network.
  • ATM is an end-to-end circuit-oriented technology, which made perfect sense in a world where a single session (voice call, terminal session to mainframes) lasted for minutes or hours and therefore the cost of session setup became negligible. In a Web 2.0 world where each host opens tens of sessions per minute to servers all across the globe, the session setup costs would be prohibitive.
  • Because of its circuit-oriented nature, ATM causes per-session overhead in each node in the network. Core IP routers don’t have to keep the session state as they forward individual IP datagrams independently. IP is thus inherently more scalable than ATM.

The shift that really made ATM obsolete was the changing data networking landscape: voice and long-lived low-bandwidth data sessions which dominated the world at the time when ATM was designed were dwarfed by the short-lived bursty high-bandwidth web requests. ATM was (in the end) a perfect solution to the wrong problem.

Not all interfaces are created equal

Two days ago I’ve managed to write aGenuineStupidity™ (OK, maybe I cannot get a trademark on this concept): the MQC shaping actions cannot be attached to a Dialer interface; they have to be specified on the underlying physical interface (in case of PPPoE link, the outside Ethernet interface).

The reason for my stupidity (apart from the obvious one: writing without testing) is the difference between true logical interfaces and dialer templates. A tunnel interface or a VLAN interface is a true logical interface; it behaves like any other interface (with a few exceptions; for example, tunnel interface does not have an output queue) and you can use most QoS actions (including shaping) on it. A dialer interface is even more “conceptual”. It can never be operational on its own – as soon as the link is established, it’s bound to a physical (for example, BRI0:1) or virtual access interface (which is yet again bound to a physical interface) and the shaping is performed on the final physical interface.

This behavior (on top of being unexpectedly inconsistent) results in interesting quirks. For example, you have to shape PPPoE packets (based on their IP characteristics) on the physical Ethernet interface which usually doesn’t even have an IP address.

… and let’s hope that the late hour hasn’t resulted in another blunder.

Service Provider “Services and Solutions” bootcamp

Last week I happened to be copied on an e-mail which included a PDF describing a “SP Services and Solutions” bootcamp I knew nothing about. As I was slowly reading the document, I’ve realized it’s a perfect course for Service Provider (or channel partner) engineers who need an overview of the various technologies used in modern (I hate to use the marketing phrase “next-generation”) SP networks and the ways they can be used to create customer-focused solutions.

My next question (to our training manager) was: “when is this going to be available, I want to blog about it” and the answer astonished me: “we’re doing the first customer teach this week”. Our SP gurus managed to develop a course they always wanted to teach almost undercover.

Read the rest of my thoughts about this fantastic course in Fragments

ADSL QoS basics

Based on the ADSL reference model we’ve discussed last week, let’s try to figure out how you can influence the quality of service over your ADSL link (for example, you’d like to prioritize VoIP packets over web download). To understand the QoS issues, we need to analyze the congestion points; these are the points where a queue might form when the network is overloaded and where you can reorder the packets to give some applications a preferential treatment.

Remember: QoS is always a zero-sum game. If you prioritize some applications, you’re automatically penalizing all others.

The primary congestion point in the downstream path is the PPPoE virtual interface on the NAS router (marked with a red arrow in the diagram below), where the Service Provider usually performs traffic policing. It’s better from the SP perspective to police the traffic @ NAS than to send all the traffic to DSLAM where it would be dropped in the ATM hardware. Secondary congestion points might arise in the backhaul network (if the network is heavily oversubscribed) and in DSLAM (if the NAS policing does not match the QoS parameters of the ATM virtual circuit).

In the upstream direction, the congestion occurs on the DSL modem – the path between the CPE and the modem (Ethernet or Fast Ethernet) is much faster than the upstream ATM virtual circuit. Secondary congestions might occur in DSLAM or the backhaul network. NAS usually does not police inbound traffic, as it’s assumed the DSL access network already limits the user traffic to its contractual upstream speed.

Based on the congestion analysis, it’s obvious you cannot use queuing on the CPE (marked “2” in the diagrams) to influence the ADSL QoS as you don’t control a single congestion point. You have to use traffic shaping on the CPE to introduce artificial congestion points in which the queues will form. You can then use the usual queuing mechanisms to prioritize the application traffic.

The shaping configured on the PPPoE interface on the CPE router neatly removes the congestion on the DSL modem. The backhaul network is rarely congested in the upstream direction (unless your friendly neighbors are devoted fans of P2P protocols).

When configuring the upstream shaping rate, you just have to take in account the extra overhead introduced by the PPPoE framing, which is not yet present in packets shaped on the Dialer interface, and reduce the upstream shaping speed to a value slightly below your DSL upstream speed.

If your DSL configuration uses PPPoE Dialer interface, you have to shape the traffic on the Dialer interface, not the outside Ethernet interface. The outside Ethernet interface transmits PPPoE-encapsulated IP traffic on which you cannot use the IP-based queuing classifiers.outside Ethernet interface.

Assuming most of your traffic is TCP-based (or that all non-TCP traffic is prioritized), the shaping on the inside LAN interface will cause enough TCP delays to slow down the downstream TCP transmission. However, it’s harder to determine the correct shaping rate and optimize the shaping behavior when the high-priority traffic is not present; we’ll cover these issues in an upcoming post.

There is no local command authorization

Shahid wrote me an e-mail asking about local command authorization. He would like to perform it within the AAA model, but while AAA local authorization works, it only allows you to specify user privilege level (and autocommand), not individual commands (like you can do on a TACACS+ server).

One of the reasons for this behavior is the difference between exec authorization (the authorization to start the interactive session, configured with aaa authorization exec) and command authorization (the authorization to execute a particular command, configured with aaa authorization commands). While the local method can be specified in the aaa authorization commands command, it’s essentially a no-op (it always succeeds). Using the local method in the aaa authorization commands is only meaningful if you want to provide a fallback mechanism where all commands are authorized if the router cannot contact a TACACS+ server.

You can use EEM applets, command privilege levels or parser views to limit the set of commands a user can execute on a router without using TACACS+ command authorization.

This article is part of You've asked for it series.

Help appreciated: touch-screen drawing

I’m looking for a touch screen device that would work (well) with PowerPoint. I’d like to start drawing my diagrams with a pen, not with a mouse; I have a completely unfounded irrational belief that drawing with a pen might be faster and easier than using a mouse. Any (tested) ideas?

IOS HTTP vulnerability

The Cisco Subnet RSS feed I’m receiving from Network World contained interesting information a few days ago: Cisco has reissued the HTTP security advisory from 2005. The 2005 bug was “trivial”: they forgot to quote the “<” character in the output HTML stream as “&lt;” and you could thus insert HTML code into the router’s output by sending pings to the router and inspecting the buffers with show buffers assigned dump (I found the original proof-of-concept exploit on the Wayback Machine). However, I’ve checked the behavior on 12.4(15)T1 and all dangerous characters (“<” and quotes) were properly quoted. So, I’m left with two explanations.

It’s real

Someone has discovered a really devious way of inserting HTML code that somehow bypasses the quoting process. It could be weird Unicode encoding of less-than character, similar to the IPS vulnerability I’ve been writing about two years ago. I couldn’t find a feasible approach to do it, as the original attack vector (show buffers command) drops the high-order bit from the dumped data and the IOS HTTP server properly quotes 7-bit characters, but then I’m not aware of every IOS command (including the hidden ones) that could dump buffer/memory data. I’ve even tested the 0xFF3C sequence produced by tclsh and it does not work (the 0xFF is emitted unchanged, but the 0x3C is quoted).

It’s an administrative blunder

The “Revision history” section of the advisory claims that they’ve revised the workaround section, which describes how to disable the HTTP WEB_EXEC service. If this is true, they might have updated the list of affected software and fixed IOS versions. Adding information on feature available in 12.3T four years after the original advisory without fixing other more relevant information is (in my opinion) pure paperwork shuffling, not to mention the scare caused by an advisory claiming there’s a security hole in all classic IOS releases.

What should you do?

To be on the safe side, you should:

  • Disable HTTP and HTTPS servers in Cisco IOS unless you absolutely need them (but you should do that anyway).

Protecting the HTTP server with an ACL does not help, as the exploit works through the administrator’s browser.

Disabling the WEB_EXEC service will break SDM.

  • Use dedicated browser sessions when accessing the router. Start a new copy of the browser (or even better, a different browser), go to the router, do what you have to do and close all browser windows before accessing anything else, including links in your e-mail.

Last but not least, you could disable individual commands with EEM applets (if only Cisco would provide a complete list of vulnerable commands). For example, the following command will disable all variants of show buffers command with the dump option

event manager applet WebDeny
 event cli pattern "show buffers.*dump" sync no skip yes

Internet anarchy: I’ll advertise whatever I like

We all know that the global BGP table is exploding (see the Active BGP entries graph) and that it will eventually reach a point where the router manufacturers will not be able to cope with it via constant memory/ASIC upgrades (Note: a layer-3 switch is just a fancy marketing name for a router). The engineering community is struggling with new protocol ideas (for example, LISP) that would reduce the burden on the core Internet routers, but did you know that we could reduce the overall BGP/FIB memory consumption by over 35% (rolling back the clock by two and a half years) if only the Internet Service Providers would get their act together.

Take a look at the weekly CIDR report (archived by WebCite on June 22nd), more specifically into its Aggregation summary section. The BGP table size could be reduced by over 35% if the ISPs would stop announcing superfluous more specific prefixes (as the report heading says, the algorithm checks for an exact match in AS path, so people using deaggregation for traffic engineering purposes are not even included in this table). You can also take a look at the worst offenders and form your own opinions. These organizations increase the cost of doing business for everyone on the Internet.

Why is this behavior tolerated? It’s very simple: advertising a prefix with BGP (and affecting everyone else on the globe) costs you nothing. There is no direct business benefit gained by reducing the number of your BGP entries (and who cares about other people’s costs anyway) and you don’t need an Internet driver’s license (there’s also no BGP police, although it would be badly needed).

Fortunately, there are some people who got their act together. The leader in the week of June 15th was JamboNet (AS report archived by Webcite on June 22nd) that went from 42 prefixes to 7 prefixes.

What can you do to help? Advertise the prefixes assigned to you by Internet Registry, not more specific ones. Check your BGP table and clean it. Don’t use more specific prefixes solely for primary/backup uplink selection.

Autocommands in AAA environment

A reader who prefers to remain anonymous has reported an interesting observation: autocommands configured on local usernames do not work after configuring aaa new-model.

I’ve immediately suspected that the problem lies in the granularity of the AAA mechanisms and a quick lab test proved it: the username/password check is configured with the aaa authentication login configuration commands, whereas the autocommand feature belongs to the EXEC authorization and has to be configured separately with the aaa authorization exec command.

The following configuration can be used if you want to use local usernames and autocommands within the AAA framework (add TACACS+/RADIUS servers as needed):

aaa new-model
!
aaa authentication login default local 
aaa authorization exec default local
!
username local password 0 local
username test password 0 test
username test autocommand show ip route

This article is part of You've asked for it series.

IS-IS is not running over CLNP

A while ago I’ve received an interesting question from someone studying for the CCNP certification: “I know it’s not necessary to configure clns routing if I’m running IS-IS for IP only, but isn’t IS-IS running over CLNS?”

I’ve always “known” that IS-IS uses a separate layer-3 protocol, not CLNP (unlike IP routing protocols that always ride on top of IP), but I wanted to confirm it. I took a few traces, inspected them with Wireshark and tried to figure out what’s going on.

You might be confused by the mixture of CLNS and CLNP acronyms. From the OSI perspective, a protocol (CLNP) is providing a service (CLNS) to upper layers. When a router is configured with clns routing it forwards CLNP datagrams and does not provide a CLNS service to a transport protocol. The IOS configuration syntax is clearly misleading.

It turns out the whole OSI protocol suite uses the same layer-2 protocol ID (unlike IP protocol suite where IP and ARP use different layer-2 ethertypes) and the first byte (NLPID) in the layer-3 header to indicate the actual layer-3 protocol. I was not able to find any table of layer-3 OSI protocol types, so I had to experiment with Wireshark to figure out the values for CLNP, ES-IS and IS-IS (yes, these three are distinct L3 protocols).

You can find all the details (including the comparison of OSI and IP protocol stacks) in the IS-IS in OSI protocol stack article in the CT3 wiki.

This article is part of You've asked for it series.

ADSL reference diagram

I’m getting lots of ADSL QoS questions lately, so it’s obviously time to cover this topic. Before going into the QoS details, I want to make sure my understanding of the implications of the baroque ADSL protocol stack is correct.

In the most complex case, a DSL service could have up to eight separate components (including the end-user’s workstation):

  1. End-user workstation sends IP datagrams to the local (CPE) router.
  2. CPE router runs PPPoE session with the NAS (Network Access Server) and sends Ethernet datagrams to the DSL modem.
  3. DSL modem encapsulates Ethernet frames in RFC 1483 framing, slices them in ATM cells and sends them over the physical DSL link to DSLAM.
  4. DSLAM performs physical level concentration and sends the ATM cells (one VC per subscriber) into the network.
  5. The backhaul network (DSLAM to NAS) could be partly ATM based. The ATM cells could thus pass through several ATM switches.
  6. Eventually the ATM cells have to be reassembled into PPPoE frames. In a worst-case scenario, an ATM-to-Ethernet switch would perform that function.
  7. The backhaul network could be extended with Ethernet switches.
  8. Finally, the bridged PPPoE frames arrive @ NAS which terminates the PPPoE session and emits the IP datagrams into the IP core network.

I sincerely hope no network is as complex as the above diagram. In most cases, the backhaul would be either completely ATM-based …

… or Ethernet based (when the DSLAM has Ethernet uplink interface):

The NAS could also be adjacent to DSLAM or even integrated in the same chassis.

Am I missing anything important? I know you could deploy numerous additional devices (for example, Cisco is promoting the Service Exchange Framework and Service Control Engine), but these devices would be placed deeper into the IP core.

ATM is like a duck

It was (around) 1995, everyone was talking about ATM, but very few people knew what they were talking about. I was at Networkers (way before they became overcrowded Cisco Live events) and decided to attend the ATM Executive Summary session, which started with (approximately) this slide …

… and the following explanation:

As you know, a duck can swim, but it's not as fast as a fish, walk, but not run as a cheetah, and fly, but it's far from being an eagle. And ATM can carry voice, data and video.

The session continued with a very concise overview of AAL types, permanent or switched virtual circuits and typical usages, but I’ve already got the summary I was looking for … and I’ll remember the duck analogy for the rest of my life. Whenever someone mentions ATM, the picture of the duck appears somewhere in the background.

If you’re trying to explain something very complex (like your new network design) to people who are not as embedded into the problem as you are, try to find the one core message, make it as simple as possible, and build around it.

Inter-VRF static routes

Swapnendu was trying to implement inter-VRF route leaking in multi-VRF environment without using route targets. He decided to use inter-VRF static routes, but got concerned after reading the following paragraph from Cisco’s documentation:

You can not configure two static routes to advertise each prefix between the VRFs, because this method is not supported. Packets will not be routed by the router. To achieve route leaking between VRFs, you must use the import functionality of route-target and enable Border Gateway Protocol (BGP) on the router. No BGP neighbor is required

There is no reason why inter-VRF static routes on point-to-point interfaces would not work. However … if Cisco's documentation states something is not supported, that's exactly what it is: not supported. It might work for you, it might not work on specific platforms and it might be broken in a future software release (like MPLS VPN on 1800 routers). You're using it at your own risk and if it stops working you can't even complain to the TAC (because they'll tell you it's unsupported).

Internet Socialism: All-I-can-eat mentality

Every few months, my good friend Jeremy finds a reason to write another post against bandwidth throttling and usage-based billing. Unfortunately, all the blog posts of this world will not change the basic fact (sometimes known as the first law of thermodynamics): there is no free lunch. Applied to this particular issue:

  • Any form of fixed Internet pricing is effectively an “all-you-can-eat” buffet. Such a buffet works as long as the visitors’ stomach sizes have comparable capacity. In the high-speed Internet world a torrent user can consume two or three orders of magnitude more resources than a regular user.
  • In an environment where a minority of users consumes most of the resources, you’re simply forced to treat the large consumers differently. Otherwise, you’re forcing the majority to pay for the excesses of the few and the majority will eventually revolt (which is why the big socialist experiments didn’t work).
  • Obviously you need to upgrade your network as the average use increases, but being forced to upgrade due to a few large consumers and distributing the costs across the whole customer base simply does not make sense (not to mention the fact that providing Internet connectivity is far away from being a lucrative business).

Unfortunately, the basic facts are usually obscured by controversies like companies choosing PR disasters over fixing their networks or Service Providers incompetent enough to call port scanning a DOS attack. It’s also highly unreasonable to expect the users to consume less than 5GB a month and charge any over-the-quota traffic without any safeguards.

There are technical solutions (for example, Cisco’s SCE) that allow the Service Providers to give each user a fair share of the bandwidth (or even limit the number of TCP/UDP sessions in a time period). However, without end-users and bloggers adopting a realistic view of the world, we’re facing a lose-lose scenario. Whatever the Service Providers do, however much they might invest in their network (and charge everyone), however reasonable their throttling/capping decisions might be, the all-I-can-eat zealots will cry foul … or am I yet again completely wrong?

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or visit his page on Facebook.