It all started with a innocuous question: can you detect voice traffic with EEM? Looks simple enough: create a QoS class-map that matches voice calls and read the cbQosClassMapStats table in the CISCO-CLASS-BASED-QOS-MIB. The first obstacle was finding the correct indexes, but a Tcl script quickly solved that; I was ready to create the EEM applet. The applet failed to work correctly and after lots of debugging I figured out the counters in the cbQosClassMapStats table change only every 10 seconds.
I couldn’t believe my eyes and simply had to test other MIB variables as well. As expected, the IF-MIB (standard interface MIB) counters increase in real-time, but obviously someone had the bright idea that we need to detect changes in traffic profile only every now and then. Although I've received suggestions from my readers, none of them works on an 1800 or a 7200. Oh, well, Cisco developers from the days when I started working with routers would know better.
To test the MIB variable behavior I wrote a simple Tcl script to test the MIB variables. It reads the specified MIB variable at fixed intervals and prints the values, so you can monitor the changes in the MIB variable in real-time. I started low-bandwidth UDP flood across the router and monitored the output bytes interface counter. As expected the counter changed in real time and accurately tracked the amount of traffic sent through the router.
GW#pm ifOutOctets.3 public 10 1000 polling ifOutOctets.3 for 10 seconds (10 iterations) 0.000 ifOutOctets.3=42528679 1.000 ifOutOctets.3=42537767 2.000 ifOutOctets.3=42546713 3.000 ifOutOctets.3=42555719 4.000 ifOutOctets.3=42564665 5.000 ifOutOctets.3=42573611 6.000 ifOutOctets.3=42582699 7.000 ifOutOctets.3=42591645 8.000 ifOutOctets.3=42600591 9.000 ifOutOctets.3=42609537
Then I created a simple class-map and policy map …
GW#show policy-map interface
FastEthernet1/0
Service-policy output: LAN
Class-map: Voice (match-all)
20438 packets, 2902196 bytes
30 second offered rate 70000 bps
Match: access-group name Voice
Class-map: class-default (match-any)
41 packets, 3967 bytes
30 second offered rate 0 bps, drop rate 0 bps
Match: any … and monitored the pre-policy byte counter (cbQosCMPrePolicyByte64) for the Voice class. The value changed only once every ten seconds:
GW#pm cbQosCMPrePolicyByte64.50.10767521 public 10 1000 polling cbQosCMPrePolicyByte64.50.10767521 for 10 seconds (10 iterations) 0.000 cbQosCMPrePolicyByte64.50.10767521=0x002865366 1.000 cbQosCMPrePolicyByte64.50.10767521=0x002865366 2.000 cbQosCMPrePolicyByte64.50.10767521=0x002865366 3.000 cbQosCMPrePolicyByte64.50.10767521=0x002865366 4.000 cbQosCMPrePolicyByte64.50.10767521=0x002865366 5.000 cbQosCMPrePolicyByte64.50.10767521=0x002865366 6.000 cbQosCMPrePolicyByte64.50.10767521=0x00287acf8 7.000 cbQosCMPrePolicyByte64.50.10767521=0x00287acf8 8.000 cbQosCMPrePolicyByte64.50.10767521=0x00287acf8 9.000 cbQosCMPrePolicyByte64.50.10767521=0x00287acf8
What can I say ... apart from expressing my deepest disappointment :(

On distributed platforms its worse. On Cat6500/7600 even packet counters change every 10 seconds on most interfaces.
ReplyDeleteI would understand the delay on a distributed platform. After all, the counters are on a different CPU and have to be transferred every now and then. Of course one could implement a smarter on-demand scheme :)
ReplyDeleteWhat really annoys me is that they didn't even think about using real-time counters on single CPU non-distributed non-accelerated platforms. The actual counters are obviously implemented in software and reside in the same memory as the MIB structures ... and still someone had the dilbertesque idea of copying the values between the two structures every 10 seconds.
could that help by any chance?
ReplyDeleteR1(config)#event manager applet plop
R1(config-applet)#event snmp oid xxx poll-interval ?
<0-4294967295> Enter seconds[.milliseconds] value
@Xavier: Unfortunately, the problem lies in the actual values of the MIB variables, not the way EEM polls them. The values change only every 10 seconds ... and you have to be very careful with the 'event snmp poll-interval' that you don't get false positives.
ReplyDeleteCan the voice stream be detected by inspecting SIP transactions? When ip inspect sip and ip nat are both enabled the inspection engines is adding UDP translation entries for the RTP/UDP traffic.
ReplyDeleteThere are numerous ways to detect VoIP/SIP traffic. The original solution (follow the links in the article) uses access list logging and EEM applet triggering on the ACL ... but it would be so much cleaner to use the proper SNMP variables.
ReplyDeleteHi ivan
ReplyDeletei am new to cisco qos. I have a silly doubt about cbwfq. as i have read that we can only use 75% of the bandwidth for the custom queues as the rest 25% is reserved for the keepalives and routing protocol updates.
now if i want to set 50% of bandwidth for a particular queue should it 50% of the total bandwidth or 50% of the availble 75% of the bandwidth.
i couldnt find answer for this anywhere. i would be glad if you can help me out .
Regards
sebastan
Sebastan,
ReplyDeleteYou're correct, if you not put the command Max reserved-bandwith 100 percent. You have 75%. that is considered 100% and then the 50% of that.
Regards
Andy
Hello Ivan,
ReplyDeletefirst of all: thank you for sharing your so creative and precise work. It is a pleasure and I will always reference to you as the original source
I tried to rebuild a scenario and to check the values because I am also concerned about the correctness of the measurement tools. You write that the counter for a given cbQosCMPrePolicyByte64-counter of an active class i only updated every 10 seconds.
I found something different (better) which is perhaps due to the ios version I used for testing.
setting: 2811,
show policy-map int s0/0/0 | sec SSL
Class-map: SSL (match-all)
38614111 packets, 19113935500 bytes
5 minute offered rate 21000 bps, drop rate 0 bps
Match: protocol secure-http
Queueing
queue limit 64 packets
(queue depth/total drops/no-buffer drops) 2/11305151/0
(pkts output/bytes output) 27308959/12934044900
bandwidth 20% (200 kbps)
tcl-hack:
R1#more flash:showcounter3.tcl
set cnt 0
set maxcnt 20
set delay 100
puts "Gleich beginnt die Counter-Sammlung.\n"
after 1000
while {$cnt < $maxcnt } {
set zaehlarray($cnt) [exec "show policy-map interface s0/0/0 | i 191"]
set zaehlarray2($cnt) [snmp_getone INSQOSR cbQosCMPrePolicyByte64.50.13986465]
incr cnt
if { $cnt != $maxcnt } { after $delay }
}
puts "Counter-Sammlung abgeschlossen. Gleich beginnt die Ausgabe."
after 1000
set cnt 0
while {$cnt < $maxcnt } {
puts $zaehlarray($cnt)
puts $zaehlarray2($cnt)
incr cnt
}
Result:
R1#tclsh flash:showcounter3.tcl
Gleich beginnt die Counter-Sammlung.
Counter-Sammlung abgeschlossen. Gleich beginnt die Ausgabe.
38639819 packets, 19115220900 bytes
{<obj oid='cbQosCMPrePolicyByte64.50.13986465' val='0x4735b1fa4'/>}
38639826 packets, 19115221250 bytes
{<obj oid='cbQosCMPrePolicyByte64.50.13986465' val='0x4735b2102'/>}
38639834 packets, 19115221650 bytes
{<obj oid='cbQosCMPrePolicyByte64.50.13986465' val='0x4735b2292'/>}
38639844 packets, 19115222150 bytes
{<obj oid='cbQosCMPrePolicyByte64.50.13986465' val='0x4735b2486'/>}
38639853 packets, 19115222600 bytes
{<obj oid='cbQosCMPrePolicyByte64.50.13986465' val='0x4735b2648'/>}
38639862 packets, 19115223050 bytes
{<obj oid='cbQosCMPrePolicyByte64.50.13986465' val='0x4735b280a'/>}
38639868 packets, 19115223350 bytes
{<obj oid='cbQosCMPrePolicyByte64.50.13986465' val='0x4735b2936'/>}
38639874 packets, 19115223650 bytes
{<obj oid='cbQosCMPrePolicyByte64.50.13986465' val='0x4735b2a62'/>}
38639881 packets, 19115224000 bytes
{<obj oid='cbQosCMPrePolicyByte64.50.13986465' val='0x4735b2bc0'/>}
38639890 packets, 19115224450 bytes
When I compare the hex-values with the values from the show policy-map-command, they are exact the same.
What do you think? ios: c2800nm-adventerprisek9-mz.124-24.T1.bin
regards, William
Here's my wild guess: the "show policy-map" command might also update the SNMP counters.
ReplyDeleteCan you redo the tests without the "show policy-map"?
You are absolutely right! Using the same script, just taking out the show-policy-map line and the puts-line for the not used array:
ReplyDeleteseveral runs over 20 seconds:
I got 4 values: aa-bb-cc-dd where bb and cc where between 8.8 and 9 seconds long
I was wondering over the involved processes:
so the one with ONLY snmp-get showed from a second vty during the tcl-run:
R1#show proc cpu sorted 5sec
CPU utilization for five seconds: 21%/7%; one minute: 11%; five minutes: 10%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
304 1340 114 11754 11.90% 1.71% 0.36% 514 Tcl Serv - tty51
179 5436 245958062 0 1.11% 1.20% 1.20% 0 HQF Shaper Backg
2 724 197138 3 0.07% 0.04% 0.05% 0 Load Meter
65 188 3942710 0 0.07% 0.02% 0.00% 0 Netclock Backgro
56 1460 251635 5 0.07% 0.00% 0.00% 0 HC Counter Timer
180 908 9854235 0 0.07% 0.06% 0.07% 0 RBSCP Background
42 1108 985723 1 0.07% 0.03% 0.02% 0 Per-Second Jobs
7 0 2 0 0.00% 0.00% 0.00% 0 Timers
6 12 16 750 0.00% 0.00% 0.00% 0 Pool Manager
(second part in second post)
the one with snmp AND show-command (after a while waiting):
ReplyDeleteR1#show proc cpu sorted 5sec
CPU utilization for five seconds: 40%/8%; one minute: 14%; five minutes: 11%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
304 3592 84 42761 31.26% 4.76% 1.02% 514 Tcl Serv - tty51
179 5448 246005037 0 0.79% 1.15% 1.19% 0 HQF Shaper Backg
1 208 2419 85 0.15% 0.01% 0.00% 0 Chunk Manager
41 120 985852 0 0.07% 0.00% 0.00% 0 TTY Background
65 188 3943482 0 0.07% 0.02% 0.00% 0 Netclock Backgro
146 1632 1525441 1 0.07% 0.02% 0.00% 0 CEF: IPv4 proces
180 920 9856122 0 0.07% 0.06% 0.07% 0 RBSCP Background
38 10316 256052 40 0.07% 0.01% 0.00% 0 Net Background
33 60 985875 0 0.07% 0.00% 0.00% 0 GraphIt
190 144 1925419 0 0.07% 0.01% 0.00% 0 CCE DP URLF cach
258 424 4929352 0 0.07% 0.02% 0.00% 0 Atheros LED Ctro
11 0 1 0 0.00% 0.00% 0.00% 0 IPC Zone Manager
12 96 962743 0 0.00% 0.00% 0.00% 0 IPC Periodic Tim
Interesting is the double amount of total cpu-time. If the shown processes do really reveal anything about which process might be resonsible for the updating of the mib I cannot see.
Well I totally agree with you: 9-10 seconds of a dead counter in an important monitoring-mib for qos seems not to be good.
To get precise data what do you suggest? Perhaps all the netmon-vendors have to include a small script in future which executes a show-policy-map-interface-command first before they poll the counter . :-D
I thought I could get the info out of the device but know I come back to using taps and do the analysis separate.
regards, William
Ivan/William:
ReplyDeleteJust following up on this thread. It looks like William has a way to make the snmp update real-time. So my questions are: 1.) can we create a never-ending background task that runs 1 time per second that runs "show policy-map" so that our SNMP values are incrementing in real-time, in which case we could then create an EEM applet that accomplishes what I was originally trying to do?
Thanks!
-Jared
I've got a problem with my 2811 router.
ReplyDeleteI cannot use any of examples that checks snmp value. When typing "event snmp oid" commands I'm getting "Invalid input detected" pointing at snmp part.
Everything is ok with my 1841 device. I've checked all 12.4T and 15.1 versions with no luck.
I've compared "sh event manager version", and I see that in 2811 device there's no "snmp" and "snmp-notification" detectors.
Why? is this platform 'broken'? 'special'? is there something that I should enable on device to get snmp detectors working?
I'd be very glad for any help.
Thanks!