Press J to jump to the feed. Press question mark to learn the rest of the keyboard shortcuts
12

95th explanation

Can someone please explain 95th % billing to me?

My understanding is that measurements of traffic are taken every so often (hourly??) and then placed into a list from highest to lowest. Then the top 5% is removed and the next remaining value is what the 95% rate is. Is that right?

I've also heard people talking of ways to move traffic around on specific days to "work-around" 95% billing... not sure what this means.. any ideas?

13 comments
84% Upvoted
What are your thoughts? Log in or Sign uplog insign up
level 1
12 points · 2 months ago · edited 2 months ago

You have the right idea but most shops record traffic in intervals much shorter than 1hr. I'd expect the intervals to be 3-5minutes, over 30days. Assuming the datapoints are just traffic utilization at an exact point in time, longer intervals would likely benefit the client as it would be easy to miss frequent burst traffic.

As for people moving traffic around to avoid going over their sla, it comes down to the math. If the provider measures traffic every 5 minutes, over 30days, that means they have 8,640 datapoints. I won't go to deep into the math but that means you could theoretically burst a 10Mbps circuit to 1Gbps for 36hrs straight and as long as you stay below 10Mbps after that for the rest of the 30 days, you wouldn't get hit with overages.

I wrote this quickly as i'm short on time, so don't judge me to harshly on the content.

level 2
3 points · 2 months ago

pretty much this.

Its a similar concept to demand charges with power. If your data peak is at noon, don't set your auto updates to go at noon. set them for the lowest point in the day

level 2

Just one quibble - any measurement of traffic has to be an average over some amount of time.

level 3
0 points · 2 months ago

I've always thought that it used utilization points (%maybe?) to determine the throughput at a certain time.

IE: 00:00:00 is 90 and 00:00:01 is 80. to calculate the amount of data transfer, you assume a line between the two point and calculate from there. An integration of the graph of the points would then give you the volume of data transferred.

level 4
CCNP3 points · 2 months ago · edited 2 months ago

90 what? 80 what? Also, assuming a line between two points would be desperately inaccurate. A fundamental characteristic of most TCP/IP traffic is that it's bursty by nature.

Even if your software is asking the switch for utilization numbers, those numbers are derived from averages over time. The fundamental operation here is the device incrementing counters every time it forwards a packet. As a result most management tools will ask for the most fundamental (and accurate) description of the data, which is the raw counters, and then do it's own math to derive the averages over the intervals it needs. And notably, every time a network device forwards a packet, it does so at the line rate of the interface. There's no such thing as a 1G interface transmitting at anything less than 1Gbps. It's either transmitting at 1Gbps, or it's silent. So the difference between a port at 25% utilization and one at 75% utilization, is the latter has less "dead air" so to speak between packets. Consequently, it's more likely that when a new packet arrives and is ready to be forwarded, the interface will already be busy forwarding one or more earlier packets and it will wind up in a queue for some amount of time.

EDIT: it's likely you get it already, but the visualization I use for new guys to get them to think about this correctly is that it's not a faucet whose nob can be adjusted up and down in an analog fashion from a trickle to full-flow. It's an immediate and absolute transition between full flow and fully closed.

level 5
0 points · 2 months ago · edited 2 months ago

The what of the 80 to 90 is what I'm unsure about. If it's measuring mbits or if it's measuring percent utilization.

The linear line isn't really that inaccurate. That's the fundamental principle of calculus and integration. Take a sine wave for example. You can approximate it with straight lines. The more straight lines you get, the closer you get to the actual curve.

So when a device monitors throughput, i'm thinking that it just monitors the actual throughput at a set time interval, then uses that as a a data point. Snmp for example may take a reading every minute. You can see that on the live graphs in a program like librenms. The program then takes those data points and integrates them to find the data transfered.

The other way it could happen, but I would think is less likely is that it takes the packet count and multiplies it by the size. That would give you a total throughput over a time interval which you would then be able to derive into a data flow rate. The tough and intensive part would be knowing the size of each packet and recording that over the time interval.

With either of the methods, it's pretty obvious that the measurement systems aren't really very accurate.

level 6
CCNP3 points · 2 months ago · edited 2 months ago

So, to be clear, we're talking about what actually happens with mainstream SNMP monitoring applications and common networking devices right? This isn't a mystery. It's explicit and well defined. The device itself tracks a running total of # of packets received and transmitted, as well as number of octets (bytes) received and transmitted. There's no guesswork there. It knows precisely how many bits it's put on the wire. Note, these are totals, so since the last time the counters were cleared (often at reboot, can also be done manually) or rolled over (these are either 32bit or 64bit numbers, and especially with 32bit on high throughput interfaces, hitting the max value and rolling over to 0 is common.

The way these values are used in typical monitoring applications are as follows:

(for simplicity I'm only talking about bytes RX, but also applies for TX, as well as packets, or any other metric of this type)

Assuming a 30 second polling interval:

At t=0 poll and record total bytes RX. in isolation this number means nothing. Since the beginning of time, this interface has received t0_rx bytes.

at t=30 poll and record the new total bytes RX. So now we know that since the beginning of time, this interface has received t30_rx bytes. t0_30_Bps = (t30_rx - t0_rx)/30. Account for possible rollover, convert from bytes to bits, maybe some other light error handling/metadata generation, and stick it in a database for later graphing. That's what's being done. Guarantee no calculus. Network traffic is not Sine waves at any layers between the physical and metaphysical. If you're not building transceivers or doing macro-projections on entire networks, you really care about the difference between "burst of traffic right when I polled" and "sustained traffic for the duration of the polling interval".

Obviously where this method breaks down is your only as granular as your polling interval, but for most gear and most situations, that's life. If you can afford to poll faster, then do it. Otherwise you have to infer the existence of microbursts and stuff from other metrics.

EDIT: It'd be dishonest in this context not to mention that you can, I think, ask switches for "percent utilization" or "load" in at least a few cases. I'm not sure how wide spread this is, because it's generally less useful to me. The reason it's less useful is because those numbers are generated in exactly the same way. The switch has some interval of time it considers in generating those numbers. could be 30 seconds, or could be 5 minutes, or 2 hours, or whatever. But there's an interval of time and it just keeps track of the total it had at the beginning of that interval, and the end, does subtraction and division to give an average. It's generally easier to tune the quantity/granularity of the data you get by adjusting polling intervals than it is to reconfigure the stats-generation-intervals on a fleet of network devices.

(Also I forgot to divide by 30 earlier... whoops)

level 1

And to be clear that measurement is 95%th percentile, not percent. Percentile is a statistical concept that pertains to data sets, not individual points. This article explains the concept well, although not in terms of networking.

https://www.quora.com/What-is-the-difference-between-percentile-and-percentage-1

The way I think of it in my mind is, 95% of the time our traffic is below this threshold.

level 2
CCIE2 points · 2 months ago

This.

Im not happy that I am having to have flashbacks to my college stats classes this early in the AM. Those sucked more than finance! But there are a number of uses for the crap especially in monitoring and analytics.

My dad always said "Almost only matters in horseshoes and hand-grenades". He never warned me of statistics!

level 3

For whatever reason, stats was one of the only higher level math subjects that really stuck with me through the years.

I had an awesome prof, so that's probably why. His entire curriculum was based on gambling games and sports stats (I. E. What does a batting average really mean? What are your chances of winning the lottery?)

level 1

There are upsides and downsides of this. as /u/redwhere said you basically get 36 hours of burst for free per month. So if you have backups running every night you can get away with an hour of smashing the circuit and not get penalized for it.

The down side.... if at 36 hours and 5mins you are still hitting that circuit at 1Gbps... you are getting hit with a 990Mbps overage bill which is usually a couple dollars(up to 10 dollars) per Mbps. That means you can get hit with a 10 grand overage. My last company would usually give you some slack if it happens one freak month but we had customers the regularly bursted and happily paid the fees instead of just locking in a high base speed at a cheaper rate.

level 1
1 point · 2 months ago

Work around would be moving large data files off hours and using something like QOS to control how much bandwidth used. You could also do workarounds like mentioned below where you saturate the heck out the link for slightly less than 5% of the time so you do not get billed for the higher usage....

level 1
Meow 🐈🐈Meow 🐱🐱 Meow Meow🍺🐈🐱Meow A+!1 point · 2 months ago
Community Details

131k

Subscribers

1.1k

Online

###Enterprise Networking Routers, switches and firewalls. Network blogs, news and network management articles. Cisco, Juniper, Brocade and more all welcome.

Create Post
r/networking Rules
1.
Rule #1: No Home Networking.
2.
Rule #2: No Certification Brain Dumps / Cheating.
3.
Rule #3: No BlogSpam / Traffic re-direction.
4.
Rule #4: No Low Quality Posts.
5.
Rule #5: No Early Career Advice.
6.
Rule #6: Educational Questions must show effort.
Cookies help us deliver our Services. By using our Services or clicking I agree, you agree to our use of cookies. Learn More.