Can we agree on how to measure performance?
Recently there was a posting from an irrationally exuberant vendor employee about the Nexus 7000. Brad Reese from Network World commented on it a bit here. While engaging in a bit of a written riposte it occurred to me that maybe we need some clearer idea, as an industry, on how to consistently measure performance.I have heard for years of Cisco Math, Cabletron Math, etc… well, today, I would like to propose that maybe we can all find some way to agree on ‘math’. After all, even though not that mathematically inclined I heard once that math is an absolute truth. So here is my cut at ‘the rules’ for stating your bandwidth on a switch….1) Do not double count per-slot bandwidth. i.e. if a slot has 40Gbs of bandwidth and can support 4x10GbE interfaces worth of traffic, then it is 40Gb of bandwidth. I agree that this is 40Gb IN and 40Gb OUT, but alas its still 40Gb. 2) Double-Count a switch fabric. i.e. a 10-slot chassis with 100Gb per slot to every slot with no blocking in the fabric would be a 2Tb switching fabric. I think that this double-counting here has meaning: it is the actual capacity of the fabric, and it does let the casual observer get an indication as to why then a 10-slot 2TB Modular switch is better than 10 separate 20-port 10GbE switches. (try building those 20-port configs into one multi-stage fabric offering 100 host facing 10GbE ports and you’ll see what I am talking about….)3) Don’t separately count the local switching capacity on a line-card. Yes, most all switches nowadays that are modular have distributed forwarding in some form or fashion at the high-end. Don’t go counting that too. That’s like saying- “well, ya see… I have 10-slots in that there chassis. I have 10 10GbE ports on each line card and a 100Gb to the fabric. I can forward from port 1 to port 10 without crossing that fabric. So thus I have 400Gb fabric on each line card and a 2Tb main fabric. So I get 10 linecards at 400Gb each or 4Tb and a central 2Tb so I have a 6Tb switch.” I don’t do this, please don’t also do this… it’s just poor form… not to mention pretty incomprehensible to most customers…4) Packet Per Second forwarding rates. Again, single count them. Just because the packet goes in and comes out of a forwarding engine (or the header does if its not an inline FE) doesn’t mean you get to count it twice… If you can do wire-rate at 10GbE thats roughly 15Mpps @ 64byte frames. btw- a highly unrealistic model as I have never seen an all 64-byte frame data flow in any production network. 5) Aggregate PPS forwarding rates- if you have 100Mpps on each of 10 line cards then you can do up to 1000 Mpps or 1Bpps. Simple…6) Fabric Redundancy- if your switch has all fabrics ACTIVE and all are utilized count both, or all 5 or all 8, etc. If there is a N+1 model or 1+1 model and the spare is NOT utilized unless there is an outage, don’t count it. Be prepared to prove this btw as I see this one getting messed around with a lot… Also data sheets should state the performance of a single-fabric and of a fully loaded fabric. 7) Oversubscription- I like oversubscription personally. I find that it lets you increase density and balance density and performance with cost to maximize the number of devices connected to a single network node. But not withstanding the philosophy of oversubscription if you are purposefully oversubscribing something somewhere, i.e. you have a module that has 8 ports of 10GbE with a 40Gb Switch fabric connection, call that out in the data sheet. Call it: Fabric Interconnect speed or something. It’s not a sin: it’s a way to increase the ports and accepts the reality that not all ports are active at the same time. I would recommend vendors put the right counters in though so customers can know when oversubscription is failing them….8) Lifecycle Capacity. Some switching platforms are designed with ‘headroom’. i.e. You know you can add a higher performance switching fabric or forwarding engine in the future lifecycle of a platform. Since this is important for customers to know in order to make a longer term investment decision it should be annotated, based on the same ways of counting described above, but also flagged as something that the platform WILL deliver in its LIFETIME so it is understood as a future deliverable but one the vendor is guaranteeing their customers and potential customers that they will absolutely deliver.Allright, I may come back and add some more thoughts to this today and tomorrow as they hit me. But does this make sense? dg