FCoE Fact or Foul? Fickle FUD Forces Frustration For Folks
In recent weeks several articles have appeared with respect to FCoE and have been trying to distill some of the technologies for its readers – with mixed success. One of the articles was pretty far off the mark; it prompted me to write the author to offer some corrections and clarification.
At first I thought that the article might have been simply a matter of laziness or FUD, but I didn’t want to jump to conclusions about motives – and I’m glad that I didn’t. In a very thorough email outlining where the author got his information I can not only fathom how he came to understand things the way he did, but also empathize with his frustration as a result.
In short, it’s not his fault. At all. He’s frustrated, and I feel his pain.
Sometimes I feel like Sisyphus, condemned to roll his stone up the hill only to be robbed at the summit by having it roll back to the bottom. I’ve been trying very hard to be as accurate as I can with respect to FCoE, but there are some pretty heavy hitters lined up against me who knock that stone back down to the bottom of the hill.
For his part, the author cited quotations from numerous white papers and vendor documentation that should have been 1) technically accurate and 2) marketing neutral. They were neither.
We’re not talking little faux pas here, we’re talking things that are just flat-out wrong. Quote after quote from several vendors (not just one) would have lead any reasonable person to think the exact opposite of how FCoE actually works. If you can spare the time, let me share with you what I mean.
Muddying The Waters
A couple of examples are worth illustrating the point (vendor names have been removed to protect the guilty):
Vendor #1: “The enhanced transmission selection (ETS) algorithm will strengthen the ability of FCoE to reliably use Ethernet as a transport layer and minimize the chance of link congestion and frame loss.”
ETS does nothing of the kind. It has nothing to do with reliability, congestion, or frame loss. ETS has to do with bandwidth allocation and groupings.
Vendor #2: “PFC allows Fibre Channel storage traffic encapsulated in FCoE frames to receive lossless service from a link that is being shared with traditional LAN traffic, which is loss-tolerant.”
While technically correct, the sentence implies that FCoE and LAN traffic are thrown willy-nilly onto the link and that one type can interrupt the other. This is misleading.
Vendor #3: “For example, with PFC, if storage traffic has a higher priority than LAN traffic and a large storage transfer causes congestion, PFC can be engaged to pause the storage transfer and let the LAN transfer proceed.”
In this case “priority” is used in the colloquial sense, i.e., there is a hierarchy of prioritization where some traffic is more important than others. In the case of DCB networks, “priority” is a misnomer because it relates to classes of service rather than how important they are. In either case, PFC does not “let LAN transfer proceed,” it focuses on making one type of traffic lossless – that’s all. It has nothing whatsoever to do with acting as a traffic cop for permitting which traffic should be transmitted and which should not.
Vendor #1 again: “Based on the priority information collected through PAUSE, the server stops sending any traffic for that specific application while the other applications continue to make progress without disruption on the shared link.”
PFC (nor PAUSE – these are two completely different mechanisms, though the terminology is reused) does not “collect priority information.” This is pure fantasy. As a result, neither PAUSE nor PFC have any role in whether or not “other applications continue to make progress without disruption on the shared link.”
Vendor #4 (Ironically from a company that does not even have FCoE products currently offered): “Typically in the data center FCoE traffic will be assigned to the higher priority classes. This ensures that congestion due to other less sensitive traffic between servers will not cause loss of FCoE storage traffic.”
Again the definition of “priority” causes havoc here. Typically FCoE is assigned to CoS (or “priority”) 3. Now, this means that you will have to be careful if you’ve assigned other types of traffic to CoS/”priority” 3, but it does not mean that FCoE is any more or less important than traffic on, say, priority 2 or 5. These are non-hierarchical lanes.
Again, from Vendor #1 (different white paper): “In addition, 802.1Qbb can leverage prioritization to establish bandwidth allocation on a per-application basis. Time-sensitive applications such as inter-process communications (IPC) can be given a higher percentage of available bandwidth as needed while other applications are assured portions of the remaining available bandwidth.”
802.1Qbb (PFC) says no such thing. Priority Flow Control is, well, Priority Flow Control. Bandwidth allocation is part of 802.1Qaz, Enhanced Transmission Selection. That’s controls the transmission selection and enhances it (i.e., bandwidth management). (By the way, the Qaz document also defines how devices can communicate configuration information and establish correct settings, also called DCBX.)
This is not an exhaustive list of what he shared with me, or that I have found for myself. Let’s not even get into the confusion surrounding QCN (Congestion Notification) and TRILL at this point. It’s sufficient to point out that all the errors mentioned above are compounded.
Is it any wonder this is frustrating? Remember these come from technical white papers not marketing or press releases!
So Why Bother?
Despite some criticisms to the contrary, it might be easy to assume that it may be a case of “the PM protests too much!” After all, it’s in my best interest to promote FCoE at all costs, no matter what, damn the torpedoes full speed ahead!
Actually, not-so-much. It’s in my best interest to make customers and partners understand the role storage-over-Ethernet (whether it be FCoE or not) plays in their Data Center, even if that role is “not at all.”
I have tried to be consistent in my approach: I have never said (nor would I state) that FCoE is right for all customers in all occasions in all situations.
I do think that for certain situations and for certain customers FCoE is a very cool technology that can allow for some very interesting (and cost effective!) solutions to long-term issues within the Data Center.
But how are those customers supposed to know if they are the ones who would benefit the most? How are they supposed to make decisions if all they get is crap like this? Pity the poor reporter whose job it is to break it down into plain English, stuck because he relies on the very documents that are completely unreliable!
I mean, really, can he (or anyone else) be blamed when public statements made by these vendors about aspects of FCoE cannot be trusted?
I don’t think so.
The practical upshot is that there is a gap between how things really work and what customers (and partners) are learning about the technology. For my part, while I may not have any control over what other vendors are writing about, I do have some influence over what my company presents (limited though it may be), and I have 100% control over the accuracy and accountability of my own tweets, blogs, and presentations.
As a result, I strive very hard to be as accurate as I can be. Things change, and sometimes I’m flat-out incorrect. But at least I can make the promise to remain accountable to what I write.
Obviously I can’t stop the misinformation single-handedly, but it doesn’t mean that I can’t get that rock up to the summit. If anyone wants to help, come on board.