Before I came to Cisco I still wrote about Fibre Channel over Ethernet (FCoE) and did my best to try to help edumificate people on how the technology works. One of the most popular things I’ve ever written, in fact, was a comparison between FCoE and another convergence technology, iSCSI. Since that time I’ve come to learn and understand a lot more about both technologies, how they relate to each other, and how storage networks are designed and implemented using them.
Since Demartek published the recent piece on multiprotocol connectivity, which included some comparisons on the protocols regarding latency I thought it might be a good time to revisit some of those questions.
The conclusion was, effectively, that given the right set of circumstances and with proper planning, iSCSI has the potential to be a a great performance technology. Much of the real-world limitations, though, have little to do with the protocol (the speed of the arrays has a much greater impact than the network protocol itself, for instance, and poor topology architectures can have negative impacts on performance as well).
Because of this, the FCoE “versus” iSCSI debate continues to rage on, even after all this time. In general, as I’ve mentioned before, this is a false dichotomy: FCoE and iSCSI are different tools in the Data Center toolbox, and each has different types of implementations and for different reasons.

When it comes to performance, though, there simply hasn’t been a side-by-side comparison of the protocols using the same, multiple switches and the same interconnects. Demartek previously had an evaluation of Fibre Channel, iSCSI, and SAS host interfaces, but that was
- focused on the host interfaces and
- did not include FCoE and
- was not designed to measure across multiple topologies.
In this latest paper, however, Demartek addresses these questions. In an “all things being equal” kind of way, what do we find when we use the same equipment and the same topologies but change the protocol? To that end, with relation to each other, Demartek shows a couple of things of note:
- Each protocol -- FC, FCoE, and iSCSI -- show consistent results in average latency and standard distribution whether we’re talking about one switch, two switches, or three switches (topology diagrams are included in the report).
- In relation to each other, there can be a marked difference in protocol latency. For example, in the tests, the average latency is lowest for FCoE with a considerable difference than FC and iSCSI.
To me, the amazing thing about these results is that they were remarkably consistent across the board. Moreover, because these were Layer 2 networks, there was no routing going on for the iSCSI traffic, which means that the only time that the protocol stack became an issue was at the initiator and target.
This also puts to bed the notion that there is some sort of “encapsulation penalty” for FCoE.
Interpretation
The topologies are clearly laid out in Demartek’s report, but while SQLIO is a test suite that is designed to mirror real-world applications, this wasn’t a “let’s try to break the protocol” test. In fact, to my knowledge there isn’t a comparison like this done to this extent, just to explore what some of the differences might be.
When I sit down to really think about this, it’s rather interesting that we can use the same 3 pieces of equipment -- Nexus 5500, Nexus 7000, and MDS Fibre Channel switches -- and (for the most part) run any protocol at any point over any box (the notable exception is iSCSI on the MDS, but the others were total mix-and-match). In that way, we were able to check the differences between the protocols regardless of the topology design.
In a nutshell, latency is only one of many variables to affect the decision to use a particular protocol, but for some customers it is extremely important (even if they don’t always understand it’s relative context to other network performance benchmarks, like IOPS). At the very least, this testing does put some of these answers to light.
What does this mean? Essentially it means that when it comes to Ethernet-based block storage protocols, iSCSI may have specific use cases and advantages in certain deployment opportunities, but from a protocol performance perspective it appears that my initial estimate of the TCP/IP stack affecting performance was on target. Likewise, the elegance of the FCoE frame format encapsulation does, in fact, appear to have a significant advantage over iSCSI.
More than two years ago I hypothesized that this would be the case, and it’s good to know that we finally have the numbers to show it. Even so, this is only one data point, but a mystery (for me, at least) that has finally been solved.
Note: The twitter link on the profile is incorrect. I can be found on Twitter with the handle @drjmetz
Tags: FCoE, Fibre Channel, iSCSI

Hi
I have talk with several vendors about things such latency, IOPS, data payload and everyone has his own version of the history, but…in this moment that we are seeing extreme speeds over ethernet, low latency standards, network protocols that are widely used for middleware and applications… I have some questions, ¿which is the real purpose of FCoE? ¿does it have sense to consider it much more than a transition protocol that helps the companies to make easier the road from FC to a full stack of storage over Ethernet? ¿why is the protocol much more important, when vendors are accelerating the storage to match memory numbers, than the ease of use or the simplicity in the architectures?
Thanks in advance and excuse me if I mispelled something, i’m not english native speaker.
There are a number of reasons why people might want to have consolidated I/O. Reduced cabling, fewer physical assets (switches, cards), lower power/cooling, and even more flexible designs. FCoE provides data centers the ability to run the Fibre Channel protocol using the same Ethernet physical structure as other types of traffic instead of needing to have two separate systems.
You may find some additional insight here regarding Cisco’s vision about convergence.
What about iSER?
I believe the next 2 years will be very important to the future of FCoE adoption.
Though devices like Violin Memory’s V6000 can prove their strengths a lot better via protocols like FCoE (compared to FC, not to speak about iSCSI), there remains one big question:
Will we need at the end a special storage network protocol at all ?
Mid of july the major vendors at SNIA founded a working group to standardize the direct access from processors to nonvolatile memory (NAND-Flash, RRAM).
Maybe in the future we will see more distributed processors with hundreds of Terabytes of nonvolatile memory included, just connected via highspeed Ethernets, like the Nutanix cluster.
If that happens, FCoE may have the same destiny like Token Ring – for some time quite successful, but at the end of no practical use because of a technology break.
Hi J, for me, the most important statement in the entire article was:
“Generally speaking, the latency for 10GbE switches is measured in microseconds, while storage latencies are generally measured in milliseconds.”
In other words, the reason why the latency was virtually identical over the set of topologies tested was because the amount of latency introduced by the network was practically a rounding error when compared to the latency introduced by either the host or storage array.
I said either host or storage array because I don’t know enough about the application/end points to know exactly how much latency will be introduced by each component. That having been said, I suspect the VAST majority of the total latency was due to the storage array. Although I work for EMC and would love the chance to tweak my colleagues at NetApp, the simple fact is that (as of today) the response time from the storage array (any array including ours) will always be at least one or two orders of magnitude greater than the latency introduced by the Network.
With this in mind, since Demartek was kind enough to run these tests over a set of different topologies, I suspect I now know much more about NetApp’s average response time when using different protocols than about anything else. BTW, I’ll point out that the average response times from the arrays were much longer that I would have expected. I would recommend that they re-run the test using something other than SQLIO since that application seems specifically designed to stress the storage array.
“SQLIO is a utility application provided by Microsoft that sends database I/O workloads to a storage system for the purpose of stress testing a storage system.”
If they were to use something that generated large block sequential reads and writes, they may get much more reasonable response times, but again the network latency would probably fall into the “background noise” category since we’d still probably be talking about milliseconds versus microseconds.
Regards, Erik
Demartek’s write-up is short on details. We don’t know if iSCSI was using DCB, or just plain TCP/IP flow control. We don’t know if the iSCSI host adapter was configured for all supported iSCSI and TCP offload optimizations, or if jumbo frames were used. We don’t know the host CPU utilization under each test.
I’m surprised you wouldn’t seek out these details before coming to your conclusion. You don’t think such things matter?
Also, I find it interesting that you would make a conclusion under the premise of “all things being equal”, when in fact the FCoE host was using a faster processor than the FC and iSCSI hosts.
Cheers,
Brad
I can answer some of those questions.
First, this was not lossless iSCSI, but regular TCP/IP iSCSI with full TCP offload. Second, since that offload was done on the CNA, and not on the CPU, it seemed logical that ‘all things were equal’ from the hardware perspective.
You do raise some good points, though, and as this was an exploratory test (and not supposed to be definitive), I have some ideas and plans for what a more comprehensive tests that I want to run. Thanks for reminding me about some of the variables.
J
Thanks for the additional details, J.
If you are comparing three protocols that are each capable of utilizing lossless forwarding in the network, providing that same service to all three protocols would certainly put you in a better position to set the ‘all things being equal’ premise.
It would also help to have the iSCSI, FC, and FCoE storage arrays using similar speed drives. In the Demartek test, the FCoE array utilized 15K rpm drives, while the FC and iSCSI arrays were equipped with 10K rpm drives. That alone discredits the “FCoE tests had lower latency than iSCSI” conclusion. No?
Looking forward to seeing the results and setup of your more comprehensive tests.
Cheers,
Brad
Given the latencies involved (milliseconds instead of microseconds), even if they used flash drives for all three protocols instead of 10k and 15k drives, they probably still wouldn’t be measuring anything meaningful about the different protocols. The natural variations in response time from the “disk” would probably mask any difference between the protocols..
In other words I think it’s like trying to measure millimeter distances using a yard stick…
Erik
What about AoE? It is cheap, fast and simple and it’s not requiring any change on the hardware site.
Hi Lorenzo.
Thanks for reading. AoE was not part of our exploratory testing parameters.