Cisco Blogs


Cisco Blog > Data Center and Cloud

VDI “The Missing Questions” #3: Realistic Virtual Desktop Limits

So this is the Million Dollar Question, right? You, along with the executives sponsoring your particular VDI project wanna know: How many desktops can I run on that blade? It’s funny how such an “it depends” question becomes a benchmark for various vendors blades, including said vendor here.

Well, for the purpose of this discussion series, the goal here is not to reach some maximum number by spending hours in the lab tweaking various knobs and dials of the underlying infrastructure. The goal of this overall series is to see what happens to the number of sessions as we change various aspects of the compute: CPU Speed/Cores, Memory Speed and capacity. Our series posts are as follows:

 

You are Invited!  If you’ve been enjoying our blog series, please join us for a free webinar discussing the VDI Missing Questions, with Doron, Shawn and myself (Jason)!  Access the webinar here!

But for the purpose of this question, let’s look simply at the scaling numbers at the appropriate amount of RAM for the the VDI count we will achieve (e.g. no memory overcommit) and maximum allowed memory speed (1600MHz).

As Doron already revealed in question 1, we did find some maximum numbers in our test environment. Other than the customized Cisco ESX build on the hosts, and tuning our Windows 7 template per VMware’s View Optimization Guide for Windows 7, the VMware View 5.1.1 environment was a fairly default build out designed for simplicity of testing, not massive scale. We kept unlogged VMs in reserve like you would in the real world to facilitate the ability for users to login in quickly…yes that may affect some theoretical maximum number you could get out of the system, but again…not the goal.

And the overall test results look a little something like this:

E5-2643 Virtual Desktops

E5-2665 Virtual Desktops

1vCPU, 1600MHz

81

130

2vCPU, 1600MHz

54

93

 

As explained in Question 1, cores really do matter…but even then, surprisingly the two CPUs are neck and neck in the race until around 40 VM mark. Then the 2 vCPU desktops on the quad core CPU really take a turn for the worse:


Why?

Co-scheduling!

When a VM has two (or more) vCPUs, the hypervisor must find two (or more) physical cores to plant the VM on for execution within a fairly strict timeframe to keep that VM’s multiple vCPUs in sync.

MULTIPLE vCPU VMS ARE NOT FREE!

Multiple vCPUs create a constraint that takes time for the hypervisor to sort out every time it makes a scheduling decision, not to mention you simply have more cores allocated for hypervisor to schedule for the same number of sessions: DOUBLE that of the one vCPU VM. Only way to fix this issue is with more cores.

That said: the 2 vCPU VMs continue to scale consistently on the E5-2665 with its double core count to the E5-2643. At around the 85 session mark, the even the E5-2665 can no longer provide a consistent experience with 2vCPU VDI sessions running. I’ll stop here and jump off that soap box…we’ll dig more into the multiple vCPU virtual desktop configuration in a later question (hint hint hint)…

Now let’s take a look at the more traditional VDI desktop: the 1 vCPU VM:


With the quad-core E5-2643, performance holds strong until around the 60 session mark, then latency quickly builds as the 4000ms threshold is hit at 81 sessions. But look at the trooper that the E5-2665 is though! Follow its 1 vCPU scaling line in the chart and all those cores show a very consistent latency line up to around the 100 session mark, where then it becomes somewhat less consistent to the 4000ms VSImax of 130. 130 responsive systems on a single server! I remember when it was awesome to get 15 or so systems going on a dual socket box 10 or so years ago, and we are at 10x the quantity today!

Let’s say you want to impose harsher limits to your environment. You’ve got a pool of users that are a bit more sensitive to response time than others (like your executive sponsors!). 4000ms response time may be too much and you want to halve that to 2000ms. According to our test scenario, the E5-2665 can STILL sustain around 100 sessions before the scaling becomes a bit more erratic in this workload simulation.

021813_1657_VDITheMissi3.png

Logic would suggest half the response time may mean half the sessions, but that simply isn’t the case as shown here. We reach Point of Chaos (POC!) where there is very inconsistent response times and behaviors as we continue to add sessions. In other words: It does not take many more desktop sessions in a well running environment that is close to the “compute cliff” before the latency doubles and your end users are not happy. But on the plus side, and assuming storage I/O latency isn’t an issue, our testing shows that you do not need to drop that many sessions from each individual server in your cluster to rapidly recover session response time as well.

So in conclusion, the E5-2643, with its high clock speed and lower core count, is best suited for smaller deployments of less than 80 desktops per blade. The E5-2665, with its moderate clock speed and higher core count, is best suited for larger deployments of greater than 100 desktops per blade.

 

Next up…what is the minimum amount of normalized CPU SPEC does a virtual desktop need?

 

Tags: , , , , , , ,

VDI “The Missing Questions” #1: Core Count vs. Core Speed

January 31, 2013 at 8:40 am PST

Choosing the right compute platform for your VDI environment requires both science and art. You have to balance CPU and memory characteristics against your expected workload profile and your desired density. At the end of the day, VDI has to meet some cost criteria in order to go from a fun science project to a funded program in your company. That means you can’t just throw the top bin CPU at the problem; you have to pick the right CPU. This is further complicated by the fact that there is not one CPU that is ideal for all VDI workloads. There is no magical bill of materials at the end of this series of blogs, but we will attempt to make your VDI decisions based more on science than art.

Strength in numbers? Or strength in speed? As Tony said in his introduction, we had several involved questions related to VDI that we honestly couldn’t answer… so we decided to start testing. This will be a series of blogs that attempts to answer practical questions like “when is processor A better than processor B?” And of course you then have to ask “when is processor B better than processer A?” In this first installment in the series, I will tackle the question of whether the number of cores or the core speed is more important when the goal is to achieve the best desktop density per host. Here is a handy guide to the other posts in this series:

The usual suspects. Throughout this series, we will focus on two processors. We picked them because they are popular and cost effective, yet quite different from each other. They are not top bin processors. Take a look at the table below for a comparison.

Note: Prices in this table are recommended prices published by Intel at http://ark.intel.com and may vary from actual prices you pay for each processor. The SPEC performance numbers are an average of SPEC results published by many OEMs (at http://www.spec.org/) across many platforms. These are not Cisco-specific SPEC numbers.

Read More »

Tags: , , , , , , ,

VDI – The Questions You Didn’t Ask (But Really Should)

There’s no shortage of content out there (a quick Google search easily confirms this) when it comes to looking for vendor-originated material touting the latest server performance benchmarks for hosted virtual desktops.  Being part of that community, I’m pretty sure I have my fingerprints on more than one such piece of collateral – and I’m constantly reminded of this, when we run into questions along the lines of “yeah, {xxx} desktops on a blade is great, but c’mon, you and I both know we’d never do that in practice”.  It’s a balancing act of demonstrating solution performance, intersected with the practical reality of what IT managers would reasonably support in a production environment.

So what really matters?  If I’m implementing VDI for the 1st time, and I’m trying to make intelligent decisions around CPU, memory speed, IOPS, etc., where do I go?  VDI is unique in its consumption of compute, storage and network resources, when compared to other workloads hosted in the data center.  Much of the performance benchmarking info put out by server manufacturers is not specific to VDI performance, or how user experience might be impacted by simple decisions like choice of clock speed or # of vCPU.

Thankfully, there are folks in my company that care a LOT about such questions.  So much so, that a small, VDI-proficient group of them took it upon themselves to design and build an in-house lab environment with one express purpose – exhaustively exploring and documenting the performance and scalability impacts seen when configuring your compute platform for VDI.  No stone left unturned – things like CPU cores, clock speed, memory speed,  vCPU, memory density and more – all fair game.

The findings are extremely valuable to anyone deploying VDI, and what this team discovered is a set of real-life “questions”.  The “Missing” questions if you will – those questions that are noticeably absent or never sufficiently exposed in marketing materials, when it comes to the practical choices you can make that most significantly impact the cost, scalability and performance of your virtual desktop implementation.

So let me start with an introduction.  Over the next few weeks, you’re going to hear from some peers of mine – Doron Chosnek, Jason Marchesano, and Shawn Kaiser.  They’re Cisco Consulting Systems Engineers, and they live and breathe VDI (I know, melodramatic), as implemented in their customers’ data centers around the world.

They undertook this journey with the express purpose of answering the “missing” questions, by assembling a test platform in their lab, built on Cisco Unified Computing System (UCS), using readily available components including:

  • Various UCS B200 M3 configurations
  • Login Virtual Session Indexer (Login VSI) 3.6.1 benchmark
  • Login VSI’s Medium with Flash workload
  • VMware View 5.1.1
  • Microsoft Windows 7 SP1 32-bit virtual desktops
  • Pure Storage FlashArray with Purity version 2.0.2.

Keep in mind that their goal was not to explore maximum scalability, or prescribe a preferred design/architecture, or even what kind of server blade or processor you should use for VDI.  Instead they relied on commonly available gear easily found in our customer’s data centers.  If you want prescriptive design guidance, Cisco CVD’s are ideal for that, and you can find them here.

So let’s talk about their test environment.

Physical Lab

The physical environment shown below is a highly overprovisioned system.  Only one B200 M3 blade was tested at any one time, yet every logical link between elements shown consists of multiple 10-GbE links or multiple 8-Gb Fibre Channel links.

The storage array has 24 flash disks and is capable of substantially higher IOPS than used for this testing. All the infrastructure machines used for this test (Active Directory, VMware vCenter, VMware View, VSI Launchers) are virtual machines on the B230 M2 blade in the environment.

9 Q Figure 1

 

Note: At the time of testing, the Pure Storage had not completed UCS certification testing.

 

Logical Server Environment

9 Q Figure 2

The tests involved two UCS B200 M3 blades, one with dual E5-2665 processors and the other with dual E5-2643 processors.  The 2643 is a 4-core high clock/burst speed processor, and the 2665 is an 8-core medium/high clock/burst speed processor.  Here are the specs for the CPU’s chosen:

9 Q Figure 3

Now, you may wonder, are either of these THE processor you would choose for VDI?  Not necessarily! 

Keep in mind the goal we set out with – to expose the relative impacts of # cores, clock speed, memory speed, #vCPU’s etc.  What you’ll take away from the results, are guidance on which parameters matter for specific types of VDI deployments.  You can then safely look at a VDI-“workhorse” processor like the E5-2680 or E5-2690, and apply what our CSE’s have learned through this testing, to that class of CPU, and make your best selection there.

The tests were conducted using Login VSI’s Medium with Flash workload generator.  As we explore the test results in this series, you’ll see reference to “VSImax”, which defines the threshold past which the user experience will be unacceptable.  The VSImax threshold will appear on supporting graphs that show the performance curve under various test scenarios.  You can learn more about how this threshold is derived here.

9 Q Table 1

So that’s the test environment.  Through this series – let’s call it VDI – the Questions You Didn’t Ask (But Really Should) – our CSE friends (Shawn, Doron, and Jason) will explore and expose the findings they’ve documented for us, dealing with a new “question” each time.  If you join us for this journey, it’ll be worth your while – you’ll come away with a better appreciation of the impact that some simple decisions in your data center compute configuration can make.

So are you ready for the journey -- You’ll find the Questions (answered thus far) below:

  1. VDI “The Missing Questions” #1: Core Count vs. Core Speed
  2. VDI “The Missing Questions” #2: Core Speed Scaling (Burst)
  3. VDI “The Missing Questions” #3: Realistic Virtual Desktop limits
  4. VDI “The Missing Questions” #4: How much SPECint is enough
  5. VDI “The Missing Questions” #5: How does 1vCPU scale compared to 2vCPU’s?
  6. VDI “The Missing Questions” #6: What do you really gain from a 2vCPU virtual desktop?
  7. VDI “The Missing Questions” #7: How memory bus speed affects scale
  8. VDI “The Missing Questions” #8: How does memory density affect VDI scalability?
  9. VDI “The Missing Questions” #9: How many storage IOPs?
  10. VDI “The Missing Questions” Conclusion

Special Web Event -- You’re Invited!

If you’re enjoying our series, be sure to join our free webcast, where Shawn, Doron and Jason will discuss all the (Missing) VDI Questions Live + take your Q&A.  Access the webcast here.

Featured Whitepaper Now Available!

Need a convenient whitepaper-ized version of the discussion?  Download it now, here.

Tags: , , , , ,

Announcing Cisco Jabber for Virtual Environments with the New Release of Cisco VXI

These are exciting times. Today Cisco announced the latest release of the Cisco Virtualization Experience Infrastructure (VXI) Smart Solution and I am very pleased to share this news with you. Cisco has unveiled a new software strategy to support Cisco Jabber for virtual environments as an integral part of Cisco VXI. Cisco has taken this path to innovation based on how our customers use the Cisco VXI Smart Solution today for desktop visualization and from trends in the market. We continue to see strong growth in desktop virtualization and in new collaborative experiences not to mention the ongoing demand for BYOD and mobility.

Cisco VXI was the first desktop virtualization architecture to eliminate the bottlenecks and overloads that often occur with rich media collaboration. Today we are evolving that architecture further by including Cisco Jabber for virtual environments which — thanks to Cisco Virtualization Experience Media Engine (VXME) — leverages the computing and processing power of the local environment to minimize the impact of rich media on network performance and data center resources.

Cisco VXME enables virtual desktop users to take advantage of Cisco Jabber’s suite of collaboration features like voice calling, high-definition video calling, presence and instant messaging. Meanwhile, virtual desktops, applications and collaboration services are centrally hosted on the Cisco Unified Data Center and delivered to a broad array of devices resulting in a seamless user experience. It’s just like using a traditional local desktop.

With today’s announcement, Cisco VXI becomes the first desktop virtualization solution to integrate network-based Quality of Service. The Cisco VXME software makes the network aware of voice and video traffic and automatically prioritizes it, reducing jitter and delays. The result? IT managers are now able to easily deliver a high quality collaboration experience to their virtual desktop user communities.

Not only do these innovations create a stellar user experience, they also meet security needs. Virtual desktops become a mirror of traditional workspaces, and as such provide the same level of secure access to documents, corporate applications, and a full suite of collaboration tools via Cisco Jabber.

Additionally, users are now able to personalize their virtual workspace experience with our new desktop accessories from Jabra and Logitech.You really have to check them out.

We continue to work closely with our partners who are fully enabled to implement an end to end VXI Smart Solution. Find out what this new release means to them.

Right now, Cisco VXME is designed to work with the Cisco Virtualization Experience Client (VXC) 6215 and will be globally available in March of this year. Support for 3rd party thin clients and Windows PCs will follow during the first half of 2013. Cisco Jabber for virtual environments is compatible with Cisco VXI solutions running Citrix XenDesktop, Citrx XenApp, or VMware View 5.1.  Read Citrix and VMware perspectives on these innovations.

To learn more about Cisco’s desktop virtualization strategy and see a demonstration of  Cisco Jabber for virtual environments and the new UC accessories, I invite you to join me for the Cisco Collaboration Announcement Webcast with live Q&A on Jan 17 from 9-9:30 a.m. Pacific Time (replay available after 11 a.m. Pacific Time).

Phil

Tags: , , , , , , , , , ,

Automate Migrating ESX Host Interfaces to Nexus 1000V

January 16, 2013 at 8:45 am PST

“We’ve tried, it can’t be automated!” I’ve heard this more times than I can keep track of and if you read my previous blog you will know that I just do not agree. I have written about automation with Linux utilities, UCS PowerTool, AutoHotKey, Excel, etc… 99.999% of operations can be automated. So when a customer tells me that something cannot be automated I usually respond with “Have you tried …?”

Here is the scenario; the customer has an automated build process for ESX hosts. At the point where the host is ready to be connected to the Nexus 1000V the process becomes manual. The customer would like to use VMWare PowerCLI to migrate the host interface but the Cmdlet to retrieve Distributed Virtual Switches, Get-VirtualSwitch, just returns the DVS objects,  there isn’t a Cmdlet to migrate the ESX vmnic interface.

Hold on a second, I know that VCenter knows about the Nexus 1000V because I see it in the interface. I know that VCenter can manipulate the Nexus 1000V because VCenter is where interface migration is done. I am fairly certain at this point that ESX interface migration from the VCenter vSwitch to the Nexus 1000V can be automated. But what to use to do it, there is no PowerCLI command like Set-ESXHostInterfaceToN1kv. This is typically where automation ends for many, sometimes you have to dive deep into the objects that the system manages and figure out what to do. Sometimes someone has already done a deep dive into something like what you are trying to do and maybe you can build off of their work.
Read More »

Tags: , , , , , , , , , ,