VDI “The Missing Questions” #2: Core Speed scaling (Burst)
Welcome back as we continue to dive deeper into advanced CPU (Central Processing Unit – I had a “tech writer” change a document on me one time, he assumed at this day in age that people still needed to have the CPU acronym translated.. but I digress) and Memory concepts in the land of VDI. Last week Doron answered our first question and told us about Core Count vs. Core Speed for scalable VDI. This week we will focus specifically on Core Speed, bursting and introduce you to a potentially new subject called “SPEC Blend/Core” for high performance VDI. If you are just finding this blog post for the first time, I encourage you to check out the Introduction from Tony as it will help set the stage for our discussion. Here is the full table of contents:
- Introduction – VDI – The Questions you didn’t ask (but really should)
- VDI “The Missing Questions” #1: Core Count vs. Core Speed
- VDI “The Missing Questions” #2: Core Speed Scaling (Burst) YOU ARE HERE!
- VDI “The Missing Questions” #3: Realistic Virtual Desktop Limits
- VDI “The Missing Questions” #4: How much SPECint is enough
- VDI “The Missing Questions” #5: How does 1vCPU scale compared to 2vCPU’s?
- VDI “The Missing Questions” #6: What do you really gain from a 2vCPU virtual desktop?
- VDI “The Missing Questions” #7: How memory bus speed affects scale
- VDI “The Missing Questions” #8: How does memory density affect VDI scalability?
- VDI “The Missing Questions” #9: How many storage IOPs?
VM’s are only as fast as their individual cores! Lets look at what this statement means. Example: Assume we have a 1GHz x 4 core processor (hey, it makes math easy for me). When we carve up a server VM or in this case a VM to be used for VDI, we can’t just give it 2 vCPU’s and say it’s got a 2GHz processor. The reality is that it has a dual 1GHz processor. This becomes an important concept in VDI when you are considering the quantity and QUALITY of vCPU’s you allocate to a Virtual Machine and ultimately the end user applications efficiency and the overall scalability of the server platform. This is not a Uni-processor vs. Multi-processor application discussion. We could easily have a very long discussion and debate on the in’s and out’s of application level efficiencies and the Operating Systems ability (and sometimes inability) to properly manage multiple CPU’s. We are going to expand upon the two CPU’s we tested and dig into per core performance.
CPU Burst vs. CPU Reservation. Let’s play around with our example 1GHz x 4 Core Processor a bit more. If we take this single processor and deploy 8 single vCPU desktops on it we will have a 500MHz CPU reservation per VM. The calculation for that is simple 1GHz x 4 Cores = 4,000MHz / 8 total VM’s = 500MHz/VM Reservation. So the Reservation is simply the average amount of CPU that is available to each VM (assuming everything is prioritized equally). But our Burst is different. Our Burst represents the maximum amount of CPU Core that any one VM could ever utilize. In this example, the Burst per VM is equivalent to 1GHz.
SPEC CINT2006… SPEC CFP2006… SPEC Blend/Core? Huh? You are probably familiar with SPEC.org, but if you are not, click here to learn about SPEC and what they do. One thing you will not find at SPEC.org is this concept of SPEC Blend/Core or even Blend for that matter. Blend is simply a combination or the CINT2006 Rate and CFP2006 Rate to give a more rounded view of the processors performance capabilities. To understand SPEC Blend/Core, we need to calculate Blend for our Processor first.
The formula is simple: (CINT2006 Rate + CFP2006 Rate) / 2 = Blend
The Blend/Core is just a slight variation of the formula where we take the Blend value and divide it by the total amount of cores that our processor has. Lets ditch the made up 1GHz x 4 Core processor and look at a real processor, the E5-2643.
Intel E5-2643: (187.5 + 167.5) / 2 = 177.5 Blend. This 177.5 Blend is for a single E5-2643 processor, so to determine Blend/Core, we will divide it by 4 total cores. 177.5 Blend / 4 Total cores = 44.38 SPEC Blend/Core.
Likewise we can calculate this for the E5-2665 processor as well.
Intel E5-2665: (305 + 233.5) / 2 = 269.25 Blend. This 269.25 Blend is for a single E5-2665 processor, so to determine Blend/Core, we will divide it by 8 total cores (since the E5-2665 is a 8 core CPU). 269.25 Blend / 8 Total cores = 33.6 SPEC Blend/Core.
As you can see the E5-2665 has significantly more CINT2006/CFP2006 capacity when compared to the E5-2643, but the E5-2643 has a much higher Blend/Core number due to its higher bursting frequency.
What does it mean? So we have calculated SPEC Blend/Core, here is how it relates to our Question regarding Core Speed and Burst. Blend/Core is the per core combined SPEC performance of each core. Why don’t we just use the GHz of the core to model from? Well, not all CPU Cores are created equal. Example:
You can see from these two different Intel processors that we would be foolish to treat them as “Equal” because they are both 4 core x 3.3GHz CPU’s. The Intel E5-2643 is clock for clock almost 3x Faster! The Virtual machine running in these environments may have the same CPU Reservation and Burst in MHz, but they will perform substantially different due to the SPEC horsepower difference of these processors. Let’s bring all of this back to the E5-2643 Processor vs. the E5-2665 Processor that we tested with.
The SPEC Blend/core is 32% higher than the E5-2665 as you can see in the left chart. This translates into roughly a 25% Increase in VM/Core as seen in the right chart. After all, a more powerful core should be able to run more VM’s/Core.
Applying this to the test scenarios, we can start seeing the impact of both 1vCPU and 2vCPU deployments at both 1600MHz and 1066MHz speeds between the 2 processors.
Answer: More Burst, More better! Doron answered our first question with Core count being the king for a scalable VDI deployment. That answer is undeniably true after the barrage of testing we did on these two platforms – however you should not dismiss the quality of your cores so quickly. If you are designing a high performance VDI environment where user responsiveness is key, it may be better to consider higher bursting processors at lower user counts per server to make users happy. As an example, consider this: If we were to place 2 vCPU VM’s on an E5-2665 system as our “high performance” VDI deployment, we tested up to about 93 Virtual desktops. Compare this to a 1vCPU E5-2643 deployment where we get 81 Virtual Desktops. Sure, you can scale a handful of desktops higher on the E5-2665 System, but the individual Cores are 32% less SPEC Blend as compared to the E5-2643 System. Depending on the multi-threaded efficiency of your applications, you may actually see significantly better performance in the desktop experience of the 1/vCPU E5-2643 vs. the 2/vCPU E5-2665. This will especially be true if your applications and operating system are not very optimized for multiple CPU’s. Let’s be real, this is a DESKTOP environment, not a server environment where applications are built to scale on large SMP (Symmetric Multi-Processing, that tech writer might spying on me) machines. Could a 1vCPU VM on a high burst System be the best solution for your demanding users?
What’s next? Two questions down, 7 more to go over the next few weeks. Stay tuned as our next question will go into realistic deployment expectations on these 2 platforms.