Understanding CPU Over Commitment


over commitment in its simplest term means allocating more resources to virtual workloads then what is available at physical level. Most common resources that are over committed are memory and cpu.

A simple example of over commitment can be running 3 VM each with 4 GB RAM on an Esxi host which has only 8 GB RAM. In this case we have allocated 12 GB RAM to all VM’s collectively but at physical level (Esxi host) we have only 8 GB RAM available.

It is a general belief by most novice VMware admins that allocating more resources to virtual machines means better performance. When I started working with VMware I also used to think in the same way.

It was the vSphere Optimize and Scale training, where I learned this is not true and how over commitment can badly affect VM performance.

In this post I am trying to demonstrate the negative affects of CPU over commitment.

Before diving into demonstration lets recall a few terms which will help understanding concepts more clearly.

  • pCPU-    Total physical CPU X Number of cores. In this example I have a host with 2 CPU and 2 cores and thus total of 4 pCPU.
  • vCPU-     CPU’s allocated to a Virtual Machine.
  • Esxtop-  Performance analysis utility, usually run from Esxi Console or SSH.
  • %RDY-   CPU Ready Time is the amount of time a virtual machine is ready to use CPU but was unable to schedule time because all physical CPU are busy.
  • %CSTP-  The percentage of time that the virtual machine is ready to execute commands but that it is waiting for the availability of physical CPU’s as the virtual machine is configured to use multiple vCPUs.
  • %WAIT-  It is the percentage of time the virtual machine was waiting for some VMkernel activity to complete (such as I/O) before it can continue.
  • %IDLE-  The percentage of time a world is in idle loop.

For the purposes of this demonstration, I created 2 instances of Windows Server 2003, a RHEL-6 64-bit (with vCloud Director installed), and the vShield Manager VM.

The host on which these VM are running has 6 GB of RAM and has 8 CPU X 2.4 GHz.

CPU allocation for the VM’s are as follows:

Server 2003-1: 1 vCPU

Server 2003-2: 1 vCPU

RHEL Server:   1 vCPU

vShield Manager: 2 vCPU

Lets begin with the demonstration:

1: Power on all VM’s

cpu-1

2: Run Esxtop

cpu-2

In general:

  • %RDY (CPU Ready) should be low
  • %CSTP should be 0.00
  • %IDLE should be high – indicating high percentages of resources waiting for something to do

3: Now power-off RHEL Server and Server 2003-1 VM

4: Change the number of cores per socket to 8 for both VM

Select 1 sockets and 8 cores per socket to make total vCPU=8

cpu-3

cpu-4

5: Power on the 2 VM and look at the esxtop stats again

cpu-5

Gees, look at the %RDY and %CSTP time.

  • High %RDY value indicates that vCPUs are waiting for actual physical CPUs to schedule their threads.
  • High %CSTP indicates that ESXi is asking the vCPUs to wait – A.K.A. co-stopping them for scheduling purposes.

Now we will run a script inside these VM’s which will make the cpu to run at 100% utilization.

For Linux VM there is one script available at GitHub

Note: Before running the cpuload.sh script make sure the Linux VM has ‘stress‘ and ‘cpulimit‘ rpm’s installed

Copy the code from GitHub and save it as cpuload.sh and run following commands:

#  chmod +x cpuload.sh
# ./cpuload.sh [cpu load in percent] [duration in seconds]

For e.g to force cpu utilization to 75% for 50 seconds run the command as shown below:
# ./cpuload.sh 75 50

On running cpuload.sh on linux VM, if you fire top command you will see the cpu stats as below:

cpu-6

If we look at the esxtop stats, we will see something like as shown below:

cpu-7

Holy crap. Look at the jump in %RDY and %CSTP values

For Windows OS there is a nice utility called Load Storm. This utility can be downloaded from Here

Note: Load Storm requires dotnet framework 3.5 to be pre-installed.

cpu-8

To generate load on cpu adjust the Number of threads value and Approx core load per thread. Also you can choose the checkbox Generate load for if you wish to keep cpu busy for more than default one minute.

On generating 100% workload on cpu, the windows reported CPU utilization as 100%

cpu-9

Lets see the esxtop stats

cpu-10

As expected you can see the rise in value for %RDY and %CSTP for Windows VM

If we check the Esxi host CPU stats in GUI we can see CPU is almost maxed out

cpu-11

Also the performance chart will show a sudden rise in CPU utilization when we have run the cpu load script on both windows and linux vm together.

cpu-12

In my case GUI was reporting almost full utilization of physical cpu. But in production environment it is very common to see Esxi host utilization somewhere near 50-60% but still VM’s are reporting high %RDY and %CSTP values in Esxtop.

High %RDY and high %CSTP indicate an over-allocation of CPU resources – i.e. too many vCPU for the job at hand

Now I am going to power off the RHEL and Windows VM and set the vCPU to 2 and re-run the cpu load scripts to see if there is any difference in %RDY and %CSTP values.

I choosed to run 4 threads on 2 vCPU in windows VM.

cpu-13

cpu-14

After changing the cpu count, esxtop has following readings

cpu-15

You can see there is significant drop in %RDY and %CSTP values for  both VM’s.

Conclusion

Assigning fewer vCPUs to a VM than it has threads is more efficient than assigning more vCPUs; but if we assign the exact number of vCPUs as threads, we risk increasing %RDY and %CSTP if we over commit the total number of vCPUs in the environment.

Please share this post on social media if the post is informational to you. Be sociable😉

About Alex Hunt

Hi All I am Manish Kumar Jha aka Alex Hunt. I am currently working in VMware Software India Pvt Ltd as Operations System Engineer (vCloud Air Operations). I have around 5 Years of IT experience and have exposure on VMware vSphere, vCloud Director, RHEL and modern data center technologies like Cisco UCS and Cisco Nexus 1000v and NSX. If you find any post informational to you please press like and share it across social media and leave your comments if you want to discuss further on any post. Disclaimer: All the information on this website is published in good faith and for general information purpose only. I don’t make any warranties about the completeness, reliability and accuracy of this information. Any action you take upon the information you find on this blog is strictly at your own risk. The Views and opinions published on this blog are my own and not the opinions of my employer or any of the vendors of the product discussed.
This entry was posted in Vmware. Bookmark the permalink.

5 Responses to Understanding CPU Over Commitment

  1. Shankar Shashi says:

    Great Post!, it’s always a challenge to make this understanding to Application teams.
    Personally I have seen impact of compute resource (cup & memory) over commitment in our vSphere environments.
    When the physical fleets gets virtualised, we also do right sizing of compute resource where required. When the right sizing is done, some of the teams feels they have lost the capability which they had in the physical world.
    Reality is that this unused capability will eliminate issues in the virtual environment and drive the organisation cost down in terms of power and cooling.
    Thanks for the good post.

    Like

    • Alex Hunt says:

      With the advent of virtualization, it has become more challenging for administrators to right size the workloads to get optimal performance. And yes this fact is very true that it is always challenging to convince application owners that assigning more resources to their application would not guarantee best performance.

      Like

  2. Mordock says:

    Your post does a good job of showing that putting too much load on a physical host is bad. But that is DUH! situation. If you put 16 vCPUs of load on an 8pCPU box, you will see high ready and co-stop.

    But that is not the problem that most people fall into. Now take your two 8-way VMs and put a 3 vCPU load on them. That way you are putting a total of 6 vCPUs of load on your 8 pCPU box (75%). Then look at your ready and co-stop as well as the thru-put within the VMs. Then drop your VMs down to 3 vCPUs each with the exact same load and see what happens.

    Like

  3. TQuinnelly says:

    Great post Alex. I think I talk about this topic at least weekly. Nice to see it broken down a bit more than I usually explain it.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s