VMware Front Experience: vSphere 4

Showing posts with label vSphere 4. Show all posts

Monday, February 20, 2012

About the VMware Tools of ESXi 5.0 and why you should install them on vSphere 4.x

There is a rather new VMware KB article available that describes an interesting problem with the VMware Tools version of ESX(i) 4.1 Update 2: If the clock resolution of a Windows VM has been changed from the default then the VMware Tools service will continually consume 15% CPU performance (in a 1 vCPU VM, for 2 vCPU VMs it will be 7%, etc.).
We have seen this problem on few of our VMs, it looks like there are certain Windows applications around that change the clock resolution thus causing the problem. Detailed background information about the Windows clock resolution (and why it is not a good idea to change it) is available from Microsoft.

The resolution documented in the KB article is to downgrade the VMware Tools to an earlier version or - and this is probably surprising for most of us - to install the VMware Tools version of ESXi 5.0 instead.
This reminds us of the fact that VMware has changed their VMware Tools support policy with the introduction of vSphere 5.0: The VMware Product Interoperability Matrixes now include a selection for the VMware Tools, and it shows that the Tools of ESXi 5.0 are "interoperable" not only with ESXi 5.0 but also with ESX(i) 4.1 and even ESX(i) 4.0:

VMware Tools Interoperability Matrix

... whereas earlier versions were only interoperable with the corresponding ESX(i) version.

So, if you are still on vSphere version 4.1 or 4.0 and are planning to upgrade to vSphere 5 sooner or later then you can start deploying the VMware Tools of ESXi 5.0 now, and avoid the effort of future tools upgrades.
You can download the latest version of the ESXi 5.0 tools here at packages.vmware.com.

If you run a manual custom installation of the ESXi 5.0 tools in a Windows VM you will notice that there are some new components included:

VMware Tools 5.0 components default selection

The default selection of components (this is what you get when doing an automatic install or upgrade) is now more suitable for VMware ESXi than it was with earlier versions of the tools, but it still includes two components that are useful for VMware Workstation and completely useless when running ESXi: the Record/Replay Driver and the Audio Driver. Earlier versions of the Tools would also install the Shared Folders component by default, although it is also only useful with VMware Workstation.

A last hint: There is still another "feature" in the VMware Tools package for Windows that I personally find very annoying: Once you have installed the Tools you are by default not able to modify or repair the installation through the "Add or Remove Programs" control panel applet. To fix this find the GUID key for the VMware Tools package in the registry under
HKLM\Software\Microsoft\Windows\CurrentVersion\Uninstall
and change the NoModify and NoRepair values there to 0.

Sunday, October 30, 2011

vSphere 4.1 Update 2 released - What's in it for me (and you)

VMware released Update 2 for vSphere 4.1 on Oct 27th. It includes numerous bug fixes for the vCenter server and client (see VC resolved issues) and ESXi (see ESXi resolved issues).

I will list some of the fixes here, because I personally welcome them very much, and I'm sure that others will feel the same:

The vSphere client performed badly with Windows 7, because of frequent screen-redraws when the Windows desktop composition feature was enabled. The only workaround was to disable desktop composition while running the vSphere client. This should be fixed now.
There are multiple fixes and enhancements to ESXi syslogging:

If ESXi fails to reach the syslog server while booting, it now keeps retrying it every 10 minutes.
Very long syslog messages (like these produced by the vpxagent ...) are no longer truncated or split into multiple lines. If you are using a third-party solution like Splunk for collecting syslog messages then you will certainly welcome this, because it is nearly impossible to handle split messages correctly with them.

But the most important issue that is resolved in Update 2 is this: "Virtual machine with large amounts of RAM (32GB and higher) loses pings during vMotion". Uh, what? VMs losing pings when being vMotioned? Yes, this can really happen (without Update 2), and I personally experienced this problem: When one of two clustered Microsoft Exchange 2010 VM with 48GB RAM was vMotioned it lost network connectivity for more than 15 seconds (between 20 and 30% of the vMotion progress) which triggered a cluster failover. We have not yet checked if this particular issue is really resolved now with Update 2, but VMware Support had put us off to it when we complained about that, so there is a good chance ...

Tuesday, August 23, 2011

How to throttle that disk I/O hog

We are in the middle of a large server virtualization project and are utilizing two Clariion CX-400 arrays as target storage systems. The load on these arrays is increasing while we are putting more and more VMs on them. This is somewhat expected, but recently we noticed an unusual and unexpected drop of performance on one of the CX-400s. The load on its storage processors went way up and its cache was quickly and repeatedly filled up to 100% causing so-called forced flushes: That means the array needs to shortly stop any I/O coming in while it is staging the cache contents down to the hard disks in order to free the cache up again. As a result overall latency went up and throughput went down, and this affected every VM on every LUN of this array!

As the root cause of this we identified a single VM that fired up to 50.000(!) write I/Os per second. It was a MS SQL server machine that we recently virtualized. When it was on physical hardware it used locally attached hard disks that were never able to provide this amount of I/O capacity, but now - being a VM on high-performance SAN storage - it took every I/O it could have, monopolizing the storage array's cache and bringing it to its knees.

We found that we urgently needed to throttle that disk I/O hog, or it would severely impact the whole environment's performance. There are several means to prioritize disk I/O in a vSphere environment: You can use disk shares to distribute available I/Os among VMs running on the same host. This did not not help here: the host that ran the VM had no reason to throttle it, because the other VMs it was running did not require lots of I/Os at the same time. So, for the host there was no real need to fairly distribute the available resources.
Storage I/O Control (SIOC) is a rather new feature that allows for I/O prioritization at the datastore level. It utilizes the vCenter server's view on datastore performance (rather than a single host's view) and kicks in when a datastore's latency raises over a defined threshold (30ms by default). It will then adapt the I/O queue depth's of all VMs that are on this datastore according to the shares you have defined for them. Nice feature, but it did not help here either, because the I/O hog had a datastore on its own and was not competing with other VMs from a SIOC perspective ...

We needed a way to throttle the VM's I/O absolutely, not relatively to other VMs. Luckily there really is a way to do exactly this: It is documented in KB1038241 "Limiting disk I/O from a specific virtual machine". There are VM advanced-configuration parameters described here that allow to set absolute throughput caps and bandwidth caps on a VM's virtual disks. We did this and it really helped to throttle the VM and restore overall system performance!

By the way, the KB article describes how to change the VM's advanced configuration by using the vSphere client which requires that the VM is powered off. However, there is a way to do this without powering the VM off. Since this can be handy in a lot of situations I added a description of how to do this on the HowTo page.

Update (2011-08-30): In the comments of this post Didier Pironet pointed out that there are some oddities with using this feature and refers to his blog post Limiting Disk I/O From A Specific Virtual Machine. It features a nice video demonstrating the effect of disk throttling. Let me summarize his findings and add another interesting information that was clarified and confirmed by VMware Support:

Unlike stated in KB1038241 you can also specify IOps or Bps values (not only K, M or GIOps resp. K, M or GBps) for the caps (e.g. "500IOps"). If you do not specify a unit at all IOps resp. Bps is assumed, not KIOps/KBps like stated in the article.

The throughput cap can also be specified through the vSphere client (see VM properties / Resources / Disk), but not the bandwidth cap. This can even be done while the machine is powered on, and the change will become immediately effective.

And now the part that is the least intuitive: Although you specify the limits per virtual disk the scheduler will manage and enforce the limits on a per datastore(!) basis. That means:

If the VM has multiple virtual disks on the same datastore, and you want to limit one of them, then you must specify limits (of the same type, throughput or bandwidth) for all the virtual disks that are on the same datastore. If you don't do this, no limit will be enforced.

The scheduler will add up the limits of all virtual disks that are on the same datastore and will then limit them altogether by this sum of their limits. This explains Didier's finding that a single disk is limited to 150IOps although he defined a limit of 100IOps for this disk, but another limit of 50IOps for a second disk that was on the same datastore.

So, if you want to enforce a specific limit to only a single virtual disk then you need to put that disk on a datastore where no other disks of the VM are stored.

Thursday, July 28, 2011

[Update] ESXi-Customizer 1.1 - bugfix release

I published an updated version of my ESXi-Customizer script. There was an annoying bug with the "Advanced edit"-mode causing the oem.tgz file to become corrupted during re-packaging. This has been fixed, and I also added an update check feature to let the script check for newer versions of itself.

Download it from the project page.

Saturday, July 16, 2011

Improve your vSphere client's performance

Are you tired of staring at this window?

vSphere Client taking ages to load a VM view

If you manage a vSphere environment with several hundreds of VMs you might notice a disturbing slowness in screen refreshes when you initially look at lists of lots of VMs, try to refresh such views or resort them by clicking on the attributes' columns.

We have been struggling with this for a long time (in fact, since we upgraded to vSphere 4) without ever finding out how to improve or resolve this.
Now I got the tip to look at VMware's KB1029665. It exactly describes this symptom and recommends tuning the Java Memory pool of the Tomcat installation that is used on the vCenter server.

And yes, it got better after implementing this! Don't expect miracles - the first load of the complete VM view will still be slow, but subsequent viewing, sorting and scrolling is faster than without his modification.
However, you need to be aware that this actually changes the memory footprint of the vCenter server. So you might want to review its RAM configuration. Easy, if you have it running as a VM ...

Friday, July 8, 2011

Using hardware-assisted virtualization in Windows Server 2003 32-bit virtual machines

This is the title of a VMware KB article (KB2001372) that was recently posted, and it includes very interesting information for anyone running virtualized Windows 2003 servers on vSphere (so, probably all of us).

ESX(i) is able to use different methods for virtualizing the CPU and associated MMU (memory management unit) instruction sets. You can configure that for a VM in its Advanced Options / CPU/MMU virtualization:

CPU/MMU virtualization settings

In the Binary Translation (BT) mode software emulation is used for both CPU and MMU instructions (the second choice in the picture). For a long time this was the only option, until the CPU vendors Intel and AMD started building virtualization functions into their processors.
Choosing the third option will enable these hardware functions for the CPU instruction set virtualization (if available), but will remain using software virtualization for MMU instructions.
The fourth option will enable hardware virtualization for both types of instructions if available.
It depends on the CPU generation whether none, only the first or both hardware virtualization options are available. Since quite a few years Intel's and AMD's processors support CPU as well as MMU virtualization.

The default in the above dialog is "Automatic". This means that ESX(i) will choose what it considers to be the best option for the type of operating system that you have selected for the VM.
With Windows 2003 this is the "Software" mode. The reason for this is that Windows 2003 with SP1 in fact performs better with software emulation than with hardware virtualization. However, this changed with code changes introduced by Microsoft with SP2. Windows 2003 with SP2 performs better with hardware virtualization in almost any case.
Today, most Windows 2003 servers should have been updated with SP2. So, to ensure best performance you should go and change the virtualization mode of these VMs to one of the hardware-assisted ones.

For more details see the KB-article mentioned above.

Friday, July 1, 2011

Mysterious port 903

I recently investigated what network ports are used by ESXi 4.1, because I had to compile the firewall requirements for a new deployment of ESXi hosts in a DMZ. There is a detailed source available for that in the VMware KB:

KB1012382: TCP and UDP Ports required to access vCenter Server, ESX hosts, and other network components

And there are numerous other sources available (even nice diagrams like this one). In most cases it is obvious that their authors referred to and relied on the above mentioned official VMware KB source.

I'm usually not paranoid, but maybe I talked too much with the IT security guys (who tend to be extremely paranoid ;-)). Anyway, following the rule "Trust no one" I started looking at the network ports that are really used in our current production environment and compared them to the list in the KB article.

So I stumbled over port 903... According to the list both the vCenter server and any vSphere Client connect to an ESXi 4.1 host on that port for accessing the VM remote console. However, when I checked the network connections on the vCenter server and my Windows Desktop running the vSphere Client (with "netstat -an") I was not able to see any connection to an ESXi host's port 903, even when I opened multiple VM consoles. Instead it was obvious that port 902 is used for console connections.

This made me really curious, so I logged on to an ESXi host (in Tech support mode) and checked the open network connections there. In ESXi you use the command "esxcli network connection list" for that which produces an output that is quite similar to the netstat output (With classic ESX the netstat command is still available in the service console).

This command will also list all ports that are opened in LISTEN mode, that means there is some process waiting for connections on that port. But there was no listening process for port 903, and that means that no one and nothing would be able to connect to that port!

I opened a support request with VMware asking for clarification on the mysterious port 903 and was very curious about their answer. Of course, they quoted their own KB article first, insisted on that the port was actually used for this and that, but finally - after raising the issue to engineering - they admitted that "ESXi does not use port 903".

Also a request was made to update the KB article accordingly. So, when you read this it might already have been corrected to not include port 903 anymore, but the numerous third party documents based on KB1012382 will take some more time to be updated ...

Bottom line: Information is good. Correct information is better. Try to verify it if it is really important to you.

Saturday, June 25, 2011

A quick primer on Changed Block Tracking (CBT)

We are about to implement a new backup solution that is based on Symantec Netbackup 7, and - like any modern VMware backup solution - it leverages a very cool feature named Changed Block Tracking (CBT) that was introduced in vSphere 4.0 to enable efficient block level incremental backups.

Since it has been around for a while there are numerous good articles around about that topic (see references). I will not just reproduce them here, but summarize the most important key facts you need to know if you come into touch with it for the first time.

1. How does CBT work and what is it good for?
If CBT is enabled for a virtual disk the VMkernel will create an additional file (named ...-ctk.vmdk) in the same directory where it stores a map of all the virtual disk's blocks. Once a block is changed it will be recorded in this map file. This way the VMkernel can easily tell a backup application what blocks of a file have changed since a certain point in time. The application can then perform an incremental backup by saving only these changed blocks.
CBT is also used by Storage VMotion that is able to move a virtual machine's disk files from one datastore to another while it is running.

2. How do you enable CBT?
CBT is to be enabled per virtual disk, and VMware's KB1031873 describes how to do this via editing a VM's advanced configuration parameters through the VI client. Unfortunately this requires the VM to be powered off. However, you can also change the setting while the VM is running by using an appropriate script like the one published here. To make the change effective then you need to perform a so called stun/unstun-cycle on the VM (i.e. power on/off, suspend/resume, create/delete snapshot).
It is important to know that CBT is not enabled by default, because it introduces a small overhead in virtual disk processing.

3. How do CBT and snapshots play together?
When you create a snapshot of a VM's virtual disk an additional ctk-file is created for the delta-disk file of the snapshot. Once this snapshot is deleted again the delta-ctk will be merged with the base-ctk, just like the delta-disk is merged with the base-disk.

4. Important notes and references

KB1020128: Changed Block Tracking (CBT) on virtual machines
KB1031873: Enabling Changed Block Tracking (CBT) on virtual machines
While an application backs up a VM using CBT the VM cannot be vMotioned: KB2001004
Inconsistency resolved in vSphere 4.0 U3 and vSphere 4.1: KB1021607
KB1031106: Virtual machine freezes temporarily during snapshot removal on an NFS datastore in a ESX/ESXi 4.1 host
Eric Siebert on CBT: A detailed introduction
Additions by Duncan Epping: Even more details...

Saturday, June 18, 2011

How to hide unused FlexNICs

When I configured an HP Blade Enclosure with VirtualConnect modules for the first time I stumbled over an issue that probably has bothered most of the people doing this, especially if they run ESX(i) on the blade servers:

The BL620c G7 blade servers we are using have four built in 10Gbit-ports, and each of them can be partitioned into up to four so-called FlexNICs (or FlexHBAs for FCoE if you use them together with FlexFabric VirtualConnect modules like we do). The overall 10GBit bandwidth of one port will be split among its FlexNICs in a configurable way. You could e.g. have four FlexNICs with 2,5 GBit each, two with 6 and 4 GBit, or any combination of one to four FlexNICs with their bandwidth adding up to 10GBit.
For the OS (e.g. VMware ESXi) that is installed on the blade server each FlexNIC appears as a separate PCI device. So an ESX(i) host installed on a BL620c G7 can have up to 16 NICs. Cool, eh?

However, we did not really want to use too much of that feature and divided the first two 10Gbit-ports in a 4Gbit-FlexHBA and a 6GBit-FlexNIC each. The third and fourth port we even configured as single 10GBit-FlexNICs.

Now, the problem is that every 10Gbit-port will show up as four PCI devices even if you have configured less than four FlexNICs for it. Even if you have not partitioned it at all, but use it as a single 10Gbit-NIC, it will show up as four NICs with the unconfigured ones being displayed as disconnected!
In our case we ended up with ESXi seeing (and complaining about) 10 disconnected NICs. Since we monitor the blades with HP Insight Manager it also constantly warned us about the disconnected NICs.

So, we thought about a method to get rid of the unused FlexNICs. If we had Windows running directly on the blades this would have been easy: We would just disable the devices and Windows (and also HP Insight Manager) would not be bothered by them. However, in ESX(i) you cannot just disable a device ... but you can configure it for "VMDirectPath":

PCI Passthrough configuration of a BL620c G7

This dialog can be found in the Advanced Hardware Settings of a host's configuration. What does it do?
With VMDirectPath you can make a host's PCI device available to a single VM. It will be passed through to the VM, and the guest OS will then be able to see and use that device in addition to its virtual devices.
This way it is possible to present a physical device to a VM that you normally would not be able to add.

In the dialog shown above you configure which devices are available for VMDirectPath (also called PCI Passthrough). You can then add all the selected devices to the hardware of individual VMs.
We really did not want to do the latter... but there is one desirable side effect of this configuration: A device that is configured for VMDirectPath becomes invisible for the VMkernel. And this is exactly what we wanted to achieve for the unused FlexNICs!

So we configured all unused FlexNICs for VMDirectPath, and they were no longer being displayed as (disconnected) vmnics. If you want to do the same you need to know what PCI device a vmnic corresponds to. In the screenshot I posted you will notice that for some of the PCI devices the vmnic name is displayed in brackets, but not for all. So, it can be hard to figure out what devices need to be selected, but it's worth it!

Sunday, June 5, 2011

Two things to know about VMware Converter

VMware Converter is a tool to convert physical machines (either online or via an existing backup image) to VMware virtual machines. It is available as a free stand-alone version and in a vCenter-integrated version.
Like others we are using it a lot for virtualizing existing physical servers, just because it's free and/or comes pre-installed with vCenter. It also does a pretty good job and is well supported by VMware, but ...
you should be aware of two issues when using Converter more than occasionally.

1. Windows 2000 support
With the latest versions (Stand-alone converter 4.3 and vCenter 4.1-integrated) VMware dropped support for converting Windows 2000 machines (see the notes about supported guest OSs in the Release Notes). The really bad about this is that it does not just tell you this when you try to convert a Windows 2000 machine, but throws an error message about not being able to install the Converter agent on the target computer. It looks like it tries to install the Windows XP version of the agent which fails.
At first it looks like this is not a big problem, because older versions of the converter still support Windows 2000. If you run vSphere 4.1 you can use the stand-alone Converter 4.0.1 to convert Windows 2000 machines by connecting to the vCenter 4.1 server or directly to an ESX(i) 4.1 host. We have done this a lot and it always worked. However, if you carefully look at the Release Notes of Converter 4.0.1 you will notice that it only supports vSphere 4.0 as virtualization platform, but not vSphere 4.1.
We asked VMware support how we - as a vSphere 4.1 customer - are supposed to convert a Windows 2000 machine using Converter in a way that is fully supported by VMware. Here are the instructions (it's only one possible way, but you will get the idea):
a) Install an ESX(i) 4.0 host and add it to an existing vCenter 4.1 instance
b) Use the Stand-alone Converter 4.0.1 to connect to this ESX(i) 4.0 host and convert the Windows 2000 machine
c) Migrate the virtualized Windows 2000 machine to an ESX(i) 4.1 host (either cold or by VMotion)
That's a bit cumbersome, isn't it? Anyway, as stated above you can also use the stand-alone converter 4.0.1 to connect directly to vSphere 4.1. It is not officially supported, but seems to work quite well.

2. Disk alignment
If you care about storage performance then you want your VMFS volumes and your guest OS partitions to be aligned. There are a lot of good explanations about what disk alignment is and why it is important. My personal favorite is on Duncan Epping's blog.
Now, the big issue is that VMware Converter does not align the guest OS partitions in the target virtual machine. Although VMware is also pointing out the importance of disk alignment since a long time (see e.g. this ESX3 white paper) they have still - as of version 4.3 - not yet built this capability into their own Converter product.
So, if you are serious about disk performance and are planning for a large virtualization project you may want to consider alternatives to VMware Converter. There are other commercial products available that do proper disk alignment. One example is Quest vConverter.

Update (2011-09-01): Good news. Today VMware released Converter 5.0 which is now able to do disk alignments!

Thursday, May 26, 2011

Network troubleshooting, Part I: What physical NIC does the VM use?

If you encounter a network issue in a VM (like bad performance or packet drops) a good first question to ask yourself is: Is this issue limited to the VM or can it be pinned to one of the host's physical NICs?
So, you need to find out what physical NIC (pNIC) the VM is actually using. In most environment this is not obvious, because the virtual switch that the VM connects to typically has multiple physical up-links (for redundancy) that are all active (to maximize bandwidth).

Unfortunately, it is not possible to find this out by using the VI client. It does not reveal this information regardless whether you use standard or distributed virtual switches.
You need to log in to the host that runs the VM (see the HowTos section for instructions) and run esxtop.
Press n to switch to the network view, and you will see a picture like this one:

Network view of esxtop (click to enlarge)

Find the VM's display name in the USED-BY column and look to the corresponding TEAM-PNIC column then. In this example the VM FRASINT215 uses vmnic1.

Updated be2net driver fixes issues with G7 blades

When we started to deploy our HP ProLiant BL620c G7 blade servers we stumbled over some issues with the driver (be2net) for the built-in FlexNIC adapters. They are documented in the VMware KB:

We followed the recommendation in these articles and updated the be2net driver to version 2.102.554.0. However, we still experienced hangs of the ESXi host and network outages whenever the host was rebooted or had its dvS-connections reconfigured.

These hangs were accompanied by VMKernel.log-messages like this one:

... vmkernel: 10:06:11:06.193 cpu0:4153)WARNING: CpuSched: 939: world 4153(helper11-0) did not yield PCPU 0 for 2993 msec, refCharge=5975 msec, coreCharge=6374 msec,

After opening a support call with VMware we finally found out that these problems were caused by improper handling of VLAN hardware offloading by the be2net driver, and that they only occur when you are using distributed virtual switches (dvS) like we did.
So, after configuring the blade hosts with virtual standard switches (vSS) the problem went away.

Since then we were waiting for a fixed be2net-driver (from Emulex) to be able to return to dvS. We really did not want to abandon this option because it offers some benefits (load based teaming of the physical uplinks and Network I/O Control) over the standard switch.

Today, the waiting finally ended. Emulex has finished the fixed driver, it is available here:

VMware ESX/ESXi 4.x Driver CD for Emulex OneConnect 10Gb Ethernet Controller

Update (18. Jul 2011): In the meantime VMware made two new KB articles available that reference the problems described here and the new driver:

In the latter one it is also recommended to update the NIC's firmware. The current one (as of today) is available at HP as a bootable ISO file. Thanks to makö for pointing this out in this post's comments.