Thursday, September 29, 2011

Wednesday, September 28, 2011

VIB files, Offline-Bundles and ESXi-Customizer 2.5

The current version 2.0 of my ESXi-Customizer script is able to add OEM.tgz-style driver packages to ESXi 4.1 and ESXi 5.0 installation ISOs. Tgz (short for tar.gz) is the format that is used to distribute all community-developed drivers for ESXi 4.1 (and prior versions). So these can currently be added to ESXi 4.1 with my script.

However, for ESXi 5.0 you cannot use the driver packages for ESXi 4.1, they need to be re-engineered and re-compiled starting with the stock Linux driver code that is the basis for ESXi drivers. It looks like - so far - nobody in the user community has figured out how to compile drivers for ESXi 5.0. I hope that this will change in the near future, but for now the current version of the ESXi-Customizer script is pretty much useless for ESXi 5.0 ...

On the other hand, there are new and updated driver packages for ESXi 5.0 already available directly from VMware or 3rd party hardware vendors. These drivers are distributed in VIB format and as so-called Offline-Bundles (in zip-format). I wondered if my script could also support these "official" VMware package format and had a closer look at the format of these files.

(Note: To fully understand the following, it is helpful to read my Anatomy post about the structure of the ESXi 5.0 ISO first!)

What is a VIB file?
VIB stands for "VMware Installation Bundle". If you just open a vib-file in a text editor you will see that it starts with the (somewhat well known) header "!<arch>". This means that the file is in the Unix ar format. Wikipedia has a detailed description of this file format, and there you can learn that VIB files use the common ar format, and that you can use the Unix ar command to handle this kind of archive files.
Fortunately, 7-zip - the Swiss army knife of packaging tools - is also able to handle ar-files, and it is also available for Windows. Since I already use it with ESXi-Customizer to unpack ISO-files I was delighted to realize that it is also able to unpack a vib-file ...

So, what's in the VIB file? Exactly three files:
  1. A file named "descriptor.xml": The name says it all. This file describes the contents and dependencies of the driver package. Later you will find that again in the IMGDB.TGZ file. For more details see section 4 of my anatomy-post. There you will find the descriptor.xml file of the e1000-driver as an example.
  2. A file named "sig.pkcs7": VMware certified drivers need to be electronically signed, and this file includes the uuenconded PKCS7 signature.
  3. The payload file: You will find the exact name of this file in the name-attribute of the <payload> tag in the descriptor.xml file. It does not have any file extension.
By design a VIB file can contain multiple payload files. However, each VIB file I looked at contained only exactly one. The payload file's type was always VGZ (short for vmtar.gz), but TGZ-format should also be possible. In fact the payload file is just the archive that makes up the actual driver package. Section 3 of my anatomy post describes that in more detail.

What is an Offline-Bundle?
Offline-Bundles come in ZIP-format and are just a collection of one or multiple VIB files. An Offline Bundle contains additional meta data in an included archive named metadata.zip, the VIB-file(s) are stored in the archive's sub directory vib20\.

Is every ZIP-file an Offline-Bundle?
At the VMware site you can download new and updated drivers for ESXi 5.0 from the Drivers & Tools / Driver CDs section of the vSphere 5 download page. These downloads are also in ZIP-format. However, if you look at the contents of such a downloaded file you will notice that it is not an Offline-Bundle itself, but includes another ZIP-file that actually is the Offline-Bundle!

So, please be careful... The real Offline-Bundle-ZIP files have the string offline_bundle or just bundle in their names (e.g. LSI_5_34-offline_bundle-455140.zip). If you are unsure then look into the zip file and check if it includes a metadata.zip file and a vib20\ sub-directory.

ESXi-Customizer 2.5
... will be out soon, and it will support adding VIB files and Offline-Bundles to an ESXi 5.0 media! Watch this space for the announcement.

Sunday, August 28, 2011

How ESXi-Customizer supports ESXi 5.0 - FAQ

I got a lot of feedback after posting the new ESXi-Customizer (with support for ESXi 5.0) and the "anatomy"-article explaining its technical background. It looks like I haven't been clear enough on some points and need to provide some additional information. So here is a list of frequently asked questions (FAQ), I might update it from time to time, so stay tuned.

1. Can I use existing drivers (made for ESXi 4.x) for customizing ESXi 5.0?

No, you can't. Driver binaries compiled for ESXi 4.x are not compatible with ESXi 5.0. They just won't be loaded. Instead vmkload_mod will throw the error message "Module does not provide a license tag".

2. What input does ESXi-Customizer expect for customizing ESXi 5.0?

It expects a gzip-compessed tar-file (with extension .tgz) that includes exactly three files:

  • /usr/lib/vmware/vmkmod/<driver-module> (the binary driver module)

  • /etc/vmware/driver.map.d/<driver-name>.map (maps PCI device IDs to the binary module)

  • /usr/share/hwdata/driver.pciids.d/<driver-name>.ids (maps PCI device IDs to display names)

Nothing more is needed. All other steps outlined in the "anatomy"-post will be done by ESXi-Customizer.

3. When and where will ESXi 5.0 compatible community drivers be available?

ESXi device drivers are derived from device drivers written for the Linux kernel. However, it is necessary to make specific changes to the source code of a stock Linux driver to turn it into an ESXi driver. An experienced Linux developer can find out what changes are necessary by studying the complete source of the existing ESXi drivers that are shipped with ESXi 5.0.

The source code of these drivers has not yet been published by VMware. However, they are obliged to do this (sooner or later), because most of the original Linux drivers are licensed under the GNU GPL requiring that the source code of derived works also needs to be publicly available.
So, we need to wait for VMware to publish the OpenSource code of its drivers (we can expect it here), and for some knowledgeable people to compile new ESXi 5.0 compatible drivers then.

I am confident that this will happen in the near future. And I expect the new drivers to become available at Dave Mishchenko's vm-help.com, the home of the ESXi Whitebox HCL.

4. Does ESX-Customizer support creating a bootable USB-key with ESXi 5.0?

No, it does not. If the machine that you want to install ESXi on does not have a CD-ROM drive, you can help yourself by installing ESXi 5.0 using any other machine (that has a CD-ROM drive) onto a USB key drive. Once you have a bootable USB key you can use that to also boot any other machine!
The easiest and safest method is to use a virtual machine provided by VMware Workstation or VMware player to do the initial install. Yes, ESXi 5.0 can be installed in a VMware Workstation VM - just select "ESX Server 4" as the guest OS type.

Thursday, August 25, 2011

The anatomy of the ESXi 5.0 installation CD - and how to customize it

1. Introduction

With vSphere 5 VMware introduced the Auto Deploy Server and the Image Builder that allow to customize the ESXi installation ISO with partner supplied driver and tools packages.
The Image Builder is a Powershell snapin that comes with the latest version of the PowerCLI package. It allows to add software packages to a pre-defined set of packages (a so-called ImageProfile) and even lets you create an installation ISO from such a baseline making it easier than ever to customize the ESXi installation.

However, doing this is not a straight-forward task. It requires a working installation of the Powershell, plus the PowerCLI software, access to the offline-bundle that makes up the base installation (which is not included with the free version of ESXi!), a custom driver in VIB format, and some guidance on what Powershell-cmdlets you need to use to add the custom driver package and build an ISO from it.
For the developers of custom drivers it requires to supply their packages in VIB format, and it's not trivial and costs extra effort to build such a package (compared to a simple OEM.TGZ file).

I wondered if it is still possible to customize the ESXi 5.0 install ISO with a simple OEM.TGZ file like you can do with ESXi 4.1, e.g. with my ESXi-Customizer script. And yes, it is possible - but it's very different now! I want to provide some background information here on how this works:

2. The contents of the ESXi 5.0 installation ISO

First let's have a look at the root directory of the ESXi 5.0 install ISO:

Contents of the ESXi 4.0 install CD root directory
Unlike the ESXi 4.1 ISO you can see lots of ISO9660-compatible file names here (all capitals and 8.3-format). You can guess from their names that the files with the V00 (and V01, V02, etc.) extensions are device driver archives. The original  type of these files is VGZ, the short form of VMTAR.GZ. That means that they are gzip'ed vmtar-files.

vmtar is a VMware proprietary variant of tar, and you need the vmtar-tool to pack and unpack vmtar archives. It is part of ESXi 5.0 and also ESXi 4.x. Other files have the extensions TGZ and T00 (like TOOLS.T00). These files are gzip'ed standard tar files that the boot loader can also handle. Good.

Comparing with the ESXi 4.1 media you will notice that there is no ddimage.bz2 file any more. In earlier versions of ESXi this is a compressed image that is written to the installation target disk and contains the whole installed ESXi system. Actually you can write this image to a USB key drive to produce a bootable ESXi system without ever booting the install CD. You cannot do this with ESXi 5.0 any more. However, customizing the install CD has become easier this way, because you do not need to add a second copy of your oem.tgz file to this system image.

There are also files named ISOLINUX.BIN and ISOLINUX.CFG in the ISO root. That means that ESXi 5.0 still uses the isolinux boot loader to make the installation CD bootable. If you look into ISOLINUX.CFG it includes a reference to the file BOOT.CFG, and in BOOT.CFG you find references to all the VGZ and TGZ files:
Contents of the BOOT.CFG file
A second copy of the BOOT.CFG file is in the directory \EFI\BOOT. The ESXi 5.0 install ISO (and ESXi 5.0 itself) was built to boot not only on a standard x86 BIOS, but also on new (U)EFI enabled BIOS versions. Just one thing to remember: If you change the one BOOT.CFG you better make the same change to the other.

Now let's have a closer look at a driver VGZ package.

3. What's in a driver's vgz-file?

As mentioned before you need the vmtar-tool to look into a VGZ-file. Since it is only part of ESXi itself you need to have access to an installed copy of ESXi (either 4.1 or 5.0). Luckily you are able to install ESXi 4.1 (and also 5.0!) inside a VMware Workstation 7 VM.
I did this by creating a VM of type "ESX Server 4" with typical settings except for the size of the virtual disk (2GB is enough for ESXi) and installing ESXi 5.0 in it. During installation the driver files from the CD root are uncompressed and copied to the directory /tardisks, so here is where you can find them again. After enabling the local shell (luckily still available with 5.0) I logged in and was finally able to look inside and unpack such a driver archive using the vmtar tool:
Unpacking NET-E100.V00 with the vmtar tool
So there are basically three files in the archive:

1. The driver binary module (with no file name extension, e1000 in this example) that will be unpacked to the well known location /usr/lib/vmware/vmkmod.

2. A text file that maps PCI device IDs to the included driver:
Contents of /etc/vmware/driver.map.d/e1000.map
3. Another text file that maps PCI IDs to vendor and device descriptive names:
Contents of /usr/share/hwdata/driver.pciids.d/e1000.ids

It is good to know that the PCI ID mapping files are now separated by driver. In ESXi 4.1 there is a single pci.ids file and a single simple.map file for all drivers which raised the potential of having conflicting copies of these files in case you merged multiple OEM drivers into the image.

It looks easy now to add a custom driver to the install CD: Just create a tgz-file containing the three files mentioned above, copy it to the ISO root directory and add its name to the two BOOT.CFG files. And yes, this will indeed work for the CD boot! The custom driver will be loaded and you will be able to install ESXi, ... but the installation routine will not copy the tgz-file to the install media, and if you boot the installed system the first time it will behave like a regular install without the custom driver.

So, there is more to it...

4. The image database IMGDB.TGZ

There is a file named IMGDB.TGZ in the root directory of the CD that is also listed in the BOOT.CFG files and has the following contents:
Unpacking the IMGDB.TGZ file
It contains files that will be unpacked to the directory /var/db/esximg. For each driver (or other software package) an XML-file is created under the vibs sub directory. There are a lot more of these files than shown here (I fiddled the output with "..."), one example is net-e1000--925314997.xml for the e1000 driver. Let's look into this file:
The contents of net-e1000--925314997.xml
The xml-file contains information about the package including possible dependencies on other packages and a list of all included files. Its file name ("net-e1000--925314997.xml") consists of the name element plus a (probably) unique number with 9 or 10 digits. The list of payloads is the list of included archive files (either of type vgz or tgz), in most cases it's just one. The name of the payload is limited to 8 characters ("net-e100" in this case) and is the name of the corresponding file in the CD's root directory. The extension of this file is expected to be ".v00" if the file is of type vgz and ".t00" if the file is of type tgz. If there are name conflicts with other packages the number in the extension is counted up. E.g. the payload file for the e1000e driver is "net-e100.v01".

Then there is the host image profile XML file in the directory /var/db/esximg/profiles. In our example this is the file ESXi-5.0.0-381646-standard1293795055. Let's look into this one:

... ... ... (lot more <vib></vib> entries cutted) ... ... ...
Contents of the host image profile XML file
Here we find a list of all vib-packages that make up the currently installed system. Please note that the vib-id of a package strictly corresponds to the element values that are in the associated vib xml file (see picture before), it is composed the following way:
<vendor>_<type>_<name>_<version>
So the vib-id element of the net-e1000 driver e.g. is
VMware_bootbank_net-e1000_8.0.3.1-2vmw.0.0.383646

The payload names that are listed in the image profile file are the same as in the distinct vib xml files with the exception that here the exact file names (e.g. "net-e100.v00") are listed rather than just the file type (vgz or tgz).

Conclusion: If we want to add a custom driver to the install CD we need to do the following (in addition to the steps described in section 3.): modify the contents of IMGDB.TGZ, add a vib xml file for the driver (similar to net-e1000...xml) to it and update the contained image profile file to include the driver as an additional <vib>-entry.

There is another particular XML element in both the vib files and image profile file that we need to take care of: the <acceptancelevel>. VMware distinguishes four different acceptance levels: VMwareCertified, VMwareAcceptedPartnerSupported and CommunitySupported, in the XML files they are coded as certified, vmwarepartner and community. The names are pretty self-explanatory, and one can easily guess that certified is stricter than vmware that is stricter than partner that in turn is stricter than community. In other words: If the host image profile is of acceptance level certified only packages of the same acceptance level can be part of it. If it is of acceptance level vmware only VMware certified and VMware accepted packages can be installed. If it is of acceptance level partner (and this is the default!) partner supported packages can be installed in addition to that. The least restrictive level is community that would accept all four types of packages.
My expectation is that custom drivers for whitebox hardware are community supported (unless they are published by a hardware vendor company). However, if the driver's vib file contains the acceptance level community the image profile's acceptance level must also be changed to community. Otherwise the installation of the package will fail.

5. Can we automate it?

Yes, we can! The latest version of ESXi-Customizer does automate all the steps described here to add custom drivers in tgz-format to an ESXi 5.0 install ISO. You only need to feed it with a tgz-file that contains the three files listed in section 3 of this post.

Please note: Packages made for earlier ESXi versions will not work with ESXi 5.0, not only because the directory structure has changed, but also because the earlier versions' driver modules won't be loaded by the new version! And - at the time of this writing - there are probably no oem.tgz-style driver packages available that are compatible with ESXi 5.0!
Hopefully, this will soon change. If you are looking for a driver of a device that does not work out-of-the-box with ESXi 5.0 check the Unofficial Whitebox HCL at vm-help.com.


Tuesday, August 23, 2011

How to throttle that disk I/O hog

We are in the middle of a large server virtualization project and are utilizing two Clariion CX-400 arrays as target storage systems. The load on these arrays is increasing while we are putting more and more VMs on them. This is somewhat expected, but recently we noticed an unusual and unexpected drop of performance on one of the CX-400s. The load on its storage processors went way up and its cache was quickly and repeatedly filled up to 100% causing so-called forced flushes: That means the array needs to shortly stop any I/O coming in while it is staging the cache contents down to the hard disks in order to free the cache up again. As a result overall latency went up and throughput went down, and this affected every VM on every LUN of this array!

As the root cause of this we identified a single VM that fired up to 50.000(!) write I/Os per second. It was a MS SQL server machine that we recently virtualized. When it was on physical hardware it used locally attached hard disks that were never able to provide this amount of I/O capacity, but now - being a VM on high-performance SAN storage - it took every I/O it could have, monopolizing the storage array's cache and bringing it to its knees.

We found that we urgently needed to throttle that disk I/O hog, or it would severely impact the whole environment's performance. There are several means to prioritize disk I/O in a vSphere environment: You can use disk shares to distribute available I/Os among VMs running on the same host. This did not not help here: the host that ran the VM had no reason to throttle it, because the other VMs it was running did not require lots of I/Os at the same time. So, for the host there was no real need to fairly distribute the available resources.
Storage I/O Control (SIOC) is a rather new feature that allows for I/O prioritization at the datastore level. It utilizes the vCenter server's view on datastore performance (rather than a single host's view) and kicks in when a datastore's latency raises over a defined threshold (30ms by default). It will then adapt the I/O queue depth's of all VMs that are on this datastore according to the shares you have defined for them. Nice feature, but it did not help here either, because the I/O hog had a datastore on its own and was not competing with other VMs from a SIOC perspective ...

We needed a way to throttle the VM's I/O absolutely, not relatively to other VMs. Luckily there really is a way to do exactly this: It is documented in KB1038241 "Limiting disk I/O from a specific virtual machine". There are VM advanced-configuration parameters described here that allow to set absolute throughput caps and bandwidth caps on a VM's virtual disks. We did this and it really helped to throttle the VM and restore overall system performance!

By the way, the KB article describes how to change the VM's advanced configuration by using the vSphere client which requires that the VM is powered off. However, there is a way to do this without powering the VM off. Since this can be handy in a lot of situations I added a description of how to do this on the HowTo page.

Update (2011-08-30): In the comments of this post Didier Pironet pointed out that there are some oddities with using this feature and refers to his blog post Limiting Disk I/O From A Specific Virtual Machine. It features a nice video demonstrating the effect of disk throttling. Let me summarize his findings and add another interesting information that was clarified and confirmed by VMware Support:
  • Unlike stated in KB1038241 you can also specify IOps or Bps values (not only K, M or GIOps resp. K, M or GBps) for the caps (e.g. "500IOps"). If you do not specify a unit at all IOps resp. Bps is assumed, not KIOps/KBps like stated in the article.

  • The throughput cap can also be specified through the vSphere client (see VM properties / Resources / Disk), but not the bandwidth cap. This can even be done while the machine is powered on, and the change will become immediately effective.

  • And now the part that is the least intuitive: Although you specify the limits per virtual disk the scheduler will manage and enforce the limits on a per datastore(!) basis. That means:

    • If the VM has multiple virtual disks on the same datastore, and you want to limit one of them, then you must specify limits (of the same type, throughput or bandwidth) for all the virtual disks that are on the same datastore. If you don't do this, no limit will be enforced.

    • The scheduler will add up the limits of all virtual disks that are on the same datastore and will then limit them altogether by this sum of their limits. This explains Didier's finding that a single disk is limited to 150IOps although he defined a limit of 100IOps for this disk, but another limit of 50IOps for a second disk that was on the same datastore.

    • So, if you want to enforce a specific limit to only a single virtual disk then you need to put that disk on a datastore where no other disks of the VM are stored.


Wednesday, August 10, 2011

[Update] ESXi-Customizer 1.2 - another bugfix release

If you used the Advanced edit mode with ESXi-Customizer 1.0 or 1.1 and got a "Corrupt boot image" message in ESXi (either when booting the customized ISO or after having installed with it) ... this was caused by a corruption of the OEM.tgz file while re-packaging it.

It was very hard to find a Windows version of tar that produces tar archives which are fully compatible with ESXi. But (I hope) I have finally found one: a Windows port of busybox. Since ESXi uses busybox, too, this should guarantee maximum compatibility. If you ever wondered what a Windows port of busybox could be good for ... now you know ;-)

Please update to version 1.2 that incorporates this fix, and let me know if you are still struck by this bug! Please download it from the project page!

Saturday, August 6, 2011

vSphere 5: release date rumors and licensing changes

From what I have heard the originally targeted release date for VMware's vSphere 5 was August 5th. Now this has passed and it did not happen. There are now rumors ongoing that it will be released on August 22nd (see source)...
I don't know why it is being delayed. One possible reason is the change in licensing that was announced on August 3rd (see VMware's Power of Partnership Blog). With the revelation of vSphere 5 on July 12th VMware introduced a new licensing method based on vRAM (the amount of RAM allocated to running VMs) which lead to a storm of protest among customers and partners, especially because of the low amount of vRAM per physical CPU that was originally communicated. With the announcement above VMware has doubled this entitlement for most vSphere editions and they also capped the accountable vRAM for a single VM to 96GB (even if it has more RAM than that).
This will definitely help to speed up the adoption of vSphere 5 ... once it is released.

Update (2011-08-23): Okay, nothing again ... So it will probably happen on Friday (August 26th), just before VMworld 2011 (starting on Monday 29th).

Update (2011-08-25): It is out now, the official release date was August 24th. Customers with subscription go here to download. The free ESXi version is available here.