Showing posts with label G7. Show all posts
Showing posts with label G7. Show all posts

Thursday, October 27, 2011

Update: ESXi 5.0 on HP G7 blades, now a Go!

About three weeks back I reported on Emulex firmware problems that prevented the use of ESXi 5.0 on HP G7 blade hardware. This was fixed now, somehow...

HP has now updated the advisory that describes the issue and published an updated firmware that fixes the VLAN handling problems with ESXi 5.0 if it is used together with the be2net driver 4.0.355.1.

Be sure that you read the release notes of the firmware! It looks like it is an emergency/workaround release that leaves many issues unresolved. A firmware version that you can really trust for production will probably be available mid-November.

Update (2012-12-09): HP and Emulex published the final version of the OneConnect firmware (4.0.360.15a) on Nov 19th. VMware's KB2007397 also lists the recommended drivers to use with this firmware for both ESXi 4.1 and 5.0.

Update (2012-03-09): HP has published yet another firmware update on March 5th. Download version 4.0.360.15b. The previous link has become invalid.

Update (2012-04-16): Please refer to my HP & VMware links page to find the download for the latest version of the firmware.

Saturday, October 1, 2011

Currently a No-Go: ESXi 5.0 on HP G7 blades

Back in May I reported on problems with ESXi 4.1 and the Emulex OneConnect CNA that is built into HP's G7 blade servers.
If you now try to install ESXi 5.0 on such a hardware you will have a strong déjà vu: The be2net driver that is available right now for ESXi 5.0 is not really functioning due to "VLAN tagging issues". HP has published an advisory on this stating that an updated driver (that should fix these issues) is "currently in the certification process" and will be made available in "Q4 2011".

Okay, I won't update our production hosts to ESXi 5.0 that soon anyway, but I just wanted to install it on some spare blades for testing and evaluation. Too bad ... waiting for a fix again ...

Update (2011-10-27):
HP has now updated the advisory and published an updated firmware that fixes the VLAN handling problems with ESXi 5.0 if it is used together with the be2net driver 4.0.355.1.
Be sure that you read the release notes of the firmware! It looks like it is an emergency/workaround release that leaves many issues unresolved. A firmware version that you can really trust for production will probably be available mid-November.

Update (2012-04-16):
In the meantime it looks like all problems have been fixed with newer firmware and driver versions. Please refer to this newer post of mine!

Saturday, June 18, 2011

How to hide unused FlexNICs

When I configured an HP Blade Enclosure with VirtualConnect modules for the first time I stumbled over an issue that probably has bothered most of the people doing this, especially if they run ESX(i) on the blade servers:

The BL620c G7 blade servers we are using have four built in 10Gbit-ports, and each of them can be partitioned into up to four so-called FlexNICs (or FlexHBAs for FCoE if you use them together with FlexFabric VirtualConnect modules like we do). The overall 10GBit bandwidth of one port will be split among its FlexNICs in a configurable way. You could e.g. have four FlexNICs with 2,5 GBit each, two with 6 and 4 GBit, or any combination of one to four FlexNICs with their bandwidth adding up to 10GBit.
For the OS (e.g. VMware ESXi) that is installed on the blade server each FlexNIC appears as a separate PCI device. So an ESX(i) host installed on a BL620c G7 can have up to 16 NICs. Cool, eh?

However, we did not really want to use too much of that feature and divided the first two 10Gbit-ports in a 4Gbit-FlexHBA and a 6GBit-FlexNIC each. The third and fourth port we even configured as single 10GBit-FlexNICs.

Now, the problem is that every 10Gbit-port will show up as four PCI devices even if you have configured less than four FlexNICs for it. Even if you have not partitioned it at all, but use it as a single 10Gbit-NIC, it will show up as four NICs with the unconfigured ones being displayed as disconnected!
In our case we ended up with ESXi seeing (and complaining about) 10 disconnected NICs. Since we monitor the blades with HP Insight Manager it also constantly warned us about the disconnected NICs.

So, we thought about a method to get rid of the unused FlexNICs. If we had Windows running directly on the blades this would have been easy: We would just disable the devices and Windows (and also HP Insight Manager) would not be bothered by them. However, in ESX(i) you cannot just disable a device ... but you can configure it for "VMDirectPath":

PCI Passthrough configuration of a BL620c G7
This dialog can be found in the Advanced Hardware Settings of a host's configuration. What does it do?
With VMDirectPath you can make a host's PCI device available to a single VM. It will be passed through to the VM, and the guest OS will then be able to see and use that device in addition to its virtual devices.
This way it is possible to present a physical device to a VM that you normally would not be able to add.

In the dialog shown above you configure which devices are available for VMDirectPath (also called PCI Passthrough). You can then add all the selected devices to the hardware of individual VMs.
We really did not want to do the latter... but there is one desirable side effect of this configuration: A device that is configured for VMDirectPath becomes invisible for the VMkernel. And this is exactly what we wanted to achieve for the unused FlexNICs!

So we configured all unused FlexNICs for VMDirectPath, and they were no longer being displayed as (disconnected) vmnics. If you want to do the same you need to know what PCI device a vmnic corresponds to. In the screenshot I posted you will notice that for some of the PCI devices the vmnic name is displayed in brackets, but not for all. So, it can be hard to figure out what devices need to be selected, but it's worth it!

Thursday, May 26, 2011

Updated be2net driver fixes issues with G7 blades

When we started to deploy our HP ProLiant BL620c G7 blade servers we stumbled over some issues with the driver (be2net) for the built-in FlexNIC adapters. They are documented in the VMware KB:
We followed the recommendation in these articles and updated the be2net driver to version 2.102.554.0. However, we still experienced hangs of the ESXi host and network outages whenever the host was rebooted or had its dvS-connections reconfigured.
These hangs were accompanied by VMKernel.log-messages like this one:

... vmkernel: 10:06:11:06.193 cpu0:4153)WARNING: CpuSched: 939: world 4153(helper11-0) did not yield PCPU 0 for 2993 msec, refCharge=5975 msec, coreCharge=6374 msec,

After opening a support call with VMware we finally found out that these problems were caused by improper handling of VLAN hardware offloading by the be2net driver, and that they only occur when you are using distributed virtual switches (dvS) like we did.
So, after configuring the blade hosts with virtual standard switches (vSS) the problem went away.

Since then we were waiting for a fixed be2net-driver (from Emulex) to be able to return to dvS. We really did not want to abandon this option because it offers some benefits (load based teaming of the physical uplinks and Network I/O Control) over the standard switch.

Today, the waiting finally ended. Emulex has finished the fixed driver, it is available here:
VMware ESX/ESXi 4.x Driver CD for Emulex OneConnect 10Gb Ethernet Controller

Update (18. Jul 2011): In the meantime VMware made two new KB articles available that reference the problems described here and the new driver:
In the latter one it is also recommended to update the NIC's firmware. The current one (as of today) is available at HP as a bootable ISO file. Thanks to makö for pointing this out in this post's comments.