Troubleshooting: Problems connecting HP Insight Control Storage Module to StoreServ 7200 (3Par)

A customer of mine, who runs a pure HP environment based on c7000 and StoreServ 7200, wanted to get the HP Insight Control Storage Module for vCenter up and running. The problem was that while we were able to connect to the older MSA array they run for non-production workloads, we were unable to connect to the newer StoreServ 7200. There is full IP connectivity between the application server that the HP Insight components run on and the storage controllers/VSP (no firewalls between them, they are located in the same subnet).

The only error message we got was an “unable to connect” message, when using the same credentials and ip address used for the 3Par Management Console. After reaching out to quite a few people, including Twitter, we finally found the solution. It turns out that the CIM service on the array was not responding, in fact it was disabled, which naturally resulted in not being able to connect.

A quick ssh session to the array, revealed that the CIM service was disabled.
[cc lang=”bash” width=”100%” theme=”blackboard” nowrap=”1″]
login as: username
Password: *************

3par-array cli% showcim
-Service- -State– –SLP– SLPPort -HTTP– HTTPPort -HTTPS- HTTPSPort PGVer CIMVer
Disabled Inactive Enabled 427 Enabled 5988 Enabled 5989 2.9.1 3.1.2
3par-array cli% stopcim
Are you sure you want to stop CIM server?
select q=quit y=yes n=no: y
CIM server stopped successfully.
3par-array cli% startcim
CIM server will start in about 90 seconds
3par-array cli%
[/cc]

Restarting it fixed the issue, and we now have StoreServ data available directly in the vSphere Web (and C#) client. This also fixed the connection problem we had with vCenter Operations Manager and the The HP StoreFront Analytics adapter.

So, if you are unable to connect to your StoreServ, check the CIM service – It might just be disabled.

Virtual Connect FlexFabric and Direct-Attach FC 3Par Caveat

When configuring a new C7000 Blade Enclosure with a couple of FlexFabric 10Gb/24-port modules I ran into a rather annoying issue during setup.

HP Virtual Connect 3.70 introduced support for Direct-Attach setups of HP 3Par StoreServ 7000 storage systems, where you can eliminate the need for dedicated FC switches. For full details, have a look at Implementing HP Virtual Connect Direct-Attach Fibre Channel with HP 3PAR StoreServ Systems.

This is excellent for setups where all your hosts are HP Blades, and you have a Virtual Connect FlexFabric setup. After all, less components means less complexity, right?

The problem I ran into is a bit strange though, and it took some time figuring out what was wrong. The HP 3Par StoreServ 7200 was racked, stacked and configured when the FC SFP+  modules where installed in the FlexFabric module, and I pretty much thought it would be plug and play from there to get the blades to talk FC to the HP 3Par after going through the Virtual Connect setup.

Sadly, that was not the case. It seems there is a bug in the web GUI for VC 3.70 that prevents getting a working setup. I know 3.75 is released, but nothing in the release notes seem to indicate that this has been fixed in that release.

For some reason, the “Fabric Type” dropdown where you should be able to select either “FabricAttach” or “DirectAttach” is greyed out, thus preventing the proper configuration of the SAN Fabric in “DirectAttach” mode. It defaults to “FabricAttach”, and in a Direct-Attach scenario that simply does not work. You will not be able to get a FC link and the SFP+ module will be listed as “unsupported”.

SanFabric-1a

The solution was to create the SAN Fabric manually by using the Virtual Connect CLI interface. T
he following commands created the two fabrics required for redundancy (VC module in Bay 1 and in Bay 2)

[cc lang=”bash” width=”100%” theme=”blackboard” nowrap=”0″]
add fabric Fabric-1-3PAR Bay=1 Ports=1 Type=DirectAttach
add fabric Fabric-2-3PAR Bay=2 Ports=1 Type=DirectAttach
[/cc]

As you can see, by using the add fabric command it was possible to define the correct Fabric Type and I could then proceed to add Port 2 from Bay 2 to Fabric-1-3PAR and vice versa to create a fully redundant setup.

SanFabric-2

Why the VC GUI prevented me from setting the correct fabric type is beyond me, but for some reason it simply did not allow me to change this rather important setting, and prevented the setup from working without using the CLI for configuration.

Why Did the HP BL460c G8 Lose It’s Datastore?

HP BL 460c G8

After installing a couple of brand new HP ProLiant BL460c G8 blades with HP Smart Array P220i controllers at a customer site, I decided that I should upgrade from the VMware-ESXi-5.0.0-Update1-623860-HP-5.20.43.iso Build 623860 used to install the blades, to the latest 821926 build offered by VMware.

Normally this is a really easy process using VMware Update Manager, but since this is a new installation all the prerequisites for that are not in place just yet and I decided to use esxcli to perform the update.

After placing the downloaded ESXi500-201209001.zip VIB file on the local datastore on the first host, I ran the update with the following command: (abbreviated for legibility)

[cc lang=”bash” width=”95%” theme=”blackboard” nowrap=”0″]
~ # esxcli software vib install -d /vmfs/volumes/datastore1/ESXi500-201209001.zip
Installation Result
Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.
Reboot Required: true
VIBs Installed: […]
VIBs Removed: […]
~ #
[/cc]

So far so good! One quick, as quick as a HP blade can, reboot later my host was updated to ESXi 5.0.0 Build 821926 and everything looked fine. That is, until I wanted to put some new files into the blades local storage. Much to my surprise, there was no local storage to be seen at all. Thinking that something must have gone wrong while booting, I decided to try a new restart and imagine my surprise when the blade booted up again, but this time with Build 623860.

Thinking that I must have messed things up pretty badly, I decided to try the same procedure on the second HP ProLiant BL460c G8 blade installed in the same manner. And yet again, I got the same results.

While this is somewhat comforting when contemplating Albert Einstein´s definition of insanity, it´s not comforting when it comes to the procedure I followed for updating the hosts.

This time around I recorded the output the esxcli command gave me, and found this little gem hidden inside the output:(unrelated VIBs removed)

[cc lang=”bash” width=”95%” theme=”blackboard” nowrap=”0″]
~ # esxcli software vib install -d /vmfs/volumes/esx5local/ESXi500-201209001.zip
Installation Result
Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.
Reboot Required: true
VIBs Installed: […], VMware_bootbank_scsi-hpsa_5.0.0-17vmw.500.0.0.469512, […]
VIBs Removed: […], Hewlett-Packard_bootbank_scsi-hpsa_5.0.0-28OEM.500.0.0.472560, […]
~ #
[/cc]

So, in short the update from VMware removes the HP Bootbank VIB for local SCSI controllers, and replaces it with a VMware one. Effectively this removes the ESXi hosts ability to read the local datastore from the HP Smart Array P220i Controller.

What still baffles me though, is that the first boot after applying the update boots Build 821926, but subsequent boots are on Build 623860

In the end, to rectify the issue, I found the scsi-hpsa-5.0.0-28OEM.500.0.0.472560.x86_64.vib file on hp.com, downloaded it and placed it on the host.

I then ran the update again
[cc lang=”bash” width=”95%” theme=”blackboard” nowrap=”0″]
~ # esxcli software vib install -d /vmfs/volumes/datastore1/ESXi500-201209001.zip
Installation Result
Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.
Reboot Required: true
VIBs Installed: […]
~ #
[/cc]

After installing the update, I then installed the HP ProLiant Smart Array Controller Driver
[cc lang=”bash” width=”95%” theme=”blackboard” nowrap=”0″]
~ # esxcli software vib install -v file:/tmp/scsi-hpsa-5.0.0-28OEM.500.0.0.472560.x86_64.vib
Installation Result
Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.
Reboot Required: true
VIBs Installed: Hewlett-Packard_bootbank_scsi-hpsa_5.0.0-28OEM.500.0.0.472560
VIBs Removed: VMware_bootbank_scsi-hpsa_5.0.0-17vmw.500.0.0.469512
VIBs Skipped:
~ #
[/cc]

This reverses the removal of the Hewlett-Packard_bootbank_scsi-hpsa_5.0.0-28OEM.500.0.0.472560 VIB by the VMware Update itself.

Another (quick) reboot, and finally my host was upgraded to the correct build and the local datastore was available too.

[cc lang=”bash” width=”95%” theme=”blackboard” nowrap=”0″]
~ # cat /proc/driver/hpsa/hpsa0
hpsa0: HP Smart Array P220i Controller
Board ID: 0x3355103c
Firmware Version: 3.04
Driver Version: HP HPSA Driver (v 5.0.0-28OEM)
Driver Build: 2
IRQ: 217
Logical drives: 1
Current Q depth: 0
Current # commands on controller: 0
Max Q depth since init: 0
Max # commands on controller since init: 4
Max SG entries since init: 129
Max Commands supported: 1020
SCSI host number: 0

hpsa0/C0:B0:T0:L1 Direct-Access LOGICAL VOLUME 3.04 RAID 1(1+0)
~ #
[/cc]

Thankfully these blades had not yet been put into production, and I was free to wrestle with them as much as I wanted to make this work, without it affecting anything other than my troubleshooting genes.

I have not tested if the same scenario unfolds if the hosts are updated via VMware Update Manager, but I suspect that the results would have been the same.

Of course, the hosts could have been installed with the newer HP ESXi 5.0 U1 Sept 2012 refresh in the first place, and I probably would not have run into this issue, at least not until someone decided to apply a later patch to the host.

All in all, I guess the lesson here is that you need to be careful when updating your hosts and make sure you have a real retreat option ready if you need to quickly roll back to the previous build. The luxury of having the time and possibility to really troubleshoot the issue, might not be available to you if you are upgrading production systems.

Test your updates, every time.