Automatically Name Datastores in vSphere?

William Lam posted “Why you should rename the default VSAN Datastore name” where he outlines why the default name for VSAN data stores should be changed. Of course, I completely agree with his views on this; Leaving it at the default might cause confusion down the line.

At the end of the post, William asks the following:

I wonder if it would be a useful to have a feature in VSAN to automatically append the vSphere Cluster name to the default VSAN Datastore name? What do you think?

The answer to that is quite simple too; Yes. It would be great to be able to append the cluster name automatically.

But this got me thinking, wouldn’t it be even better would be to use the same kind of naming pattern scheme we get when provisioning Horizon View desktops, when we provision datastores? In fact, this should also be an option for other datastores, not just when using VSAN.

Imagine the possibilities if you could define datastore naming schemes in your vCenter, and add a few variables like this, for instance: {datastoretype}-{datacentername}-{clustername/hostname}-{fixed:03}.

Then you could get automatic, and perhaps even sensible, datastore naming like this:

local-hqdc-esxi001-001
iscsi-hqdc-cluster01-001
nfs-hqdc-cluster01-001
fc-hqdc-cluster01-001
vsan-hqdc-cluster01-001 

And so on… I’m sure there are other potentially even more useful variables that could be used here, perhaps even incorporating something about tiering and SLA´s (platinum/gold/silver etc.) but that would require that you knew the storage characteristics and how it maps to your naming scheme when it gets defined. But yes, we do need to be able to automatically name our datastores in a coherent matter, regardless of storage type.  

After all, we’re moving to a model of policy based computing, shouldn’t naming of objects like datastores, also be ruled by policy, defined at a Datacenter level in vCenter?
(wait a minute, why 
don’t do the same for hosts joined to a datacenter or cluster?)

 

Fun and Games with VDR and Snapshots

One of the smaller improvements in vSphere 5 was the introduction of the “Virtual machine disks consolidation is needed” configuration alert if vSphere determines that there are orphaned snapshots for a given VM.

Consolidation Needed Warning

Previous versions does not show this warning message, and datastore usage could potentially skyrocket for no apparent reason if something continues to create snapshots that are not properly cleaned up when they are no longer in use. Unless there is active space monitoring for your datastores, and there should be, it could go on unnoticed for some time.

Running a snapshot consolidation attempts to clean up this situation, and remove the orphaned snapshots reclaiming the space they occupy.

This video from VMware shows how this is done, and how it should work:

For more info, have a look at KB2003638 Consolidating snapshots in vSphere 5.x

While it is great that you get alerted when this happens, and that there is an option to clean it up directly in the vSphere Client, what do you do if consolidation doesn´t work for some reason?

I recently visited a client who had problems with their VDR appliance, and every attempted backup left orphaned snapshots behind. By default VDR has a retry interval of 30 minutes for failed backups (BackupRetryInterval=30 in datarecovery.ini), before it times out. In the space of 30 minutes, VDR did 30 backup attempts, effectively creating 30 orphaned snapshots each time a backup was attempted. One of the affected VMs had over 300 delta files accumulated over a fairly small timeframe.

There clearly were a lot of snapshots in the datastore, but for some reason the vSphere Client Snapshot Manager did not show any snapshots for the VM. Clearly there was an inconsistency here, and after investigating the VM.vmsd file it became fairly apparent what was going on. The VM.vmsd file is responsible for keeping tabs of the snapshot delta-vmdk files and is the source that the Snapshot Manager uses to display and manage them.

The case here was that when a snapshot is removed, the snapshot entity in the VM.vmsd is removed before the changes are made to the child disks. When VDR subsequently had problems removing the child disks, they were being left behind.

Combine this situation with the fact that the maximum redo logs supported is 255, you can quickly run into a situation where there are snapshots left behind that you can´t get easily rid of with the consolidate command.

Cloning the VM was not really an option, since a clone operation actually consolidates snapshots as part of the cloning process, it would also fail with the same error.

In the end, the solution was to fire up VMware vCenter Converter, and use that to perform a V2V “conversion” of the VMs. Why does this work, when a “native” vSphere clone operation does not? The answer is surprisingly simple, vCenter Converter does not know the virtual disk structure at all. It sees only that what the operating system sees, and the OS inside the VM has no concept or knowledge of the snapshots created in vCenter.

While this fixed the immediate issue of getting rid of the orphaned snapshots and reclaim the space wasted by them, it does not fix the underlying root issue that causes VDR not to clean up after itself.

For some reason it looks like VDR, after mounting the snapshot files to the appliance, does not release them, thus retaining a lock on the snapshot files, This in turn means that the VM.vmsd file is cleared for VDR snapshots, but the files are still present on on the datastore.

VMFS-5: Block Size Me

The up and coming release of VMware vSphere 5 comes with an upgraded versjon of the VMware vStorage VMFS volume file system. One of the problems with VMFS-3 an earlier is that the block size you define when you format the datastore, determines the maximum size of the VMDK files stored on it. This means that when planning your datastore infrastructure you must have an idea on how large your VMDK files will potentially be during the lifecycle of the datastore.

For VMFS-2 and VMFS-3 the block sizes  and their impact on VMDK files looks as follows:

Block Size
Largest virtual disk on VMFS-2
Largest virtual disk on VMFS-3
1MB
456GB
256GB
2MB
912GB
512GB
4MB
1.78TB
1TB
8MB
2TB
2TB minus 512B
16MB
2TB
Invalid block size
32MB
2TB
Invalid block size
64MB
2TB
Invalid block size

In other words, if you format your datastore with a 1MB block size, with VMFS-3, you are limited to a maximum VMDK file size of 256GB. Of course, you can work around this by adding more VMDK files to your VM, and then extending the disks inside the installed OS in the VM, but over time that might get a bit messy. The only way to change the block size, is to migrate all the VMs stored on that particular datastore to a different one, and reformat your original datastore with a new block size. For environments with limited storage space, this can be a real headache.

Thankfully VMware has made this a thing of the past in VMware vSphere 5, and VMFS-5. VMFS-5 has a  new unified block size of 1MB, and which no longer limits you to 256GB VMDK files.  In fact, the block size no longer really matters, as the limits are removed completely.

The table for vSphere 5 and VMFS-5 looks a bit simpler:

Block Size
Largest virtual disk on VMFS-5
1MB
2TB minus 512B

Upgrading from VMFS-3 to VMFS-5 is an online & non-disruptive upgrade operation, meaning your can do it while your VMs are running on the datastore.

Thankfully you can also extend VMDK files, to the new limits, on an upgraded VMFS-3 datastore, given that it has been upgraded to VMFS-5.

Note: Remember to update all your hosts to vSphere 5 before upgrading your datastores, since vSphere 4 (and earlier) can’t read the new VMFS-5 filesystem.

This is great news for vAdmins, since we no longer have to worry about the block size as a limiting factor for our VMs. Simplification is always welcome!

Of course, there are other improvements in VMFS-5 as well, but we’ll save those for a later post or two.