Zappa-SSD-Failure

Fixing “Could not create vFlash cache” error when powering on a VM

During some way overdue housekeeping in my HP Microserver-based “Homelab” today, I ran into a rather annoying issue that prevented me from starting one of my more important VMs; namely my home Domain Controller.

In short, I removed an old SSD drive that I’ve used for vFlash Read Cache (vFRC) testing and installed a new 1TB drive instead. Since I have a rather beefy work lab now, I need space more than speed at home, so this seemed like a good idea at the time.

A good idea that is, until I tried starting my DC VM, which also is my DNS and DHCP server, and got greeted with this little gem:

Zappa-SSD-Failure

The “Could not create vFlash cache: msg.vflashcache.error.VFC_FAILURE” error is understandable, the SSD drive was removed, but I honestly did not think that even if a VM was configured to use it, that it’s absence would prevent a power-on operation on that VM. I would have expected it to throw a warning about the cache location missing, and warn me that acceleration was not happening, not a flat out “cannot power on“.

Normally the fix for this would be quick and easy, edit the VM and remove the vFRC configuration, but since my host has a whopping total of 8GB of memory I don’t have vCenter running at all times.  Editing the VM settings through the vCenter Legacy Client (C#) does not work, since vFRC requires Virtual Hardware Version 10, which it cannot edit.

Once I got the vCenter Server Appliance (vCSA) fired up, I realised that I have somehow forgotten the admin AND root passwords, and was completely unable to log-on. How that has happened is beyond me, but for the life of me I was not able to log on.

Next step was to try and edit the VM from VMware Workstation 10 installed on one of the Windows boxes in my network. Sadly Workstation has no concept of vFRC, and it is not possible to edit that particular VM setting, even if you connect it to the vSphere host. I later on also realized that VMware Workstation 10 is also unable to host connect  USB peripherals, like printers, to a VM, but that’s beside the point right now.

So, either trying to hack the VM Hardware Version to a value that the vSphere Client can handle, of  deploying a new vCSA instance, I was left with editing the VMs vmx file directly.

Thankfully this was an easy way to fix the problem, and get the VM powered on. For each vmdk file that is configured to use vFRC, there is a corresponding entry in the vmx file, that controls vFRC.

In order to turn off vFRC acceleration for a given disk, download the vmx file, and change the value for <device>.vFlash.enabled from

sched.scsi0:2.vFlash.enabled = "TRUE"

to

sched.scsi0:2.vFlash.enabled = "FALSE"

Re-upload the vmx file, and try to power on. In my case, this fixed the problem of powering on the VM.

This problem again highlights one of the problems with the dependancy on the new vSphere Web Client. In a real production environment, the vCenter would be up and running at all times, and editing the VM would have been a small task, but what if you had used vFRC to speed up vCSA itself, and you had a failed SSD drive?

Of course, in this case the fault is mostly mine. I removed a “production SSD”, without first removing the vFRC configuration. I did not have a working vCSA with Web Client up and running when this was done, and I had forgotten my passwords. A pretty specific error generating procedure if there ever was one.

It’s an easy fix to edit the vmx file, but it does at times feel a little bit like you are now able to cut of the branch you are sitting on with the new vSphere Web Client. In simpler days, you could pretty much fix anything by connecting the vSphere Client to the host and fix any errors there, but now the dependancy on the vCenter Server is stronger than ever.

Before upgrading your VMs to Hardware Version 10, make sure you understand the implications of going all-in with dependancies on the Web Client and vCenter. It might just come back and bite you if you haven’t thought through your design.

Connect

Christian Mohn

Chief Consultant at EVRY Norge AS
Written by .

Christian is the owner of vNinja.net and a vSoup.net Virtualization Podcast co-host.
Connect

8 thoughts on “Fixing “Could not create vFlash cache” error when powering on a VM”

  1. Chris, vFRC resource allocations are set and defined at reserved resources. As you know, the reservation of resources have to be satisfied in order for VM power on operations to work. If the SSDs fail VMs won’t suffer any outages but they lose the performance value. VMs that are not powered on and are configure to use vFRC they will not be able to power on as the reservation of one of the resources (vFRC) can’t be satisfied. The issue you experience is the expected behavior. Good workaround article.

  2. Thanks for your comment, Rawlinson! I do see where the power-on limitation comes from, but I would expect that I would be able to power it on again without having to resort to vmx file editing. A simple “reservations could not be met, want to power on regardless” popup would have sorted that part, but it would have to be supported in the vSphere Client, something that is currently not possible. I suspect it never will be, since the vSphere Client is currently being phased out.

    I do appreciate that this is an edge case, as I explained in the post, a good few things have to go wrong at the same time for this to occur. I’m glad I found a workaround, besides deploying a new vCSA (something I will have to do anyway).

    The main problem here is the inability to edit a VM that runs on vHW10, without the web client. If there was a built-in, minimal, web client in ESXi itself, this would not have been a problem. Right now, this is not the case, and you can cut off the branch you are sitting on without realizing it before it’s to late.

  3. I haven’t tried vFRC or VSAN, but honestly the more I read around, the more they seems to me overly complicated to configure and manage.
    Other competitors have dynamic management of the cache, so it can be allocated on the fly to several VMs, and you can run caching as “set and forget”.
    Fixed allocation is really ugly imho, and I see two major problems: how can I estimate how much cache would be good for my VM, and all those manual configurations can become impossible to manage as the number of VMs increases…

    1. Comparing vFRC and VSAN doesn’t really make sense in this case, even though vFRC is in a way part of VSAN (+ write caching). I agree on the fixed caching part of vFRC though, a dynamic allocation solution would make it easier to manage, and would even be a work-around for issues like this when the cache disappears for some reason.

      Of course, comparative and competitive solutions come with a price-tag attached to it. If you have the right licensing, vFRC is a “free” solution available when you upgrade your vSphere infrastructure. VSAN is another matter, we simply do not know what it’s price model will be when released.

  4. With dynamic allocation the loss of the cache would be solved immediately by allocating 0 amount.
    But more than the loss of the cache, I’m more doubtful about the initial allocation: I’m almost sure I’m not able to find out what the right amount of cache is. And even if I am, I don’t want to waste time to set it on any VM, and get back regularly to check it, because what is ok today, can be a wrong value tomorrow…

    Hopefully this is a really v 1.0 product, so there is for sure room for improvement…

    1. Exactly, no cache found – no cache allocated. Also, with vFRC the amount you set is fine when you create the VM, but what happens after you have deployed 20 new VMs to the same host/cluster. Does the initial value still make sense?

Leave a Reply