VMware vSphere 6.5 PSOD: GP Exception 13

While at a customer site, migrating an old vSphere 5.5 environment to 6.5, several hosts suddenly crashed with a PSOD during the migration. Long story short, we got hit by this: VMware KB 2147958: ESXi 6.5 host fails with PSOD: GP Exception 13 in multiple VMM world at VmAnon_AllocVmmPages (2147958)

It turned out that a bunch of the VMs we were vMotioning from the old environment had the cpuid.corePerSocket advanced setting set in the .vmx file, and this can cause ESXi 6.5 to enter a state of panic, and in our case it certainly did.

Upgrading the hosts to 6.5a, like the knowledgebase article states, alleviated the issue and we did not experience PSOD’s again while migrating the 100+ VMs from the old environment to the new one.

ESXi Snapshot Problems: msg.snapshot.error-QUIESCINGERROR

Photo by Sonja Langford

Just a quick post about something I experienced at a client, with ESXi 6.0 hosts, today:

If you have trouble performing VMware snapshots, and see a  msg.snapshot.error-QUIESCINGERROR error, check the host time settings and NTP.

In this case, snapshots of VMs located on other hosts in the cluster were fine, but once a VM was moved to the new host, snapshot operations failed after an hour or so.

It turns out a new host in the cluster was not properly set up to use NTP, and time drift between the host and the vCenter caused the snapshot failures. Correcting the time on the host and configuring NTP resolved the issue.

Always remember: If the problem isn’t DNS, it almost certainly is NTP.

vCenter / SSO unable to retrieve AD-information | Error while extracting local SSO users

After deploying a new VCSA 6.0u1 I was seeing some weird errors while trying to retrieve AD- users/groups (or anything from the esod.local domain):

_1446154857693

After some serious head scratching, it dawned on me after checking the DNS records for the DC in the domain, from the vCenter Appliance itself:

dig +noall +answer +search dc1.esod.local
dc1.esod.local. 3600 IN A 10.0.1.201

So far so good, the DNS lookup works as expected.

dig +noall +answer +search -x 10.0.1.201

That’s right, the reverse lookup returns exactly zilch, zero, zippo, nil, nada and null.

The Solution

Add reverse lookup zone to DNS and update the DC PTR record._1446155633910

 

Once that it done, it works as expected:

dig +noall +answer +search -x 10.0.1.201
201.1.0.10.in-addr.arpa. 3600 IN PTR dc1.esod.local.

Re-checking the domain in the vCenter Web Client, and  AD-information is retrieved correctly.

_1446154788631

 

It turns out that in VC6.0u1 reverse PTR records are required for SSO and Active Directory authentication to function properly.