Wednesday, 17 December 2014

Solaris Live Upgrade, ZFS and Zones

I've been working on this problem for a few days and have only just solved it, so thought it might be worth sharing...

Solaris is a very powerful operating system with some great features. Zones brought Docker-like containers to Solaris back in 2005, ZFS is one of the most advanced filesystems currently available, and the Live Upgrade capability is highly underrated and is a great way to patch a server while ensure you have a back out plan.

All good stuff, but when you put Live Upgrade into a mix of Zones and ZFS, things get a bit flakey.

The issue I was having was that when I ran the "lupc -S" (Live Upgrade Preflight Check) script on my zone, I'd get the following message:

# lupc -S
This system has Patch level/IDR  of 121430-92.
Please check MOS (My Oracle Support) to verify that the latest Live Upgrade patch is installed -- lupc does not verify patch versions.

Zonepath of zone is the mountpoint of top level dataset.
This configuration is unsupported

Oracle has a document on My Oracle Support: "List of currently unsupported Live Upgrade (LU) configurations (Doc ID 1396382.1)" which lists a lot of ways in which Live Upgrade won't work(!). On checking this document for the top level dataset issue, it gives the following text:

If ZFS root pool resides on one pool (say rpool) with zone residing on toplevel dataset of a different pool (say newpool) mounted on /newpool i.e. zonepath=/newpool, the lucreate would fail.

Okay, except that's not what I've got. My zone, s1, has a zonepath set to  /zones/s1. The zpool is called "zones" and "s1" is a separate ZFS filesystem in this dataset.

What the system is actually complaining about is that the zpool is called "zones" and is mounted as "/zones". The workaround is to set the ZFS mountpoint to be something different from the pool name.

For example,  I created a new ZFS filesystem under zones called "zoneroot":

# zfs create zones/zoneroot

Then (and this is the important bit), I set the mountpoint to something else:

# zfs set mountpoint=/zoneroot zones/zoneroot

Running zfs list for this dataset shows:

zones/zoneroot                  1.80G   122G    32K  /zoneroot

Now, I can create a zone, let's call it "s2":

# zonecfg -z s2
s2: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:s2> create
zonecfg:s2> set zonepath=/zoneroot/s2

zonecfg:s2> verify
zonecfg:s2> commit
zonecfg:s2> exit

On installing this zone, a new ZFS file system is created, /zoneroot/s2.

Now, when running the "lupc -S" command, Live Upgrade doesn't complain about unconfigured configurations!

Saturday, 15 February 2014

N36L and N54L in the same vSphere cluster

In the home lab I have a two-node cluster built around the HP Microserver N36L. Recently I took delivery of an N54L which I wanted to add to the cluster.

Although it was straightforward to add the new host, I was unable to vMotion VMs between the two types of CPU as there was a CPU incompatability:

Okay, so the CPUs are different between the two servers, but they are both AMD, so I figured I could make it work by enabling Enhanced vMotion Compatability (EVC). Unfortunately, this complained that the CPUs were either missing features, or that the VMs were using these CPU features:

As it turned out, the problem was due to the virtual machines in the cluster.

In order to work around the error, I powered off each VM in the cluster, moved them to a host outside of the cluster, edited the settings of the VM, selected Options > CPUID Mask > Advanced > Reset all to Default.

This cleared a bunch of flags that have been set at one time (not by me; must have been an automatic change). Once that I was able to configure the cluster to use EVC for AMD "Generation 2" processors.

I was then able to cold migrate the VMs back to the new cluster and power them on. One problem though: How to move the vCenter Server and its SQL Server database VM into the cluster without powering it off. I tried to vMotion them while they were on, but got the same error as above.

The answer to this, courtesy of VMware knowledge base article 1013111, is to open a vSphere Client connection directly to the host running the vCenter and SQL Server VMs, power off the VMs, right-click on them and Remove from Inventory. Then open another vSphere Client connection directly to a host in the EVC-enabled cluster, and then browse the datastore containing these VMs. Once located, right-click the VMX file and add to inventory. This will register the VMs with the host and you can power them on (be sure to say you moved the VM when prompted). Once loaded (which can take some time on low end servers), open a new vSphere Client session to the vCenter Server and you should be able to see the VMs in the correct cluster.