Wednesday, 17 December 2014

Solaris Live Upgrade, ZFS and Zones

I've been working on this problem for a few days and have only just solved it, so thought it might be worth sharing...

Solaris is a very powerful operating system with some great features. Zones brought Docker-like containers to Solaris back in 2005, ZFS is one of the most advanced filesystems currently available, and the Live Upgrade capability is highly underrated and is a great way to patch a server while ensure you have a back out plan.

All good stuff, but when you put Live Upgrade into a mix of Zones and ZFS, things get a bit flakey.

The issue I was having was that when I ran the "lupc -S" (Live Upgrade Preflight Check) script on my zone, I'd get the following message:

# lupc -S
This system has Patch level/IDR  of 121430-92.
Please check MOS (My Oracle Support) to verify that the latest Live Upgrade patch is installed -- lupc does not verify patch versions.

Zonepath of zone is the mountpoint of top level dataset.
This configuration is unsupported


Oracle has a document on My Oracle Support: "List of currently unsupported Live Upgrade (LU) configurations (Doc ID 1396382.1)" which lists a lot of ways in which Live Upgrade won't work(!). On checking this document for the top level dataset issue, it gives the following text:

If ZFS root pool resides on one pool (say rpool) with zone residing on toplevel dataset of a different pool (say newpool) mounted on /newpool i.e. zonepath=/newpool, the lucreate would fail.

Okay, except that's not what I've got. My zone, s1, has a zonepath set to  /zones/s1. The zpool is called "zones" and "s1" is a separate ZFS filesystem in this dataset.

What the system is actually complaining about is that the zpool is called "zones" and is mounted as "/zones". The workaround is to set the ZFS mountpoint to be something different from the pool name.

For example,  I created a new ZFS filesystem under zones called "zoneroot":

# zfs create zones/zoneroot

Then (and this is the important bit), I set the mountpoint to something else:

# zfs set mountpoint=/zoneroot zones/zoneroot

Running zfs list for this dataset shows:

zones/zoneroot                  1.80G   122G    32K  /zoneroot

Now, I can create a zone, let's call it "s2":

# zonecfg -z s2
s2: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:s2> create
zonecfg:s2> set zonepath=/zoneroot/s2

zonecfg:s2> verify
zonecfg:s2> commit
zonecfg:s2> exit


On installing this zone, a new ZFS file system is created, /zoneroot/s2.

Now, when running the "lupc -S" command, Live Upgrade doesn't complain about unconfigured configurations!



Saturday, 15 February 2014

N36L and N54L in the same vSphere cluster

In the home lab I have a two-node cluster built around the HP Microserver N36L. Recently I took delivery of an N54L which I wanted to add to the cluster.

Although it was straightforward to add the new host, I was unable to vMotion VMs between the two types of CPU as there was a CPU incompatability:



Okay, so the CPUs are different between the two servers, but they are both AMD, so I figured I could make it work by enabling Enhanced vMotion Compatability (EVC). Unfortunately, this complained that the CPUs were either missing features, or that the VMs were using these CPU features:


As it turned out, the problem was due to the virtual machines in the cluster.

In order to work around the error, I powered off each VM in the cluster, moved them to a host outside of the cluster, edited the settings of the VM, selected Options > CPUID Mask > Advanced > Reset all to Default.

This cleared a bunch of flags that have been set at one time (not by me; must have been an automatic change). Once that I was able to configure the cluster to use EVC for AMD "Generation 2" processors.

I was then able to cold migrate the VMs back to the new cluster and power them on. One problem though: How to move the vCenter Server and its SQL Server database VM into the cluster without powering it off. I tried to vMotion them while they were on, but got the same error as above.


The answer to this, courtesy of VMware knowledge base article 1013111, is to open a vSphere Client connection directly to the host running the vCenter and SQL Server VMs, power off the VMs, right-click on them and Remove from Inventory. Then open another vSphere Client connection directly to a host in the EVC-enabled cluster, and then browse the datastore containing these VMs. Once located, right-click the VMX file and add to inventory. This will register the VMs with the host and you can power them on (be sure to say you moved the VM when prompted). Once loaded (which can take some time on low end servers), open a new vSphere Client session to the vCenter Server and you should be able to see the VMs in the correct cluster.

Monday, 12 August 2013

VNXe: Using Unisphere remotely

I'm not sure if this has always been in Unisphere or was added as a recent release, but I've only just discovered this feature...

The VNXe is managed by a web-based Flash application called Unisphere. This application makes use of smooth fades for menus and screen information. All very impressive, until you're on an RDP session at which point it's mildly annoying.

Fortunately there is an option to allow the fading to be turned off. To enable, go to Settings and then Preferences. Tick the box labled "Optimize for remote management access" and click "Apply Changes".


The application will now feel a lot snappier.

Tuesday, 6 August 2013

EMC VNXe volume layout

Because of the way the VNXe "simplifies" the allocation and management of storage, it can be difficult to work out just what's happening under the hood. This can result in some unexpected behaviour...

When we originally purchased the VNXe, we only had six NL-SAS disks which were automatically assigned to the "Capacity Pool". The six disks were configured by the VNXe to be part of a RAID Group in a RAID 6 (4+2) configuration (i.e., four data disks, two parity disks). From this pool, we carved out a volume. This volume is striped across all six disks:

Figure 1 - single RAID Group


Additional volumes could be created assuming there is capacity in the pool and would also be striped across the disks in the RAID Group.

Since the original installation, we have added a couple of expansion trays and now have a total of 24 NL-SAS disks. The VNXe has configured three extra RAID6 (4+2) Groups, but does not automatically rebalance existing volumes across all the RAID Groups in the pool.

When creating a new volume, the VNXe will now stripe across all four RAID Groups (in fact, four RAID Groups is the maximum that the VNXe will stripe over, regardless of the the number of RAID Groups in a pool). This means that although two volumes are created from the same pool, and may be of equal size, there may be different performance characteristics. Volume B should have four times the IOPS as Volume A:

Figure 2 - multiple RAID Groups



A further complication may arise if the first volume created completely filled the RAID Group. In this instance, the VNXe will not be able to use the first six disks and the second volume will be striped across 18 disks.

Figure 3 - Surprise!


Unfortunately, because the VNXe is "simplified", it's not possible to see this from the Unisphere web interface. EMC seem to assume that the only metric non-storage specialists care about is capacity. Not true!

Fortunately, there is a way to identify how many disks are actually assigned to a volume, even if it's not something that can be tweaked. To do this, open an SSH session (see my previous blog post on how to do this) and run the svc_neo_map command:


svc_neo_map --fs=volume_name

The list of filesystems configured on the VNXe can be viewed in Unisphere under System > Storage Resource Health. The svc_neo_map command runs a bunch of internal commands and outputs a lot of data, but it's the last part, the output of "c4admintool -c enum_disks", that interests us here. This shows the number and identity of the disks that have been assigned to the filesystem. In the following example, the volume is spread over six disks:

# Show disk info, based on wwns from RG.
root> c4admintool -c enum_disks

Disk #    0  Enc #    1  Type 10(DRIVE_CLASS_SAS_NL, RPM 7202) Flag ()  Phys Cap: 3846921026(0xe54b5b42) Disk WWID      wwn = 06:00:00:00:05:00:00:00:0f:00:00:00:00:01:00:03
Disk #    1  Enc #    1  Type 10(DRIVE_CLASS_SAS_NL, RPM 7202) Flag ()  Phys Cap: 3846921026(0xe54b5b42) Disk WWID      wwn = 06:00:00:00:05:00:00:00:10:00:00:00:01:01:00:03
Disk #    2  Enc #    1  Type 10(DRIVE_CLASS_SAS_NL, RPM 7202) Flag ()  Phys Cap: 3846921026(0xe54b5b42) Disk WWID      wwn = 06:00:00:00:05:00:00:00:11:00:00:00:02:01:00:03
Disk #    3  Enc #    1  Type 10(DRIVE_CLASS_SAS_NL, RPM 7202) Flag ()  Phys Cap: 3846921026(0xe54b5b42) Disk WWID      wwn = 06:00:00:00:05:00:00:00:12:00:00:00:03:01:00:03
Disk #    4  Enc #    1  Type 10(DRIVE_CLASS_SAS_NL, RPM 7202) Flag ()  Phys Cap: 3846921026(0xe54b5b42) Disk WWID      wwn = 06:00:00:00:05:00:00:00:13:00:00:00:04:01:00:03
Disk #    5  Enc #    1  Type 10(DRIVE_CLASS_SAS_NL, RPM 7202) Flag ()  Phys Cap: 3846921026(0xe54b5b42) Disk WWID      wwn = 06:00:00:00:05:00:00:00:14:00:00:00:05:01:00:03


What do you do if you find yourself unable to take advantage of all your RAID Groups? As mentioned above, the VNXe does not auto balance existing volumes when new disks are added. The only way to take advantage of the new disks is to create a new volume, copy everything across and delete the original. Not ideal. Even the VNXe's bigger sibling, the VNX, had this limitation until the recent "Inyo" release. Hopefully, the new auto balance feature will trickle down and appear in a future VNXe update.

I think as it stands, the current VNXe Unisphere is too simple. While I'm glad that the VNXe hides the FLARE LUNs and DART dvols, slices, stripes, metas etc from the user, it needs to go further in showing where a volume is going to be placed and what performance can be expected. This is something that even non-storage admins care about!

Any comments/corrections welcome.

Monday, 15 July 2013

Veeam Management Suite Wish List


We've been running Veeam Backup & Replication (B&R) for a couple of years after I saw it demoed at the London VMUG. It's a great product that works well and has proven itself many times when we've had to restore a VM. It's not perfect (what is?) but overall, it's definitely our go-to backup product.

The Veeam Management Suite is a package that combines B&R with Veeam ONE, itself a suite of three products: Veeam Monitor, Reporter and Business View.

Veeam ONE competes with other VMware monitoring and reporting solutions such as VMware's vCenter Operations (VCOPS) and VMTurbo Operations Manager.

We opted for Veeam ONE as it provides a good balance between cost and features. Personally, I prefer the VCOPS user interface (especially the way it now integrates with the vSphere Web Client) and it's a seriously powerful product, but the cost is prohibitive in our environment.

We did look seriously at VMTurbo, and while I think its "Intelligent Workload Management" and "Economic Scheduling Engine" are very clever, I didn't think that the interface was accessible enough for our IT operations support team. Very whizzy, but too complicated.

In contrast, although Veeam ONE is less sophisticated than some of the above products in certain areas, the Veeam ONE Monitor interface is very functional and gives clear remediation steps for alarms. We needed to get a product that could help us with our proactive monitoring and ongoing capacity planning, and Veeam ONE meets these needs.

With both products due a version 7 refresh this year, these are the highlights that I'm excited about:

vCloud Director support


Without a doubt, this is the "must-have" feature for us and the arrival of version 7 will coincide with the deployment of vCloud Director (vCD) into our environment.

Having the ability to backup all the vCloud metadata and restore vApps back into the vCD infrastructure is an essential component that has, until now, been unsatisfactory. Veeam should be rightly proud that they have the first vCD backup/restore solution.

vSphere Web Client plugin


From vSphere 5.1, the Web Client has become the primary interface to vCenter. Veeam B&R 7 will introduce a plugin that allows visibility into backup jobs through the Web Client.

It looks pretty decent and this kind of integration reduces the number of applications vSphere admins have to open  in order to keep track of their infrastructure.

Tape support


Tape is dead. Except for all the places where it isn't!

For many organisations, bandwidth costs mean off-site backups can only be effectively done using tape. Apparently any tape drive or library that can be seen by Windows will be usable and this will help many organisations overcome a clunky backup approach that currently consists of multiple backup applications.

WAN acceleration


I'm tentatively excited about this because although the compression ratios look impressive, I'll need to do some testing when B&R 7 is released to see how effective it is in our environment. Certainly having another option alongside the much wanted tape support would be good for those really important, mission-critical VMs.

The Wishlist


Although the above additions will make Veeam 7 a compelling upgrade in itself, there are some features I'd like to see:

A vCenter specific backup job


There has been a long running "challenge" in backing up vCenter using VSS. The provided solution is to manually add the ESXi host on which vCenter is located and backup that way. Which is fine until the vCenter VM is moved to another host (via automated DRS, manual vMotion or the result of an HA failover).

Given that computers are designed to automate repetitive tasks for us, I'd like to see a Veeam backup job type that is designed specifically for vCenter and when run, queries vCenter to find out which host its on and then directly connects to that host to perform the backup.

I appreciate that there is likely to be more complexity under the hood to solve this problem, but the current method is unsatisfactory and requires too much manual intervention.

NetApp snapshot support


Veeam has obviously got a good relationship with HP and has introduced the ability to backup Lefthand SAN snapshots. As a NetApp customer, I'd love to see the same functionality for the NetApp filers.

SRM-lite


VMware Site Recovery Manager (SRM) is very impressive and allows an admin to define a "run book", essentially a series of steps to perform should DR be invoked. This includes the order in which VMs should be started, any IP address changes or network settings that need to be modified for the remote site, etc. With it, a site can be failed over (and back) by clicking a button.

Although Veeam B&R is currently focused on replication of VMs to another site, there appears to be a huge gap in the market for something that sits above the replication functionality and provides SRM functionality at a lower price point.

Backup and Restore Self Service


vSphere 5.1 is bundled with the VMware Data Protection (VDP) appliance. Based on EMC Avamar technology, this basic (and limited) backup product has a couple of nice features. Firstly, it integrates seamlessly with the vSphere Web Client (as will Avamar 7). Secondly, it provides a self-service interface that allows end users to restore their own files (as demoed on Chad's blog). I know that Veeam Enterprise Manager has similar functionality, but I'd like to take it a step further.

It would be useful to give users the ability to specify their own VM backup and restore jobs. As vCloud Director enables end users to provision their own VMs, there needs to be a way to allow these VMs to be scheduled for backup and then restore them in the future, without the need to log a call with IT.

I appreciate that there are lots of factors to consider when allowing end users to run their own backups such as how to quota backup repository space or prevent developers from backing up their VMs every hour ("just in case"), but in a multi-tenant environment, having the ability to delegate this functionality becomes important.

It's possible that this functionality may be in version 7 (I don't know, I'm not a beta tester).

Veeam ONE UI integration


There is some UI inconsistency across the Veeam ONE products. Monitor is a traditional Windows application, while Reporter and Business View are web applications. None of the above look like the Backup & Replication user interface.

Similarly, the Veeam Enterprise Manager is a web application, but has a different look and feel to the other products.

Although it's not the end of the world, I'd like to see some tighter integration between the three ONE products, Enterprise Manager and B&R to enable the oft-promised "single pane of glass". At the moment, the multiple separate icons on my desktop highlight how there is more integration to do at the front end.

As I've previously commented, I like it when applications smoothly integrate with vSphere Web Client. VMware have done this with VCOPS Foundation, and Veeam themselves will with B&R 7. This would be the logical conclusion of a UI merge for Veeam ONE.

And while the existing interfaces are functional, they're not beautiful and lack the "wow" factor when compared with vCenter Operations Manager.

Veeam ONE appliance


Okay, so this is unlikely to happen, but many of the competing solutions come in the form of virtual appliances on Linux. They don't require a separate database install or Windows licence. With VMware moving towards a Linux based appliance for vCenter Server, this appears to be the future.

I'm not really expecting this one, since Veeam support both VMware and Hyper-V, but hey, it's a wish list.

Chargeback Reporting


I've not had the opportunity to dig into all the reports thoroughly, so this functionality may already exist (but I've not found it after a quick look).

Veeam Business View provides a way to group VMs logically, such as by business unit. I'd like to see the ability to be able to define costs for vCPU, vRAM, datastores and operating systems and be able to assign these properties to specific business units (or Business View entities). With the addition of a report pack, we would then have the ability to generate custom chargeback reports for our vCloud environment.

If this sounds like VMware Chargeback Manager then its because Veeam seem to be almost there with it. Most of the pieces are already in place and it would be a useful addition.

Final thoughts


The Veeam Management Suite, especially the ONE products provide a lower cost alternative to the expensive, do-everything VMware solutions. Veeam could expand its existing products to compete in other areas where VMware is not affordable to many companies, specifically in the areas of disaster recovery and VM chargeback.

Last year's "free" upgrade for vSphere Enterprise Plus customers to the vCloud Suite Standard Edition has given many organisations access to vCloud Director and vCloud Networking and Security. But the vCloud Suite Standard Edition lacks chargeback and SRM functionality.

This has to be an opportunity for third parties such as Veeam...

Saturday, 29 June 2013

vSphere 6.0 wish list

In recent history, VMware have adopted the Intel-like "tick/tock" approach to software releases. In other words, each "tick" is a major release (e.g. 5.0) and each "tock" is a minor release (e.g., 5.1). This has occurred on an annual basis for the last few years.

If this approach continues, then vSphere 6.0 is presumably due out later in 2013. With that in mind, here are some of the things I would like to see in the next release... (I'm not a beta tester, so this is all pure conjecture).

Make vCenter Server easier to install


vSphere 5.1 changed the architecture of vCenter Server and introduced separate Single Sign On (SSO) and Inventory Services. My experience of installing all these components on a single server has been fairly easy, but a quick look around the forums and blogs show that some people have had real problems updating existing installations or doing more complex architectures.

One way to make this easier would be to...

Make the VCSA a first class citizen


The vCenter Server Appliance has been an alternative to the "traditional" Windows-based vCenter Server for the last couple of releases. The advantage of the VCSA is that it is extremely easy to deploy and update. The downside is that it does not have all the features of the Windows vCenter Server and using the built in database is only suitable for small deployments (five hosts and 50 VMs). Bigger installations require an external database.

A VCSA that fully supports all vCenter functions (including things like linked mode) and can handle large workloads without requiring SQL Server or Oracle would be a step in the right direction. The dependency on third party databases could be resolved if VMware were to...

Bundle vFabric Postgres


VMware products support a number of different databases:
  • Traditional vCenter Server supports SQL Server, Oracle and DB2
  • The vCenter Server Appliance supports its own internal Postgres database or external Oracle database
  • vShield Manager bundles its own Postgres database
  • vCloud Director supports an external Oracle or SQL Server database. The vCloud Director appliance bundles an internal Postgres database
  • vCenter Operations Manager supports Postgres, Oracle or SQL Server
While the internal databases are suitable for lab work and test environments, for production workloads there is a currently a requirement to use a third party database with the licensing overheads this incurs.

VMware have their own vFabric Postgres database (itself a virtual appliance) and bundling this could allow VMware users to point all their application servers to a single, scalable database server, without needing to buy database licences. Bundling a cut-down version of vFabric Data Director would make management easier and also expose VMware admins to the vFabric product suite, gaining mindshare in the process.

Make vCloud Director easier to install


vCloud Director runs on a Linux VM and has a dependency on either an Oracle or SQL Server database. There are a number of manual steps to get a vCloud Director cell installed that requires some messing around at the command line to configure bits, followed by running SQL scripts to setup the database. I'm not adverse to the Linux command line, but would prefer it if the install was more straightforward.

VMware provide a vCloud Director appliance for testing, but it's not currently supported for production workloads. It would make sense for VMware to enhance the appliance and make it a fully supported option for production use, removing unnecessary complexity in the setup process.

This theme of turning all VMware products into appliances extends to the...

VUM appliance


Please can we have a Linux based VMware Update Manager appliance? If everything else is moving to appliances, having a dedicated Windows host for patching seems overkill. Alternatively, bundle the functionality in with the VCSA.

Update vCNS Manager


The vCenter Network and Security Manager is a rebranding of what was previously called vShield Manager. The web UI of this product is looking pretty dated and provides a completely different user experience to the new vSphere Web Client. It would be good to see this UI refreshed for the next release and to integrate it more closely with vSphere and vCloud Director.

Make the vSphere Web Client faster


The vSphere Web Client looks very nice and instantly makes the old .NET client look dated. However, for day to day operations I still find myself using the old client, simply because it's faster for most operations.

I want to use the new client, but it seems really sluggish a lot of the time, or perhaps it's simply that my Windows PC isn't quad core with 16GB RAM. VMware need to spend some time refactoring and improving the performance of the web client.


Infrastructure Overhead

This is more to do with those of us with home labs, so I'm not seriously expecting VMware to do anything about this...

The system requirements for VMware management servers is getting ridiculous and, while I appreciate the new functionality, there doesn't appear to be much effort going on to minimise system resources. For example:
  • vCenter Server with Inventory and SSO services on the same VM: 10GB RAM recommended
  • vShield Manager: 8GB RAM required
  • vCenter Operations Manager: 10GB RAM required

While these requirements might be fine in large, production environments, for those of us wanting to build a decent lab, it's possible to use all your resources just building the infrastructure, with no capacity left over to actually run VMs!


Make vCenter Operations Manager easier to buy!


The current way of buying VCOPS is missing some obvious options:

There is a free version (VCOPS Foundation) which is missing loads of features.

vCenter Operations Manager Suite can be bought standalone in Standard, Advanced and Enterprise  editions, licensed based on the number of VMs or physical hosts monitored, but not using a per-processor licensing model.

It's possible to buy vSphere with Operations Manager (VSOM) which provides per-processor licensing. Regardless of which vSphere edition (Standard, Enterprise or Enterprise Plus), you get a copy of vCenter Operations Manager Standard.

The vCloud Suite Advanced bundles what looks like Operations Manager Advanced per-processor licensing.

The vCloud Suite Enterprise bundles what looks like Operations Manager Enterprise per-processor licensing.

But what about those of us who had vSphere Enterprise Plus (pre-VSOM) and took advantage of the offer to upgrade to vCloud Suite Standard? Can we get a per-processor licensed version of VCOPS added?

Not that I can see. We're stuck with VCOPS Foundation.

The point is moot for me. We recently bought Veeam ONE for our monitoring and reporting solution. While it's not as sexy as VCOPS, it's a lot more affordable and we were able to buy it with per-processor licensing.

VCOPS is a great product but it's licensing is too difficult and not flexible enough. And it's still too expensive.

Roll on VMworld...


Although I won't be attending VMworld this year (boo!), I'll be watching the keynote, blogs and Twitter to see what new products are announced. I think the core products are pretty much done, so at this point I think it's about improving ease of deployment and management. Whether any of the above wishes are actually implemented will become clear in a few months...

Saturday, 6 April 2013

Home Lab: Storage Upgrade

I've been quiet over the last few months due to a very busy work project. With that now out of the way (for now!), I can update on a few of the things I've been doing with the home lab...

As detailed in a previous post, I've been running my NAS/SAN on Nexenta. This was originally running as a VSA under ESXi. While this configuration worked very well for me, I did notice that on the Microserver, the Nexenta VM was being CPU and RAM constrained. When my old ML110 G5 was replaced by the G7 servers, I decided to reprovision the Nexenta VM onto dedicated hardware.

The ML110 G5 has six SATA ports and I configured them as follows:
  1. 250GB SATA
  2. 60GB SSD
  3. 1TB SATA
  4. 1TB SATA
  5. 1TB SATA
  6. 1TB SATA
The Nexenta operating system is installed on the 250GB disk. While this is not mirrored, there is nothing particularly important on this disk and if it dies then it can be replaced, Nexenta re-installed and the important zpool reimported.

The 60GB SSD is configured as a L2ARC (read) cache device.

The four 1TB hard drives are configured in a RAID10 configuration. This gives better performance than my original configuration which used RAID-Z (aka RAID5) and is used for both NFS and iSCSI shares. It's where the "important stuff" is stored.

I added a Lights Out board for the ML110 G5 which I found on Ebay. All my other servers have lights out and it means I don't need to have a monitor attached.

In addition to the capacity and improved performance due to the faster RAID, the other advantage of going physical is that I can dedicate the entire 8GB of RAM in the server to Nexenta (up from 4GB on the VSA). Nexenta also benefits from having two Xeon cores dedicated to it, a significant increase over the cores in the Microserver.

I've not had the chance to benchmark the new build, but it "feels" faster than running under a VM and it also frees up CPU and RAM on my Microservers for more VMs.