Tuesday, 17 February 2015

Thoughts on migrating from vCloud Director

A couple of years ago, VMware provided a "free" upgrade from vSphere Enterprise Plus to the "vCloud Suite" standard edition. This gave enterprises access to the vCloud Networking and Security (vCNS) and vCloud Director (vCD) products, enabling vApp firewalling and routing, self service provisioning and multi-tenancy support. Third party companies such as Veeam and VMTurbo started adding vCloud Director support into their products and the future seemed bright. We had the tools to build private clouds.

Then VMware bought Dynamic Ops and decided to refocus enterprise customers on what was now called vCloud Automation Center (vCAC). vCloud Director would continue as a Service Provider tool only. As mild compensation, a cut down version of vCAC was added to the vCloud Suite for standard edition users.

With the release of vCloud Suite 6.0, vCD and vCNS appear to have been dropped. While VMware are continuing support for these products through to 2017, it is obvious that they are not the future if you are in the enterprise space.

So what should vCD and vCNS users do?

The answer VMware gave back in 2013 when this happened was to look to vCAC (now vRealize Automation) to replace the portal aspects of vCD. That blog post gave a suggestion that some vCloud Director functionality would move "up" to vCAC and other functionality would move "down" to vCenter Server:

VMware has been pretty much silent on the subject ever since.

For vCAC/vRealize Automation to successfully replace vCD, it needs to:
  • Support multiple organisations/tenants
  • Enable delegation of organisation VMs to non-IT end users
  • Provide IT with tools to easily assign computer, memory and storge resources to specific organisations
  • Allow for the creation of standard images through a service catalogue
  • Allow for the creation and dynamic implementation of networks and complex vApps
  • Allow for firewall/routing/VPN between vApp networks
  • Provide integration points for third party backup and monitoring tools

At this stage, I'm not sure if vCAC can do this or not. My limited exposure to the product (thanks to a presentation at the South West VMUG) left me with a feeling that to do anything with vCAC required a fair amount of development work and integration with vCenter Orchestrator.

So with vCD's migration path unclear, what about vCNS?

In the knowledge base article, End of Availability (EOA) of vCloud Networking and Security (vCNS) in vCloud Suite 6.0 (2107201), VMware recommends that customers migrate to NSX at a "discounted price". Hmm, so if customers don't pay more, what do they lose? Edge and App firewalls? VPN into vApps? Load balancing? So how will more complex vApps with private networks utilising network pools work in this situation? Will any of this even be possible without NSX?

Again, more questions than answers.

In the past, some customers were burnt when VMware deprecated Lab Manager in preference to vCloud Director, and they've done it again now with vCloud Director to vRealize Automation and vCNS to NSX. This creates a lot of work for customers, for little apparent gain, and does nothing to instil a sense of confidence that the "new" solution is going to be around in five years.

To VMware, you need to improve communication in this area. Customers need to make plans and the silence regarding on-premise private cloud is uncertain. At the moment, there seems to be no like-for-like migration plan that doesn't cost the customer more, both in terms of effort required and additional SKUs.

And the "discounted price" for NSX is frankly insulting. Don't sell enterprises the dream of private cloud, provide the tools to build it, then pull the rug from under us because you have a new product you want to sell. Providing a discount that expires in year is useless to organisations who have already submitted their budget requests.

For me, I guess I need to schedule some time in to see what vRealize Automation is actually capable of. But I'll also be watching closely to see what others in our position are doing and if there are any alternative options.

[Update - 4th March 2015:  The VMware knowledge base article referenced above has gone offline. Perhaps VMware are re-evaluating???]

Monday, 16 February 2015

Passing the VCP550D exam

Last year VMware announced that the VMware Certified Professional (VCP) certification would only be valid for two years, ostensibly to ensure that candidates didn't become out of date. Now, I have no problems with recertifying when the certification isn't version specific (e.g., CCNA), but because the VCP is tied to a release of software (VCP4, VCP5 etc.), forcing a recertification does seem a bit like a cash-grab by VMware Education.

With my VCP scheduled to expire next month, I spent a couple of weeks revising and took the exam today. Fortunately I passed with a score of 340 (the passing score is 300). To be honest, I'm a bit disappointed that I didn't score higher, but a pass is a pass and it got the job done.

The exam I took was the VCP550D "delta", which focuses on the differences between vSphere 5.0/5.1 and 5.5. However, it would be worth revising the standard VCP material too as there are a lot of generic questions. The exam blueprint for the 550D is the same as for the 550, which didn't help much.

For revision, I did the following:

Took the free VMware vSphere What's New Fundamentals [v5.5] course

Took the free VMware VSAN 101 course, which has subsequently been replaced by the VSAN 6.0 course.

Signed up for the Pluralsight 10 day trial subscription and took the VMware vSphere 5.5 New Features course

Built a nested home lab environment to test a bunch of new features. William Lam's OVF template for creating Nested ESXi VSAN clusters was very helpful in getting an environment up and running quickly (as was using the vCenter Server Appliance).

There are a number of features that I specifically focussed on when revising because I don't use them day-to-day, including: vSphere Data Protection (we use Veeam), vSphere Replication (we use Veeam), VSAN (we have a SAN/NAS) and VCOPS. Getting hands on with these features in the lab was extremely helpful, although make sure you're not too rusty of "basic" VCP questions covering network, storage, DRS/HA, update manager etc.

The exam itself is online and open book, but this doesn't make passing it a foregone conclusion. You still need to know your stuff! I found it helpful to have my home lab powered up and logged in, along with the VCOPS dashboard in case I needed to quickly cross-reference something. I made sure I had access to the VMware PDFs (but didn't actually use them). Having access to Google was very useful too(!).

With 65 questions in 75 minutes, there was plenty of time to go through the exam and then have time to review "marked" questions. I did use all my time and didn't finish the review, but, obviously did enough to pass.

If you are a VCP5 holder, you only have until the 10th March 2015 to recertify. Doing the VCP550D is the most efficient and easiest way to stay certified.

Wednesday, 17 December 2014

Solaris Live Upgrade, ZFS and Zones

I've been working on this problem for a few days and have only just solved it, so thought it might be worth sharing...

Solaris is a very powerful operating system with some great features. Zones brought Docker-like containers to Solaris back in 2005, ZFS is one of the most advanced filesystems currently available, and the Live Upgrade capability is highly underrated and is a great way to patch a server while ensure you have a back out plan.

All good stuff, but when you put Live Upgrade into a mix of Zones and ZFS, things get a bit flakey.

The issue I was having was that when I ran the "lupc -S" (Live Upgrade Preflight Check) script on my zone, I'd get the following message:

# lupc -S
This system has Patch level/IDR  of 121430-92.
Please check MOS (My Oracle Support) to verify that the latest Live Upgrade patch is installed -- lupc does not verify patch versions.

Zonepath of zone is the mountpoint of top level dataset.
This configuration is unsupported

Oracle has a document on My Oracle Support: "List of currently unsupported Live Upgrade (LU) configurations (Doc ID 1396382.1)" which lists a lot of ways in which Live Upgrade won't work(!). On checking this document for the top level dataset issue, it gives the following text:

If ZFS root pool resides on one pool (say rpool) with zone residing on toplevel dataset of a different pool (say newpool) mounted on /newpool i.e. zonepath=/newpool, the lucreate would fail.

Okay, except that's not what I've got. My zone, s1, has a zonepath set to  /zones/s1. The zpool is called "zones" and "s1" is a separate ZFS filesystem in this dataset.

What the system is actually complaining about is that the zpool is called "zones" and is mounted as "/zones". The workaround is to set the ZFS mountpoint to be something different from the pool name.

For example,  I created a new ZFS filesystem under zones called "zoneroot":

# zfs create zones/zoneroot

Then (and this is the important bit), I set the mountpoint to something else:

# zfs set mountpoint=/zoneroot zones/zoneroot

Running zfs list for this dataset shows:

zones/zoneroot                  1.80G   122G    32K  /zoneroot

Now, I can create a zone, let's call it "s2":

# zonecfg -z s2
s2: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:s2> create
zonecfg:s2> set zonepath=/zoneroot/s2

zonecfg:s2> verify
zonecfg:s2> commit
zonecfg:s2> exit

On installing this zone, a new ZFS file system is created, /zoneroot/s2.

Now, when running the "lupc -S" command, Live Upgrade doesn't complain about unconfigured configurations!

Saturday, 15 February 2014

N36L and N54L in the same vSphere cluster

In the home lab I have a two-node cluster built around the HP Microserver N36L. Recently I took delivery of an N54L which I wanted to add to the cluster.

Although it was straightforward to add the new host, I was unable to vMotion VMs between the two types of CPU as there was a CPU incompatability:

Okay, so the CPUs are different between the two servers, but they are both AMD, so I figured I could make it work by enabling Enhanced vMotion Compatability (EVC). Unfortunately, this complained that the CPUs were either missing features, or that the VMs were using these CPU features:

As it turned out, the problem was due to the virtual machines in the cluster.

In order to work around the error, I powered off each VM in the cluster, moved them to a host outside of the cluster, edited the settings of the VM, selected Options > CPUID Mask > Advanced > Reset all to Default.

This cleared a bunch of flags that have been set at one time (not by me; must have been an automatic change). Once that I was able to configure the cluster to use EVC for AMD "Generation 2" processors.

I was then able to cold migrate the VMs back to the new cluster and power them on. One problem though: How to move the vCenter Server and its SQL Server database VM into the cluster without powering it off. I tried to vMotion them while they were on, but got the same error as above.

The answer to this, courtesy of VMware knowledge base article 1013111, is to open a vSphere Client connection directly to the host running the vCenter and SQL Server VMs, power off the VMs, right-click on them and Remove from Inventory. Then open another vSphere Client connection directly to a host in the EVC-enabled cluster, and then browse the datastore containing these VMs. Once located, right-click the VMX file and add to inventory. This will register the VMs with the host and you can power them on (be sure to say you moved the VM when prompted). Once loaded (which can take some time on low end servers), open a new vSphere Client session to the vCenter Server and you should be able to see the VMs in the correct cluster.

Monday, 12 August 2013

VNXe: Using Unisphere remotely

I'm not sure if this has always been in Unisphere or was added as a recent release, but I've only just discovered this feature...

The VNXe is managed by a web-based Flash application called Unisphere. This application makes use of smooth fades for menus and screen information. All very impressive, until you're on an RDP session at which point it's mildly annoying.

Fortunately there is an option to allow the fading to be turned off. To enable, go to Settings and then Preferences. Tick the box labled "Optimize for remote management access" and click "Apply Changes".

The application will now feel a lot snappier.

Tuesday, 6 August 2013

EMC VNXe volume layout

Because of the way the VNXe "simplifies" the allocation and management of storage, it can be difficult to work out just what's happening under the hood. This can result in some unexpected behaviour...

When we originally purchased the VNXe, we only had six NL-SAS disks which were automatically assigned to the "Capacity Pool". The six disks were configured by the VNXe to be part of a RAID Group in a RAID 6 (4+2) configuration (i.e., four data disks, two parity disks). From this pool, we carved out a volume. This volume is striped across all six disks:

Figure 1 - single RAID Group

Additional volumes could be created assuming there is capacity in the pool and would also be striped across the disks in the RAID Group.

Since the original installation, we have added a couple of expansion trays and now have a total of 24 NL-SAS disks. The VNXe has configured three extra RAID6 (4+2) Groups, but does not automatically rebalance existing volumes across all the RAID Groups in the pool.

When creating a new volume, the VNXe will now stripe across all four RAID Groups (in fact, four RAID Groups is the maximum that the VNXe will stripe over, regardless of the the number of RAID Groups in a pool). This means that although two volumes are created from the same pool, and may be of equal size, there may be different performance characteristics. Volume B should have four times the IOPS as Volume A:

Figure 2 - multiple RAID Groups

A further complication may arise if the first volume created completely filled the RAID Group. In this instance, the VNXe will not be able to use the first six disks and the second volume will be striped across 18 disks.

Figure 3 - Surprise!

Unfortunately, because the VNXe is "simplified", it's not possible to see this from the Unisphere web interface. EMC seem to assume that the only metric non-storage specialists care about is capacity. Not true!

Fortunately, there is a way to identify how many disks are actually assigned to a volume, even if it's not something that can be tweaked. To do this, open an SSH session (see my previous blog post on how to do this) and run the svc_neo_map command:

svc_neo_map --fs=volume_name

The list of filesystems configured on the VNXe can be viewed in Unisphere under System > Storage Resource Health. The svc_neo_map command runs a bunch of internal commands and outputs a lot of data, but it's the last part, the output of "c4admintool -c enum_disks", that interests us here. This shows the number and identity of the disks that have been assigned to the filesystem. In the following example, the volume is spread over six disks:

# Show disk info, based on wwns from RG.
root> c4admintool -c enum_disks

Disk #    0  Enc #    1  Type 10(DRIVE_CLASS_SAS_NL, RPM 7202) Flag ()  Phys Cap: 3846921026(0xe54b5b42) Disk WWID      wwn = 06:00:00:00:05:00:00:00:0f:00:00:00:00:01:00:03
Disk #    1  Enc #    1  Type 10(DRIVE_CLASS_SAS_NL, RPM 7202) Flag ()  Phys Cap: 3846921026(0xe54b5b42) Disk WWID      wwn = 06:00:00:00:05:00:00:00:10:00:00:00:01:01:00:03
Disk #    2  Enc #    1  Type 10(DRIVE_CLASS_SAS_NL, RPM 7202) Flag ()  Phys Cap: 3846921026(0xe54b5b42) Disk WWID      wwn = 06:00:00:00:05:00:00:00:11:00:00:00:02:01:00:03
Disk #    3  Enc #    1  Type 10(DRIVE_CLASS_SAS_NL, RPM 7202) Flag ()  Phys Cap: 3846921026(0xe54b5b42) Disk WWID      wwn = 06:00:00:00:05:00:00:00:12:00:00:00:03:01:00:03
Disk #    4  Enc #    1  Type 10(DRIVE_CLASS_SAS_NL, RPM 7202) Flag ()  Phys Cap: 3846921026(0xe54b5b42) Disk WWID      wwn = 06:00:00:00:05:00:00:00:13:00:00:00:04:01:00:03
Disk #    5  Enc #    1  Type 10(DRIVE_CLASS_SAS_NL, RPM 7202) Flag ()  Phys Cap: 3846921026(0xe54b5b42) Disk WWID      wwn = 06:00:00:00:05:00:00:00:14:00:00:00:05:01:00:03

What do you do if you find yourself unable to take advantage of all your RAID Groups? As mentioned above, the VNXe does not auto balance existing volumes when new disks are added. The only way to take advantage of the new disks is to create a new volume, copy everything across and delete the original. Not ideal. Even the VNXe's bigger sibling, the VNX, had this limitation until the recent "Inyo" release. Hopefully, the new auto balance feature will trickle down and appear in a future VNXe update.

I think as it stands, the current VNXe Unisphere is too simple. While I'm glad that the VNXe hides the FLARE LUNs and DART dvols, slices, stripes, metas etc from the user, it needs to go further in showing where a volume is going to be placed and what performance can be expected. This is something that even non-storage admins care about!

Any comments/corrections welcome.

Monday, 15 July 2013

Veeam Management Suite Wish List

We've been running Veeam Backup & Replication (B&R) for a couple of years after I saw it demoed at the London VMUG. It's a great product that works well and has proven itself many times when we've had to restore a VM. It's not perfect (what is?) but overall, it's definitely our go-to backup product.

The Veeam Management Suite is a package that combines B&R with Veeam ONE, itself a suite of three products: Veeam Monitor, Reporter and Business View.

Veeam ONE competes with other VMware monitoring and reporting solutions such as VMware's vCenter Operations (VCOPS) and VMTurbo Operations Manager.

We opted for Veeam ONE as it provides a good balance between cost and features. Personally, I prefer the VCOPS user interface (especially the way it now integrates with the vSphere Web Client) and it's a seriously powerful product, but the cost is prohibitive in our environment.

We did look seriously at VMTurbo, and while I think its "Intelligent Workload Management" and "Economic Scheduling Engine" are very clever, I didn't think that the interface was accessible enough for our IT operations support team. Very whizzy, but too complicated.

In contrast, although Veeam ONE is less sophisticated than some of the above products in certain areas, the Veeam ONE Monitor interface is very functional and gives clear remediation steps for alarms. We needed to get a product that could help us with our proactive monitoring and ongoing capacity planning, and Veeam ONE meets these needs.

With both products due a version 7 refresh this year, these are the highlights that I'm excited about:

vCloud Director support

Without a doubt, this is the "must-have" feature for us and the arrival of version 7 will coincide with the deployment of vCloud Director (vCD) into our environment.

Having the ability to backup all the vCloud metadata and restore vApps back into the vCD infrastructure is an essential component that has, until now, been unsatisfactory. Veeam should be rightly proud that they have the first vCD backup/restore solution.

vSphere Web Client plugin

From vSphere 5.1, the Web Client has become the primary interface to vCenter. Veeam B&R 7 will introduce a plugin that allows visibility into backup jobs through the Web Client.

It looks pretty decent and this kind of integration reduces the number of applications vSphere admins have to open  in order to keep track of their infrastructure.

Tape support

Tape is dead. Except for all the places where it isn't!

For many organisations, bandwidth costs mean off-site backups can only be effectively done using tape. Apparently any tape drive or library that can be seen by Windows will be usable and this will help many organisations overcome a clunky backup approach that currently consists of multiple backup applications.

WAN acceleration

I'm tentatively excited about this because although the compression ratios look impressive, I'll need to do some testing when B&R 7 is released to see how effective it is in our environment. Certainly having another option alongside the much wanted tape support would be good for those really important, mission-critical VMs.

The Wishlist

Although the above additions will make Veeam 7 a compelling upgrade in itself, there are some features I'd like to see:

A vCenter specific backup job

There has been a long running "challenge" in backing up vCenter using VSS. The provided solution is to manually add the ESXi host on which vCenter is located and backup that way. Which is fine until the vCenter VM is moved to another host (via automated DRS, manual vMotion or the result of an HA failover).

Given that computers are designed to automate repetitive tasks for us, I'd like to see a Veeam backup job type that is designed specifically for vCenter and when run, queries vCenter to find out which host its on and then directly connects to that host to perform the backup.

I appreciate that there is likely to be more complexity under the hood to solve this problem, but the current method is unsatisfactory and requires too much manual intervention.

NetApp snapshot support

Veeam has obviously got a good relationship with HP and has introduced the ability to backup Lefthand SAN snapshots. As a NetApp customer, I'd love to see the same functionality for the NetApp filers.


VMware Site Recovery Manager (SRM) is very impressive and allows an admin to define a "run book", essentially a series of steps to perform should DR be invoked. This includes the order in which VMs should be started, any IP address changes or network settings that need to be modified for the remote site, etc. With it, a site can be failed over (and back) by clicking a button.

Although Veeam B&R is currently focused on replication of VMs to another site, there appears to be a huge gap in the market for something that sits above the replication functionality and provides SRM functionality at a lower price point.

Backup and Restore Self Service

vSphere 5.1 is bundled with the VMware Data Protection (VDP) appliance. Based on EMC Avamar technology, this basic (and limited) backup product has a couple of nice features. Firstly, it integrates seamlessly with the vSphere Web Client (as will Avamar 7). Secondly, it provides a self-service interface that allows end users to restore their own files (as demoed on Chad's blog). I know that Veeam Enterprise Manager has similar functionality, but I'd like to take it a step further.

It would be useful to give users the ability to specify their own VM backup and restore jobs. As vCloud Director enables end users to provision their own VMs, there needs to be a way to allow these VMs to be scheduled for backup and then restore them in the future, without the need to log a call with IT.

I appreciate that there are lots of factors to consider when allowing end users to run their own backups such as how to quota backup repository space or prevent developers from backing up their VMs every hour ("just in case"), but in a multi-tenant environment, having the ability to delegate this functionality becomes important.

It's possible that this functionality may be in version 7 (I don't know, I'm not a beta tester).

Veeam ONE UI integration

There is some UI inconsistency across the Veeam ONE products. Monitor is a traditional Windows application, while Reporter and Business View are web applications. None of the above look like the Backup & Replication user interface.

Similarly, the Veeam Enterprise Manager is a web application, but has a different look and feel to the other products.

Although it's not the end of the world, I'd like to see some tighter integration between the three ONE products, Enterprise Manager and B&R to enable the oft-promised "single pane of glass". At the moment, the multiple separate icons on my desktop highlight how there is more integration to do at the front end.

As I've previously commented, I like it when applications smoothly integrate with vSphere Web Client. VMware have done this with VCOPS Foundation, and Veeam themselves will with B&R 7. This would be the logical conclusion of a UI merge for Veeam ONE.

And while the existing interfaces are functional, they're not beautiful and lack the "wow" factor when compared with vCenter Operations Manager.

Veeam ONE appliance

Okay, so this is unlikely to happen, but many of the competing solutions come in the form of virtual appliances on Linux. They don't require a separate database install or Windows licence. With VMware moving towards a Linux based appliance for vCenter Server, this appears to be the future.

I'm not really expecting this one, since Veeam support both VMware and Hyper-V, but hey, it's a wish list.

Chargeback Reporting

I've not had the opportunity to dig into all the reports thoroughly, so this functionality may already exist (but I've not found it after a quick look).

Veeam Business View provides a way to group VMs logically, such as by business unit. I'd like to see the ability to be able to define costs for vCPU, vRAM, datastores and operating systems and be able to assign these properties to specific business units (or Business View entities). With the addition of a report pack, we would then have the ability to generate custom chargeback reports for our vCloud environment.

If this sounds like VMware Chargeback Manager then its because Veeam seem to be almost there with it. Most of the pieces are already in place and it would be a useful addition.

Final thoughts

The Veeam Management Suite, especially the ONE products provide a lower cost alternative to the expensive, do-everything VMware solutions. Veeam could expand its existing products to compete in other areas where VMware is not affordable to many companies, specifically in the areas of disaster recovery and VM chargeback.

Last year's "free" upgrade for vSphere Enterprise Plus customers to the vCloud Suite Standard Edition has given many organisations access to vCloud Director and vCloud Networking and Security. But the vCloud Suite Standard Edition lacks chargeback and SRM functionality.

This has to be an opportunity for third parties such as Veeam...