Thursday, 5 April 2012

EMC VNXe - diving under the hood (Part 4: CSX)

After the last post, I was pointed in the direction of the "VNXe Theory of Operations" online training available from (just do a search for it). This free course provides some interesting details into the VNXe architecture.

With knowledge gained from the course in mind, let's see if we can get a better understanding of what's happening under the hood...

C4LX and CSX

When the VNXe was announced, Chad Sakac at EMC referred to the it as "using a completely homegrown EMC innovation called C4LX and CSX to virtualize, encapsulate whole kernels and other multiple high performance storage services into a tight, integrated package."

In the same blog post, Chad also illustrated the operating system stack which showed the C4LX and CSX components are built on a 64bit Linux kernel.

CSX (short for "Common Software eXecution") is designed to provide a common API layer for EMC software that is not tied to the underlying operating system kernel. As a portable execution environment, CSX can run on many platforms in either kernel or user space. So when some functionality is written within the CSX framework (e.g., data compression), it can be easily ported to all CSX supporting platforms, regardless of whether the underlying operating system is DART, FLARE or something else. Steve Todd has some more details about CSX on his blog.

So if you read that CSX instances are similar to Virtual Machines, think of it in terms more like a Java Virtual Machine rather than a VMware Virtual Machine. It's an API abstraction and runtime environment, not a virtualisation of physical resources such as CPU and memory.

There aren't many details on what C4LX is, but here's my conjecture: There are some functions that CSX needs the underlying operating system to perform that may not be easily possible "out of the box". If that's true, then C4LX is the Linux kernel along with a bunch of kernel modules and additional software that provide this functionality. Or another way to describe it might be to call it EMC's own internal Linux distribution...

Data Path

Like the data plane and control plane in a network switch, software in the VNXe appears to be designed to operate on the "Data Path" or the "Control Path".

CSX creates various "Containers" that are populated with "CSX Modules". A Container is either a user space application or a kernel module. CSX Containers implement functionality within the Data Path.

The FLARE functionality described in part 2 of this series is implemented as a CSX Module, as is the DART functionality described in part 3. Both these modules run in the Linux user space. There is a degree of isolation between Containers in that they be terminated and restarted without interfering with other Containers. However, some Containers (such as DART) have dependencies on other Containers (FLARE).

In addition to the FLARE and DART Containers, a Global Memory Services (GMS) Container provides memory management functionality and services other Containers. As an example, the FLARE Container takes 500MB memory, while the DART Container takes 2.5GB, all allocated by the GMS.

A kernel space Container is responsible for allocating resources on behalf of user space Containers. The Linux Upstart software provides a means to start, stop and restart Containers.

Control Path

The Control Path is a implemented using technology derived from the Celerra Control Station (itself a Linux-based server) and the CLARiiON NaviSphere software. The Control Path is also where the Common Security Toolkit (CST) is found. The CST appears to be RSA technology and is used in multiple EMC products for security-relation functions. In contrast to the Data Path which consists of functionality directly relating to the transferring of data, the Control Path is concerned with management functionality.

Within the Control Path of the VNXe is the EMC CIM (Common Interface Module) Object Manager (ECOM) management server. ECOM interfaces with "Providers" which are essentially plug-ins. Within the VNXe, ECOM runs on the master SP.

There are a number of different Providers. These include Application Providers for Exchange, iSCSI, Shared Folders and VMware software provision, a Virtual Server Provider, Pools Provider, CLARiiON Provider and Celerra Provider. There are also providers for Registration, Scripting, Scheduling, Replication etc. As plug-ins to the ECOM server, additional services can be written to extend the functionality within the VNXe.

With the use of Providers, ECOM implements a middleware subsystem that can be called by front end applications such as Unisphere or the VNXe command line.

In addition to running Providers, ECOM also provides basic web server functionality used by the Unisphere GUI and CLI via the Apache web server.

Pulling it together

The VNXe uses some additional Linux software along with the custom CSX and ECOM components. High availability is implemented through the open source Pacemaker cluster resource manager and using the Softdog software timing kernel driver. CSX components are resource managed using the cgroups feature of the Linux kernel. The Logging system uses the Postgres database. Although this is covered in the EMC training, it's also possible to see this by checking the output of "ps" from an SSH session.

To understand how the various components hang together, the boot sequence looks a bit like this:

  2. Linux boots and initiates run level 3
  3. The "C4" stack is loaded by the Linux Upstart software:
    • CSX infra
    • Log daemon
    • GMS Container
    • FLARE Container
    • admin
  4. Pacemaker is loaded and automatically starts:
    • Logging
    • DART Container
    • Control Path software (ECOM on the master SP based on mgmt network status)
At which point, all the components are initialised and ready to go. Obviously there are additional details that I'm not covering (check the training if you want some more information on functionality such as replication, vault-to-flash, more on HA, licensing, logging and deduplication).

Hopefully this gives some insight into the complexity that underpins the VNXe. We're going to look at one more topic to conclude this mini series, and it's a subject that is the source of many questions on the EMC VNXe Community forum. In the next post we'll have a look at VNXe networking...

No comments: