Wednesday, 4 April 2012

EMC VNXe - diving under the hood (Part 3: DART)

In the previous post, we looked at the parts of the VNXe that are derived from the FLARE (CLARiiON) code. The result is a number of LUNs that are presented up the stack to the DART (Celerra) part of the system.

Using the "svc_storagecheck -l" command, we can see that a total of 20 disks are found. These map to the two FLARE LUNs from the 300GB SAS RAID5 RAID Group and the sixteen FLARE LUNs from the 2TB NL-SAS RAID6 RAID Group, plus two other disks: root_disk and root_ldisk.

root_disk and root_ldisk appear to map to the internal SSD on the Service Processors and are not visible to the end user for configuration. These disks appear to have root filesystems, panic reservation and UFS log filesystems.

The FLARE LUNs are seen as disks to DART and are commonly referred to as "dvols".

The dvols are grouped into Storage Pools. The following are defined by the system, along with a subset of their parameters:

Name Description In use Members Volume Profile
clarsas_archive CLARiiON RAID5 on SAS False
clarsas_r6 CLARiiON RAID6 on SAS False
clar_r1_3d_sas 3 disk RAID-1 False
clar_r3_3P1_SAS RAID-3 (3+1) False
performance_dart0 performance True d18,d19 N/A
capacity_dart1 capacity True d23,d24,d25,d26

As the above table shows, the LUNs presented from the FLARE side of the VNXe are assigned to the performance_dart0 and capacity_dart1 pools.

The Volume Profile should be familiar to Celerra administrators and is the set of rules that define how a set of disks should be configured.

On a Celerra, disks could be configured manually (if you know exactly what you want) or automatically using the "Automatic Volume Manager" (AVM). Because the VNXe is designed to be simple, AVM does all the work.

An AVM group called "root_avm_vol_group_63" (the svc_neo_map command refers to this as the "Internal FS name") has been created and consists of two dvols, d18 and d19 that corresponds to the performance_dart0 storage pool. These two dvols map to the two LUNs presented from the 300GB SAS disk RAID Group. It appears when a filesystem is created, the first disk is partitioned into a number of slices (sixteen on d18). Each slice then has a volume created on it and finally, another volume is created that spans across all the other volumes. It's this top level volume, called v139 in the diagram below, on which a filesystem is created:

Note that d19 in the above diagram isn't used. If the filesystem is expanded beyond the capacity of the single disk, then presumably the next disk is used. For some reason, slice 68 doesn't have a corresponding volume. I would welcome any explanation as to why this is.

The configuration for the capacity_dart1 pool is very similar, albeit with many more disks (sixteen instead of two) and many more slices. Unfortunately it's too big to show here. As an example, the first disk, d23, has 40 slices of its own that form part of the pool.

The use of all these smaller slices presumably means that a filesystem can grow incrementally from the pool (and possibly shrink?).

When the filesystem is created, it isn't visible to an external host. On a Celerra or VNX, this functionality would be handled by a physical data mover. The VNXe uses a software "Shared Folder Server" (SFS) which acts as the server to the other hosts on the network.

Multiple Shared Folder Servers can be created (apparently up to 12 Shared Folder Servers (file) and/or iSCSI Servers (block) are supported), each with its own network settings and sharing its own filesystems out over NFS or CIFS. Note that while a SFS can handle both NFS and CIFS, a single filesystem within a SFS can support either NFS or CIFS, but not both at the same time.

From a disk perspective, EMC have done well to hide a lot of legacy cruft away from the user and the encapsulation of FLARE and DART, along with the software implementation of the data mover idea is a neat evolution of an aging architecture.

There is more to look into such as networking (which has provoked a significant number of questions on the EMC forums) and I'd like to find out more about the CSX "execution environment" that underpins much of the new design. I'd be sure to post more if/when I get more information, but hopefully you've found this a useful dive under the hood of the VNXe.

No comments: