replication

EMC VNX – MirrorView configuration

Last week I have written a short post about introduction to MirrorView. This week I would like to write a little bit more about MirrorView configuration, terminology and usage. Let’s start with terminology

VNX MirrorView Key Terminology

Primary Image – LUN containing production data and the contents of which is replicated to secondary image.
Secondary Image  – LUN containing a mirror of the primary image LUN residing on a different VNX (secondary site)
Image condition – Provides additional information about the status of updates for a secondary image.
State – Remote mirror states and image states.
Consistency Group – Set of mirrors that are managed as a single entity and whose secondary images always remains in a consistent and recoverable state with the primary image and each other.
Consistency Group State – Indicated the current state of the consistency group.
Fracture – Condition in which I/O is not mirrored to the secondary image. Manually initiated by the administrator or by the system when it determines the secondary image is unreachable.
Promote – changes an image’s role from secondary to primary.

 Basic MirrorView Configuration

MirrorView allows for a large amount of topologies and configuration. The primary and secondary images must have the same server-visible capacity (user capacity), because they are allowed to reverse roles for fail-over and fail-back (see Promote definition above). But Primary Image and Secondary Image can reside on different RAID configuration. In order to use MirrorView, the software need to be loaded on both (primary and secondary) VNX arrays. Secondary LUNs are not accessible to hosts during the mirroring. Bi-directional mirroring (VNX array can be both primary and secondary site) is supported, as long as the primary and secondary images within mirror reside on different storage systems.

Consistency Groups

Consistency Groups allow all LUNs that are belonging to a give application to be treated as a single entity and managed as a whole. This helps to ensure that the remote images are consistent. As a result, the remote images are always re-startable copies of the local images. When a mirror is part of a Consistency Groups, most operations on individual members are prohibited (for example fracture, or synchronize can only be executed for the Consistency Group).

Site Level Fan-In

MirrorView supports 4:1 Fan-In ratio. It means that one VNX array can be a destination (secondary site) for 4 (different) primary VNX arrays. It’s a common configuration when remote VNX Array is used for consolidated backups, simplified failover or consolidated remote processing activities. The 4:1 Fan-In ratio is applicable to both MirrorView/S and MirrorView/A

LUN Level Fan-Out

Fan-out mirroring may be used to replicate data from one primary LUN to up-to-two secondary LUNs residing on different arrays. (MirrorView/S 1:2 Fan-out ratio). This configuration enables administrator to synchronously mirror one primary image to two different secondary images. In case of MirrorView/A, one primary image can be mirrored only to single secondary image (MirroView/A 1:1 Fan-out ratio).

 Port configuration

MirrorView ports are automatically assigned when the system is initialized. All MirrorView traffic goes through one dedicated port of each connection type (FC or iSCSI) per Storage Processor. For VNX that have FC and iSCSI systems, one FC port and one iSCSI are available for MirrorView traffic (per SP).

A path must exist between the MirrorView ports of SP-A of the primary and SP-A of a secondary system. Same relationship must exist for Storage Processor B. MirrorView ports may be shared with host I/O, but that might cause performance issues.

 Mirrored Image States

Once an image has been mirrored, the image may be in one of three availability states:

  • Inactive – inactive mirrored states means that the Administrator has stopped mirroring.
  • Active – an active status is considered a normal state, where all I/Os are allowed on the image.
  • Attention – this state indicated that something has happened to the mirrored image and action by an Admin is required.

In terms of the mirrored image consistency and relationship with the source image, MirrorView contains five data states

  • Out-of-Sync – means that a full sync is in order
  • In-Sync – indicate that the primary and secondary contains identical data and the operation is in progress
  • Rolling Back – the act of returning a primary to a predefined point-in-time.
  • Consistent – the mirroring has been stopped and a write intent or fracture log is needed to continue the mirroring
  • Synchronizing – the operation of synchronizing is in progress.

MirrorView Common Operations

Synchronization

Synchronization is a copy operation MirrorView performs during newly-created mirrors or to reestablish existing mirrors after an interruption. Initial synchronization is used to create a baseline copy of the primary image to the secondary. Primary images remain online during the sync process and until the synchronization is complete, the secondary image is unusable.

Promote

A secondary image is promoted to the role of primary when it is necessary to run production applications at the disaster recovery site. A promotion can only occur if the secondary image is in the consistent or synchronized state.

Fracture

A fracture stops MirrorView replication from the primary image to the secondary mirror. Administrative fractures are usually initiated to suspend replication, as opposed to a system fracture which is initiated by the MirrorView software. A system fracture typically means a communication failure between the primary and secondary systems.

With MirrorView/S writes continue to the primary image but are not replicated to the secondary during a fracture. Replication can resume when the user issues a synchronize command.

With MirrorView/A the current updates stop during a fracture and no futher updates will start until a synchronize request is issued.

 

SnapMirror – set it up!

What is SnapMirror?

SnapMirror is a feature that enables us to replicate data. You can replicate data from specified source volume or qtree to another destination. The destination can be on the same filer, or it can be in complete other location as long as there is a connection between source and destination.

There are three modes availabe:

  • SnapMirror Sync – replicated data to the destination ASAP – so basically when data is written to the source volume it is being replicated instantly.
  • SnapMirror Semi-Sync – The lag between source and destination is maximum 10 seconds. This mode give us better performance compared to sync mode, and still RPO (Recovery Point Objective) is close to be zero.
  • SnapMirror Async – I would say this is the one you will probably meet the most. The snapmirror is being updated based on the schedule, it can be updated as often as every minute, or once a month. This mode I will focus on in this post.

 How does SnapMirror work?

SnapMirrors task is to replicate the data from a source volume (or qtree!) to a partner destination volume (or qtree). Before using SnapMirror you have to establish a relationship between the source and the destination.
In case of SnapMirror Async you have to set up the schedule that goes to /etc/snapmirror.conf file on the destination filer/vfiler. Well – by definition you do not have to setup the schedule, but without the schedule the relationship will never get updated as long as storage admin will not update it manually.
So – how SnapMirror works when the relationship is initialized:

  1. Creates a Snapshot copy of the data on the surce volume
  2. Copies it to the destination, which can be a read-only volume or qtree
  3. Source and destination share the common snapshot.

As you can notice, when the relationship is initlized for the first time, step 2 is transferring all the data, in other words it is base-line copy.

How SnapMirror works when the relationship is already initialized and the update is executed:

  1. Create a Snapshot copy of the data on the source volume
  2. Compare the new Snapshot with the last common snapshot copy between the source and the destination
  3. Transfer to destination only the data that has changed since the last update

 Let’s set up of the Volume SnapMirror a-sync relationship

Step 1.  Add a proper licence to both source and destination filer:

filerA> license add xxxyyy
filerB> license add xxxyyy

Step 2. Turn on the SnapMirror

filerA> options snapmirror.enable on
filerB> options snapmirror.enable on

Step 3. Allow the access on the source (snapmirror.allow vs snapmirror.access)

filerA> options snapmirror.access host=filerB

Step 4. Create a source and a destination volume

filerA> vol create sourcevol aggr1 50g
filerB> vol create destvol aggr1 50g

Step 5. Restrict the destination volume (the destinatnio volume has to be restricted)

filerB> vol restrict destvol

Step 6. Initialize the snapmirror

filerB> snapmirror initialize -S filerA:sourcevol filerB:destvol
Transfer started.
Monitor progress with ‘snapmirror status’ or the snapmirror log.

Step 7. Check the status. If it’s empty volume the initialization shoud go really fast so after a minute or two you can see

filerB> snapmirror status
Snapmirror is on.
Source              Destination      State         Lag    Status
filerA:sourcevol    filerB:destvol   Snapmirrored   00:00:45   Idle

Step 8. Setup the snapmirror schedule

The snapmirror schedule has to be setup on the destination volume (/etc/snapmirror.conf)

The syntax of the schedule is:

src_system :/vol/src_vol[/src_qtree] dest_system :/vol/dest_vol[/dest_qtree] arguments schedule

Simple example, which will update the snapmirror relationship at 10 a.m. every Monday, Wednesday and Friday would be:

filerB>rdfile /etc/snapmirror.conf

filerA:sourcevol  filerB:destvol – 0 10 * 1,3,5

Summary

SnapMirror is a complex technology. In this post I presented only the most simple setup of asynchronus Volume Snapmirror. If you would like to go a little bit deeper into how the transfer works, how to setup the QSM or what arguments can you specify try this book:
Data Protection Online Backup and Recovery Guide

 

snapmirror.allow and snapmirror.access

To set a SnapMirror relationship between source filer and destination filer you have to allow the destination to pull from source. In other words  the source filer has to allow the destination filer to replicate the data from the entire volume/qtree. There are basically two ways to do it:

snapmirror.access

snapmirror.access is an option that let us provide the list of the filers that have a permission to pull the data from the source filer. To print the current setting just go with:


filerA>options snapmirror.access
snapmirror.access   host=filerB,filerC AND if=vif-10,vif-11

What does it mean? It means that filerB, and filerC has an access (as SnapMirror Destination) to pull data from SnapMirror Source volume/qtrees. The data can be accessed only by the network interfaces vif-10, and vif-11 (again – it is just an example).

If you would like to set it up by yourself, you can just go with:


filerA>options snapmirror.access host=filerC,10.12.12.13
filerA>
filerA>options snapmirror.access
snapmirror.access host=filerC,10.12.12.13
 

snapmirror.allow

snapmirror.allow is a file. The location of the file is /etc/snapmirror.allow and it can be edited with your favorite  wrfile command :). The syntax of the file is pretty simple:

filerA>rdfile /etc/snapmirror.allow

filerB
FilerC

But there is one trick. If you would like to use snapmirror.allow you have to set the snapmirror.access option, because this is the first thing that is checked.

filerA>options snapmirror.access legacy

If the snapmirror.access is not set to legacy option, the filer will not check the snapmirror.allow file at all.

Troubleshoot the access

The first thing would be to check if a proper license is installed on both source and destination, but I’m sure you already checked that.
If you use the host-name instead of IP – make sure that the filer can resolve the name and the host is reachable, the easiest way is to ping it:

filerA>ping filerC
filerC is alive

If you can ping the host by IP but not by the host-name, make sure the filer can resolve the name (check /etc/nsswitch.conf and optionally /etc/hosts).