In my last post I have explained a little bit Journal and Repository Volumes. If you know what those volumes as used for, you are aware that Journal Volumes hold poin-in-time history of the data. And in this short entry I would go a little bit into those point-in-time history.
Snapshot – as you can image, it’s a point-in-time snap, marked by the RecoverPoint system for recovery purposes. Within the EMC RecoverPoint a snapshot include only the data that has changed from the previous snapshot. The first Snapshot has all the changes between the moment of snapshot creation and
- current state – if only one snapshot is created
- next snapshot – if more snapshots are created
In other words – a snapshot is the difference between one consistent image of storage data and the next. In synchronous replication every write is a single snapshot. In asynchronous replication the RPA gathers several writes into a single snapshot (you can actually adjust that within the configuration.
Now, let’s get back for a second to EMC RecoverPoint – Introduction. As you know, the replication can be local or remote (or both!). Now, each Replica has its own Journal, so if you have same customer data replicated both locally and remote, you can have different policies for those two! For example synchronous replication for local protection and async replication for remote one.
Bookmark – a bookmark is a text label that is applied to a snapshot. In other way if you wish to manually create a snapshot and name it – boom, that’s it – you have a bookmark. You can create those if you need to have a specific point-in-time, e.g. right before the application upgrade, or right before the maintenance break etc.
As you can see on above print-screen, we have an example list of Snapshots, some of them have names, which makes them Bookmarks. Simple as that.
Last week I have created a post describing EMC RecoverPoint Architecture – mostly from hardware point of view. This time I would like to go a little bit into software. As you already known, an RPA (RecovePoint Appliance) manages all aspects of data protection for a storage group, such as maintining the images in the Journal Volumes. But what are Journal Volumes? What is the Repository Volume? Let me answer those in this short post
The Repository Volume is a special type of volume which is dedicated on the SAN-attached storage for each RPA Cluster. Important to remember – one Repository Volume for each RPA Cluster. This volume stores configuration information about the RPAs and Consistency Groups. It is used, for example, during one of the RPA failure in a cluster, when rest of RPAs (within same cluster) are taking over its job. Beside the function, Repository Volume, is a normal volume, that can be provisioned from example from VNX. Although it should not be a thin LUN (rather thick, or traditional RAID LUN). In addition – the volume cannot be located on a VPLEX distributed device
Journal Volumes are very much connected with Consistency Group term, that I will describe a little bit better later. For now you can think of Consistency Group as a group of volumes, the CG ensures that updates to the production volumes are also written to the copies in consistent and correct write-order. Consequence: Copies are always consistent. Now – back to the Journal Volumes. Each copy of data in a Consistency Group must contain at least one volume that is dedicated to hold poin-in-time history of the data. Journal volumes hold snapshots of data to be replicated – holding as many poin-in-time images as the capacity allows. There are two types of journal volumes:
- Replica (Copy) Journal(s)
- Production Journal(s) – this one is more-or-less not used during normal operation, however it is necessery and being used when the relationship is promoted to secondary side.
Journal Volume cannot be masked to host(s) – only RPAs in the cluster should have access to it.
In synchronous replication, every write is retained in the replica journal, so you can recover to any point in time. In asynchronous replication, several writes are grouped in a single snapshot. The granularity for async replication can be set to seconds or MBs.
RecoverPoint Consistency Group
Above you see the example of both CPD and CRR replication. As you can see we have three Journal Volumes:
- two Replica (Copy) Journals
- one Production Volume
As I mentioned above, normal operation doesn’t include Production Journal Volume, both CDP Replica Volume and CRR Replica Volume are used for communication between RPA (in this case vRPA) and Storage.
RecoverPoint architecture consists of components such as RecoverPoint software, Appliance, write splitters etc. In this post I would like to give you a quick introduction to those components. To understand how RecoverPoint actually works first you have to understand all the small building-blocks of the RecoverPoint System.
RecoverPoint Appliance (RPA)
EMC RecoverPoint is appliance-based – which makes this solution highly scalable. RPA can be a physical appliance (RPA) or a virtualized RPA (vRPA).
RecoverPoint RPAs manage aspects of data protection and replication.
The virtual RPA is a purely software-based instance of the RPA appliance, and utilizes services on an existing ESX platform – vRPA requires no other hardware compontents beyond the ESX server cluster it is deployed on.
RPA (RecoverPoint Appliance) has dedicated Fibre Channel, WAN and LAN interfaces.
- Fibre Channel is used for data exchange with local host applications and storage subsystems
- WAN is used to transfer data to other RPAs
- Management is used to manage the RecoverPoint System.
Array Based Write Splitter
RecoverPoint Write Splitter is used to split (duplicate) the writes. A write is sent first to the RecoverPoint appliance and then a duplicate is sent to the primary storage volume. It is important to understand that Write Splitter is Array-Based. It means that you don’t have to actually connect RecoverPoint Appliance between host and a Storage Array – instead a host is communicating with Storage Array directly, not even our of a RecoverPoint instance. Write Splitter is already built-in to the EMC VNX, VMAX, VPLEX storage systems.
A cluster of two or more active RPAs is deployed at each RecoverPoint Site. This provides high-availability – if one one RPA in a cluster fails, RecoverPoint immediately switches over to remaining RPA(s) in the cluster. RecoverPoint Clustes is a logical entity – a group of 2 to 8 physical (or virtual) RPAs that work together to replicate and protect data. Important fact: the number of RPAs in an RPA cluster must be the same at all RPA clustes in a RecoverPoint System. What is RecoverPoint System? Keep reading 🙂
A RecoverPoint System is a logically single entity which replicates and protects data between all sites in one RecoverPoint installation. You can manage all RPAs through a floating cluster Management IP address. A system is a set up to 5* interconnected RecoverPoint Clustes managed via a single RP management console. (*5 clusters for RecoverPoint/EX and RecoverPoint/CL licenses, RecoverPoint/SE supports up to 2 clusters).
RecoverPoint System of three RPA clusters.
The number of RPA clustes included in a RecoverPoint systems usually depends on how RecoverPoint is used (local protection only/ local and remote protection / remote protection only).