Design samples for Veeam + NetApp with different SLAs

During the last weeks I got a lot requests on how Veeam and NetApp designs could look like for different size of customers. The combination of Veeam and NetApp is all about minimizing the RPO and RTO in the modern datacenters. It is about combining the NetApp storage based snapshot and replication features like snapshot, SnapVault and SnapMirror with the orchestration and granular restore capabilities of Veeam. Additionally the combination minimizes the management overhead and eliminates dependencies.
If we talk about different designs we should define what the main goal of the design should be. That’s why I defined the following requirements for this article:
– SMB design (RTO > 24 hrs. and RPO > 24 hrs.)
– MID design (RTO < 24 hrs. and RPO < 24 hrs.)
– ENT design (RTO 0-2 hrs. and RPO 0-1 hrs.)

If I look into the IT of today my feeling is that more than 70% of the customers will need to have at least the MID design and > 50% require a enterprise solution as it is unacceptable to be offline or even loose any data in case of a disaster.

Now let’s look into the SMB solution:

At a SMB level where you don’t need any kind of low RPO or RTO a combination of NetApp E-Series as a source and E-Series as a repository for the Veeam server is a valid design. With that you will get a high performing and very stable primary storage with f.e. SAS drives for your VMware environment and a Veeam server which is taking care of all the backup and restore stuff. As the target it is best you use a high density storage like a E-Series with a bunch of NL-SAS drives to be the Veeam repository. This design is mostly for SMB customers or branch offices where there is no need of being available. In this scenario also the restore capabilities in case of a disaster are limited and it will take a lot of time until you are fully operational again. Veeam is used to provide single item restore capabilities and to restore several VMs instantly by booting those directly from the repository but in case of a disaster you still need to restore all VMs back to new hardware and this will take a lot of time. Another disadvantage is that there is no optimized snapshot handling or backup from storage snapshot available with NetApp E-Series and Veeam.

To enhance the RTO and RPO let’s look how a MID solution could look like:

The MID solution is a combination of NetApp FAS and Veeam. In this scenario you will have a primary NetApp FAS as your storage for VMware and a Veeam server with direct attached E-Series as a backup target. The benefit is that you can now leverage the Veeam integration into NetApp’s data ONTAP. First benefit is, that you can backup the data directly from a NetApp storage snapshot. With that a lot of load will be taken from the VMware environment and this optimizes the VMware snapshot handling problem extremely by minimizing the time a VMware snapshot needs to stay open during the backup. You can find more about backup from storage snapshot in my previous posts. Secondly you can combine Veeam created agent-lees, consistent storage snapshot with NetApp crash consistent snapshots to minimize the RPO. As Veeam can restore from snapshots even if they are not created by themselves this is a great way to improve your RPO.
If you need to restore files or even VMs you can leverage the NetApp snapshots to be the source for Te restore. With that you optimize RTO of VMs and files as there is no performance issue with NetApp snapshots. But as it is in the SMB design, in case of a disaster you still need to restore the whole environment back to new systems and servers before you will be backup online. As there is no replication in this design the disaster RTO is still very high.

Last but not least let’s have a look to an ENT design:

In the always on business of today a modern datacenter requires to have more than just a bit of storage and servers. It’s all about having availability capabilities in several layers. Looking to the design above it’s all about combining the functionalities of NetApp storage systems with the features of Veeam. At the primary datacenter a NetApp MetroCluster can make sure that in case of a disaster no data is lost (RPO=0) and the applications can access the data with no outage (RTO=0) as the data is synchronously mirrored between two sides. The MetroCluster is for sure the solution which provides you the highest level of availability in case of a disaster but you can also use a regular NetApp CDOT cluster on the primary side if MetroCluster is not possible. In a secondary location you will have another NetApp FAS system to be used as SnapVault and/or SnapMirror destination from you primary NetApp. And then there is a Veeam Backup & Replication server present either on your secondary site or on a third place. The Veeam server is the central orchestration tool of any kind of backup, restore, Snapshot, SnapMirror or SnapVault activity within the whole design. It is used to create an application consistent VMware Snapshot followed by a volume Snapshot on ONTAP. Right after this the VMware Snapshot will be deleted as you now have the application consistent state at the ONTAP level. As soon as the VMware Snapshot is committed as SnapVault or SnapMirror update can be triggered to transfer the data from the MetroCluster directly to the secondary NetApp. There it can be either saved on a Snapshot level or in version 9 be used as a source to perform a Backup from Storage Snapshot. In this scenario the primary storage is completely unaffected from a performance perspective during the backup as everything is going to be transferred from the secondary NetApp ONTAP system. Furthermore you can then use the data stored on your repository to perform a copy job to the cloud via cloud connect or to do a tape out for long time retention. The data can for sure be used for all other Veeam restore capabilities such as instant recovery or Veeam Explorer.
Beside that you can also use the NetApp Snapshots as source for your restore. By leveraging the Snapshot you will see the same performance during restores as it is in you production environment as the Snapshots are directly mounted to vSphere. The combination of NetApp SnapVault/SnapMirror and Veeam can minimize the RTO in case of a disaster of your primary system (MetroCluster or Cluster offline) down to ~ 1 hr.
RPO can be minimized down to 15 min. depending on the configuration you use in the jobs.

I hope the post answered some of your question and feel free to comment and share.


NetApp Insight 2015 – Must see sessions

2015-11-06 21_49_30-NetApp Insight Berlin 2015

Next week from 16-19th there will be NetApp Insight 2015 in Berlin. NetApp Insight is the major partner and customer event during the year and already took place in Las Vegas couple of weeks ago in a US version.
Now the conference is moving to EMEA and will bring tons of news and technology updates with it.
This will actually be my 7th Insight I will visit so I can say that I’m now more or less a old fogey.
As I’ve seen a lot of sessions and demos during the last years I will try to give you some advises what might be really interessting to see.
I’m a technical guy, that’s why I usually only go to tech deep dive sessions.
Anyway let’s start with the highlights:

As it is the NetApp major conference there are lots of sessions around the NetApp story and the jounery to the data farbric.

1836-3 – Clustered Data ONTAP 8.3.1® Storage Operating System Networking Deep Dive
1961-3-TT – Deep Dive–StorageGRID® Webscale Performance and Sizing
2096-3-TT – Clustered Data ONTAP® Transition: Complex SnapMirror® Environments
2305-2 – VMware® on Clustered Data ONTAP® – New Tricks and Best Practices Update v5.0
1688-4-TT – Advanced Storage Bottleneck Analysis with Perfstat
2260-4 – Advanced NetApp® Storage Management with Microsoft® PowerShell®
1672-4 – WFA3.1, Ontapi®, .NET–Getting the Full Power of Automation
1935-3 – Deep Dive on NetApp® AltaVault® Integration with Amazon Web Services™
1904-2 – NetApp® SnapMirror®–Clustered Data ONTAP 8.3® Storage Operating System Deep Dive
2068-2 – Deep Dive on Advanced Disk Partitioning (ADP)

One of my favorite vendors is for sure Cisco. Thats why I can highly recommend the following session.

2194-4 – FlexPod® with Cisco® ACI Deep-Dive
2276-4 – FlexPod® with Clustered Data ONTAP® Deep Dive
2197-2 – FlexPod® Solutions with UCS® Mini and FAS25xx for Small Data Centers and ROBO

For sure I highly can recommend to see one of our three session around our joint solutions and integration as well.
BTW the sessions are going to be fully booked, so hurry up and try to get a free seat.

Almot Full: 3122-3 – Veeam®Software–Veeam Availability Suite, Deep Dive on NetApp® Integration by Luca Dell’Oca
FULLY Booked: 3144-4 – MTE: Veeam – Availability in Your Data Center with Veeam® and NetApp® Solutions by Andreas Neufert
FULLY Booked: 3142-3 – Veeam®Software: Designing a Veeam® + NetApp® Data Protection Architecture by ME 🙂

Presentation by NetApp:
2342-2 – Veeam Backup and Replication Best Practices with NetApp® E-Series by Eric Kemp

Looking forward to see you there.
Feel free to contact me at our Veeam booth.