How I solved to Backup and Restore to AWS Challenge.
Let me start by declaring that I’ve been dabbling in backup & disaster recovery for some 10 years now and have been a cloud advocate pretty much since the “cloud” became cloud. I’m also a bit of an evangelist for certain products backup & DR Products. One such product is Arcserve Unified Data Protection.
One question I get asked more than anything is about Recovery to Amazon AWS… Can it be done?
The short answer is YES.. But with a catch. It is not exactly clear and simple in the current release of Arcserve UDP v6.0. There is a certain amount of manual involvement, and some AWS skills needed. Luckily I’ve developed an App to simplify the process immensely. You can check out the demo of the process using the link below.
But why would you want to use the public cloud for Backup & DR? Well the cloud is an affordable pay-as-you-go off-site globally available datacentre available to every company regardless of the size. For me, it eliminates two major shortfalls of small business backup strategy – 1) getting backup (tapes) out of the building 2) somewhere to recover the backups to.
For those of you not familiar with Arcserve UDP yet, it is a next generation backup & DR solution that offers impressive global deduplication and replication capabilities and rapid system recovery using either Virtual Standby, Instant Virtual Standby or Bare Metal Recovery.
The key benefit of why you want want to use Arcserve UDP to backup your servers to the Cloud is the immense bandwidth benefit you gain from the Global De-duplication. You can backup an entire datacentre and send only a fraction of the amount of data over the network than any other backup solution.
So how does it work?
I have to admit that I did a bit of digging around and researching other products to come up with a solution. Amazon EC2 has an Import option that allows you to upload VHD disks and create EBS volumes and EC2 instances. Arcserve UDP has a “Restore as VHD” capability. The recovery process involves restoring the server as a VHD, then uploading the VHD file to Amazon S3 then converting the images to the instance or disk.
Upload VHD/VMDK to S3? How big? How long? How much bandwidth do you have?
Those are the key questions! If you are backing up a 500GB server & uploading a 500GB VHD file to S3 over a 4Mb bandwidth, it is going to take time. Lots of time. It is also going to block up your bandwidth.
However with Arcserve UDP, I can drastically reduce the WAN requirement by placing a Recovery Point Server in the cloud & synchronising the minimal de-duplicated incremental backups. The WAN footprint is minimised dramatically. It is then possible to use the cloud based RPS server to extract the VHD – then perform a High Speed import to S3.
In this diagram we are backing up Physical & Virtual Servers locally, synchronising the backups to AWS, then restoring as AWS instances.
Local Backup – with Offsite Storage
In this solution we take local, infinite incremental backups of Physical and Virtual Servers to a local Arcserve UDP Recovery Point Server. The datastore is highly de-duplicated to reduce storage and also reduce the amount of data sent over the network to a second RPS server running in AWS. This topology eliminates the need for high bandwidth and long backup duration. The AWS based RPS also has a de-duplicated datastore and it could well be a target datastore for multiple remote sites and for the backup of instances running in AWS.
Without going into too much detail on the process. It is fairly simple to extract a server backup from the AWS based RPS Server in the form of VHD files to a temporary disk location. You can then use the documented AWS import process to import the VHD files..
However it isn’t always that simple. It is actually very difficult to determine which VHD file is the System Disk and which are data disks. It is easy enough for single disk restores, but tricky for servers with multiple disk of similar sizes.
Luckily the application I developed figures this out for you and even gives you a choice of importing only the system or only the data disks.
I’ve built some more smarts into the App:
• Attach the instance to selected subnet
• Tag the instance name
• Attach data volumes
• Queue conversion tasks (AWS has a 5 simultaneous conversion limit)
I’ve recorded a short video of the process that can be viewed here… If you’d like to test the app for yourself, please let me know here..