Every dedupe rose has some thorns

I recently had a customer ask me about migrating backup data out of an old Data Domain appliance.  As this sort of question has come up before, I will share my thoughts with the rest of you as well.

First of all, I want to be very clear that I believe backups are the ‘killer app’ for deduplication, and moving forward, file sharing as well.  That being said, there are a couple of ugly truths that nobody in the dedupe business talks about.  The one we will discuss here is Exit Strategy.

When considering deduplication as part of an upgrade or change in your current backup design, there are several options to weigh.  Software or Hardware, integrated or stand-alone, and then the various offerings for each type.  There are several to choose from, and factors surrounding the decisions that you make.  I could go deeper into each of them, but lets not do that today.  I can save that for another day.

When it comes down to your choice, you will most likely have a simple disk-to-disk (D2D) target for your backup vendor of choice, where the save sets are stored in a compressed and deduplicated manner.  However this is done, either by the backup software or the target device, there is metadata associated with this backup that stores the compression algorithm and the deduplicated block index.  From an external ‘point of view’ the data looks normal, but is stored in the compressed/deduplicated format on the file share based on the metadata.  This is pretty much standard across the board, and is the magic behind the curtain.

Here is where it gets tricky.  You can have multiple dedupe engines working on the same piece of data.  For example, if your backup application can dedupe and compress while writing to a D2D target, it has a ‘point of view’.  If your backup application doesn’t do dedupe, the D2D target could be on an appliance such as Data Domain, which natively performs deduplication inline on anything being stored within, creating another ‘point of view’.  You probably aren’t going to dedupe at multiple layers, but it is possible due to the separate ‘points of view’ established.  You will need to be careful when extracting data by following the same chain back out of the dedupe string to get back to the flat files that you backed up originally.

As an example, you could use your backup software to write your save sets to a D2D target, and it will write that data in a way that is compatible with the backup application.  If that D2D target happens to be a volume on a Data Domain appliance, the data written into the device will be deduplicated inline as it is written to disk, therefore being reduced in size significantly.  Those save sets, and others added afterward, are contained in the same ‘point of view’ allowing you to store significantly more data than physically available on the disks natively.

Now we come to the ugly side of dedupe, which is what to do when the honeymoon is over.  How do you break up with a dedupe device?  The customer in question was a long-term DD customer, and was facing a large increase in maintenance fees post-EMC acquisition.  The cost/benefit had fallen to the point where they wanted to go in a new direction.  They wanted to know if they could just export the data on the Data Domain box to another location.  I wish it was that easy, but alas… it was not meant to be.

As I mentioned earlier, to get out of a dedupe data set, you need to unwrap the data in the same manner that you put it into the store.  The customer needs to open up their backup application, browse to the save sets, and clone them off to another location.  This location could be tape, another dedupe device, or a simple D2D target.  In their case, they had been cloning out monthly sets to tape for archiving purposes, and simply needed to step up the process for weekly and daily tapes until their migration to the new backup application was completed.

In conclusion, deduplication is a great thing for storage and backup vendors, but it can create additional complexities when you try to migrate away from a specific technology.  Make sure you ask about exit and upgrade strategies while shopping around for your next backup appliance.  It could save you headaches down the road.

About timantz

I am a Solutions Architect at SimpliVity, helping people around the country with their virtualization, storage, backup and recovery projects.
This entry was posted in backups, DR and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s