6 posts tagged “avamar”
Today I looked on horrified as I read how "shocked" some delusional people were at the fact that drive failure rates do not magically stabilize with age.
Here comes the Cap'n:
Okay so maybe not everyone reading this spent a week working with QA guys from Seagate but I did so I'm rather underwhelmed at all of this. One humorous highlight amongst the charge of tin foil hat brigade came from the Zerowait blog where Mike wonders why we haven't seen people present this stuff at storage conferences (Maybe they were busy finishing their research first?) before asking if it's because vendors have something to hide? Well FAST 07 was a storage conference (File and Storage Technologies. FAST. Get it?) and if the vendors were doing their utmost to suppress the information they've done a really bad job since most of them sponsored the event if not the very award the paper in question won. Hell, the vendors are even thanked in the acknowledgments if anyone was bothered to read that far:
I'll make this quite clear, when it comes to overthrowing governments and suppressing information the PDL Consortium wouldn't be the folks I'd call. You're talking socks with sandals & beards worth remembering.
Moving off that I was busy today building an EMC Avamar system. I dutifully ordered equipment from DELL which met the requirements and instead of packing things with FC disks I went with SAS drives, thereby leaving me in a situation with no spares to hand when inevitably a drive fails (Newsflash: Your drives will fail eventually. Don't panic.), but it's not for production usage and I wanted to mix things up and see what that SAS magic was like.
What it was like was just like any other bunch of disks with a single LUN created on it before it left the factory, which means they're probably all still spinning away while they bind the LUNs I created after I blew away the factory shipped ones. Even with things binding in the background as I was loading the code eventually you get to a point where your wrist watch overrides your need to see what's over the next hill and you go home for the night.
I suppose it's best not to start these things fifteen minutes before quitting time.
Interested in EMC Avamar?
There's a free hour long webcast covering EMC's Global De-Duplication technology running on the 24th of January, the details of which you can get here.
Hu has a post up about HDS's ProtecTier VTL offering which caught my eye. Lets watch..
I recently saw an article by Beth Pariseau about reasons why people are not buying deduplication products. The main reason given was the slow speed of backup compared to non deduplication VTL solutions. Compared to tape libraries and VTLs which top out around 500GB/s, the Diligent ProtecTier solution is cited at 220 MB/s. While that is better than the 100MB/s that is cited for Data Domain, it is judged to be too slow.
That might be because for Enterprise customers it is too slow. I'm not saying that it's too slow for all customers, just customers who have large environments and require a higher transfer rate than what's effectively performing on par with four LTO 3 drives. (LTO 3 having a native drive transfer rate of 80MB/s.)
I have another take on this. The speed of backup was a concern when backup was done during backup windows. Today most users will create a snapshot or clone copy and do their backup in the background without the need for backup windows.
Backup windows haven't gone away, they're alive and well and still causing people pain. Indeed VTLs evolved from the need to help customers get their backups done within their window by leveraging the speed of backup to disk but without having to drastically overhaul their existing backup environment.
As is mentioned further down while the backup is fast the restore is just so much faster as it's coming off disk and you're not dealing with multiplexing or mount operations. The other issue is that customers might not have chosen or just can't deploy Array based instant copy (Snap/Clone/BCV/Whatever) technologies across all the systems they backup.
For some folks it's a mission critical thing only, and involves taking numerous significant point in time backups during the day to ensure a fast restart. Ideally customers doing that should be looking at CDP, but that's another discussion. There's also a lot of DAS and NAS out there which needs to be backed up, the DAS pouring across the LAN while the NAS devices might be streaming off via NDMP over FC. The NAS devices might be using snapshots, they might not. So while backups do happen in the background a lot of backups don't, and people are still playing beat the clock with their backup window.
The main concern today is the recovery speed of backups. For recovery, we achieve 400 to 500 MB/s using one ProtecTier in front of an AMS 1000 modular array. So while the backup time does take longer today, it can be done using a shadow image copy, without impact to the application, and the recovery times are still comparable to other VTLs. The major advantage of Diligent’s deduplication is to reduce the amount of data backed up by a factor of 25 to 1, and that is money.
Spot on with the money thing but it's my understanding that ProtecTier doesn't reduce the amount of data backed up, it reduces the amount of backup data retained. Yes I'm nit-picking and that isn't what Hu meant I know, but it's an important point to mention about VTL de-dup. Everything still gets backed up even if only the unique pieces are kept. Avamar on the other hand does reduce the amount of data backed up as it de-dups globally, both at the client during backup and at the storage target. So you could see a de-dup ratio in the order of hundreds to one before it leaves the client, as it's only sending subfile changes and not the entire file containing the change, and up to 30:1 when all data from all the clients lands on the back end. So you've reduced your storage requirement when the data is at rest as well as drastically reducing the bandwidth requirement when the data is in motion.
By writing this post I haven't set out to bash anyone's solution. I know that there are people out there quite happy with ProtecTier, but I would like to point out that the performance levels we're seeing when de-dup happens at the VTL could be significantly better than they are now. We're starting to see a number of different approaches spring up in the market, Sepaton with DeltaStor, FalconStor with their Single Instance Repository, Quantum have the Rocksoft Blocklets technology now shipping in their DXi series, Data Domain have their new DDX array, Diligent/HDS are going to add clustering, and so on, but when I look at the EMC DL 4000 series and see that the throughput of something like the DL 4400 is 2200MB/s you kind of realize that for some approaches there's a very long road to get to that level of performance, and all the while customers keep asking for even faster VTLs.
The question I suppose this raises is where is EMC's VTL de-dup offering? I'd imagine EMC will answer that question soon enough.
You can read that here.
I like startup people as they're always up for a fight, Jed strikes me as one of those people, and he's been more than gracious about taking questions about the new age of that thing we currently call backup and recovery.
The long version met with an untimely demise in editing so here's the short one.
The more I learn about Avamar's tech the more I want it backing up my computers at home. The days of editing a few frames in a home movie or changing a letter or a word in a file and then watching as the entire movie or file was backed up in the next incremental/full backup are over. It's more than just de-duplicating data, it's being smart about de-duplicating data.
So, the problem has been solved, all that's left to do is wait until the technology trickles down into the consumer market. That's how I know that Avamar was a good buy for EMC. It's technology solves a problem which technology buyers of all shapes and sizes have, be they large Enterprise customers or individual consumers, now it's EMC's job to work out how best to deliver that technology to all the different types of folks who could do with it.
Not going to say too much about this as this is only the beginning, but what it's the beginning of is a drastic change in the backup and recovery market. Yes my friends de-duplication, (I prefer the term redundant data elimination myself), is one of those technology's which is going to screw a lot of vendors up.
Like some technological angel of death it'll mortally wound or kill some other vendors outright, turn their value prop to dust, but what's going to be really interesting is to watch how quickly it becomes ubiquitous. How quickly it spreads everywhere.
Lets see how quickly it spreads when a couple of thousand EMC sales folks are out there showing Avamar to customers. Backup less, transmit less, store less, and do so securely?
Gimme.