Category Archives: Falconstor

Primary Dedupe where are you?

I am a bit surprised that primary storage deduplication has not taken off in a big way, unlike the times when the buzz of deduplication first came into being about 4 years ago.

When the first deduplication solutions first came out, it was particularly aimed at the backup data space. It is now more popularly known as secondary data deduplication, the technology has reduced the inefficiencies of backup and helped sparked the frenzy of adulation of companies like Data Domain, Exagrid, Sepaton and Quantum a few years ago. The software vendors were not left out either. Symantec, Commvault, and everyone else in town had data deduplication for backup and archiving.

It was no surprise that EMC battled NetApp and finally won the rights to acquire Data Domain for USD$2.4 billion in 2009. Today, in my opinion, the landscape of secondary data deduplication has pretty much settled and matured. Practically everyone has some sort of secondary data deduplication technology or solution in place.

But then the talk of primary data deduplication hardly cause a ripple when compared a few years ago, especially here in Malaysia. Yeah, the IT crowd is pretty fickle that way because most tend to follow the trend of the moment. Last year was Cloud Computing and now the big buzz word is Big Data.

We are here to look at technologies to solve problems, folks, and primary data deduplication technology solutions should be considered in any IT planning. And it is our job as storage networking professionals to continue to advise customers about what is relevant to their business and addressing their pain points.

I get a bit cheesed off that companies like EMC, or HDS continue to spend their marketing dollars on hyping the trends of the moment rather than using some of their funds to promote good technologies such as primary data deduplication that solve real life problems. The same goes for most IT magazines, publications and other communications mediums, rarely giving space to technologies that solves problems on the ground, and just harping on hypes, fuzz and buzz. It gets a bit too ordinary (and mundane) when they are trying too hard to be extraordinary because everyone is basically talking about the same freaking thing at the same time, over and over again. (Hmmm … I think I am speaking off topic now .. I better shut up!)

We are facing an avalanche of data. The other day, the CEO of Nexenta used the word “data tsunami” but whatever terms used do not matter. There is too much data. Secondary data deduplication solved one part of the problem and now it’s time to talk about the other part, which is data in primary storage, hence primary data deduplication.

What is out there?  Who’s doing what in term of primary data deduplication?

NetApp has their A-SIS (now NetApp Dedupe) for years and they are good in my books. They talk to customers about the benefits of deduplication on their FAS filers. (Side note: I am seeing more benefits of using data compression in primary storage but I am not going to there in this entry). EMC has primary data deduplication in their Celerra years ago but they hardly talk much about it. It’s on their VNX as well but again, nobody in EMC ever speak about their primary deduplication feature.

I have always loved Ocarina Networks ECO technology and Dell don’t give much hoot about Ocarina since the acquisition in  2010. The technology surfaced a few months ago in Dell DX6000G Storage Compression Node for its Object Storage Platform, but then again, all Dell talks about is their Fluid Data Architecture from the Compellent division. Hey Dell, you guys are so one-dimensional! Ocarina is a wonderful gem in their jewel case, and yet all their storage guys talk about are Compellent  and EqualLogic.

Moving on … I ought to knock Oracle on the head too. ZFS has great data deduplication technology that is meant for primary data and a couple of years back, Greenbytes took that and made a solution out of it. I don’t follow what Greenbytes is doing nowadays but I do hope that the big wave of primary data deduplication will rise for companies such as Greenbytes to take off in a big way. No thanks to Oracle for ignoring another gem in ZFS and wasting their resources on pre-sales (in Malaysia) and partners (in Malaysia) that hardly know much about the immense power of ZFS.

But an unexpected source coming from Microsoft could help trigger greater interest in primary data deduplication. I have just read that the next version of Windows Server OS will have primary data deduplication integrated into NTFS. The feature will be available in Windows 8 and the architectural view is shown below:

The primary data deduplication in NTFS will be a feature add-on for Windows Server users. It is implemented as a filter driver on a per volume basis, with each volume a complete, self describing unit. It is cluster aware, and fully crash consistent on all operations.

The technology is Microsoft’s own technology, built from scratch and will be working to position Hyper-V as an strong enterprise choice in its battle for the server virtualization space with VMware. Mind you, VMware already has a big, big lead and this is just something that Microsoft must do-or-die to keep Hyper-V playing catch-up. Otherwise, the gap between Microsoft and VMware in the server virtualization space will be even greater.

I don’t have the full details of this but I read that the NTFS primary deduplication chunk sizes will be between 32KB to 128KB and it will be post-processing.

With Microsoft introducing their technology soon, I hope primary data deduplication will get some deserving accolades because I think most companies are really not doing justice to the great technologies that they have in their jewel cases. And I hope Microsoft, with all its marketing savviness and adeptness, will do some justice to a technology that solves real life’s data problems.

I bid you good luck – Primary Data Deduplication! You deserved better.

Falconstor – soaring to 7th heaven

I was invited to Falconstor version 7.0 launch to the media this morning at Sunway Resort Hotel.

I must admit that I am a fan of Falconstor from a business perspective because they have nifty solutions. Many big boys OEMed Falconstor’s VTL solutions such as EMC with its CDL (CLARiiON Disk Library) and Sun Microsystems virtual tape library solutions. Things have been changing. There are still OEM partnerships with HDS (with Falconstor VTL and FDS solutions), HP (with Falconstor NSS solution) and a few others, but Falconstor has been taking up a more aggressive stance with their new business model. They are definitely more direct with their approach and hence, it is high time we in the industry recognize Falconstor’s prowess.

The launch today is Falconstor version 7.0 suite of data recovery and storage enhancement solutions. Note that while the topic of their solutions were on data protection, I used data recovery, simply because the true objective of their solutions are on data recovery, doing what matters most to business – RECOVERY.

Falconstor version 7.0 family of products is divided into 3 pillars

  • Storage Virtualization – with Falconstor Network Storage Server (NSS)
  • Backup & Recovery – with Falconstor Continuous Data Protector (CDP)
  • Deduplication – with Falconstor Virtual Tape Library (VTL) and File-Interface Deduplication System (FDS)
NSS virtualizes heterogeneous storage platforms and sits in between the application servers, or virtualized servers. It simplifies disparate storage platforms by consolidating volumes and provides features such as thin provisioning and snapshots. In the new version, NSS now supports up to 1,000 snapshots per volume from the previous number of 255 snapshots. That is a 4x increase as the demand for data protection is greater than ever. This allows the protection granularity to be in the minutes, well meeting the RPO (Recovery Point Objectives) standard of the most demanding customers.
The NSS also replicates the snapshots to a secondary NSS platform at a DR to extend the company’s data resiliency and improves the business continuance factor for the organization.
In a revamp new algorithm in version 7.0, the Microscan technology used in the replication technology is now more potent and higher in performance. For the uninformed, Microscan, as quoted in the datasheet is:
MicroScan™, a patented FalconStor technology, minimizes the
amount of data transmitted by eliminating redundancies at the
application and file system layers. Rather than arbitrarily
transmitting entire blocks or pages (as is typical of other
replication solutions), MicroScan technology maps, identifies, and
transmits only unique disk drive sectors (512 bytes), reducing
network traffic by as much as 95%, in turn reducing remote
bandwidth requirements.

Another very strong feature of the NSS is the RecoverTrac, which is an automated DR technology. In business, business continuity and disaster recovery usually go hand-in-hand. Unfortunately, triggering either BC or DR or both is an expensive and resource-consuming exercise. But organizations have to prepare and therefore, a proper DR process must be tested and tested again.

I am a certified Business Continuity Planner, so I am fully aware of the beauty RecoverTrac brings to the organization. The ability to test non-intrusive, simulated DR, and find out the weak points of recovery is crucial and RecoverTrac brings that confidence of DR testing to the table. Furthermore, well-tested automated DR processes also eliminates human errors in DR recovery. And RecoverTrac also has the ability to track the logical relationships between different applications and computing resource, making this technology an invaluable tool in the DR coordinator’s arsenal.

The diagram below shows the NSS solution:

And NSS touts to be one true any storage platform to any storage platform over any protocol replication solution. Most vendors will have either FC or iSCSI or NAS protocols but I believe so far, only Falconstor offers all protocols in one solution.

Item #2 in the upgrade list is Falconstor’s CDP solution. Continuous Data Protection (CDP) is a very interesting area in data protection. CDP provides almost near-zero RTO/RPO solution on disk, and yet not many people are aware of the power of CDP.

About 5-6 years ago, CDP was hot and there were many start-ups in this area. Companies such Kashya (bought by EMC to become RecoverPoint), Mendocino, Revivio (gobbled up by Symantec) and StoneFly have either gone belly up or gobbled up by bigger boys in the industry. Only a few remained, and Falconstor CDP is one of the true survivors in this area.

CDP should be given more credit because there are always demand for very granular data protection. In fact, I sincerely believe that both CDP, snapshots and snapshot replication are the real flagships of data protection today and the future because data protection using the traditional backup method, in a periodic and less frequent manner, is no longer adequate. And the fact that backup is generating more and more data to keep is truly not helping.

Falconstor CDP has the HyperTrac™ Backup Accelerator (HyperTrac) works in conjunction with FalconStor Continuous Data Protector (CDP) and FalconStor Network Storage Server (NSS) to increase tape backup speed, eliminate backup windows, and offload processing from application servers. A quick glimpse of HyperTrac technology is shown below:

In the Deduplication pillar, there were upgrades to both Falconstor VTL and Falconstor FDS. As I said earlier, CDP, snapshots and replication of the snapshot are already becoming the data protection of this new generation of storage solutions. Coupled with deduplication, data protection is made more significant because it makes smart noodles to keep one copy of the same old files, over and over again.

Falconstor File-Interface Deduplication Systems (FDS) addresses the requirement to storage more effectively, efficiently, economically. Its Single Instance Repository (SIR) technology has now been enhanced as a global deduplication repository, giving it the ability to truly store a single copy of the object. Previously, FDS was not able to recognize duplicated objects in a different controller. FDS also has improved its algorithms, driving performance up to 30TB/hour and is able to deliver a higher deduplication ratio.

In addition to the NAS interface, the FDS solution now has a tighter integration with the Symantec Open Storage Technology (OST) protocol.

The Falconstor VTL is widely OEM by many partners and remains one of the most popular VTL solutions in the market. VTL is also enhanced significantly in this upgrade and not surprisingly, the VTL solution from Falconstor is strengthened by its near-seamless integration with the other solutions in their stable. The VTL solution now supports up to 1 petabyte usable capacity.

Falconstor has always been very focused in the backup and data recovery space and has done well favourably with Gartner. In January of 2011, Gartner has release their Magic Quadrant report for Enterprise Disk-based Backup and Recovery, and Falconstor was positioned as one of the Visionaries in this space. Below is the magic quadrant:

As their business model changes to a more direct approach, it won’t be long before you seen Falconstor move into the Leader quadrant. They will be soaring, like a Falcon.