Category Archives: Dell

The last bastion – Memory

I have been in this industry for almost 20 years. March 2, 2012 will be my 20th year, to be exact. I have never been in the mainframe era, dabbled a bit in the mini computers era during my university days and managed to ride the wave of client-server, Internet explosion in the beginning WWW days, the dot-com crash, and now Cloud Computing.

In that 20 years, I have seen the networking wars (in which TCP/IP and Cisco prevailed), the OS wars and the Balkanization of Unix (Windows NT came out the winner), the CPU wars (SPARC, PowerPC, in which x86 came out tops) and now data and storage. Yet, the last piece of the IT industry has yet to begun or has it?

In the storage wars, it was pretty much the competition between NAS and SAN and religious groups of storage in the early 2000s but now that I have been in the storage networking industry for a while, every storage vendor are beginning to look pretty much the same for me, albeit some slight differentiating factors once in a while.

In the wars that I described, there is a vendor for the product(s) that are peddled but what about memory? We never question what brand of memory we put in our servers and storage, do we? In the enterprise world, it has got to be ECC, DDR2/3 memory DIMMs and that’s about it. Why????

Even in server virtualization, the RAM and the physical or virtual memory are exactly just that – memory! Sure VMware differentiates them with a cool name called vRAM, but the logical and virtual memory is pretty much confined to what’s inside the physical server.

In clustering, architectures such as SMP and NUMA, do use shared memory. Oracle RAC shares its hosts memory for the purpose of Oracle database scalability and performance. Such aggregated memory architectures in one way or another, serves the purpose of the specific applications’ functionality rather than having the memory shared in a pool for all general applications.

What if some innovative company came along, and decided to do just that? Pool all the physical memory of all servers into a single, cohesive and integrated memory pool and every application of each of the server can use the “extended” memory in an instance, without some sort of clustering software or parallel database. One company has done it using RDMA (Remote Direct Memory Access) and their concept is shown below:

I am a big fan of RDMA ever since NetApp came out with DAFS some years ago, and I only know a little bit about RDMA because I didn’t spend a lot of time on it. But I know RDMA’s key strength in networking and when this little company called RNA Networks news came up using RDMA to create a Memory Cloud, it piqued my interest.

RNA innovated with their Memory Virtualization Acceleration (MVX) and this is layered on top of 10Gigabit Ethernet or Infiniband networks with RDMA. Within the MVX, there are 2 components of interest – RNAcache and RNAmessenger. This memory virtualization technology allows hundreds of server nodes to lend their memory into the Memory Cloud, thus creating a very large and very scalable memory pool.

As quoted:

RNA Networks then plunks a messaging engine, an API layer, and a pointer updating algorithm
on top of the global shred memory infrastructure, with the net effect that all nodes in the
cluster see the global shared memory as their own main memory.

The RNA code keeps the memory coherent across the server, giving all the benefits of an SMP
or NUMA server without actually lashing the CPUs on each machine together tightly so they
can run one copy of the operating system.

The performance gains, as claimed by RNA Networks, was enormous. In a test  published, running MVX had a significant performance gain over SSDs, as shown in the results below:

This test was done in 2009/2010, so there were no comparisons with present day server-side PCIe Flash cards such as FusionIO. But even without these newer technologies, the performance gains were quite impressive.

In a previous version of 2.5, the MVX technology introduced 3 key features:

  • Memory Cache
  • Memory Motion
  • Memory Store
The Memory Cache, as the name implied, turned the memory pool into a cache for NAS and file systems that are linked to the server. At the time, the NAS protocol supported was only NFS. The cache stored frequently accessed data sets used by the servers. Each server could have simultaneous access to the data set in the pool and MVX would be handling the contention issues.
The Memory Motion feature gives OSes and physical servers (including hypervisors) access to shared pools of memory that acts as a giant swap device during page out/swap out scenarios.
Lastly, the Memory Store was the most interesting for me. It turned the memory pool into a collection of virtual block device and was similar to the concept of RAMdisks. These RAMdisks extended very fast disks to the server nodes and the OSes, and one server node can mount multiple instances of these virtual RAMdisks. Similarly multiple server nodes can mount a single virtual RAMdisk for shared disk reasons.
The RNA Networks MVX scales hundreds of server nodes and supported architectures such as 32/64 bit x86, PowerPC, SPARC and Itanium. At the time, the MVX was available for Unix and Linux only.

The technology that RNA Networks was doing was a perfect example of how RDMA can be implemented. Before this, memory was just memory but this technology takes the last bastion of IT – the memory – out into the open. As the Cloud Computing matures, memory is going to THE component that defines the next leap forward, which is to make the Cloud work like one giant computer. Extending the memory and incorporating memory both on-premise, on the host side as well as memory in the cloud, into a fast, low latency memory pool would complete the holy grail of Cloud Computing as one giant computer.

RNA Networks was quietly acquired by Dell in July 2011 for an undisclosed sum and got absorbed into Dell Fluid Architecture’s grand scheme of things. One blog, Juku, captured an event from Dell Field Tech Day back in 2011, and it posted:

The leitmotiv here is "Fluid Data". This tagline, that originally was used by Compellent
(the term was coined by one of the earlier Italian Compellent customer), has been adopted
for all the storage lineup, bringing the fluid concept to the whole Dell storage ecosystem,
by integrating all the acquired tech in a single common platform: Ocarina will be the
dedupe engine, Exanet will be the scale-out NAS engine, RNA networks will provide an interesting
cache coherency technology to the mix while both Equallogic and Compellent have a different
targeted automatic tiering solution for traditional block storage.

Dell is definitely quietly building something and this could go on for some years. But for the author to quote – “Ocarina will be the dedupe engine, Exanet will be the scale-out NAS engine; RNA Networks will provide cache coherency technology … ” mean that Dell is determined to out-innovate some of the storage players out there.

How does it all play in Dell’s Fluid Architecture? Here’s a peek:

It will be interesting how to see how RNA Networks technology gels the Dell storage technologies together but one thing’s for sure. Memory will be the last bastion that will cement Cloud Computing into an IT foundation of the next generation.

Oracle Bested the Best in Quality

I have been an avid reader of SearchStorage Storage magazine for many years now and have been downloading their free PDF copy every month. Quietly snugged at the end of January 2012’s issue, there it was, the Storage magazine 6th annual Quality Awards for NAS.

I was pleasantly surprised with the results because in the previous annual awards, it would dominated by NetApp and EMC but this time around, a dark horse has emerged. It is Oracle who took top honours in both the Enterprise and the Mid-range categories.

The awards are the result of Storage Magazine’s survey and below is an excerpt about the survey:

In both categories covering the Enterprise and the Mid-Range, the overall ratings are shown below:

Surprised? You bet because I was.

The survey does not focus on speeds and feeds or comparing scalability or performance. Rather, the survey focuses on the qualitative aspects of the NAS products. There were many storage vendors who were part of the participation lists but many did not qualify to be make a dent of what the top 6 did. Here’s a list of the vendors surveyed:

The qualitative aspects of the survey focused on 5 main areas:

  • Sales force competency
  • Initial Quality
  • Product Features
  • Product Reliability
  • Technical Support

In each of the 5 main areas, customers were asked a series of questions. Here is a breakdown of those questions of each area.

Sales Force Competency

  1. Are the sales force knowledgeable about their products and their customer’s industries?
  2. How flexible are their sales effort?
  3. How good are they keeping the customer’s interest levels up?

Initial Product Quality

  1. Does the product need little or no vendor intervention?
  2. Ease of installation and ease of use
  3. Good value for money
  4. Reasonable requirement from Professional Service or needing little Professional Service
  5. Installation without defects
  6. Getting it right the first time

Product Features

  1. Storage management features
  2. Mirroring features
  3. Capacity scaling features
  4. Interoperable with other vendor’s products
  5. Remote replication features
  6. Snapshotting features

Product Reliability

  1. Vendor provide comprehensive upgrading procedures
  2. Ability to meet Service Level Agreement (SLA)
  3. Experiences very little downtime
  4. Patches applied non-disruptively

Technical Support

  1. Taking ownership of the customer’s problem
  2. Timely problem resolution and technical advice
  3. Documentation
  4. Vendor supplies support contractually as specified
  5. Vendor’s 3rd party partners are knowledgeable
  6. Vendor provide adequate training

These are some of the intangibles that customers are looking for when they qualify the NAS solutions from vendors. And the surprising was Oracle just became something to be reckoned with, backed by the strong legacy of customer-centric focus of Sun and StorageTek. If this is truly happening in the US, then kudos to Oracle for maximizing the Sun-Storagetek enterprise genes to put their NAS products to be best-of-breed.

However, on the local front, it seems to me that Oracle isn’t doing much justice to the human potential they have inherited from Sun. A little bird has told me that they got rid of some good customer service people in Malaysia and Singapore just last month and more could be on the way in 2012. All this for the sake of meeting some silly key performance indices (KPIs) of being measured by tasks per day.

The Sun people that I know here in Malaysia and Singapore are gurus who has gone through the fire and thrived and there is no substitute for quality. Unfortunately, in Oracle, it’s all about numbers, whether it is sales or tasks per day.

Well, back to the survey. And of course, the final question would be, “Is the product good enough that you would buy it again?” And the results are …

Good for Oracle in the US but the results do not fully reflect what’s on the ground here in Malaysia, which is more likely dominated by NetApp, HP, EMC and IBM.

Is Dell Fluid Enough?

Dell made a huge splash 2 weeks ago in London in their inaugural Dell Storage Forum. They dubbed their storage and management lineup as “Fluid Data Architecture” offering the ability for customers to quickly adapt and automate their business when it comes to storage networking and more importantly, data management.

In the London show, they showcased several key innovations and product development. Here’s a list of their jewels:

  • DR4000 – an inline, content optimized backup deduplication appliance (based on the acquired technology of Ocarina Networks)
  • Compellent Storage Center 6.0 – a major software release
  • Compellent key technology integration with VMware
  • Optimized object storage for Microsoft Sharepoint with the DX6000 Object Storage Platform – DX6000 is an OEM from Caringo
  • Broader support for Dell Force10, PowerConnect and their partner’s Brocade

The technology from Ocarina Networks is fantastic technology and I have always admired Ocarina. I have written about Ocarina in the past in my previous blog. But I was a bit perplexed why Dell chose to enter the secondary dedupe market with a backup dedupe appliance in the DR4000. They are already a latecomer into the secondary deduplication game and I thought HP was already late with their StoreOnce.

They could have used Ocarina’s technology to trailblaze the primary deduplication market. In my previous blog, I mentioned that primary deduplication hasn’t really taken off in a big way, and Dell with the technology from Ocarina could set the standard and establish themselves as the leader of the primary deduplication market space. I was disappointed that they didn’t, not just yet.

The Compellent Storage Center 6.0 release was a major release and it was, for better or for worse, coincided with the departure of Phil Soran, the founder and CEO of Compellent. Phil felt that he can let his baby go and Dell is certainly making the best of what they can do with Compellent as their flagship data storage product.

The major release included 64-bit support for greater performance and scalability and also include several key VMware technologies that other vendors already have. The technologies included:

  • VMware vStorage API for Array Integration (VAAI)
  • Storage Replication Adapter plug-in for VMware Site Recovery Manager (SRM)
  • VSphere 5 client plug-in
  • Integration of Enterprise Manager and VSphere

Other storage related releases (I am not going to talk about Force10 or their PowerConnect solutions here) included Dell offering 16Gbps FibreChannel switches from Brocade and also their DX6000 Object Storage Platform optimized for Microsoft Sharepoint.

I think it is fantastic that Dell is adapting and evolving into a business-oriented, enterprise solution provider and their acquisitions in the past 3 years – EqualLogic, Exanet, Ocarina Networks, Force10 and Compellent – proves that Dell aims to take market share in the storage networking and data management market. They have key initiatives with CommVault, Symantec, VMware and Microsoft as well. And Michael Dell is becoming quite a celebrity lately, giving Dell the boost it needs to battle in this market.

But the question is, “Is their Fluid Data Architecture” fluid enough?” If I were a customer, would I bite?

As a customer, I look for completeness in the total solution, and I cannot fault Dell for having most of the pieces in the solution stack. They have networking in their PowerConnect, Force10 and Brocade. They have SAN in both Compellent and EqualLogic but their unified storage story is still a bit lacking. That’s because we have not seen Dell’s NAS storage yet. Exanet was a scale-out NAS and we have seen little rah-rah about this product.

From a data management perspective, their data protection story gels well with the Commvault and Symantec partnership, but I feel that Dell sales and SEs (at least in Malaysia) spends too much time touting the Compellent Automated Storage Tiering. I have spoken to folks who have listened to Dell guys’ pitches and it’s too one-dimensional. It’s always about storage tiering and little else about other Compellent technology.

At this point of time, the story that Dell sells here in Malaysia is still disjointed, but they are getting better. And eventually, the fluidity (pun intended ;-)) of their Fluid Data Architecture will soon improve.

How will Dell fare in 2012? They had taken a beating in the past 2 IDC’s quarter storage market tracker, losing some percentage points in market share but I think Dell will continue to tinker to get it right.

2012 will be their watershed year.

Primary Dedupe where are you?

I am a bit surprised that primary storage deduplication has not taken off in a big way, unlike the times when the buzz of deduplication first came into being about 4 years ago.

When the first deduplication solutions first came out, it was particularly aimed at the backup data space. It is now more popularly known as secondary data deduplication, the technology has reduced the inefficiencies of backup and helped sparked the frenzy of adulation of companies like Data Domain, Exagrid, Sepaton and Quantum a few years ago. The software vendors were not left out either. Symantec, Commvault, and everyone else in town had data deduplication for backup and archiving.

It was no surprise that EMC battled NetApp and finally won the rights to acquire Data Domain for USD$2.4 billion in 2009. Today, in my opinion, the landscape of secondary data deduplication has pretty much settled and matured. Practically everyone has some sort of secondary data deduplication technology or solution in place.

But then the talk of primary data deduplication hardly cause a ripple when compared a few years ago, especially here in Malaysia. Yeah, the IT crowd is pretty fickle that way because most tend to follow the trend of the moment. Last year was Cloud Computing and now the big buzz word is Big Data.

We are here to look at technologies to solve problems, folks, and primary data deduplication technology solutions should be considered in any IT planning. And it is our job as storage networking professionals to continue to advise customers about what is relevant to their business and addressing their pain points.

I get a bit cheesed off that companies like EMC, or HDS continue to spend their marketing dollars on hyping the trends of the moment rather than using some of their funds to promote good technologies such as primary data deduplication that solve real life problems. The same goes for most IT magazines, publications and other communications mediums, rarely giving space to technologies that solves problems on the ground, and just harping on hypes, fuzz and buzz. It gets a bit too ordinary (and mundane) when they are trying too hard to be extraordinary because everyone is basically talking about the same freaking thing at the same time, over and over again. (Hmmm … I think I am speaking off topic now .. I better shut up!)

We are facing an avalanche of data. The other day, the CEO of Nexenta used the word “data tsunami” but whatever terms used do not matter. There is too much data. Secondary data deduplication solved one part of the problem and now it’s time to talk about the other part, which is data in primary storage, hence primary data deduplication.

What is out there?  Who’s doing what in term of primary data deduplication?

NetApp has their A-SIS (now NetApp Dedupe) for years and they are good in my books. They talk to customers about the benefits of deduplication on their FAS filers. (Side note: I am seeing more benefits of using data compression in primary storage but I am not going to there in this entry). EMC has primary data deduplication in their Celerra years ago but they hardly talk much about it. It’s on their VNX as well but again, nobody in EMC ever speak about their primary deduplication feature.

I have always loved Ocarina Networks ECO technology and Dell don’t give much hoot about Ocarina since the acquisition in  2010. The technology surfaced a few months ago in Dell DX6000G Storage Compression Node for its Object Storage Platform, but then again, all Dell talks about is their Fluid Data Architecture from the Compellent division. Hey Dell, you guys are so one-dimensional! Ocarina is a wonderful gem in their jewel case, and yet all their storage guys talk about are Compellent  and EqualLogic.

Moving on … I ought to knock Oracle on the head too. ZFS has great data deduplication technology that is meant for primary data and a couple of years back, Greenbytes took that and made a solution out of it. I don’t follow what Greenbytes is doing nowadays but I do hope that the big wave of primary data deduplication will rise for companies such as Greenbytes to take off in a big way. No thanks to Oracle for ignoring another gem in ZFS and wasting their resources on pre-sales (in Malaysia) and partners (in Malaysia) that hardly know much about the immense power of ZFS.

But an unexpected source coming from Microsoft could help trigger greater interest in primary data deduplication. I have just read that the next version of Windows Server OS will have primary data deduplication integrated into NTFS. The feature will be available in Windows 8 and the architectural view is shown below:

The primary data deduplication in NTFS will be a feature add-on for Windows Server users. It is implemented as a filter driver on a per volume basis, with each volume a complete, self describing unit. It is cluster aware, and fully crash consistent on all operations.

The technology is Microsoft’s own technology, built from scratch and will be working to position Hyper-V as an strong enterprise choice in its battle for the server virtualization space with VMware. Mind you, VMware already has a big, big lead and this is just something that Microsoft must do-or-die to keep Hyper-V playing catch-up. Otherwise, the gap between Microsoft and VMware in the server virtualization space will be even greater.

I don’t have the full details of this but I read that the NTFS primary deduplication chunk sizes will be between 32KB to 128KB and it will be post-processing.

With Microsoft introducing their technology soon, I hope primary data deduplication will get some deserving accolades because I think most companies are really not doing justice to the great technologies that they have in their jewel cases. And I hope Microsoft, with all its marketing savviness and adeptness, will do some justice to a technology that solves real life’s data problems.

I bid you good luck – Primary Data Deduplication! You deserved better.

Gartner 3Q2011 WW ECB Disk Storage Market

Just after IDC released their numbers of their worldwide Disk Storage System Tracker (Read my blog) 10 days ago, Gartner released their Worldwide External Controller Based (ECB) Disk Storage Market report for Q3 of 2011.

The storage market remains resilient (for now) and growing 10.4% in terms of revenue, despite the hard economic conditions. The table below shows the top 7 storage vendors and their relation to their Q2 numbers.

EMC remained at the top and gained a massive 3.6% jump in market share. Looks like they are firing all cylinders and chugging like an unstoppable steam train. IBM gained 0.1% in second place as its stable of DS8000, XIV and Storewize V7000 is taking shape. Even though IBM has been holding steadily, I still think that their present storage lineup is staggered and lacks that seamless upgrade path for their customers.

NetApp, which I always terms as the “little engine that could”, is slowing down. They were badly hit in the last quarter, delivering lower than expected revenue numbers according to the analysts. Their stock took a tumble too. As quoted by Gartner, “NetApp’s third-quarter results reflect an overdependence on a few large customers, limited geographic coverage in high-growth countries and increased competition from Dell, EMC, HP and IBM in the midrange modular ECB disk array market segment.

I wrote in my recent blog, that NetApp has to start evolving from a pure-play storage vendor into a total storage and data management solution vendor. The recent rumours of NetApp’s interests in Commvault and Quantum should make a lot of sense if NetApp decides to make that move. Come on, NetApp! What are you waiting for?

HP came back strong in this report. They are in 4th place with 10.4% market share and hot on NetApp’s heels. After many months of nonsensical madness – Leo Apotheker firing, trying to ditch the PC business, the killing of WebOS tablet, the very public Oracle-HP spat – things are beginning to settle a bit under their new CEO, Meg Whitman. In a recent HP Discover conference in Vienna, it was reported that the HP storage team is gung-ho of what they have in their arsenal right now. They called it “The 4 Jewels of HP Storage Crown” which includes 3PAR, Ibrix, StoreOnce and LeftHand. They also leap-frogged over HDS and Dell in the recent Gartner Magic Quadrant (See below).

Kudos to HP and team.

HDS seems to be doing well, and so is Dell. But the Gartner numbers tell a different story. HDS, lost market share and now shares 7.8% market share with Dell. Dell, despite its strong marketing on Compellent, could not make up its loss after breaking off with EMC.

Fujitsu and Oracle completes the line up.

My conclusion: HP and IBM are coming back; EMC is well and far ahead of everyone else; NetApp has to evolve; Dell still lacking in enterprise storage savviness despite having good technology; No comments about HDS. 

Magic on storage players

It’s that time of the year again where Gartner releases it Magic Quadrant for the block-access, external controller-based, mid-range and high-end modular disk arrays market. This particular is very important because it represents the mainstay of the overall storage industry, viewed from a more qualitative angle. Whereas the other charts and reports work with statistics and numbers, this is the chart that everyone in the industry flock to. Gartner Magic Quadrant (MQ) is the storage industry indicator of who’s are the leaders; who are the visionaries; who are the executive wizards and who are the laggards (also known as niche players).

So, this time around, who’s in the Leaders Quadrant?

The perennial players in the Leader’s Quadrant are EMC, IBM, NetApp, HP, Dell, and HDS. In my previous blog, I shared with you the IDC figures about market shares but the Gartner MQ shows are more subtle side, and one that perhaps carry more weight to organizations.

From the IDC numbers announced previously, we have seen Dell taking a beating. They have lost market share and similarly in this latest Gartner MQ, they have lost their significance of their influence as well. Everyone expected their Compellent solution to be robust and having EqualLogic, Ocarina and Exanet in its stable would strengthen their presence in the storage industry. Surprisingly, Dell lost on both IDC statistically charged market numbers and this Gartner MQ as well. Perhaps they were too hasty to dump EMC a few months ago?

Gartner also reported that HP has made significant leap in the Leader’s Quadrant. It has leapfrogged over HDS and IBM when comparing their position in Gartner’s MQ chart. This could be coming from their concerted effort to pitch their Converged Infrastructure, a vision that in my opinion, simplified computing. HP Malaysia shared with me their vision a few months ago, and I was impressed. What I was not very impressed then and even now, is that their storage solutions story is still staggered, lacking the gel. Perhaps it is work in progress for HP, the 3PAR, the IBRIX and the EVA. But one things for sure. They are slowly but surely getting the StoreOnce story right and that’s good news for customers. I did a review of HP StoreOnce technology a few months ago.

Perhaps it’s time for HP to ditch their VLS deduplication, which to me, confuses customers. By the way, HP VLS is an OEM from Sepaton. (Sepaton is “No tapes” spelled backwards)

Here’s a glimpse of last year’s Magic Quadrant.

 

In the Niche Quadrant, there are a few players making waves as well. 2 companies to watch out for are Huawei (they dropped Symantec 2 weeks ago) and Nexsan. Nexsan has been beefing up its marketing of late, and I often see them in mailing lists and ads on some websites I went to.

But the one to watch will be Huawei. This is a company with deep pockets, hiring the best in the storage industry and also has a very strong domestic market in China. In the next 2-3 years, Huawei could emerge as a strong contender to the big boys. So watch out!

Gartner Magic Quadrant is indeed weaving its magic and this time around the magic is good to HP.

Data Deduplication – Dell is first and last

A very interesting report surfaced in front of me today. It is Information Week’s IT Pro ranking of Data Deduplication vendors, just made available a few weeks ago, and it is the overview of the dedupe market so far.

It surveyed over 400 IT professionals from various industries with companies ranging from less than 50 employees to over 10,000 employees and revenues of less than USD5 million to USD1 billion. Overall, it had a good mix of respondents. But the results were quite interesting.

It surveyed 2 segments

  1. Overall performance – product reliability, product performance, acquisition costs, operations costs etc.
  2. Technical features – replication, VTL, encryption, iSCSI and FCoE support etc.
When I saw the results (shown below), surprise, surprise! Here’s the overall performance survey chart:
Dell/Compellent scored the highest in this survey while EMC/Data Domain ranked the lowest. However, the difference between the first place and the last place vendor is only 4%, and this is to suggest that EMC/Data Domain was about just as good as the Dell/Compellent solution, but it scored poorly in the areas that matters most to the customer. In fact, as we drill down into the requirements of the overall performance one-by-one, as shown below,
there is little difference among the 7 vendors.
However, when it comes to Technical Features, Dell/Compellent is ranked last, the complete opposite. As you can see from the survey chart below, IBM ProtecTier, NetApp and HP are all ranked #1.
The details, as per the technical requirements of the customers, are shown below:
These figures show that the competition between the vendors is very, very stiff, with little edge difference from one to another. But what I was more interested were the following findings, because these figures tell a story.
In the survey, only 34% of the respondents say they have implemented some data deduplication solutions, while the rest are evaluating and plan to evaluation. This means that the overall market is not saturated and there is still a window of opportunity for the vendors. However, the speed of the a maturing data deduplication market, from early adopters perhaps 4-5 years ago to overall market adoption, surprised many, because the storage industry tend to be a bit less trendy than most areas of IT. With the way the rate of data deduplication is going, it will be very much a standard feature of all storage vendors in the very near future.
The second figures that is probably not-so-surprising is, for most of the customers who have already implemented the data deduplication solution, almost 99% are satisfied or somewhat satisfied with their solutions. Therefore, the likelihood of these customer switching vendors and replacing their gear is very low, perhaps partly because of the reliability of the solution as well as those products performing as they should.
The Information Week’s IT Pro survey probably reflected well of where the deduplication market is going and there isn’t much difference in terms of technical and technology features from vendor to vendor. Customer will have to choose beyond the usual technology pitch, and look for other (and perhaps more important) subtleties such as customer service, price and flexibility of doing business with. EMC/Data Domain, being king-of-the-hill, has not been the best of vendor when it comes to price, quality of post-sales support and service innovation. Let’s hope they are not like the EMC sales folks of the past, carrying the “Take it or leave it” tag when they develop their relationship with their future customers. And it will not help if word-of-mouth goes around the industry about EMC’s arrogance of their dominance. It may not be true, and let’s hope it is not true because the EMC of today has changed plenty compared to the Symmetrix days. EMC/Data Domain is now part of their Backup Recovery Service (BRS) team, and I have good friends there at EMC Malaysia and Singapore. They are good guys but remember guys, customer is still king!
Dell, new with their acquisition of Compellent and Ocarina Networks, seems very eager to win the business and kudos to them as well. In fact, I heard from a little birdie that Dell is “giving away” several units of Compellents to selected customers in Malaysia. I did not and cannot ascertain if this is true or not but if it is, that’s what I call thinking-out-of-the-box, given Dell as a late comer into the storage game. Well done!
One thing to note is that the survey took in 17 vendors, including Exagrid, Falconstor, Quantum, Sepaton and so on, but only the top-7 shown in the charts qualified.
In the end, I believe the deduplication vendors had better scramble to grab as much as they can in the coming months, because this market will be going, going, gone pretty soon with nothing much to grab after that, unless there is a disruptive innovation to the deduplication technology

Ocarina rising

After more than a year since Dell acquired Ocarina Networks, it has finally surfaced last week in the form of Dell DX Object Storage 6000G SCN (Storage Compression Node).

Ocarina is a content-aware storage optimization engine, and their solution is one of the best I have seen out there. Its unique ECOsystem technology, as described in the diagram below, is impressive.

Unlike most deduplication and compression solutions out there, Ocarina Networks solution takes storage optimization a step further.  Ocarina works at the file level and given the rise and crazy, crazy growth of unstructured files in the NAS space, the web and the clouds, storage optimization is one priority that has to be addressed immediately. It takes a 3-step process – Extract, Correlate and Optimize.

Today’s files are no longer a flat structure of a single object but more of a compounded file where many objects are amalgamated from different sources. Microsoft Office is a perfect example of this. An Excel file would consists of objects from Windows Metafile Formats, XML objects, OLE (Object Linking and Embedding) Compound Storage Objects and so on. (Note: That’s just Microsoft way of retaining monopolistic control). Similarly, a web page is a compound of XML, HTML, Flash, ASP, PHP object codes.

In Step 1, the technology takes files and breaks it down to its basic components. It is kind of like breaking apart every part of a car down to its nuts and bolt and layout every bit on the gravel porch. That is the “Extraction” process and it decodes each file to get the fundamental components of the files.

Once the compounded file object is “extracted”, identified and indexed, each fundamental object is Correlated in Step 2. The correlation is executed with the file and across files under the purview of Ocarina. Matching and duplicated objects are flagged and deduplicated. The deduplication is done at the byte-level, unlike most deduplication solutions that operate at the block-level. This deeper and more granular approach further reduces the capacity of the storage required, making Ocarina one of the most efficient storage optimization solutions currently available. That is why Ocarina can efficiently reduce the size of even zipped and highly encoded files.

It takes this storage optimization even further in Step 3. It applies content-aware compactors for each fundamental object type, uniquely compressing each object further. That means that there are specialized compactors for PDF objects, ZIP objects and so on. They even have compactors for Oil & Gas seismic files. At the time I was exposed to Ocarina Networks and evaluating it, it had about 600+ unique compactors.

After Dell bought Ocarina in July 2010, the whole Ocarina went into a stealth mode. Many already predicted that the Ocarina technology would be integrated and embedded into Dell’s primary storage solutions of Compellent and EqualLogic. It is not there yet, but will likely be soon.

Meanwhile, the first glimpse of Ocarina will be integrated as a gateway solution to Dell DX6000 Object Storage. DX Object Storage is a technology which Dell has OEMed from Caringo. DX6000 Object Storage (I did not read in depth) has the concept of the old EMC Centera, but with a much newer, and more approach based on XML and HTTP REST. It has published an open API and Dell is getting ISV partners to develop their applications to interact with the DX6000 including Commvault, EMC, Symantec, StoredIQ are some of the ISV partners working closely with Dell.

(24/10/2011: Editor note: Previously I associated Dell DX6000 Object Storage with Exanet. I was wrong and I would like to thank Jim Dtuton of Caringo for pointing out my mistake)

Ocarina’s first mission is to reduce the big, big capacities in Big Data space of the DX6000 Object Storage, and the Ocarina ECOsystem technology looks a good bet for Dell as a key technology differentiator.

Mr. Black divorces Miss Purple

The writing’s on the wall and the relationship has been on the rocks since Mr. Black decided to take on 2 new wives (one in 2007 and one in 2010) and Miss Purple had a good run when things were hot.

Why Black and Purple? For a while within the local circle of EMC Malaysia, Dell’s EMC CLARiiONs were known as “Black” while EMC’s own CLARiiON was “Purple”. They were the colours of the bezels of each respective storage box. And the relationship, which Dell signed with EMC in 2003, was supposed to last 10 years but today, Dell has decided to end that relationship 2 years early. Here is one of the news at eWeek.com.

The “divorce” was inevitable. Gaps started showing up in the relationship when Dell acquired EqualLogic in 2007 and this relationship went to a point of no return when Dell started pursuing 3PAR back in 2010. Dell eventually lost 3PAR to HP and got Compellent instead. It was bound to happen, sooner or later.

Storage is becoming a very important strategy for Dell. As server virtualization grows, the demand for Dell servers wanes but storage demand kept growing. That is why it makes sense for Dell to have their own storage techonology. In addition to Compellent and EqualLogic, Dell has also acquired Exanet and Ocarina Networks in 2010.

It has been a good run for both companies, especially EMC, who was able to make use of Dell’s aggressive sales force to increase their market penetration for CLARiiON. And given the market dynamics, it is crucial that a company like Dell, with little innovation in the past, change their approach of reselling other people’s products and start owning and developing their own technology.

Storage Tiering – Responsible and Prudent

Does your IT have bottomless budget? If not, storage tiering is likely to be considered as one of IT’s weapons to combat the ever growing need for storage capacity.

Storage tiering is not new and in the past, features such as HSM (Hierarchical Storage Management) and ILM (Information Lifecycle Management) addresses storage tiering in different capacities, ranging for simple aging files movement and migration, to data objects being moved within the data infrastructure of an organization with some kind of workflow and searching capabilities.

Lately, storage tiering, and especially automated storage tiering, has been gaining prominence, thanks to the 2 high profile acquisitions – HP 3PAR and Dell Compellent. According to Wikibon,

Tiered storage is a system of assigning applications to different 
types of storage media based on application requirements. Factors 
considered in the allocation of storage type include the level of 
protection needed, performance requirements, speed of recovery, 
and many other considerations.Since assigning application data to 
specific media may be complex, some vendors provide software for 
automatically managing the process.

For the sake of simplicity, this blog talks about automated storage tiering within the storage array itself, where different data blocks are moved within several tiers to achieve just-right storage provisioning. Why do we need to achieve this “just-right provisioning”? Rather than discussing this from an IT, technical angle, the just-right storage provisioning should be addressed from a business and operational angle, and more rightly so, costs and benefits.

Business and operations are about managing costs and increasing profits. In the past, many storage administrators employ a single storage tier architecture. Using the same type of disks, for example, 146G 10,000RPM Fibre Channel disks, there was usually 1 or 2 RAID levels for the entire data storage requirement. Usually RAID 1+0 volumes/LUNs are for the applications that require the highest performance and availability but they come with a big cost. So, the rest of the data are kept in RAID-5 volumes/LUNs. The introduction of enterprise SATA hard disk drives basically changed the rules of the ball game, giving storage administrators another option, a cheaper alternative to store their data. Obviously, storage vendors saw the great need to address this requirement, and hence created automated storage tiering as part of their offerings.

There are quite a few storage solutions that offers the storage tiering feature, and most of them are automated as well, meaning that the data blocks are moved between the different tiers of storage within the array itself automatically. 3PAR, long before they were acquired by HP, had their Dynamic Autonomic Tiering. Today, with HP, 3PAR offers 2 key strengths in their Autonomic Tiering offering.

  • Adaptive Optimization
  • Dynamic Optimization

As HP puts it,

Not to be outdone, Compellent (also long before its acquisition by Dell) had the Data Progression feature as part of the Automated Storage Tiering offering. In a nutshell, their solution (which is basically similar from a 10,000 feet view with most of the competitors) is shown below.

The idea is to put the most frequently accessed data blocks to the most expensive, fastest, storage tier and then dynamically move the lesser accessed data block to the least expensive, most economical tier.

I have had the privilege to learn more about Compellent (before Dell) technology about 2.5 years ago, thanks to my friends Chyr and Winston, the bosses at Impact Business Solutions. And what Compellent has was pretty cool stuff and I would like to share what I have picked up about Dell Compellent storage solution. But some of the information could be a little out of date.

The foundation of Dell Compellent automated storage tiering feature, called Data Progression, is their Dynamic Block Architecture (as shown below)

From a high level, all data blocks are bunched together into a logical data structure called a page. A page is by default 2MB but can be configured between 512KB and 4MB. The page is the granular unit required to initiate and implement the Data Progression feature in Compellent’s automated storage tiering solution. Every page comes with attached metadata about the page such as

  • When was this page created
  • When was this page last accessed
  • Which RAID level is it currently in (RAID 1+0, RAID-59, RAID-55 and so on)
  • Which Tier does it currently reside (Tier 1, 2 or 3)
  • Which kind of disk track does it live in (Fast or Standard)

Meanwhile, there are different storage Tiers and notably, Tier 1, 2 or 3 where different disk profiles reside. Typically, the SSDs or the 15K RPM disk drives will be in Tier 1, the 10K RPM disk drives will be in Tier 2 and the slowest 7200 RPM disks will be in Tier 3.  Each of the 3 tiers are further divided into the outer Fast disk cylinders (where the platters spin the fastest) and the Standard disk cylinders (running in the inner tracks and slower).

As data chunks or blocks are accessed, their frequency of access and their data movement statistics are gathered in real-time, giving the Compellent solution a fairly good intelligence of how the pages should be laid out on the most relevant tiers. As the pages become more stale, and less relevant, the pages of data chunks are progressively relegated to the lower tiers, while the more active, and most relevant pages relative to importance of access, is progressively promoted to the higher tiers.

Different policies can also be configured to ensure that some important pages stay where they are regardless of their frequency of access or their relevance.

There is a very nice whitepaper from Dell detailing their Data Progression technology.

Another big automated storage tiering player is HP 3PAR. I admit that I don’t know the inner details of the HP 3PAR Dynamic Tiering solution, though I had some glossy lessons from a 3PAR Systems Engineer called Nathan Boeger (thanks to my friends at PTC Singapore, the 3PAR distributor back then) about the same time I learned about Dell Compellent. I hope HP can offer to introduce more in depth of how the 3PAR technology works, now that I have gotten cosy with some of the HP Malaysia’s folks.

Similarly, the other big boys are offering the automated storage tiering solution as well. IBM has been offering Easy Tier for almost 18 months and EMC has its FAST2 for about the same time.

Funnily, the odd one out in this automated storage tiering game is NetApp. I was in a partner conference call about 1 year ago and there were questions asking NetApp about their views of automated storage tiering. At that time of the concall, NetApp did not believe in automated storage tiering, preferring to market their FlashCache PCIe (previously called the PAM card) solution. Take note that the FlashCache is a Read-Only “extension” to their NVRAM, and used to accelerate read operations of WAFL. And also take note that NetApp, at the time of writing, does not have an “engine” that performs automated storage tiering, regardless of how they spin it.

There are also host-based file tiering solutions as well.Since I am familiar with the NetApp universe, Arkivio and Enigma Data Solutions are 2 of the main partners that NetApp works with. Recently NetApp also resells StorNext from Quantum. But note that these host-based solutions are file-based, making them less granular, less dynamic and less efficient. They are usually marketed as file archiving solutions, and the host-based license are usually charged by per TB. In large enterprises, this might make sense but for the everyday Joes (with tight IT budgets), host-based file archiving solutions are expensive. And it is nowhere close to the efficiencies of automated storage tiering.

Overall, automated storage tiering, when applied, should help the IT operations and the organization’s business reduce costs. There is no longer a one-size-fit-all model and associating the right storage tier to the relevance and importance of the data at a very granular sub-LUN/sub-volume level will help any organization define a more prudent approach in managing their data actively and more importantly their cost of operations.

This is called Responsible IT. 😀