Daily Archives: October 4, 2011

RedHat to acquire Gluster

This is breaking news. RedHat is to acquire Gluster!

What is Gluster? Gluster is a clustering Linux distribution started by Z Research under the direction of Anand Babu (who is currently Gluster’s CEO) aiming to commoditize supercomputing and supercomputing clustered storage. Gluster is open source but there is a commercial version as well. It runs on commodity 64-bit x86 hardware. The Gluster File System (GlusterFS) aggregates disks and memory resources into a pool of storage thru a single global namespace and accessed through multiple file-level protocols. The scale-out architecture is where storage resources can be added as a storage node in a building block fashion to meet performance and capacity demands, rather like what HP P4000 is doing to the block-level environment for SAN.

Gluster can integrated with most 64-bit Linux distros. This is done at the Linux user space but it can also be crafted at the Linux kernel space, where it is a software appliance, easily integrated into off-the-shelf 64-bit x86-64 platforms. This means that you can build a scale-out NAS pretty easily using your own hardware.

From an architecture standpoint, GlusterFS and its integration to a storage appliance looks like this:

Because it works in a modular add-on fashion, this architecture is distribution and extended by replicating the same architecture across additional x86-64 platforms (which is a storage node) as shown below.

It’s really easy to install Gluster and build the Scale Out NAS. I have been saving a couple videos about how Gluster is installed and I must say that it’s pretty easy. In less than 30 minutes, you can install your first Gluster storage node and then add additional nodes on the fly.

Enjoy the videos.

Video #1 (Gluster Installation)

(I have difficulty uploading the videos because WordPress requires me to purchase one of their solutions)

Video #2 (Creating and adding Storage Node in Gluster)

(I have difficulty uploading the videos because WordPress requires me to purchase one of their solutions)

Note: If you are interested to see the videos, please email to me at chin-fah.heoh@storagenetworking-academy.com.

This news gets me very excited because this is the perfect endorsement of what I have been saying all along. Storage networking and data management are the foundations of CLOUD and VIRTUALIZATION. Without data being stored and managed well, everything falls apart. And as I have mentioned many times before, this is a fantastic time to become an extra-ordinary storage engineer/consultant/architect/sales (maybe not!)


HP P4000 – Pretty impressive

After being in the storage networking industry for so long, I have seen most of the new storage solutions out there. Most of them don’t really differ much from what already out there, and it gets a little boring. But once in a while, a little gem is unearthed and my excitement bubbles up again.

Today, I was at the HP P4000 G2 SAN workshop and the LeftHand Networks SAN/iQ storage solution which HP acquired in 2008 left me with 3 words – Interesting, Innovative and Impressive – from a technology standpoint.

I must admit that this is a little gem that got past my radar and now it’s HP’s gain. I have heard about LeftHand Networks in the past, and at the same time, I was also looking at another storage solution called Intransa. Unfortunately, Intransa went on to differentiate themselves and today, they are focused more as a storage solution for videos and CCTVs, seldom surfacing with innovative technology. LeftHand Networks was and is different and I can understand why HP bought them, because the technology that they bring with them to HP is really cool!

Now rebranded and renamed as HP P4000 G2 SAN, the storage solution no longer sits on proprietary hardware. As part of HP’s Converged Infrastructure strategy, the SAN/iQ has been fully integrated into the HP Proliant x86 platform (I heard there’s a blade version as well), making it simple to procure and probably helps simplify operational resource planning and logistics as well. At the same, there is also a P4000 VSA (Virtual Storage Appliance) as well, which HP guys have been using for demo for several years now. There is a 60-day trial available at the HP P4000 VSA Download site, for organizations to have a try-and-buy and if they do, they can turn some of their old x86 platforms into a storage appliance by just adding more hard disk drives. That’s saves money too!

So, what’s cool, you say?

2 key technologies stands out

  • Storage Clustering
  • Network RAID

As I was well informed at the workshop today, the Storage Clustering technology is not exclusive to the P4000. In fact, Dell EqualLogic employs something similar as well. But it was something that impressed me and it is different from the traditional storage SANs that we usually see.

You see, in the traditional SAN setup, the LUNs or volumes are either loosely or tightly linked to 2 active/active storage processors/controllers. And the way most of the storage vendors do, when a customer runs out of capacity or performance or both, they would have to do a forklift upgrade of the controllers. This is something that is disruptive and also does not allow CPU, memory or I/O channels upgrade to the existing controller. Today, most storage vendors do not allow you to break open the storage processor chassis and change the CPU, add more RAM or add more I/O paths to support more disk drives or increase throughput. Mind you, this is something that I have been questioning for a long time but as the storage networking industry has it, you got to upgrade the entire storage processor or controller in order to get more power and capacity.

The P4000 (as well as the Dell EqualLogic) approaches this from another angle where instead of doing a forklift upgrade of the storage processor/controller, just add another node of the same CPU and RAM profile, and have the P4000 SAN/iQ software group the new node together with the existing node(s) to form a storage cluster group. As best practice, the Storage Cluster feature should have 16 nodes or less, but in one of the war stories shared, one customer in the US actually had 32 nodes in a Storage Cluster group, for storage capacity reasons.

As more nodes are added to the Storage Cluster group, the LUNs/volumes can be extended or spanned to the other nodes as long as they are physically connected in a Gigabit network and the entire LUN or volume is been seen as ONEĀ  irregardless of which physical nodes it may be sitting. Typically you will see this sort of thing of single “Global Namespace” concept at the file system level but this is the first time I have seen it implemented at the SAN level. (Ok, I have to admit that I am a little behind times with this technology)

Here’s a little diagram I dug up from LeftHand before it was acquired by HP which I hope will enlightened the readers about this Storage Cluster feature.

But the best is yet to come as the HP Solution Architect (Timothy Chua) mentioned that the Network RAID feature was uniquely LeftHand’s and way cooler. And I couldn’t agree more because this lighted me up like a spark plug!

Since Storage Clustering could span LUNs/volumes across nodes, it was only natural that the RAID capability be extended across nodes as well. RAID-10, RAID-5, RAID-6 could all be spanned across all nodes, spread the data blocks and its mirrored/parity data blocks across the nodes in the network. And the nodes does not have to at a single site. With Gigabit networks, the nodes can be separated into multiple sites as well, giving the entire solution quite a comprehensive campus-wide storage high availability. And since this is Network RAID, it gives an entirely new meaning to the word Disaster Recovery because this will eliminate the need for data replication. Primary data in a Network RAID-10 in Node 1/Site 2 could be mirrored in Node 2/Site 2, which can be further mirrored to Node 3/Site 3 and Node 4/Site 4 for a 4-way mirror. This is the P4000 Multi-site SAN solution.

The diagram below shows how Network RAID is implemented with VMware ESX.

And since replication is no longer a requirement, VMware’s SRM (Site Recovery Manager) is also not required as well.

It is no surprise that synchronous replication in the P4000 solution is equivalent to Network RAID. Though the concept of separating the storage controllers/nodes into multiple sites for true long-distance mirroring exists, they usually don’t exist at this level. NetApp has their Fabric and Stretch MetroCluster and EMC has their VPlex, but they usually are proposed at the higher end of the spectrum. Looks to me that HP P4000 is the only one that has this concept at the entry level iSCSI SAN level. Kudos!

They have an asynchronous replication as well for longer distance networks.

I did not stay for the demo today but I am already tickled pink about the HP P4000 technology. It had a good impression on me and I can’t wait to know more of how it works internally. Looking forward to a deeper dive of the P4000 and hope to stay for the demo next time.