The next all-Flash product in my review list is SolidFire. Immediately, the niche that SolidFire is trying to carve out is obvious. It’s not for regular commercial customers. It is meant for Cloud Service Providers, because the features and the technology that they have innovated are quite cloud-intended.
Are they solid (pun intended)? Well, if they have managed to secure a Series B funding of USD$25 million (total of USD$37 million overall) from VCs such as NEA and Valhalla, and also angel investors such as Frank Slootman (ex-Data Domain CEO) and Greg Papadopoulus(ex-Sun Microsystems CTO), then obviously there is something more than meets the eye.
The one thing I got while looking up SolidFire is there is probably a lot of technology and innovation behind their Nodes and their Element OS. They hold their cards very, very close to their chest, and I couldn’t not get much good technology related information from their website or in Google. But here’s a look of how the SolidFire is like:
The SolidFire only has one product model, and that is the 1U SF3010. The SF3010 has 10 x 2.5″ 300GB SSDs giving it a raw total of 3TB per 1U. The minimum configuration is 3 nodes, and it scales to 100 nodes. The reason for starting with 3 nodes is of course, for redundancy. Each SF3010 node has 8GB NVRAM and 72GB RAM and sports 2 x 10GbE ports for iSCSI connectivity, especially when the core engineering talents were from LeftHand Networks. LeftHand Networks product is now HP P4000. There is no Fibre Channel or NAS front end to the applications.
Each node runs 2 x Intel Xeon 2.4GHz 6-core CPUs. The 1U height is important to the cloud provider, as the price of floor space is an important consideration.
Aside from the SF3010 storage nodes, the other important ingredient is their SolidFire Element OS.
Cloud storage needs to be available. The SolidFire Helix Self-Healing data protection is a feature that is capable of handling multiple concurrent failures across all levels of their storage. Data blocks are replicated randomly but intelligently across all storage nodes to ensure that the failure or disruption of access to a particular data block is circumvented with another copy of the data block somewhere else within the cluster. The idea is not new, but effective because solutions such as EMC Centera and IBM XIV employ this idea in their data availability. But still, the ability for self-healing ensures a very highly available storage where data is always available.
To address the efficiency of storage, having 3TB raw in the SF3010 is definitely not sufficient. Therefore, the Element OS always have thin provision, real-time compression and in-line deduplication turned on. These features cannot be turned off and operate at a fine-grained 4K blocks. Also important is the intelligence to reclaim of zeroed blocks, no-reservation, and no data movement in these innovations. This means that there will be no I/O impact, as claimed by SolidFire.
But the one feature that differentiates SolidFire when targeting storage for Cloud Service Providers is their guaranteed volume level Quality of Service (QOS). This is important and SolidFire has positioned their QOS settings into an advantage. As best practice, Cloud Service Providers should always leverage the QOS functionality to improve their storage utilization
The QOS has:
- Minimum IOPS – Lower IOPS means lower performance priority (makes good sense)
- Maximum IOPS
- Burst IOPS – for those performance spikes moments
- Maximum and Burst MB/sec
Together with the Storage Nodes and the Element OS, the whole package is aimed towards a more significant storage platform for Cloud Service Providers(CSPs). Storage has always been a tricky component in Cloud Computing (despite what all the storage vendors might claim), but SolidFire touts that their solution focuses on what matters most for CSPs.
CSPs would want to maximize their investment without losing their edge in the cloud offerings to their customers. SolidFire lists their benefits in these 3 areas:
The edge in cloud storage is definitely solid for SolidFire. Their ability to leverage on their position and steering away from other all-Flash vendors’ battlezone could all make sense, as they aim to gain market share in the Cloud Service Provider space. I only wish they can share more about their technology online.
Fortunately, I found a video by SolidFire’s CEO, Dave Wright which gives a great insight about SolidFire’s technology. Have a look (it’s almost 2 hour long):
[2 hours later]: Phew, I just finished the video above and the technology is solid. Just to summarize,
- No RAID (which is a Godsend for service providers)
- Aiming for USD5.00 or less per Gigabyte (a good number!)
- General availability in Q1 2012
Lots of confidence about the superiority of their technology, as portrayed by their CEO, Dave Wright.
Solid? Yes, Solid!
I like the way Amazon is building their Cloud Computing services. Amazon Web Services (AWS) is certainly on track to become the most powerful Cloud Computing company in the world. In fact, AWS might already is. But they are certainly not resting on their laurels when they launched 2 new services in as many weeks – Amazon DynamoDB (last week) and Amazon Storage Gateway (this week).
I am particularly interested in the Amazon Storage Gateway, because it is addressing one of the biggest fears of Cloud Computing head-on. A lot of large corporations are still adamant to keep their data on-premise where it is private and secure. Many large corporations are still very skeptical about it even though Cloud Computing is changing the IT landscape in a massive way. The barrier to entry for large corporations is not something easy, but Amazon is adapting to get more IT divisions and departments to try out Cloud Computing in a less disruptive way.
The new service, is really about data storage and data backup for large corporations. This is important because large corporations have plenty of requirements for data storage and data to be backed up. And as we know, a large portion of the data stored does not need to be transactional or to be accessed frequently. This set of data is usually less frequently used, for archiving or regulatory compliance reasons, particular in the banking and healthcare industry.
In the data backup operations, the reason data is backed up is to provide a data recovery mechanism when a disaster strikes. Large corporations back up tons of data every day, weeks or month and this data only has value when there is a situation that requires data relevance, data immediacy or data recovery. Otherwise, it is just plenty of data taking up storage space, be it on disk or on tape.
Both data storage and data backup cost a lot of money, both CAPEX and OPEX. In CAPEX, you are constantly pressured to buy more storage to store the ever growing data. This leads to greater management and administration costs, both contributing heavily into OPEX costs. And I have not included the OPEX costs of floor space, power and cooling, people (training, salary, time and so on) typically adding up to 3-5x the operations costs relative to the capital investments. Such a model of IT operations related to storage cannot continue forever, and storage in the Cloud offers an alternative.
These 2 scenarios – data storage and data backup – are exactly the type of market AWS is targeting. In order to simplify and pacify large corporations, AWS introduced the Amazon Storage Gateway, that eases the large corporations to take some of their IT storage operations to the Cloud in the form of Amazon S3.
The video below shows the Amazon Storage Gateway:
The Amazon Storage Gateway is a piece of software “appliance” that is installed on-premise in the large corporation’s data center. It seamlessly integrates into the LAN and provides a SSL (Secure Socket Layer) connection to the Amazon S3. The data being transferred to the S3 is also encrypted with AES (Advanced Encryption Standard) 256-bit. Both SSL and AES-256 can give customers a sense of security and AWS claims that the implementation meets the data storage and data recovery standards used in the banking and healthcare industries.
The data storage and backup service regularly protects the customer’s data in snapshots, and giving the customer a rapid recovery platform should the customer experienced on-premise data corruption or data disruption. At the same time, the snapshot copies in the Amazon S3 can also be uploaded into Amazon EBS (Elastic Block Store) and testing or development environments can be evaluated and testing with Amazon EC2 (Elastic Compute Cloud). The simplicity of sharing and combining different Amazon services will no doubt, give customers a peace of mind, easing their adoption of Cloud Computing with AWS.
This new service starts with a 60-day free trial and moving on to a USD$125.00 (about Malaysian Ringgit $400.00) per gateway per month subscription fee. The data storage (inclusive of the backup service), costs only 14 cents per gigabyte per month. For 1TB of data, that is approximately MYR$450 per month. Therefore, minus the initial setup costs, that comes to a total of MYR$850 per month, slightly over MYR$10,000 per year.
At this point, I like to relate an experience I had a year ago when implementing a so-called private cloud for an oil-and-gas customers in KL. They were using the HP EVS (Electronic Vaulting Service) to an undisclosed HP data center hosting site in the Klang Valley. The HP EVS, which was an OEM of Asigra, was not an easy solution to implement but what was more perplexing was the fact that the customer had a poor understanding of what would be the objectives and their 5-year plan in keeping with the data protected.
When the first 3-4TB data storage and backup were almost used up, the customer asked for a quotation for an additional 1TB of the EVS solution. The subscription for 1TB was MYR$70,000 per year. That is 7x time more than the AWS MYR$10,000 per year cost! I have to salute the HP sales rep. It must have been a damn good convincing sell!
In the long run, the customer could be better off running their storage and backup on-premise with their HP EVA4400 and adding an additional of 1TB (and hiring another IT administrator) would have cost a whole lot less.
Amazon Web Services has already operating in Singapore for the past 2 years, and I am sure they are eyeing Malaysia as their regional market. Unless and until Malaysian companies offering Cloud Services know to use economies-of-scale to capitalize the Cloud Computing market, AWS is always going to be a big threat to CSP companies in Malaysia and a boon of any companies seeking cloud computing services anywhere in the world.
I urge customers in Malaysia to start questioning their so-called Cloud Service Providers if they can do what AWS is doing. I have low confidence of what the most local “cloud computing” companies can deliver right now. I hope they stop window dressing their service offerings and start giving real cloud computing services to customers. And for customers, you must continue to research and find out more which cloud services meet your business objectives. Don’t be flashed by the fancy jargons or technical idealism thrown at you. Always, always find out more because your business cost is at stake. Don’t be like the customer who paid MYR$70,000 for 1TB per year.
AWS is always innovating and the Amazon Storage Gateway is just another easy-to-adopt step in their quest for world domination.
I was in Singapore last week attending the Cloud Infrastructure Services course.
In the class, one of the foundation components of Cloud Computing is of course, storage. As the students and the instructor talked about Storage, one very interesting argument surfaced. It revolved around the storage, if it was offered on the cloud. A lot of people assumed that Cloud Storage would be for their databases, and their virtual machines, which of course, is true when the communication between the applications, virtual machines and databases are in the local area network of the Cloud Service Provider (CSP).
However, if the storage is offered through the cloud to applications that are sitting on-premise in the customer’s server room, then we have to think twice of how we perceive Cloud Storage. In this aspect, the Cloud Storage offered by the CSP is a Infrastructure-as-a-Service (IaaS), where the key service is Storage. We have to differentiate that this Storage functions as a data container, and usually not for I/O performance reasons.
Though this concept probably will be easily understood by storage professionals like us, this can cause a bit confusion for someone new to the concept of Cloud Computing and Cloud Storage. This confusion, unfortunately, is caused by many of us who are vendors or solution providers, or even publications and magazines. We are responsible to disseminate correct information to customers, but due to our lack of knowledge and experience in this extremely new market of Cloud Storage, we have created the FUDs (Fear, Uncertainty and Doubt) and hype.
Therefore, it is the duty of this blogger to clear the vapourware, and hopefully pass on the right information to accelerate the adoption of Cloud Storage in the near future. At this moment, given the various factors such as network costs, high network latency and lack of key network technologies similar to LAN in Cloud Computing, Cloud Storage is, most of the time, for data storage containership and archiving only. And there are no IOPS or any performance related statistics related to Cloud Storage. If any engineer or vendor tells you that they have the fastest Cloud Storage in the industry, do me a favour. Give him/her a knock on the head for me!
Of course, as technologies evolve, this could change in the near future. For now, Cloud Storage is a container, NOT a high performance storage in the cloud. It is usually not meant for transactional data. There are many vendors in the Cloud Storage space from real CSPs to storage companies offering re-packaged storage boxes that are “cloud-ready”. A good example of a CSP offering Cloud Storage is Amazon S3 (Simple Storage Service). And storage vendors such as EMC and HDS are repackaging and rebranding their storage technologies as object storage, ready for the cloud. EMC Atmos is really a repackaged and rebranded Centera, with some slight modifications, while HDS , using their Archiving solution, has HCP (aka HCAP). There’s nothing wrong with what EMC and HDS have done, but before the overhyping of the world of Cloud Computing, these platforms were meant for immutable data archiving reasons. Just thought you should know.
One particular company that captured my imagination and addresses the storage performance portion is Nasuni. Of course, they are quite inventive with the Cloud Storage Gateway approach. Nasuni comes up with a Cloud Storage Gateway filer appliance, which can be either a physical 1U server or as a VMware or Hyper-V virtual appliance sitting on-premise at the customer’s site.
The key to this is “on-premise”, which allows access to data much faster because they are locally-cached in the Nasuni filer appliance itself. This Nasuni filer piece addresses the Cloud Storage “performance” piece but Nasuni do not claim any performance statistics with such implementation. The clever bit is that this addresses data or files that are transactional in nature, i.e. NFS or CIFS, to serve data or files “locally”. (I wonder if Nasuni filer has iSCSI as well. Hmmmm….)
In the Nasuni architecture, they “break up” their “Cloud Storage” into 2 pieces. Piece #1 sits on-premise, at the customer site, and acts as a bridge to the Piece #2, that is sitting in a Cloud Storage. From a simplified view, have a look at the diagram below:
Piece #1 is the component that handles some of the transactional traffic related to files. In a more technical diagram below, you can see that the Nasuni filer addresses the file sharing portion, using the local disks on the filer appliance as a local caching mechanism.
Furthermore, older file pieces are whiffed away to the any Cloud Storage using the Cloud Connector interface, hence giving the customer a sense that their storage capacity needs can be limitless if they want to (for a fee, of course). At the same time, the Nasuni filer support thin provisioning and snapshots. How cool is that!
The Cloud Storage piece (Piece #2) is used for the data container and archiving reasons. This component can be sitting and hosted at Amazon S3, Microsoft Azure, Rackspace Cloud Files, Nirvanix Storage Delivery Network and Iron Mountain Archive Services Platform.
The data communication and transfer between the Nasuni filer is secure, encrypted, deduplication and compressed, giving it the efficiency and security that most customers would be concerned about. The diagram below explains the dat communication and data transfer bit.
In this manner, the Nasuni filer can replace traditional NAS platforms and can potentially provide a much lower total cost of ownership (TCO) in the long run. Nasuni does not pretend to be a NAS replacement. To me, this concept is very inventive and could potentially change the way we perceive file sharing and file server, obscuring and blurring concept of NAS.
Again, I would like to reiterate that Nasuni does not attempt to say their solution is a NAS or a performance-based Cloud Storage but what they have cleverly packaged seems to be appealing to customers. Their customer base has grown 78% in Q2 of 2011. It’s just too bad they are not here in Malaysia or this part of the world (yet).
IOPS in Cloud Storage? Not yet.