Category Archives: EMC
Unless you are working with highly, parallelized access to files in a large scale-out NAS environment, you probably don’t get to work much with Parallel NFS (pNFS). pNFS is part of the NFSv4.1 (RFC 5661) standard, and was introduced in January 2010 to address NFS protocol in the clustered, scale-out NAS environment. It is to provide parallel file access across distributed servers.
pNFS is heavily driven by Panasas, NetApp, EMC, IBM, Sun (now Oracle) among others. And funnily enough, the company that sticks out from the bunch is one that used to tout block storage as the way to go, not files. That’s EMC, the company that more well known for its SAN solutions than its NAS (remember Celerra and IP4700?). And EMC has embraced pNFS in a big, big way. Read EMC’s CTO for Global Marketing, Chuck Hollis’ blog here and here.
However, unknown to many, a lot of the thinking that goes into pNFS are very similar to an EMC product some years ago. That product is EMC Highroad, which in the later years, renamed as Multi-path File System (MPFS).
Note: If you want to know more about the history of HighRoad/MPFS, read this blog.
The cornerstone of EMC MPFS is their File Mapping Protocol or FMP, which is a robust protocol that lines the mapping of files to their addressable blocks in storage. In a nutshell, when I was made responsible for this product during my time at EMC, I used to pitch to companies that MPFS was a file request is through NFS but respond to the requester can be in blocks (iSCSI or Fibre Channel). The beauty of this was, NFSv3 was chatty and heavy but the delivery of data through blocks via iSCSI or Fibre Channel has lesser overhead compared to NFSv3.
Hence the delivery is faster and EMC touted that the performance was 2-4x faster than NFS. Indeed, I have seen some lab tests results from EMC’s work with Schlumberger High Performance Lab in Houston, and the numbers were impressive. I still have them on Powerpoint somewhere.
In circa of 2003-2004, EMC donated the FMP code to IETF and as they say the rest was history.
The picture below basically summarizes what pNFS is all about.
A NFSv4 client will basically check with a Metadata server via the pNFS protocol about the layout of the distributed cluster of server. The file, could be striped across multiple nodes of the cluster, and it is the metadata server that provides a map to the client to access the blocks or files from these nodes. This is exactly what the EMC HighRoad/MPFS File Mapping Protocol (FMP) did, mapping the file requests to its respective corresponding blocks. See diagram below:
And one of the powerful feature of pNFS is that it is not just about NFS. The green arrow you see in the above diagram is the storage-access protocol. That access protocol can be NFSv4.1, CIFS, iSCSI, Fibre Channel, FCoE, Infiniband, and Object Storage Device (OSD).
In order to have pNFS working, the NFS client must be NFSv4.1 ready and that code has been made available in Linux and OpenSolaris. Other Unix vendors, no doubt, will be coming out with their NFSv4.1 implementation soon. Oooooh, there will be a Windows NFSv4.1 client coming as well!
But I want to dispel the notion that EMC is a SAN company. EMC is a very strong NAS company and if you have seen the IDC market share (ok, ok, many of you out there will argue about it), EMC is #1 in NAS. And their contribution to pNFS is immense.
A few days after I wrote about the performance benchmark bag of tricks, EMC was the first to fire the first salvo at NetApp’s SPECSfs2008 world records on NFS IOPS.
EMC is obviously using all its ammo to deflate NetApp chest thumping act, with Storagezilla‘s blog. Mark Twomey, who is the alter ego of Storagezilla posted several observations about NetApp apparent use of disk short stroking to artificially boost its performance numbers. This puts NetApp against the wall, with Alex MacDonald (who incidentally is SNIA NFSv4 co-chairman) of the office of the CTO responding hard to Storagezilla’s observation.
The news of this appeared in The Register. Read all about it.
With no letting up, the article also mentioned EMC Isilon’s CTO, Rob Pegler, adding more fuel to the fire.
I spoke about short stroking as some of the tricks used to gain better numbers in benchmark. And I also mentioned that these numbers have little use to the real work and I would like to add that these numbers are just there for marketing reasons. So, for you readers out there, benchmark is really not big of a deal.
Have a great weekend!
A very interesting report surfaced in front of me today. It is Information Week’s IT Pro ranking of Data Deduplication vendors, just made available a few weeks ago, and it is the overview of the dedupe market so far.
It surveyed over 400 IT professionals from various industries with companies ranging from less than 50 employees to over 10,000 employees and revenues of less than USD5 million to USD1 billion. Overall, it had a good mix of respondents. But the results were quite interesting.
It surveyed 2 segments
- Overall performance – product reliability, product performance, acquisition costs, operations costs etc.
- Technical features – replication, VTL, encryption, iSCSI and FCoE support etc.
I wanted to sign off early tonight but an article in ComputerWorld caught my tired eyes. It was titled “EMC to put hardware into servers, VMs into storage” and after I read it, I couldn’t help but to juxtapose the articles with what I said earlier in my blogs, here and here.
It is very interesting to note that “EMC runs vSphere directly on the storage controllers and then uses vMotion to migrate VMs from application servers onto the storage array, ..” since the storage boxes have enough compute power to run Virtual Machines on the storage. Traditionally and widely accepted, VMs should be running on servers. Contrary to beliefs, EMC has already demonstrated this running of VMs capability on their VNX, Isilon and Symmetrix.
And soon, with EMC’s Project Lightning (announced at EMC World in May 2011), they will be introducing server side PCIe-based SSDs, ala Fusion-IO. This is different from the NetApp PAM/FlashCache PCIe-based card, which sits on their arrays, not on hosts or servers. And it is also very interesting to note that this EMC server-side PCIe Flash SSD card will become a bridge to EMC’s FAST (Fully Automated Storage Tiering) architecture, enabling it to place hot, warm and cold data strategically on different storage tiers of the applications on VMware’s VMs (now on either the server or the storage), perhaps using vMotion as a data mover on top of the “specialized” link created by the server-side EMC PCIe card.
This also blurs the line between the servers and storage and creates a virtual architecture between servers and storage, because what used to be distinct data border of the servers is now being melded into the EMC storage array, virtually.
2 red alerts are flagging in my brain right now.
- The “bridge” has just linked the server back to the storage, after years of talking about networked storage. The server is ONE again with the storage. Doesn’t that look to you like a server with plenty of storage? It has come a full cycle. But more interesting and what I am eager to see is what more is this “bridge” capable of when it comes to data management. vMotion might be the first of many new “protocol” breeds to enhance data management and mobility with this “bridge”. I am salivating right now of this massive potential.
- What else can EMC do with the VMware API? This capability I am writing right now is made possible by EMC tweaking VMware’s API to maximize much, much more. As the VMware vStorage API is continually being enhanced, the potential is again, very massive and could change the entire landscape of cloud computing and subsequently, the entire IT landscape. This is another Pavlov’s dog moment (see figures below as part of my satirical joke on myself)
Sorry, the diagram below is not related to what my blog entry is. Just my way of describing myself right now.😉
I am extremely impressed with what EMC is doing. A lot of smarts and thinking go into this and this is definitely signs of things to come. The server and the storage are “merging again”. Think of it as Borg assimilation in Star Trek.
Resistance is futile!
The writing’s on the wall and the relationship has been on the rocks since Mr. Black decided to take on 2 new wives (one in 2007 and one in 2010) and Miss Purple had a good run when things were hot.
Why Black and Purple? For a while within the local circle of EMC Malaysia, Dell’s EMC CLARiiONs were known as “Black” while EMC’s own CLARiiON was “Purple”. They were the colours of the bezels of each respective storage box. And the relationship, which Dell signed with EMC in 2003, was supposed to last 10 years but today, Dell has decided to end that relationship 2 years early. Here is one of the news at eWeek.com.
The “divorce” was inevitable. Gaps started showing up in the relationship when Dell acquired EqualLogic in 2007 and this relationship went to a point of no return when Dell started pursuing 3PAR back in 2010. Dell eventually lost 3PAR to HP and got Compellent instead. It was bound to happen, sooner or later.
Storage is becoming a very important strategy for Dell. As server virtualization grows, the demand for Dell servers wanes but storage demand kept growing. That is why it makes sense for Dell to have their own storage techonology. In addition to Compellent and EqualLogic, Dell has also acquired Exanet and Ocarina Networks in 2010.
It has been a good run for both companies, especially EMC, who was able to make use of Dell’s aggressive sales force to increase their market penetration for CLARiiON. And given the market dynamics, it is crucial that a company like Dell, with little innovation in the past, change their approach of reselling other people’s products and start owning and developing their own technology.
News of Joe Tucci quitting EMC at the end of 2012 is abuzz tonight. Here’s one from The Register.
He is one of the longest serving CEO in the storage industry and since he took over the helm in 2001, he has brought EMC to where it is today. Like him or loathe him, you cannot deny that he is one of the best out there. Having gone thru at least 3 economic downturns, he has turned EMC into an industry giant, a juggernaut.
The next question is who will succeed him? There are many candidates from long-serving senior staff to the new ones that EMC has recruited in the recent years. It will only be end of 2012 when Joe finally leaves EMC but the search for his successor will be an interesting one. We shall soon know.