Daily Archives: October 28, 2011
I wanted to sign off early tonight but an article in ComputerWorld caught my tired eyes. It was titled “EMC to put hardware into servers, VMs into storage” and after I read it, I couldn’t help but to juxtapose the articles with what I said earlier in my blogs, here and here.
It is very interesting to note that “EMC runs vSphere directly on the storage controllers and then uses vMotion to migrate VMs from application servers onto the storage array, ..” since the storage boxes have enough compute power to run Virtual Machines on the storage. Traditionally and widely accepted, VMs should be running on servers. Contrary to beliefs, EMC has already demonstrated this running of VMs capability on their VNX, Isilon and Symmetrix.
And soon, with EMC’s Project Lightning (announced at EMC World in May 2011), they will be introducing server side PCIe-based SSDs, ala Fusion-IO. This is different from the NetApp PAM/FlashCache PCIe-based card, which sits on their arrays, not on hosts or servers. And it is also very interesting to note that this EMC server-side PCIe Flash SSD card will become a bridge to EMC’s FAST (Fully Automated Storage Tiering) architecture, enabling it to place hot, warm and cold data strategically on different storage tiers of the applications on VMware’s VMs (now on either the server or the storage), perhaps using vMotion as a data mover on top of the “specialized” link created by the server-side EMC PCIe card.
This also blurs the line between the servers and storage and creates a virtual architecture between servers and storage, because what used to be distinct data border of the servers is now being melded into the EMC storage array, virtually.
2 red alerts are flagging in my brain right now.
- The “bridge” has just linked the server back to the storage, after years of talking about networked storage. The server is ONE again with the storage. Doesn’t that look to you like a server with plenty of storage? It has come a full cycle. But more interesting and what I am eager to see is what more is this “bridge” capable of when it comes to data management. vMotion might be the first of many new “protocol” breeds to enhance data management and mobility with this “bridge”. I am salivating right now of this massive potential.
- What else can EMC do with the VMware API? This capability I am writing right now is made possible by EMC tweaking VMware’s API to maximize much, much more. As the VMware vStorage API is continually being enhanced, the potential is again, very massive and could change the entire landscape of cloud computing and subsequently, the entire IT landscape. This is another Pavlov’s dog moment (see figures below as part of my satirical joke on myself)
Sorry, the diagram below is not related to what my blog entry is. Just my way of describing myself right now.
I am extremely impressed with what EMC is doing. A lot of smarts and thinking go into this and this is definitely signs of things to come. The server and the storage are “merging again”. Think of it as Borg assimilation in Star Trek.
Resistance is futile!
IBM claims that we are responsible of for creating 2.5 quintillion bytes of data every day. How much is 1 quintilion?
According to the web,
1 quintillion = 1,000,000,000,000,000,000
After billion, it is trillion, then quadrillion, and then quintillion. That’s what 1 quintillion is, with 18 zeroes!
These data comes from everything from social networking updates, meteorology (weather reports), remote sensing maps (Google Maps, GPS, Geographical Information Systems), photos (Flickr), videos (YouTube), Internet search (Google) and so on. The big data terminology, according to Wikipedia, is data that are too large to be handled and processed by conventional data management tools. This presents a new set of difficulties when it comes to collected these data, storing them and sharing them. Indexing and searching big data would require special technologies to be able to mine and extract valuable information from big data datasets, within an acceptable period of time.
According to Wiki, “Technologies being applied to big data include massively parallel processing (MPP) databases, datamining grids, distributed file systems, distributed databases, cloud computing platforms, the Internet, and scalable storage systems.” That is why EMC has paid big money to acquire GreenPlum and IBM acquired Netezza. Traditional data warehousing players such Teradata, Oracle and Ingres are in the picture as well, setting a collision course between the storage and infrastructure companies and the data warehousing solutions companies.
The 2010 Gartner Magic Quadrant has seen non-traditional players such as IBM/Netezza and EMC/Greenplum, in its leaders quadrant.
And the key word that is already on everyone’s lips is “ANALYTICS“.
The ability to extract valuable information that helps determines what the next future trend is and personalized profiling will be something that may already arrived as companies are clamouring to get more and more out of our personalities so that they can sell you more of their wares.
Meteorological organizations are using big data analytics to find out about weather patterns and climate change. Space exploration becomes more acute and precise from the tons and tons of data collected from space explorations. Big data analytics are also helping pharmaceutical companies develop new biological and pharmaceutical breakthroughs. And the list goes on.
I am a new stranger into big data and I do not proclaim to know a lot. But terms such as scale-out NAS, distributed file systems, grid computing, massively parallel processing are certainly bringing the data storage world into a new frontier, and it is something we as storage professionals have to adapt to. I am eager to learn and know more about big data. It is a big headache but change is inevitable.