This week I went off the beaten track to get back to my first love – Solaris. Now that Oracle owns it, it shall be known as Oracle Solaris. I am working on a small project based on (Oracle) Solaris Containers and I must say, I am intrigued by it. And I felt good punching the good ‘ol command lines in Solaris again.
Oracle actually offers a lot of virtualization technologies – Oracle VM, Oracle VM Dynamic Domains, Oracle Solaris Logical Domains (LDOMs), Oracle Solaris Containers (aka Zones) and Oracle VirtualBox. Other than VirtualBox, the other VE (Virtualized Environment) solutions are enterprise solutions but unfortunately, they lack the pizazz of VMware at this point in time. From my perspective, they are also very Oracle/Solaris-centric, making them less appealing to the industry at this moment
Here’s an old Sun diagram of what Sun virtualization solutions are:
What I am working on this week is Solaris Containers or Zones. The Containers solution is rather similar to VMware’s gamut of Tier-2 Virtualization solutions that are host-based. Solutions that fall into this category are VMware Server, VMware Workstation, VMware Player, VMware ACE and VMware Fusion for MacOS. Therefore, it requires a host OS to run the Solaris Containers.
I did not have a Solaris Resource Manager software to run the GUI stuff, so I had to get back to basics with CLI, which is good for me. In fact, I liked it even more and with the CLI, I could pretty much create zones with ease. And given the fact that the host OS is Solaris 10, I could instantly feel the robustness, the performance, the stability and the power of Solaris 10, unlike the flaky Windows hosting VMware host-based virtualization solutions or the iffiness of Linux.
A more in depth look of Solaris Containers/Zones is shown below.
At first touch, 2 things impressed me
- The isolation of each Container and its global master domain is very well defined. What can be done, and what cannot be done; what can be configured and what cannot, is very clear and the configurability of each parameter is quickly acknowledged and controlled by the Solaris kernel. From what I read, Solaris Containers has achieved the highest level of security with its Trusted Extension component, which is a re-implementation of Trusted Solaris. Solaris 10 has received the highest commercial level of Common Criteria Certification. This is known as EAL4+ and has been accepted by the U.S DoD (Department of Defense).
- It’s simplicity in administering compute and memory resources to the Containers. I will share that in CLI with you later.
To start, we acknowledge that there is likely a global zone that has been created when Solaris 10 was first installed.
To create a zone and configuring it with CLI, it is pretty straightforward. Here’s a glimpse of what I did yesterday.
# zonecfg –z perf-rac1 Use ‘create’ to be configuring a zone zonecfg:perf-rac1> create zonecfg:perf-rac1> set zonepath=rpool/perfzones/perf-rac1 zonecfg:perf-rac1> set autoboot=true zonecfg:perf-rac1> remove inherit-pkg-dir dir=/lib zonecfg:perf-rac1> remove inherit-pkg-dir dir=/sbin zonecfg:perf-rac1> remove inherit-pkg-dir dir=/usr zonecfg:perf-rac1> remove inherit-pkg-dir dir=/usr/local zonecfg:perf-rac1> add net zonecfg:perf-rac1:net> set address=<input from parameter> zonecfg:perf-rac1:net> set physical=<bge0|or correct Ethernet interface> zonecfg:perf-rac1:net> end zonecfg:perf-rac1> add dedicated-cpu zonecfg:perf-rac1:dedicated-cpu> set ncpus=2-4 (or any potential cpus on sun box) zonecfg:perf-rac1:dedicated-cpu>end zonecfg:perf-rac1> add capped-memory zonecfg:perf-rac1:capped-memory> set physical=4g zonecfg:perf-rac1:capped-memory>set swap=1g zonecfg:perf-rac1:capped-memory>set locked=1g zonecfg:perf-rac1:capped-memory>end zonecfg:perf-rac1> verify zonecfg:perf-rac1> commit zonecfg:perf-rac1> exit
The command zonecfg -z <zonename> triggers a configuration prompt where I run create to create the zone. I set the zonepath to list where the zone files will be contained and set the autoboot=true so that it will automatically start during a reboot.
Solaris Containers is pretty cool where it has the ability to either inherit or share the common directories such as /usr, /lib, /sbin and others or create its own set of directories separate from the global root directory tree. Here I choose to remove the inheritance and allow the Solaris in the Container to have its own independent directories.
The commands add net sends me into another sub-category where I can configure the network interface as well as the network address. Nothing spectacular there. I end the configuration and do a couple of cool things which are related to resource management.
I have added add dedicated-cpu and set ncpus=2-4 and also add capped-memory of physical=4g, swap=1gb, locked=1gb. What I have done is to allocate a minimum of 2 CPU resources and a maximum of 4 CPU resources (if resource permits) to the zone called perf-rac1. Additionally, I have allowed it to have a capped memory of at most 4GB of RAM, with assured of 1GB of RAM. Swap space wis set at 1GB.
This resource management allows me to build a high performance Solaris Container for Oracle 11g RAC. Of course, you are free to create as many containers as long as the system resources allow it. Note that I did not include the shared memory and semaphores parameters required for Oracle 11g RAC but go ahead and consult your favourite Oracle DBA (have fun doing so!)
After the perf-rac1 zone/container has been created (and configured), I just need to run the following
# zoneadm –z perf-rac1 install # zoneadm –z perf-rac1 boot
These 2 commands will install the zone and start the installation process. It will copy all the packages from the global zone and start the installation as per normal. Once the “installation” is complete, there will be the usual Solaris configuration form where information such as timezone, IP address, root login/password and so on are input. That will take about 20-40 minutes, depending on the amount of things to be installed and of course, the power of the Sun system. I am running an old Sun V210 with 512MB, so it took a while.
When it’s done, we can just login into the zone with the command
# zlogin –C perf-rac1
and I get into another Solaris OS in the Solaris Container.
What I liked what the fact that Solaris Containers is rather simple to understand but the flexibility to configure computing resources to it is pretty impressive. It’s fun working on this stuff again after years away from Solaris. (This was after I took my RedHat RHCE certification and I pretty much left Sun Solaris for quite a while).
More testing to be done, but overall I am quite happy to be back as a Solaris virgin again.
I picked up a new article this afternoon from SearchStorage – titled “Enterprise storage trends: SSDs, capacity optimization, auto tiering“. I cannot help but notice some of the things I have been writing about VMware being the storage killer and the rise of Cloud Computing which take away our jobs.
I did receive some feedback about what I wrote in the past and after reading the SearchStorage article, I can’t help but feeling justified. On the side bar, it wrote:
“The rise of virtual machine-specific and cloud storage suggest that other changes are imminent. In both cases …. and would no longer require storage architects and managers.”
Things are changing at an extremely fast pace and for those of us still languishing in the realms of NAS and SAN, our expertise could be rendered obsolete pretty quickly.
But all is not lost because it would be easier for a storage engineer, who already has the foundation to move into the virtualization space than a server virtualization engineer coming down to learn about the storage fundamentals. We can either choose to be dinosaur or be the species of the next generation.
This is breaking news. RedHat is to acquire Gluster!
What is Gluster? Gluster is a clustering Linux distribution started by Z Research under the direction of Anand Babu (who is currently Gluster’s CEO) aiming to commoditize supercomputing and supercomputing clustered storage. Gluster is open source but there is a commercial version as well. It runs on commodity 64-bit x86 hardware. The Gluster File System (GlusterFS) aggregates disks and memory resources into a pool of storage thru a single global namespace and accessed through multiple file-level protocols. The scale-out architecture is where storage resources can be added as a storage node in a building block fashion to meet performance and capacity demands, rather like what HP P4000 is doing to the block-level environment for SAN.
Gluster can integrated with most 64-bit Linux distros. This is done at the Linux user space but it can also be crafted at the Linux kernel space, where it is a software appliance, easily integrated into off-the-shelf 64-bit x86-64 platforms. This means that you can build a scale-out NAS pretty easily using your own hardware.
From an architecture standpoint, GlusterFS and its integration to a storage appliance looks like this:
Because it works in a modular add-on fashion, this architecture is distribution and extended by replicating the same architecture across additional x86-64 platforms (which is a storage node) as shown below.
It’s really easy to install Gluster and build the Scale Out NAS. I have been saving a couple videos about how Gluster is installed and I must say that it’s pretty easy. In less than 30 minutes, you can install your first Gluster storage node and then add additional nodes on the fly.
Enjoy the videos.
Video #1 (Gluster Installation)
(I have difficulty uploading the videos because WordPress requires me to purchase one of their solutions)
Video #2 (Creating and adding Storage Node in Gluster)
(I have difficulty uploading the videos because WordPress requires me to purchase one of their solutions)
Note: If you are interested to see the videos, please email to me at email@example.com.
This news gets me very excited because this is the perfect endorsement of what I have been saying all along. Storage networking and data management are the foundations of CLOUD and VIRTUALIZATION. Without data being stored and managed well, everything falls apart. And as I have mentioned many times before, this is a fantastic time to become an extra-ordinary storage engineer/consultant/architect/sales (maybe not!)
After being in the storage networking industry for so long, I have seen most of the new storage solutions out there. Most of them don’t really differ much from what already out there, and it gets a little boring. But once in a while, a little gem is unearthed and my excitement bubbles up again.
Today, I was at the HP P4000 G2 SAN workshop and the LeftHand Networks SAN/iQ storage solution which HP acquired in 2008 left me with 3 words – Interesting, Innovative and Impressive – from a technology standpoint.
I must admit that this is a little gem that got past my radar and now it’s HP’s gain. I have heard about LeftHand Networks in the past, and at the same time, I was also looking at another storage solution called Intransa. Unfortunately, Intransa went on to differentiate themselves and today, they are focused more as a storage solution for videos and CCTVs, seldom surfacing with innovative technology. LeftHand Networks was and is different and I can understand why HP bought them, because the technology that they bring with them to HP is really cool!
Now rebranded and renamed as HP P4000 G2 SAN, the storage solution no longer sits on proprietary hardware. As part of HP’s Converged Infrastructure strategy, the SAN/iQ has been fully integrated into the HP Proliant x86 platform (I heard there’s a blade version as well), making it simple to procure and probably helps simplify operational resource planning and logistics as well. At the same, there is also a P4000 VSA (Virtual Storage Appliance) as well, which HP guys have been using for demo for several years now. There is a 60-day trial available at the HP P4000 VSA Download site, for organizations to have a try-and-buy and if they do, they can turn some of their old x86 platforms into a storage appliance by just adding more hard disk drives. That’s saves money too!
So, what’s cool, you say?
2 key technologies stands out
- Storage Clustering
- Network RAID
As I was well informed at the workshop today, the Storage Clustering technology is not exclusive to the P4000. In fact, Dell EqualLogic employs something similar as well. But it was something that impressed me and it is different from the traditional storage SANs that we usually see.
You see, in the traditional SAN setup, the LUNs or volumes are either loosely or tightly linked to 2 active/active storage processors/controllers. And the way most of the storage vendors do, when a customer runs out of capacity or performance or both, they would have to do a forklift upgrade of the controllers. This is something that is disruptive and also does not allow CPU, memory or I/O channels upgrade to the existing controller. Today, most storage vendors do not allow you to break open the storage processor chassis and change the CPU, add more RAM or add more I/O paths to support more disk drives or increase throughput. Mind you, this is something that I have been questioning for a long time but as the storage networking industry has it, you got to upgrade the entire storage processor or controller in order to get more power and capacity.
The P4000 (as well as the Dell EqualLogic) approaches this from another angle where instead of doing a forklift upgrade of the storage processor/controller, just add another node of the same CPU and RAM profile, and have the P4000 SAN/iQ software group the new node together with the existing node(s) to form a storage cluster group. As best practice, the Storage Cluster feature should have 16 nodes or less, but in one of the war stories shared, one customer in the US actually had 32 nodes in a Storage Cluster group, for storage capacity reasons.
As more nodes are added to the Storage Cluster group, the LUNs/volumes can be extended or spanned to the other nodes as long as they are physically connected in a Gigabit network and the entire LUN or volume is been seen as ONE irregardless of which physical nodes it may be sitting. Typically you will see this sort of thing of single “Global Namespace” concept at the file system level but this is the first time I have seen it implemented at the SAN level. (Ok, I have to admit that I am a little behind times with this technology)
Here’s a little diagram I dug up from LeftHand before it was acquired by HP which I hope will enlightened the readers about this Storage Cluster feature.
But the best is yet to come as the HP Solution Architect (Timothy Chua) mentioned that the Network RAID feature was uniquely LeftHand’s and way cooler. And I couldn’t agree more because this lighted me up like a spark plug!
Since Storage Clustering could span LUNs/volumes across nodes, it was only natural that the RAID capability be extended across nodes as well. RAID-10, RAID-5, RAID-6 could all be spanned across all nodes, spread the data blocks and its mirrored/parity data blocks across the nodes in the network. And the nodes does not have to at a single site. With Gigabit networks, the nodes can be separated into multiple sites as well, giving the entire solution quite a comprehensive campus-wide storage high availability. And since this is Network RAID, it gives an entirely new meaning to the word Disaster Recovery because this will eliminate the need for data replication. Primary data in a Network RAID-10 in Node 1/Site 2 could be mirrored in Node 2/Site 2, which can be further mirrored to Node 3/Site 3 and Node 4/Site 4 for a 4-way mirror. This is the P4000 Multi-site SAN solution.
The diagram below shows how Network RAID is implemented with VMware ESX.
And since replication is no longer a requirement, VMware’s SRM (Site Recovery Manager) is also not required as well.
It is no surprise that synchronous replication in the P4000 solution is equivalent to Network RAID. Though the concept of separating the storage controllers/nodes into multiple sites for true long-distance mirroring exists, they usually don’t exist at this level. NetApp has their Fabric and Stretch MetroCluster and EMC has their VPlex, but they usually are proposed at the higher end of the spectrum. Looks to me that HP P4000 is the only one that has this concept at the entry level iSCSI SAN level. Kudos!
They have an asynchronous replication as well for longer distance networks.
I did not stay for the demo today but I am already tickled pink about the HP P4000 technology. It had a good impression on me and I can’t wait to know more of how it works internally. Looking forward to a deeper dive of the P4000 and hope to stay for the demo next time.
Compute and storage are 2 components within the IT infrastructure which are surely converging. SAN and NAS are facing their greatest adversary yet, and could be made insignificant if the cloud and virtualization game had their way. This is giving rise to the a new breed of solution, a specialized appliance where both compute and storage are ONE. Rising from the ashes of shared storage (SAN and NAS, take note), we are beginning to see things going back to way of direct, internal storage.
There were some scuffles in the bushes about 5 years, where Sun (now Oracle) was ahead of its game. The Sun Fire X4500 (aka Thumper) was one of the strong candidates to challenge the SAN/NAS duopoly in this networked storage period. X4500 integrated both the server and the storage components together, using ZFS as a file system and volume manager to deliver a very high throughput on all the JBOD disks very efficiently. ZFS acted as the RAID, so there was no need to have specialized RAID hardware. This proved that a very high performance storage solution can be easily integrated using standard off-the-shelf infrastructure components and the x86 architecture. By combining both compute and storage together, there were hints that the industry was about to rise up to Direct-Attached Storage (DAS) again, despite its perceived weakness against SAN and NAS.
Unfortunately, the applications were not ready for DAS then. Besides ZFS, applications such as databases, emails and file servers were not ready to jump into the DAS bandwagon and watch them ride into the sunset. But the fairy tale seems to be retold again, and this time, the evidence that DAS could rise again is much stronger.
The catalyst to this disruptive force? Virtualization!
I mentioned that VMware is the silent storage killer a few blogs ago. Needless to say, that ruffled a few featheres among the readers. I have no doubt that virtualization is changing how we storage guys look at SAN and NAS. In a traditional setup, the SAN or NAS is setup to provision LUNs or mount points to the data storage for VMFS volumes in the VMware environment. It will then be the storage array to provide snapshots, replications, thin provisioning and so on.
Perhaps VMware is nit picking that managing storage arrays for VMFS volumes is difficult. From the VMware administrators view, they are right. They don’t want to know what’s going on below the VM-level. All they want is storage, any kind of storage and VMware will manage the volumes, snapshots, replication and thin provisioning. Indeed they were already doing that since vStorage API was introduced. In the new release of VMware version 5.0, the ante has been upped even higher, making networked storage less and less significant.
If you want to know about vStorage API and stuff, below is a diagram of the integration of the various components at the VMware API level.
VMware can now use direct, internal storage look like shared storage. The Virtual Storage Appliance (VSA) does just that. VMware already has a thriving market from the community and hobbists for VMware Appliances.
The appliance market has now evolved into new infrastructure too. Using x86 architecture, off-the-shelf infrastructure components (sounds familiar?), companies such as Nutanix and Tintri are taking advantage of this booming trend to introduce specialized VMware appliances as shown in their advertisements on their respective web sites.
Here’s the Nutanix Ad:
Here’s the Tintri Ad:
Both Tintri and Nutanix are a new breed of appliances – specialized appliances for VMware.
At the same time, other applications are building these specialized appliances as well. I have mentioned Oracle Exadata many times in the past and Oracle Exadata is the perfect example an a fine-tuned, hardcore database engine to make the Oracle run at the best performance possible.
Likewise HP has announced their E5000 Messaging System for Microsoft Exchange. The E5000 is a specialized appliance optimized and well-tuned for the Microsoft Exchange Server 2010. From the words of HP,
“HP E5000 Messaging System is the industry’s first fully self-contained platform built for the next-generation of Microsoft Exchange to deliver enterprise-class messaging to businesses of all sizes. Built as a turnkey solution that can be up and running in a few hours vs. days, the HP E5000 Messaging System gives business users the experience they want most: large mailboxes, centralized archiving of mailboxes files and 24×7 access from any device. IT staffs benefit the solutions simplicity to setup, scale and manage and to meet new demands affordably. Ideal for multi-site enterprises as well as branch office and remote office environments, each HP Messaging System delivers greater simplicity and accelerates deployment with preconfigured solutions starting at 500 mailboxes up to 3000 mailboxes, while delivering large, 1 to 2.5GB mailbox sizes. Clients can grow by adding storage capacity or more appliances within the environment up from hundreds to thousands of mailboxes.”
What are the specs of this E5000 box, you say? Here you go:
And look at Row#2 in the table above … Direct, Internal Disks! Look at Row #4, Xeon CPUs! Both Compute and Storage in the same appliance!
While the HP E5000 announcement was recently, Hitachi Data Systems were already in the game early with their Unified Compute Platform and their Converged Platform for Microsoft Exchange with relatively the same idea – specialized appliances.
Perhaps the HDS solutions aren’t exactly direct, internal storage but the concept is still the same – specialized appliance. HDS Unified Compute Platform (UCP) has these components.
HDS Converged Platform for MS Exchange provides their specialized “appliance” with Reference Architectures that can support up to 68,000 Microsoft Exchange mailboxes. Here’s an architecture diagram of their “appliance”
There’s no denying that the networked storage landscape is changing. So are the computing platforms. We are already seeing the compute and storage components being integrated together, tighter than ever. The wave is rising for specialized appliances and it can only get more intense from now on.
No wonder HP’s Converged Infrastructure vision is betting on x86 architecture, simple storage platforms with SAS/SATA disks and Virtualization. Other vendors are doing the same as well – Cisco, NetApp and VMware with their FlexPod solution and EMC with their VBlocks of VMware, Cisco and EMC Storage.
Hail to the Rise of the Specialized Appliance!
I was chatting with a friend yesterday and we were discussing about virtualization and cloud, the biggest things that are happening in the IT industry right now. We were talking about the VMware vSphere 5 arrival, the cool stuff VMware is bringing into the game, pushing the technology juggernaut farther and farther ahead of its rivals Hyper-V, Xen and Virtual Box.
And in the technology section of the newspaper yesterday, I saw news of Jaring OneCloud offering and one of the local IT players just brought in Joyent. Fantastic stuff! But for us in IT, we have been inundated with cloud, cloud and more cloud. The hype, the fuzz and the reality. It’s all there but back to our conversation. We realized that virtualization and cloud aren’t much without storage, the cornerstone of virtualization and cloud. And in the storage networking layer, there are the data management piece, the information infrastructure piece and so on and yet … why are there so few storage networking professional out there in our IT scene.
I have been lamenting this for a long time because we have been facing this problem for a long time. We are facing a shortage of qualified and well experienced storage networking professionals. There are plenty of jobs out there but not enough resources to meet the demand. As SNIA Malaysia Chairman, it is my duty to work with my committee members of HP, IBM, EMC, NetApp, Symantec and Cisco to create the awareness, and more importantly the passion to get the local IT’s storage networking professional voice together. It has been challenging but my advice to all those people out there – “Why be ordinary when you can become extra-ordinary?”
We have to make others realize that storage networking is what makes virtualization and cloud happen. Join us at SNIA Malaysia and be part of something extra-ordinary. Storage networking IS the foundation of virtualization and cloud. You can’t exclude it.