Happy Lunar New Year! The Chinese around world has just ushered in the Year of the Water Dragon yesterday. To all my friends and family, and readers of my blog, I wish you a prosperous and auspicious Chinese New Year!
Over the holidays, I have been keeping up with the progress of Solid State Drives (SSDs). I am sure many of us are mesmerized by SSDs and the storage vendors are touting the best of SSDs have to offer. But let me tell you one thing – you are probably getting the least of what the best SSDs have to offer. You might be puzzled why I say things like this.
Let me share with a common sales pitch. Most (if not all) storage vendors will tout performance (usually IOPS) as the greatest benefits of SSDs. The performance numbers have to be compared to something, and that something is your regular spinning Hard Disk Drives (HDDs). The slowest SSDs in terms of IOPS is about 10-15x faster than the HDDs. A single SSD can at least churn 5,000 IOPS when compared to the fastest 15,000 RPM HDDs, which churns out about 200 IOPS (depending on HDD vendors). Therefore, the slowest SSDs can be 20-25x faster than the fastest HDDs, when measured in IOPS.
But the intent of this blogger is to share with you more about SSDs. There’s more to know because SSDs are not built the same. There are write-bias SSDs, read-bias SSDs; there are SLC (single level cell) and MLC (multi level cell) SSDs and so on. How do you differentiate them if Vendor A touts their SSDs and Vendor B touts their SSDs as well? You are not comparing SSDs and HDDs anymore. How do you know what questions to ask when they show you their performance statistics?
SNIA has recently released a set of methodology called “Solid State Storage (SSS) Performance Testing Specifications (PTS)” that helps customers evaluate and compare the SSD performance from a vendor-neutral perspective. There is also a whitepaper related to SSS PTS. This is something very important because we have to continue to educate the community about what is right and what is wrong.
In a recent webcast, the presenters from the SNIA SSS TWG (Technical Working Group) mentioned a few questions that I think we as vendors and customers should think about when working with an SSD sales pitch. I thought I share them with you.
- Was the performance testing done at the SSD device level or at the file system level?
- Was the SSD pre-conditioned before the testing? If so, how?
- Was the performance results taken at a steady state?
- How much data was written during the testing?
- Where was the data written to?
- What data pattern was tested?
- What was the test platform used to test the SSDs?
- What hardware or software package(s) used for the testing?
- Was the HBA bandwidth, queue depth and other parameters sufficient to test the SSDs?
- What type of NAND Flash was used?
- What is the target workload?
- What was the percentage weight of the mix of Reads and Writes?
- Are there warranty life design issue?
I thought that these questions were very relevant in understanding SSDs’ performance. And I also got to know that SSDs behave differently throughout the life stages of the device. From a performance point of view, there are 3 distinct performance life stages
- Fresh out of the box (FOB)
- Steady State
As you can see from the graph below, a SSD, fresh out of the box (FOB) displayed considerable performance numbers. Over a period of time (the graph shown minutes), it transitioned into a mezzanine stage of lower IOPS and finally, it normalized to the state called the Steady State. The Steady State is the desirable test range that will give the most accurate type of IOPS numbers. Therefore, it is important that your storage vendor’s performance numbers should be taken during this life stage.
Another consideration when understanding the SSDs’ performance numbers are what type of tests used? The test could be done at the file system level or at the device level. As shown in the diagram below, the test numbers could be taken from many different elements through the stack of the data path.
Performance for cached data would given impressive numbers but it is not accurate. File system performance will not be useful because the data travels through different layers, masking the true performance capability of the SSDs. Therefore, SNIA’s performance is based on a synthetic device level test to achieve consistency and a more accurate IOPS numbers.
There are many other factors used to determine the most relevant performance numbers. The SNIA PTS test has 4 main test suite that addresses different aspects of the SSD’s performance. They are:
- Write Saturation test
- Latency test
- IOPS test
- Throughput test
Good morning, afternoon, evening, Ladies & Gentlemen, wherever you are.
Today, we are going to learn how to bake, errr … I mean, make a storage performance model. Before we begin, allow me to set the stage.
Don’t you just hate it when you are asked to do storage performance sizing and you don’t have a freaking idea how to get started? A typical techie would probably say, “Aiya, just use the capacity lah!”, and usually, they will proceed to size the storage according to capacity. In fact, sizing by capacity is the worst way to do storage performance modeling.
Bear in mind that storage is not a black box, although some people wished it was. It is not black magic when it comes to performance sizing because things can be applied in a very scientific and logical manner.
SNIA (Storage Networking Industry Association) has made a storage performance modeling methodology (that’s quite a mouthful), and basically simplified it into these few key ingredients. This recipe is for storage performance modeling in general and I am advising you guys out there to engage your storage vendors professional services. They will know their storage solutions best.
And I am going to say to you – Don’t be cheap and not engage professional services – to get to the experts out there. I was having a chat with an consultant just now at McDonald’s. I have known this friend of mine for about 6-7 years now and his name is Sugen Sumoo, the Director of DBORA Consulting. They specialize in Oracle and database performance tuning and performance forecasting and it is something that a typical DBA can’t do, because DBORA Consulting is the Professional Service that brings expertise and value to Oracle customers. Likewise, you have to engage your respective storage professional services as well.
In a cook book or a cooking show, you are presented with the ingredients used and in this recipe for storage performance modeling, the ingredients (in no particular order) are:
- Application block size
- Read and Write ratio
- Application access patterns
- Working set size
- IOPS or throughput
- Demand intensity
Application Block Size
First of all, the storage is there to serve applications. We always have to look from the applications’ point of view, not storage’s point of view. Different applications have different block size. Databases typically range from 8K-64K and backup applications usually deal with larger block sizes. Video applications can have 256K block sizes or higher. It all depends.
The best way is to find out from the DBA, email administrator or application developers. The unfortunate thing is most so-called technical people or administrators in Malaysia doesn’t have a clue about the applications they manage. So, my advice to you storage professionals, do your research on the application and take the default value. These clueless fellas are likely to take the default.
Read and Write ratio
Applications behave differently at different times of the day, and at different times of the month (no, it’s not PMS). At the end of the financial year or calendar, there are some tasks that these applications do as well. But in a typical day, there are different weightage or percentage of read operations versus write operations.
Most OLTP (online transaction processing)-based applications tend to be read heavy and write light, but we need to find out the ratio. Typically, it can be a 2:1 ratio or 60%:40%, but it is best to speak to the application administrators about the ratio. DSS (Decision Support Systems) and data warehousing applications could have much higher reads than writes while a seismic-analysis applications can have multiple writes during the analysis periods. It all depends.
To counter the “clueless” administrators, ask lots of questions. Find out the workflow of several key tasks and ask what that particular tasks do at different checkpoints of the application’s processing. If you are lazy (please don’t be lazy, because it degrades your value as a storage professional), use a rule of thumb.
Application access patterns
Applications behave differently in general. They can be sequential, like backup or video streaming. They can be random like emails, databases at certain times of the day, and so on. All these behavioral patterns affect how we design and size the disks in the storage.
Some RAID levels tend to work well with sequential access and others, with random access. It is not difficult to find out about the applications’ pattern and if you read more about the different RAID-levels in storage, you can easily identify the type of RAID levels suitable for each type of behavioral patterns.
Working set size
This variable is a bit more difficult to determine. This means that a chunk of the application has to be loaded into a working area, usually memory and cache memory, to be used and abused by the application users.
Unless someone is well versed with the applications, one would not be able to determine how much of the applications would be placed in memory and in cache memory. Typically, this can only be determined after the application has been running for some time.
The flexibility of having SSDs, especially the DRAM-type of SSDs, are very useful to ensure that there is sufficient “working space” for these applications.
IOPS or Throughput
According to SNIA model, for I/O less than 64K, IOPS should be used as a yardstick to do storage performance modeling. Anything larger, use throughput, in which MB/sec is the measurement unit.
The application guy would be able to tell you what kind of IOPS their application is expecting or what kind of throughput they want. Again, ask a lot of questions, because this will help you determine the type of disks and the kind of performance you give to the application guys.
If the application guy is clueless again, ask someone more senior or ask the vendor. If the vendor engineers cannot give you an answer, then they should not be working for the vendor.
This part is usually overlooked when it comes to performance sizing. Demand intensity refers to how intense is the I/O requests. It could come from 1 channel or 1 part of the applications, or it could come from several parts of the applications in parallel. It is as if the storage is being ‘bombarded’ by applications and this is the part that is hard to determine as well.
In some applications, the degree of intensity or parallelism can be tuned and to find out, ask the application administrator or developer. If not, ask the vendor. Also do a lot of research on the application’s architecture.
And one last thing. What I have learned is to add buffers to the storage performance model. Typically I would add about 10-20% extra but you never know. As storage professionals, I would strongly encourage to engage professional services, because it is worthwhile, especially in the early stages of the sizing. It is usually a more expensive affair to size it after the applications have been installed and running.
“Failure to plan is planning to fail”. The recipe isn’t that difficult. Go figure it out.
We are used to block-based approach and also the file-based approach to data. The 2 diagrams below shows the basics of how we access data in both block-based and file-based data on the storage device.
For block-based , the storage of the blocks is merely in arrays of unrelated contiguous blocks. For file-based, as seen below,
there is another layer of abstraction, and this is called the file system. But if you seen both diagrams above, there are some random numbers in light blue and that is to represent the storage device, the hard disk drive’s export of “containers” to the file system or the application that is accessing the storage device. This is usually the LBA (Logical Block Addressing), which is basically set of schematics that defines the locations on the hard disk drives. LBA tells the location of where the data is stored. For more information about LBA, check out this Wikipedia definition. But the whole idea is LBA is dumb. It is pretty much static and exported to file systems and applications so that these guys can do something with it.
There’s something brewing in the background since 1994 and it is one of the many efforts to make intelligent storage devices. This new object-based interface was part of the research project done by Carnegie-Mellon University (CMU). Initially, it was known as Network Attached Secure Disk (NASD) but eventually made its way to the working group in SNIA, and developing it for ANSI T10 INCITS standard. ANSI T10 is the guardian of all SCSI standards. This is called Object Storage Device (OSD). The SCSI architecture diagram below shows the layer where OSD resides.
The motivation for this simple: To make storage devices of today to do more computational work, in particularly I/O, relieving the hosts and the local systems to concentrate other computational processing work. And the same time, the local systems must have some level of interactivity and management between the storage object and the computational hosts.
In the diagram below which compares both block-based and OSD,
you can see the separation of file system management interface that is at the kernel-space of the local host/system and this is replaced by the OSD Management interface at the storage device.
What does this all mean? This means that using LBA type of addressing that we are familiar with in the block-based and file-based storage is no longer the way to go, because as I mentioned before, LBA is dumb.
OSD, in some way, replaces the LBA with OIDs (Object IDs). The existing local system and/or its file system will interact with the storage devices with OIDs and the OIDs links to its respective objects storage. And the object will carry a lot of metadata, that represents the object, giving it the intelligent and management capability of the object.
The prominence of the metadata in the OSD would mean that we can build much more intelligent systems in the future. The OIDs and the objects can be grouped together in a flat design or can be organized and categorized in a virtual, hierarchical model.
Object storage is an intelligent evolution of disk drives that can store and serve objects rather than simply place data on tracks and sectors. And it can bring the following benefits:
- Intelligent space management in the storage layer
- Data aware pre-fetching and caching
- Robust shared access by multiple clients
- Scalable performance using off-loaded data path
- Reliability security
No, HP probably didn’t read my blogs and this isn’t a knee-jerk reaction from HP about things I have been writing. OK, I didn’t write about HP because I don’t know much about them. But this came as a coincidence as well as an apt title (my bad for the shameless plug for this entry’s title).
In my previous blog entry, I wrote about HP’s future in the latest IDC Q2 market share figures. I was not too enthusiastic about HP’s storage line up. Today, my old friend Mr. CC Chung, who is HP’s Country Manager for Storage, had tea with me at Bangsar Shopping Center. We were there to discuss about HP engagement with SNIA when the topic of HP’s storage came about (obviously). Chung said I lack the understanding of HP storage solutions, which I admit, is very true. And so, my friend with his kind gesture invited me to a series of HP Storage Solutions workshops, which I accept with glee and gratitude. Thank you very much, my friend.
Here’s a screenshot of their upcoming workshops:
I am seriously looking forward to the workshop and learn about the vibes of HP Storage Solutions. Too bad there aren’t workshops for HP 3PAR and HP X9000 IBRIX but I am sure this will be the start of my new friendship with HP.
Incidentally, as I was waiting for Chung, I was reading the HWM Magazine August 2011 issue, and lo behold, Chung was in the news announcing the HP X9000 IBRIX and X5000 G2 Network Storage System. I couldn’t find the HWM article online but I found the next best thing. A similar article (online, of course), appeared at CIO Asia. And with a nice picture of Mr. CC Chung too!
I was chatting with a friend yesterday and we were discussing about virtualization and cloud, the biggest things that are happening in the IT industry right now. We were talking about the VMware vSphere 5 arrival, the cool stuff VMware is bringing into the game, pushing the technology juggernaut farther and farther ahead of its rivals Hyper-V, Xen and Virtual Box.
And in the technology section of the newspaper yesterday, I saw news of Jaring OneCloud offering and one of the local IT players just brought in Joyent. Fantastic stuff! But for us in IT, we have been inundated with cloud, cloud and more cloud. The hype, the fuzz and the reality. It’s all there but back to our conversation. We realized that virtualization and cloud aren’t much without storage, the cornerstone of virtualization and cloud. And in the storage networking layer, there are the data management piece, the information infrastructure piece and so on and yet … why are there so few storage networking professional out there in our IT scene.
I have been lamenting this for a long time because we have been facing this problem for a long time. We are facing a shortage of qualified and well experienced storage networking professionals. There are plenty of jobs out there but not enough resources to meet the demand. As SNIA Malaysia Chairman, it is my duty to work with my committee members of HP, IBM, EMC, NetApp, Symantec and Cisco to create the awareness, and more importantly the passion to get the local IT’s storage networking professional voice together. It has been challenging but my advice to all those people out there – “Why be ordinary when you can become extra-ordinary?”
We have to make others realize that storage networking is what makes virtualization and cloud happen. Join us at SNIA Malaysia and be part of something extra-ordinary. Storage networking IS the foundation of virtualization and cloud. You can’t exclude it.