Category Archives: SNIA
Happy Lunar New Year! The Chinese around world has just ushered in the Year of the Water Dragon yesterday. To all my friends and family, and readers of my blog, I wish you a prosperous and auspicious Chinese New Year!
Over the holidays, I have been keeping up with the progress of Solid State Drives (SSDs). I am sure many of us are mesmerized by SSDs and the storage vendors are touting the best of SSDs have to offer. But let me tell you one thing – you are probably getting the least of what the best SSDs have to offer. You might be puzzled why I say things like this.
Let me share with a common sales pitch. Most (if not all) storage vendors will tout performance (usually IOPS) as the greatest benefits of SSDs. The performance numbers have to be compared to something, and that something is your regular spinning Hard Disk Drives (HDDs). The slowest SSDs in terms of IOPS is about 10-15x faster than the HDDs. A single SSD can at least churn 5,000 IOPS when compared to the fastest 15,000 RPM HDDs, which churns out about 200 IOPS (depending on HDD vendors). Therefore, the slowest SSDs can be 20-25x faster than the fastest HDDs, when measured in IOPS.
But the intent of this blogger is to share with you more about SSDs. There’s more to know because SSDs are not built the same. There are write-bias SSDs, read-bias SSDs; there are SLC (single level cell) and MLC (multi level cell) SSDs and so on. How do you differentiate them if Vendor A touts their SSDs and Vendor B touts their SSDs as well? You are not comparing SSDs and HDDs anymore. How do you know what questions to ask when they show you their performance statistics?
SNIA has recently released a set of methodology called “Solid State Storage (SSS) Performance Testing Specifications (PTS)” that helps customers evaluate and compare the SSD performance from a vendor-neutral perspective. There is also a whitepaper related to SSS PTS. This is something very important because we have to continue to educate the community about what is right and what is wrong.
In a recent webcast, the presenters from the SNIA SSS TWG (Technical Working Group) mentioned a few questions that I think we as vendors and customers should think about when working with an SSD sales pitch. I thought I share them with you.
- Was the performance testing done at the SSD device level or at the file system level?
- Was the SSD pre-conditioned before the testing? If so, how?
- Was the performance results taken at a steady state?
- How much data was written during the testing?
- Where was the data written to?
- What data pattern was tested?
- What was the test platform used to test the SSDs?
- What hardware or software package(s) used for the testing?
- Was the HBA bandwidth, queue depth and other parameters sufficient to test the SSDs?
- What type of NAND Flash was used?
- What is the target workload?
- What was the percentage weight of the mix of Reads and Writes?
- Are there warranty life design issue?
I thought that these questions were very relevant in understanding SSDs’ performance. And I also got to know that SSDs behave differently throughout the life stages of the device. From a performance point of view, there are 3 distinct performance life stages
- Fresh out of the box (FOB)
- Steady State
As you can see from the graph below, a SSD, fresh out of the box (FOB) displayed considerable performance numbers. Over a period of time (the graph shown minutes), it transitioned into a mezzanine stage of lower IOPS and finally, it normalized to the state called the Steady State. The Steady State is the desirable test range that will give the most accurate type of IOPS numbers. Therefore, it is important that your storage vendor’s performance numbers should be taken during this life stage.
Another consideration when understanding the SSDs’ performance numbers are what type of tests used? The test could be done at the file system level or at the device level. As shown in the diagram below, the test numbers could be taken from many different elements through the stack of the data path.
Performance for cached data would given impressive numbers but it is not accurate. File system performance will not be useful because the data travels through different layers, masking the true performance capability of the SSDs. Therefore, SNIA’s performance is based on a synthetic device level test to achieve consistency and a more accurate IOPS numbers.
There are many other factors used to determine the most relevant performance numbers. The SNIA PTS test has 4 main test suite that addresses different aspects of the SSD’s performance. They are:
- Write Saturation test
- Latency test
- IOPS test
- Throughput test
Good morning, afternoon, evening, Ladies & Gentlemen, wherever you are.
Today, we are going to learn how to bake, errr … I mean, make a storage performance model. Before we begin, allow me to set the stage.
Don’t you just hate it when you are asked to do storage performance sizing and you don’t have a freaking idea how to get started? A typical techie would probably say, “Aiya, just use the capacity lah!”, and usually, they will proceed to size the storage according to capacity. In fact, sizing by capacity is the worst way to do storage performance modeling.
Bear in mind that storage is not a black box, although some people wished it was. It is not black magic when it comes to performance sizing because things can be applied in a very scientific and logical manner.
SNIA (Storage Networking Industry Association) has made a storage performance modeling methodology (that’s quite a mouthful), and basically simplified it into these few key ingredients. This recipe is for storage performance modeling in general and I am advising you guys out there to engage your storage vendors professional services. They will know their storage solutions best.
And I am going to say to you – Don’t be cheap and not engage professional services – to get to the experts out there. I was having a chat with an consultant just now at McDonald’s. I have known this friend of mine for about 6-7 years now and his name is Sugen Sumoo, the Director of DBORA Consulting. They specialize in Oracle and database performance tuning and performance forecasting and it is something that a typical DBA can’t do, because DBORA Consulting is the Professional Service that brings expertise and value to Oracle customers. Likewise, you have to engage your respective storage professional services as well.
In a cook book or a cooking show, you are presented with the ingredients used and in this recipe for storage performance modeling, the ingredients (in no particular order) are:
- Application block size
- Read and Write ratio
- Application access patterns
- Working set size
- IOPS or throughput
- Demand intensity
Application Block Size
First of all, the storage is there to serve applications. We always have to look from the applications’ point of view, not storage’s point of view. Different applications have different block size. Databases typically range from 8K-64K and backup applications usually deal with larger block sizes. Video applications can have 256K block sizes or higher. It all depends.
The best way is to find out from the DBA, email administrator or application developers. The unfortunate thing is most so-called technical people or administrators in Malaysia doesn’t have a clue about the applications they manage. So, my advice to you storage professionals, do your research on the application and take the default value. These clueless fellas are likely to take the default.
Read and Write ratio
Applications behave differently at different times of the day, and at different times of the month (no, it’s not PMS). At the end of the financial year or calendar, there are some tasks that these applications do as well. But in a typical day, there are different weightage or percentage of read operations versus write operations.
Most OLTP (online transaction processing)-based applications tend to be read heavy and write light, but we need to find out the ratio. Typically, it can be a 2:1 ratio or 60%:40%, but it is best to speak to the application administrators about the ratio. DSS (Decision Support Systems) and data warehousing applications could have much higher reads than writes while a seismic-analysis applications can have multiple writes during the analysis periods. It all depends.
To counter the “clueless” administrators, ask lots of questions. Find out the workflow of several key tasks and ask what that particular tasks do at different checkpoints of the application’s processing. If you are lazy (please don’t be lazy, because it degrades your value as a storage professional), use a rule of thumb.
Application access patterns
Applications behave differently in general. They can be sequential, like backup or video streaming. They can be random like emails, databases at certain times of the day, and so on. All these behavioral patterns affect how we design and size the disks in the storage.
Some RAID levels tend to work well with sequential access and others, with random access. It is not difficult to find out about the applications’ pattern and if you read more about the different RAID-levels in storage, you can easily identify the type of RAID levels suitable for each type of behavioral patterns.
Working set size
This variable is a bit more difficult to determine. This means that a chunk of the application has to be loaded into a working area, usually memory and cache memory, to be used and abused by the application users.
Unless someone is well versed with the applications, one would not be able to determine how much of the applications would be placed in memory and in cache memory. Typically, this can only be determined after the application has been running for some time.
The flexibility of having SSDs, especially the DRAM-type of SSDs, are very useful to ensure that there is sufficient “working space” for these applications.
IOPS or Throughput
According to SNIA model, for I/O less than 64K, IOPS should be used as a yardstick to do storage performance modeling. Anything larger, use throughput, in which MB/sec is the measurement unit.
The application guy would be able to tell you what kind of IOPS their application is expecting or what kind of throughput they want. Again, ask a lot of questions, because this will help you determine the type of disks and the kind of performance you give to the application guys.
If the application guy is clueless again, ask someone more senior or ask the vendor. If the vendor engineers cannot give you an answer, then they should not be working for the vendor.
This part is usually overlooked when it comes to performance sizing. Demand intensity refers to how intense is the I/O requests. It could come from 1 channel or 1 part of the applications, or it could come from several parts of the applications in parallel. It is as if the storage is being ‘bombarded’ by applications and this is the part that is hard to determine as well.
In some applications, the degree of intensity or parallelism can be tuned and to find out, ask the application administrator or developer. If not, ask the vendor. Also do a lot of research on the application’s architecture.
And one last thing. What I have learned is to add buffers to the storage performance model. Typically I would add about 10-20% extra but you never know. As storage professionals, I would strongly encourage to engage professional services, because it is worthwhile, especially in the early stages of the sizing. It is usually a more expensive affair to size it after the applications have been installed and running.
“Failure to plan is planning to fail”. The recipe isn’t that difficult. Go figure it out.
Gartner, in its recent Symposium and IT/Expo, laid out the top 10 IT trends for 2012. Here’s a list of their top 10 (in no particular order)
- Big Data, patterns and analytics
- Energy efficiency and monitoring
- Context-aware applications
- Staff retention and retraining
- Social Networks
- Cloud Computing
- Compute per square foot
- Fabrics stacks
The testing criteria for all storage solutions are basically the same, as follows: 1. The System Under Test (SUT) is run through a SUT Conditioning Test to get it into a known and stable state. 2. The SUT Conditioning Test is followed by a defined series of Active Test phases that collect data for the active metrics, each with a method to assure stability of each metric value. 3. The Active Test is followed by the so-called Ready Idle Test that collects data for the capacity metric. 4. Lastly, the Capacity Optimization Test phases are executed which demonstrate the storage system’s ability to perform defined capacity optimization methods. For each of the categories of storage, there will be different workloads and run times depending on the category characteristics.