Does your data center really need an All-Flash array?

Most data centers that use spindle based storage suffer from performance problems. Replacing existing storage technology with All-Flash arrays is an expensive and disruptive proposition. It requires data center’s to sacrifice existing investments in storage technology, acquire new and expensive equipment, and bring existing systems down while existing data is being migrated.

Not only is an all out replacement of spindle based storage systems expensive, it may also be unwarranted, because it’s possible to achieve performance acceleration very close to that of All-Flash arrays by intelligently optimizing the Active I/O data set and putting it close to the server.

All I/Os are not equal

The Pareto Principle, or the 80/20 rule, states that for many events roughly 80{00650960c7c98e0cfb19b413dfe1b11628fd22333dea5da8eda402d86148a18a} of the effects come from 20{00650960c7c98e0cfb19b413dfe1b11628fd22333dea5da8eda402d86148a18a} of the causes. In economics, 80{00650960c7c98e0cfb19b413dfe1b11628fd22333dea5da8eda402d86148a18a} of the land is owned by 20{00650960c7c98e0cfb19b413dfe1b11628fd22333dea5da8eda402d86148a18a} of the population and in business, roughly 80{00650960c7c98e0cfb19b413dfe1b11628fd22333dea5da8eda402d86148a18a} of revenues come from 20{00650960c7c98e0cfb19b413dfe1b11628fd22333dea5da8eda402d86148a18a} of the clients. The numbers and distribution may vary, but interestingly this law applies across many domains. We have also seen it apply to I/O workload distributions in our lab.

We run regular experiments in our lab to understand I/O behavior in database applications and virtual machines in general. Our results show a pattern similar to the Pareto Principle. Even though it’s hard to generalize a specific number across different applications and workloads, we have often noticed that approximately 10{00650960c7c98e0cfb19b413dfe1b11628fd22333dea5da8eda402d86148a18a} – 20{00650960c7c98e0cfb19b413dfe1b11628fd22333dea5da8eda402d86148a18a} of the total data set seems to be accessed about 80{00650960c7c98e0cfb19b413dfe1b11628fd22333dea5da8eda402d86148a18a} – 90{00650960c7c98e0cfb19b413dfe1b11628fd22333dea5da8eda402d86148a18a} of the time. We call this most frequently accessed data, the Active I/O dataset or the Primary I/O data set.

Our product, Primary I/O APA, accelerates I/O performance by up to 600{00650960c7c98e0cfb19b413dfe1b11628fd22333dea5da8eda402d86148a18a} and in some cases even more, by optimizing the Active I/O data set using a software defined solution which works with an in-server SSD cache.

Primary I/O should be close to the server

Caching data to improve performance in not new. Many storage solutions cache data in the array to improve performance. And they do improve performance. However this is not an optimum solution, because it disregards network latency (do we have any numbers for n/w latencies?). Even though array based caches improve the array’s response time, it may not result in an equivalent gain to the VM. Since storage arrays are typically always on the network, we have to account for network latencies when we talk of data response time as seen by the VM.

We solve this problem by totally eliminating network latency. We put an SSD cache inside the server. Our software defined solution which works with VMWare’s VAIO API uses analytics to determine the active I/O data set and caches it on an in-server SSD cache for maximum performance acceleration.

Storage performance acceleration should not disrupt existing IT investments

Nowadays All-Flash arrays are being projected as a cure-all for storage performance problems. But is it really necessary to throw away all the investment that have gone into spindle based storage and replace the storage array with an All-Flash solution? Eventually the answer may be yes, but in our opinion technology upgrades, especially when they are expensive, should be done in stages rather than disruptively. SSD technology is maturing rapidly in terms of speed, reliability, and cost. By deferring an All-Flash upgrade, you will be able to purchase better technology at a cheaper price, when you really need it.

PrimaryIO APA gives data centers an alternate path. Our product can accelerate your data center’s performance by a factor equivalent to that of an All-Flash array, at a fraction of the cost, and without needing to replace any spindle based storage. This will allow data centers to do an All-Flash upgrade when the cost and performance have matured enough.

Finally, our product accelerates not only spindle based storage but also flash based storage. By caching Active data set I/O and putting it close to the server you will reap benefits even after you upgrade to an All-Flash array. Your investment in PrimaryIO APA will never be wasted even then.