During Storage Field Day, one of the most interesting sessions was with Western Digital. It was a 4-hour long session, so I won’t tell you to watch all the segments (although some of them were particularly enlightening about the future of storage and infrastructure in general – If you have to pick only one watch the part about the gaming industry. I’m not a gamer, but it was really fascinating). Now, let’s focus on the hard disk and the datacenter!
Why Did I Say That the Hard Disk is Dead?
The hard disk drive is slow, damn slow, and flash memory is adopted for a growing number of use cases. Now, with 3D NAND and QLC, we are getting lower prices, and even if performance is not as good as it is for other types of flash-memory, additional density and overall efficiency are making up for it. In fact, the number of products adopting QLC is steadily growing. For example, Pure Storage launched FlashArray//C (where C stays for capacity-optimized) a few months ago, and startups like VAST Data are building entire new architectures on this type of media.
The presence of flash memory is quickly growing in the enterprise data center, while the number of hard disks is slowly decreasing. Flash is faster, power and space-efficient, more reliable, and manageable. It has almost all the pros, and very few of the cons. You may think that this is the end of the story. In a few years, Flash will win, and HDD will die. Well, the reality is a little bit different.
It’s All About $/GB
The hard disk will disappear from the small datacenter. No more hard drives in the houses, no more hard drives in small business organizations, no more hard drives in medium enterprises as well. These kinds of organizations will rely on flash and the cloud, or only the cloud probably!
Flash will be cheap enough for every type of active workload or application, but it is also highly likely that we will keep producing data at an accelerated rate, and we will need to store it somewhere when it becomes cold. There are three answers for that: cloud, disk, or tape.
Why did I mention tape? Am I not talking about disks here? Yes, but tape is still very part of the game. Tape has the best $/GB you can find. And all hyperscalers are avid tape consumers both for backup and cold storage. Very large enterprises still rely on tape as well, again for backup and archives that are decades old. Tape is for cold and deep storage then, while flash is for hot and active storage. How do you fill the gap between them? The hard disk is still the answer.
The industry, western digital, in this case, has robust roadmaps about disks. With new technology to increase capacity up to 50TB per drive in a few years while keeping costs very low. Disk is still a random access device, opposite to tape that is for sequential reading and writing, making information retrieval very slow. So for all the data that is cold, but not frozen, the disk will be more than adequate.
It won’t happen with a price increase, but capacity will come at a cost anyway. In fact, new technology to improve density (Shingled Magnetic Recording, for example) will come at the expense of performance and usability. Furthermore, new techniques like dual actuators will make operations more complex than today, and, last but not least, large capacity will also mean longer rebuilding times in case of a failure. It can take a week to rebuild a 14TB HDD today, think about rebuilding a 50TB one!
So, to recap very quickly, $/GB, but the hard disk is not made for you anymore.
Hard Disks are for Hyperscalers
As it happened for the tape, the disk will have a similar fate.
All the added complexity will make hard disk drives unpractical for small organizations without enough data to store in them (and we will easily pass the 1PB mark in this case). Think about that: 1PB equals to 20 hard drives, with parity and spare drives it will be 24. It provides you throughput, but very few IOPS and the risk of data loss is high due to rebuilding times. Good luck with that!
If you are not big enough to build an HDD-based infrastructure, don’t do it. Use the cloud instead! At the end of the day, it will be cheaper, and there are plenty of options out there: AWS, GCP, Azure, Wasabi, you name it! And don’t worry, it is highly likely that your data will end up in an HDD-based infrastructure anyway, the only difference is that this one will be big enough to justify its existence both in terms of $/GB, reliability and availability.
If you look at the roadmap, SMR, dual actuators, capacity, zoned storage APIs are all features designed without taking into account the needs of the small infrastructure, but they are taking this technology to the limit to satisfy hyperscaler requirements. In fact, these guys have complete control over the entire stack and can take advantage of all the functionalities I mentioned by developing the necessary components and designing the infrastructures around them.
Hard disk will disappear from your datacenter if you don’t need several petabytes of cold storage installed locally on your premises. $/GB could be good on paper, but it will be unpractical.
Scale-out storage vendors are not working to make their software efficient with next-generation hard drives (SMR, dual actuators, zones, etc.), they are concentrating on optimizing their solutions for flash, looking for balanced architectures instead of just $/GB.
The all-flash datacenter will become a reality, and you will store more and more of your cold data in the cloud. This means that your data will probably end up in disks and tapes anyway, it is just that it is not your problem anymore.
Disks will reach important capacities (now 18/20TB, 50TB in a few years), but again, it is unlikely that you will see one of these devices in an enterprise data center near you, and you’ll be able to use it efficiently. Primary customers of these devices will be hyperscalers and very large enterprises.
Long live the hard disk!
Disclaimer: I was invited to Storage Field Day 19 by GestaltIT, and they paid for travel and accommodation, I have not been compensated for my time and am not obliged to blog. Furthermore, the content is not reviewed, approved, or edited by any other person than the GigaOm team. Some of the vendors mentioned in this article are GigaOm clients.