Why should companies switch to an open environment ahead of the zettabyte era?

Data growth is going up a steep trajectory and, according to forecasts by the International Data Corporation (IDC) , in 2023 103 zettabytes of information will be generated worldwide. With the further spread of 5G IoT devices and a significant growth in video companies will adapt their technologies for storing data and extracting valuable information from them, and so far we have barely come into contact with this process. Although one thing can already be said with confidence: it is clear that, on the threshold of the era of zettabytes, companies must reconsider their approaches to the architecture of data centers in order to keep up with such trends in the future.



A New Approach to Storage Architecture in the Zettabyte Era


First of all, what is zettabyte? Zettabytes are a trillion gigabytes. This is a lot of data, but - unlike a gigabyte or even a terabyte - not everyone knows the word β€œzettabyte”, and the reason may be that the need to store such a volume of information for commercial purposes is rare. But this will not always be.

Innovation, products and requirements during this new architectural shift will depend on several key factors.

First: the need to disaggregate computing systems, SDH and network resources in order to maximize the effective and optimal use of each of these components. Disaggregation is the only way to deal with the volume, speed and variety of data that the zettabyte era will bring with it.

Second: the storage infrastructure must be purpose-built, that is, specialized. Companies will no longer be able to rely on unspecialized solutions for widespread use, since a single solution simply will not be able to solve the whole range of large-scale tasks. In the world of zettabytes, companies will have to work as productively as possible and focus all their attention on achieving one goal - to ensure the perfect balance between capacity, density and cost.

Third: all the various elements of the process must be interfaced with each other and intelligently process the data. You should configure the interaction between hardware and software, but in order to properly develop both hardware and software, you need to be well versed in the full set of technologies, only then it will be possible to maximize the performance and functionality of the entire complex.

Specialized solutions based on tiled magnetic recording technology (SMR)


When considering possible solutions that could meet the data-driven needs of the next decade, it seems important to get feedback from professional open source and Linux software communities about the key technologies that underlie tiled magnetic recording (SMR). With SMR, data tracks are placed on the disk one above the other, due to this equipment manufacturers can increase capacity by about 20%. This is only possible with sequential recording of the upper data track, then the lower track does not change.

For many hyper-scalable solutions, sequential recording will be a good option, as large-scale work tasks, such as video streaming, are implemented on a write-once / multiple-read basis. But improving performance for implementing SMR requires redesigning the architecture on the end host side: you need to change the operating system to sequentially post records, or even let the application see that the data is being written sequentially.

At the initial stage, some changes will be required to change the architecture, but huge advantages in terms of increasing density and lowering costs will clearly demonstrate all the advantages of specialized hardware and structures that take into account application features.

Using Zoned Namespaces Technology


Comparing HDDs supporting SMR technology and SSDs may seem strange, because in many ways these technologies are conceptually very far apart. However, if you look at SSDs and NANDs in the context of their place in a disaggregated future, you can find the technology that accompanies SMR / HDD, it is called Zoned Namespaces (ZNS).

Storage devices with NAND-memory are designed only for a certain number of deletions and records and, therefore, they need to be managed. The Flash Translation Layer (FTL) layer intelligently manages everything from the cache to performance and allows you to even out wear. However, on a zettabyte scale, such device-level control introduces an intermediate level between the host and the specific drive, which negatively affects bandwidth, latency, and cost.

But in the new era, companies will want to keep these indicators under control and maximize work efficiency, so this management function should be transferred from the device level to the host level, and the essence of the SMR approach is precisely this.

ZNS divides the flash drive into zones, and each zone becomes an isolated namespace. Cloud solution providers can, for example, distribute different types of workloads or data to different zones, thereby gaining the ability to identify predictable usage patterns for a certain number of users. But more importantly, the data is written sequentially in the zone, as in tiled magnetic recordings. And suddenly the need for all this drive management simply disappears. Total:

  • additional savings, since there is no need to inflate the β€œpark” of NAND flash drives;
  • extending the life of the disc by reducing excessive recording;
  • significant reduction in delay;
  • a serious increase in bandwidth.



Zoned Storage - the unifying platform for supporting SMR and ZNS technologies


As companies prepare to increase their information needs, an important role is given to initiatives such as Zoned Storage, working with the professional community to establish ZNS as an open standard that can use the same interfaces and application programming interface (API) as SMR. This step will allow users to use a single interface to access the entire storage tier. As a result, data center architects will find it easier to switch to zettabyte-based architectures because they don’t have to change applications no matter which storage solution they choose. Using disaggregated, specialized and smart architectures will allow companies to find a new balance between performance, latency and cost.

All Articles