🕴🏽 👨🏾‍🚒 🔎 Kubernetes Storage Patterns 👩‍👩‍👦‍👦 🚴 🤣

Hello, Habr!

We remind you that we have released another extremely interesting and useful book on Kubernetes patterns. It all started with Brendan Burns's “ Patterns, ” and, by the way, work in this segment is in full swing . Today we invite you to read an article from the MinIO blog that summarizes the trends and specifics of data storage patterns in Kubernetes.

Kubernetes fundamentally changed the traditional patterns of application development and deployment. Now the team can take days to develop, test and deploy the application - in different environments, and all this within the Kubernetes clusters. Such work with technology from previous generations usually took weeks, if not months.

Such acceleration was made possible thanks to the abstraction provided by Kubernetes - that is, due to the fact that Kubernetes itself interacts with low-level details of physical or virtual machines, allowing users to declare, among other parameters, the desired processor, the required memory, the number of container instances. As Kubernetes is supported by a huge community, and the scope of Kubernetes is constantly expanding, it leads by a wide margin among all container orchestration platforms.

As the use of Kubernetes expands, so does the confusion about the storage patterns used in it .

With general competition for a piece of Kubernetes pie (that is, for data storage), when it comes to talking about data storage, the signal is drowning in loud noise.
Kubernetes embodies a modern model for developing, deploying, and managing applications. Such a modern model detaches data storage from computing. To fully understand this detachment in the context of Kubernetes, you also need to understand what stateful and stateless applications are, and how data storage is combined with this. This is where the REST API approach used by S3 has clear advantages over the POSIX / CSI approach that is typical of other solutions.

In this article we will talk about storage patterns in Kubernetes and we will separately discuss the debate about state-safe and stateless applications, so that we can clearly understand the difference between them and why it is important. Further in the text, we will consider applications and the patterns of data storage used in them in the light of best practices for working with containers and Kubernetes.

Stateless Containers

Containers are inherently lightweight and ephemeral. They can be easily stopped, deleted or deployed on another node - all this takes a matter of seconds. In a large container orchestration system, such operations occur all the time, and users do not even notice such changes. However, movements are possible only if the container has no dependencies on the node on which it is located. These containers are said to work without state preservation .

Stateful Containers

If the container stores data on locally connected devices (or on a block device), then the data warehouse on which it is located will have to be moved to a new node along with the container itself - in case of failure. This is important, because otherwise the application running in the container will not be able to function correctly, since it needs to access data stored on local media. These containers are said to be stateful .

From a purely technical point of view, stateful containers can also be moved to other nodes. Typically, this is achieved using distributed file systems or block network storages attached to all nodes on which containers operate. Thus, containers gain access to volumes for persistent data storage, and information is stored on disks located throughout the network. I will call such a method a “ state-preserving container approach ,” and in the rest of the article I will call it for the sake of uniformity.

In a typical stateful container approach, all application pods are attached to one distributed file system — a kind of shared storage is obtained, where all application data is acquired. While some variations are possible, this is a high-level approach.

Now let's look at why the stateful container approach in the cloud-based world is antipattern.

Cloud-Based Application Design

Traditionally, applications used databases for structured storage of information and local disks or distributed file systems, where all unstructured or even semi-structured data was dumped. As the volume of unstructured data grew, the developers realized that POSIX was too talkative, associated with significant costs and, ultimately, interferes with the application when moving to a really large scale.

This mainly contributed to the emergence of a new standard for data storage, that is, cloud-based storages that work primarily on the basis of the REST API and free the application from the burdensome maintenance of the local data warehouse. In this case, the application actually enters the operation mode without saving the state (since the state is in the remote storage). Modern applications are being built from scratch already taking this factor into account. As a rule, any modern application that processes data of one kind or another (logs, metadata, blobs, etc.) is built on a cloud-oriented paradigm, where the state is transferred to a software system specially allocated for its storage.

A stateful container approach makes this whole paradigm roll back exactly to where it started!

When using POSIX interfaces for storing data, applications work in the same way as if they maintained state, and because of this, depart from the most important postulates of cloud-based design, that is, from the ability to vary the size of application workflows depending on the incoming load, move to a new node as soon as the current node fails, and so on.

A closer look at this situation reveals that when choosing a data warehouse, we again and again face the dilemma “POSIX versus REST API”, BUT with additional aggravation of POSIX problems caused by the distributed nature of Kubernetes environments. In particular,

POSIX : POSIX , . , . API , , S3 API, , , «» . , . .
: , , . , , ( ), , . - POSIX . , S3 API , , , .
: POSIX : . - . , API, , , ..
: , . , , , . , , , .

Whereas the container data storage interface (CSI) helped a lot with the distribution of the Kubernetes volume level, partially passing it to third-party data warehouse vendors, but also accidentally contributed to the conviction that the stateful container approach was the recommended method of data storage in Kubernetes.

CSI was developed as a standard for providing arbitrary block and file storage systems for legacy applications when working with Kubernetes. And, as was shown in this article, the only situation where a stateful container approach (and CSI in its current form) is appropriate is when the application itself is a legacy system in which it is impossible to add support for the object data storage API.

It is important to understand that using CSI in its current form, that is, when mounting volumes when working with modern applications, we will encounter approximately the same problems as those encountered in systems where data storage is organized in the POSIX style.

Better approach

In this case, it is important to understand that most applications are inherently not tailored specifically to work with or without state preservation. This behavior depends on the overall architecture of the system and on the specific options selected during design. Let's talk a bit about stateful applications.

In principle, all application data can be divided into several broad types:

Log Data
Time stamp data
Transaction data
Metadata
Container images
Blob data (blobs)

All of these data types are very well supported on modern data storage platforms, and there are several cloud-based platforms adapted to deliver data in each of these specific formats. For example, transaction data and metadata may reside in a modern cloud-based database such as CockroachDB, YugaByte, etc. Container images or blob data can be stored in the docker registry based on MinIO. Time stamp data can be stored in a time series database, such as InfluxDB, etc. We will not go into details of each type of data and related applications, but the general idea is to avoid persistent data storage based on local disk mounting.

In addition, it is often effective to provide a temporary caching layer, which serves as a kind of temporary file storage for applications, but applications should not depend on this level as a source of truth.

Stateful Application Storage

While in most cases it is useful to keep applications stateless, those applications that are designed to store data - for example, databases, object stores, key and value stores - should keep state. Let's see why these applications are deployed to Kubernetes. Take MinIO as an example, but similar principles apply to any other large cloud-based storage systems.

Cloud-centric applications are designed to maximize the use of the flexibility inherent in containers. This means that they make no assumptions about the environment in which they will be deployed. For example, MinIO uses an internal erasure coding mechanism, which provides the system with sufficient stability so that it remains operational even if half of the drives fail. MinIO also manages data integrity and security using its own server-side hashing and encryption.

For such cloud-based applications, local persistent volumes (PV) are most convenient as backup storage. Local PV provides the ability to store raw data, while applications running on top of these PVs independently collect information to scale data and manage growing data requirements.

This approach is much simpler and significantly better scalable compared to CSI-based PV, which bring their own levels of data management and redundancy to the system; the fact is that these levels usually conflict with applications designed according to the principle of state preservation.

Confident Movement to Unpin Data from Computing

In this article, we talked about how applications are reoriented to work without saving state, or, in other words, data storage is delimited from computing on them. In conclusion, we consider a few real-world examples of such a trend.

Spark , the renowned data analysis platform, has traditionally been used with stateful deployment and deployment to the HDFS file system. However, as Spark transitions to a cloud-based world, this platform is increasingly being used without state preservation using `s3a`. Spark uses s3a to transfer state to other systems, while Spark containers themselves work entirely without state preservation. Other large enterprise players in the field of big data analytics, in particular, Vertica , Teradata, Greenplum also go to work with the division of data storage and computing over them.

Similar patterns can also be seen on other large analytical platforms, including Presto, Tensorflow to R, Jupyter. Uploading state to remote cloud storage systems makes it much easier to manage and scale your application. In addition, it helps portability of the application to a variety of environments.