🔞 ♦️ 👁️ How we solved the problem of three monoliths 🖖 👶🏿 👼🏼

In the strategies of most companies, digitalization is increasingly mentioned: some companies are trying to implement modern technologies (for example, Big Data, IoT, AI, blockchain), while others are everywhere automating their internal processes. Despite growing efforts and investments in systems implementation, many see the results as mediocre. Ideally, modern organizations should be able to quickly create new digital products or integrate with popular third-party services; take processes beyond their organization; be able to effectively interact with partners, while maintaining the isolation of their processes. You must also be able to not only collect data, but also quickly access and manage it. However, even mature companies face the challenges of transforming and managing data,with constant competition of business priorities. What prevents them from achieving perfection?

The experience of our DTG team in creating digital products and services allows us to state that the solution to these problems is hindered by the problem of three monoliths: application monolith, integration monolith and data monolith . They are the result of the inherited paradigms of traditional architecture, culture, relying on available data and working in a “layered” system, where the isolation of the IT department and business leads to the loss of data and knowledge about them. As a solution to this problem, we see a transition from traditional development and management approaches to distributed ones, which implies serious technical and cultural changes in the organization.

But first things first. Let us briefly describe what the notorious monoliths are, and then we will move on to the solutions we propose to overcome the difficulties generated by monoliths.

Application Monolith

One of the three architectural problems when creating enterprise solutions is the monolith of applications , which appears as more and more functions are added to an existing application. Over the years, the application turns into a "monster" with a lot of interwoven functionality and co-dependent components, entailing the following negative points:

the presence of a single point of failure (in the event of a failure in one of the application modules, the entire application fails and all employees working with this application stop working);
difficulty in ensuring the required quality of the developed product, the need for volumetric regression testing;
one monolithic team, which is not practical to expand, as this will not speed up and facilitate the development process;
, ; , ;
( -). , , « ». ;
.

Microservices help overcome the described problems. The meaning of the approach is that a monolithic application is divided into several small applications consisting of a group of services.

Unlike monolithic applications, this provides much greater scalability than the monolithic approach, since it becomes possible to scale highly loaded services as necessary, and not the entire application. Microservices allow several teams in an organization to work independently and release new features as they see fit.

Although the idea of modularity has existed for many years, the architecture of microservices provides much greater flexibility, allowing organizations to respond more quickly to changing market conditions.

But do not naively believe that microservices will completely save your IT environment from complexity. With the advent of microservices, there is a compromise to increase development flexibility while increasing the complexity of management, development and support due to their decentralization. Moreover, not every application in a corporate environment is suitable for microservice architecture.

Integration Monolith

The second architectural problem, the integration monolith, is connected with the use of the integration corporate bus ( Enterprise Service Bus , ESB). This is an architectural pattern with a single enterprise-wide interaction layer that provides a centralized and unified event-oriented messaging.

In this traditional approach, integration is seen as an intermediate layer between layers of data sources and their consumers. ESB provides services that are used by many systems in different projects. The ESB is managed by only one integration team, which must be very qualified. Moreover, it is difficult to scale. Due to the fact that the ESB team is the “bottleneck” of the project, it is difficult to issue changes and a growing line of improvements:

Integration is possible only through the bus as part of the next release, which is better to submit an application for because of the large flow in a few months;
any changes can be made only if they are agreed with other consumers, since not everything is decomposed and isolated. Technical debt is accumulating, which only increases over time.

In monolithic architectures, data is “resting”. But the whole business is built on streaming events and requires quick changes. And where everything changes very quickly, the use of ESB is inappropriate.

To solve these problems, the Agile Integration approach helps, which does not imply a single centralized integration solution for the entire company or a single integration team. Using it, several cross-functional development teams appear who know what data they need and what quality they should be. Yes, with this approach, the work performed can be duplicated, but it allows you to reduce the dependence between different teams and helps to lead mainly parallel development of different services.

Data monolith

The third, but no less important architectural problem is the problem of data monolith, associated with the use of a centralized corporate data warehouse ( Enterprise Data Warehouse, EDW). EDW solutions are expensive, they contain data in a canonical format, which, due to specific knowledge, is supported and understood by only one team of specialists, which serves everyone. Data in EDW comes from various sources. The EDW team verifies them and converts them into a canonical format, which should satisfy the needs of various consumer groups within the organization, and the team is loaded. In addition, data converted to a certain canonical format cannot be convenient for everyone and always. Bottom line - it takes too much time to work with the data. Accordingly, it is not possible to quickly launch a new digital product on the market.

Such an orientation to the central component, its dependence on changes in the surrounding systems is a real problem in the development of new digital processes and the planning of their improvements. Changes can be conflicting, and their coordination with other teams further slows down the work.

To solve the problem of data monolith, an unstructured data repository, Data Lake , was invented.. Its main difference is that “Raw” data is loaded into Data Lake, there is no single team for working with them. If a business needs to get some data to solve its problem, a team is formed that extracts the data necessary for a particular task. Nearby, another team can do the same for another task. Thus, Data Lake was introduced so that several teams could simultaneously work on their product. This approach implies that the data can be duplicated in different domains, as the teams convert them into a form suitable for developing their product. Here the problem arises - the team needs to have competencies to work with various data formats. However, this approach, although it carries the risk of additional costs,gives business a new quality and positively affects the speed of creating new digital products.

And only a few among advanced organizations use an even more “mature” approach in working with data - the Data Mesh , which inherits the principles of the two previous ones, but eliminates their shortcomings. The Benefits of Data Mesh are Real-Time Data Analysisand lower costs for managing big data infrastructure. The approach favors stream processing and implies that the external system provides a data stream that becomes part of the source solution API. The data quality is the responsibility of the team owner of the system that generates this data. To maximize this approach, stricter control over how data is processed and applied is required to avoid “getting people into a bunch of meaningless information”. And this requires a change in the thinking of management and the team regarding how the interaction of IT with the business is built. This approach works well in a product-oriented model, and not in a project-oriented one.

Such a data infrastructure opens up a completely different perspective and facilitates the transition from the state of “storing data” to the state of “responsive to data”. Stream processing enables digital businesses to respond immediately to events when generating data, providing intuitive means of obtaining analytical data and real-time settings of products or services that will help the organization go one step ahead of its competitors.

Distributed approaches

To summarize, the solution to the problems of all of the listed monoliths is:

dividing the system into separate blocks focused on business functions;
the allocation of independent teams, each of which can create and operate a business function;
parallelization of work between these teams in order to increase scalability, speed.

There are no simple solutions in building the IT infrastructure of a modern organization. And the transition from traditional to distributed architecture is not only a technical transformation, but also a cultural one. It requires changes in thinking regarding the interaction of business and information systems. And if monolithic applications existed in the organization before, now thousands of services work that need to be managed, maintained and compared in terms of interfaces and data. This increases costs, increases the requirements for the skills of people and project management. The IT department and the business must take on additional responsibilities, and if they learn to manage this complexity, then this infrastructure will allow the business to respond to market challenges with a new, higher quality.

And now about what exactly are we inDo we use DTG as a solution to the “problem of monoliths” when optimizing the digital processes of our customers and their integration into the ecosystem of partners? Our answer is the Digital Business Technology Platform class (see Gartner analytics classification). We called her GRANUMand, by tradition, built on a combination of open source technologies, which allows us to quickly and easily create complex distributed systems in a corporate environment. We will touch on technologies in more detail below. What has become easier and faster? Using the platform, we significantly accelerated the integration of existing customer IT platforms, customer interaction systems, data management, IoT and analytics, were able to quickly integrate customer systems with ecosystem partners to handle business events and make joint decisions to create common value. Also, the use of open source technologies helped us respond to customer requests related to avoiding licensed software.

From a technical point of view, during the digitalization of processes through the use of a distributed architecture (microservices and the DataMesh approach), we were able to reduce the interdependence of components and solve the problem of complex and lengthy development. In addition, we were able to process streaming events in real time, preserving the quality of data, and also create a trusted environment for interacting with partners.

The platform can be divided into three logical layers.

The bottom layer is infrastructure. Designed to provide basic services. This includes security, monitoring and analysis of logs, container management, network routing (load balancing), devops.
Integration layer - supports a distributed architecture (DataMesh approach, microservices and streaming data processing).
— . (track&trace), , . .

Let’s talk more specifically about the open-source technologies we have chosen. Which of them are used in their best practice by leading Internet companies such as Netflix, LinkedIn, Spotify. Kubernetes, Jenkins, Keycloak, Spring Boot, Fluentd, Grafana, Prometheus technologies are chosen to combat the monolith of applications and to build and work with a microservice architecture, as well as in pursuit of flexibility and speed of changes. To move away from a monolithic architecture, the Agile Integration approach usually uses Apache Camel, NiFi, WSO2 API Manager. And finally, Kafka, Flink, Salase Event Portal are useful for solving the problem of data monolith, its partitioning and transition to real-time data analysis using the Data Mesh approach.

The illustration below represents a set of technologies that, as a result of experiments, we at DTG considered optimal for solving the problem of three monoliths.

We started the practical application of the described platform about a year ago and today we can already conclude that, regardless of industry, such a solution is of interest to organizations that are thinking about reducing the cost of executing their business processes, increasing the efficiency of interaction with partners, creating new value chains. Such companies are aimed at fast digital flow experiments (hypothesis testing, integration, rapid market launch and, if local success, global implementation), and will also open up new channels of communication with customers and build more intense digital communication with them the world. Interesting vacancies

are always open in our group of companies . Waiting for you!

How we solved the problem of three monoliths

Application Monolith

Integration Monolith

Data monolith

Distributed approaches

More articles: