⬇️ 🚑 🌺 Checklist for an architect 🥑 🤙🏿 💲

From this article you will learn how to organize the process of building an effective development in a distributed digital company, how to do this through expert communication, and how this happens with the example of MTS.

MTS, like many other modern companies, has undergone the so-called digital transformation. In simple terms, the launch of digital processes and products has become our priority.

For me, as a techie, this means that the direction of the business in the company depends entirely on the quality of IT systems and their ability to evolve rapidly.

Of course, this is a wrong definition, and marketers can argue with me - and even argue! But for everything that you read below, it is quite enough.

Less bureaucracy - easier development

What has changed: first of all, the model of company management. If earlier the guys from the centralized enterprise architecture (enterprise architecture) verified each project, now they publish a technical policy (a large and clever document) and train architects in it. And how to apply it is already a personal matter of each product architect from more than a hundred teams.

On the one hand, this is good - less bureaucracy, which greatly simplifies development. On the other hand, all products interact with each other in one way or another, and an error in one of them can affect the other.

For example, in Software Systems Architecture: Working with Stakeholders Using Viewpoints and Perspectives, Eoin Woods and Nick Rozanski write about the basic principle of security - secure the weakest link. It means that if your IT landscape has at least one weakly protected IT system, then the entire IT landscape is at risk. Just because a hypothetical attacker can work with impunity on behalf of this system.

There are many more examples where it is useful to have guaranteed quality and consistency in the design and development of IT systems.

Introvert Experts

What we came up with: create a community to share knowledge and disseminate best practices. The idea is not new and not very revolutionary, but meets the requirements and specifics of the development of digital products.

— DevOps- support-;
, . , , ;
- , — . . -, ; -, ; -, ;
, ;
Organization of rotation in a team of “auditors” so that as many team representatives as possible have the opportunity to share knowledge and experience.

To start the process, we assembled a team of enthusiasts, developed a list of discussion topics for each of the roles, and trained the team of our impromptu auditors. By the way, training was the most difficult stage, because often very good specialists in our field are also very good introverts :-)

What is the result?

The process of researching product teams has been rather leisurely. On average, it takes about 31 days for one team. During this time, we manage to communicate with representatives of all areas of the team’s activities, draw up a memo report and explain it to the product owner so that he can plan it for action;
The result of the work is very dependent on the expert. Therefore, it is important that there are several for each role: two analysts, two architects, etc .; where one has already conducted a series of interviews, and the other is only involved in communication;
It is also necessary to constantly adapt the methodology of interviewing, as some topics lose their relevance, and in their place there are questions that no one had thought of before.

For example, let's look at the results of a study in the direction of "Architecture".

What have we done:

Communicated with 20 teams;
Each spent an average of 31 days. Given the fact that we simultaneously interacted with several teams, the whole process took six months;
Revealed 180 risks associated with architecture.

Within our teams, the risks were divided as follows:

Risk 1: design

It is important to understand that all the software systems that we are examining go through some kind of strict quality control (for example, for telecom systems the monitoring period is longer than the development period), but there are no limits to perfection and efficiency .

To understand what we consider to be risks, let's look at the TOP-3 by examples.

For young product teams, the situation is quite normal when the software architecture is developed on a residual basis. At first it seems that everything is simple, and the timing of projects rarely provides an opportunity to seriously think about the organization of architecture. And then the bottom-up design method comes into play - when we develop the individual components of the solution, after which we assemble them into a single whole.

For example, we decided to make a digital product for telemedicine. What is needed for this?

We probably need a component for video calls between the patient and the doctor - we make a component for calls;
Sometimes you need a regular chat - that means we make a component for the chat;
We need to take the medical history from automated medical systems - we create the appropriate component;
We need to keep a schedule of doctors on duty - we make a component for this as well.

Etc.

Everything seems simple until we start putting it all together. And here there are problems with duplication of functions - for example, chat and video calls are very close applications in themselves (at least from the point of view of the context of the doctor-patient interaction). Those. the risk is that we have to redo our application quite significantly due to the large amount of duplicate code.

Or problems with the data model. Each component by default provides interfaces in that model, which is convenient for storing and processing this particular component, and not the application as a whole.

Therefore, it is worth remembering a number of simple rules:

The bottom-up design method is good for small projects with low technical complexity, small teams and volatile requirements;
For large projects and teams, the design method is top-down, that is, when we first design the picture as a whole, and then proceed to coding.

Therefore, before plunging headlong into a new project, ask yourself the question: what type does it belong to?

Risk 2: Security

It seemed that security is being thought very seriously these days. Everyone remembers such banalities as necessity:

Authenticate users
authorize them to carry out actions;
comply with the principle of least privileges;
maintain data confidentiality;
keep a log of the audit of user actions.

But here is a surprise! For teams that do services for internal automation, this is not as obvious as for everyone else. It seems that if the application is already running on the internal corporate network, then why protect it yet? In fact, it is necessary, especially if the data with which the application works is classified as personal. Yes, the probability that an intruder penetrated the internal network is very small, but there is not much protection.

And with external applications, nuances can also arise. Consider a simple, purely hypothetical, example web application that authenticates a user with a password. What problems can there be:

The application may allow you to enter passwords that are too simple, which are then easy to pick up;
The application may not be protected from brute force passwords themselves (there is no captcha or anything like that);
. , - ;
URL- HTTP- ;
-, . , MD5 ;
- ;
- , . , , -;
- : , ..;
- HTTP-:

- session tokens , ;
- session fixation- (. . session token );
- HttpOnly Secure browser cookies, session tokens;
- .

 Thus, the risk here is that someone will gain access to data that is not intended for him. And this can lead to problems in the application.
These are just examples of what you can talk about in the security field. Of course, the ideal option would be to implement the Secure Development Life Cycle process, for example, such as recommended by Microsoft .

Risk 3: performance

One of the challenges of quickly creating product teams is a three-letter word. This is an MVP or minimal valuable product. Such teams strive to create an application as soon as possible, which will begin to generate revenue for the company, and since there will be very few users at the beginning of the application, they usually think about performance parameters at the last moment. But if the created application suddenly becomes popular, then you have to think about what to do next.

The recommendations here are simple: application performance is inversely proportional to the number of requests for slow resources. Accordingly, all tactics are aimed either at reducing the number of requests, or at accelerating the resources themselves. In this case, resources are understood as a processor, memory, network, disks; It is also sometimes convenient to consider a database or application server as a resource.

First, we look at whether it is possible to make a client cache in a distributed application so that each time we do not request / calculate the data we need. If this is possible, then we save on network requests, loading server resources and everything that he does there.
But it’s very rare lucky, so we’re looking to see if we can make a server cache. With it, the principle is the same as with the client, but the performance gain is slightly less, because network requests will still go;
, . , , , , (load balancer);
, . — My SQL Cluster Grid Edition Apache Ignite (Gridgain).

Well, of course, we must remember that the cache itself solves the problem of access to data, but creates a new problem with the algorithm for its invalidation and preload. And in some systems, caching can be completely useless. For example, in CRM (Customer Relationship Management) systems it is very rarely possible to effectively cache customer data. A specialist who works in the office very quickly moves from one client to another and the cache is simply not used.

Thus, the risk here is that without first thinking about the strategy of how we will “overclock” our application, we may end up at very high costs for rewriting the application in the future.

Summarizing

In this article I tried to talk about how you can organize the process of building an effective development in a distributed digital company through expert communication. In our time of remote development, such processes become especially relevant. They allow you to destroy Conway’s law , or at least minimize it.

If you decide to create your own checklists, then I would recommend not to do everything from scratch, but to take something from existing literature. For example, on architecture, the review material Software Architect's Handbook by Joseph Ingeno ISBN is very useful: 9781788624060

My report can be found here

Author of the article: Dmitry Dzyuba, Head of the R&D Center

Checklist for an architect

Less bureaucracy - easier development

Introvert Experts

What is the result?

Summarizing

More articles: