Death Day Standard Library

The other day in Prague, the C ++ standardization committee conducted a series of surveys on the issue of changing the ABI, and ultimately it was decided not to change anything in it. There was no applause in the hall.
I think we did not fully realize the consequences that this decision would entail, and I do not believe that, in principle, it can positively affect the development of the language.



What is ABI?


ABI - a set of conventions that define how your program is represented in binary form, how names are declared in it, class markup is defined, and function calling conventions are defined. In fact, this is a kind of binary protocol, but it is not versioned.
In order not to confuse you in terminology, let's look at examples of what the ABI change entails and its breakdown within the program:

You cannot use the exported symbol in the new version of your compiled library if you did any of the following:

  • Added a new field to an existing class
  • Changed the template parameters of the class / function, turned the template function into a non-template, or vice versa, added variadic template
  • Apply the inline specifier to something
  • Added default parameters in function declaration
  • New virtual method announced

And much, much more, but the examples above are the main ones that are mentioned by the committee, and the very ones, thanks to which most of the proposals in the standard are destroyed on the spot. I omitted the options for violating the ABI, during which the source code of the program also breaks (for example, deleting or changing functions). However, this is not always true. For example, deleting a function does not always entail breaking the source code. std::stringIt has a conversion operator in std::string_view, which I would gladly get rid of, and although deleting it will break the ABI, it will not lead to significant problems in the source code.

Why should we break ABI


First of all, there are several useful changes in the implementation of the standard library that you can implement if you break the current ABI:

  • Make associative containers (much) faster
  • Speed ​​up your work std::regex(At the moment, it’s faster to run PHP and search on it with a regular expression than use the standard one std::regex)
  • Some changes in std::string, std::vectoras well as in the layout of other containers
  • Unity of the class interface: at the moment, some of their implementations intentionally do not correspond to a single interface for the sake of ABI stability

More importantly, there are changes in the design of the library that cannot be made without encountering the ABI security issue. Here is an incomplete list of all that is currently not feasible for the reason mentioned above:

  • std::scoped_lock was added in order not to break the abi change lock_guard
  • int128_t , intmax_t. , , , intmax_t deprecated
  • std::unique_ptr , zero-overhead
  • error_code - , ABI
  • status_code ABI
  • recursive_directory_iterator , ABI
  • <cstring> constexpr ( strlen) , ABI
  • UTF-8 std::regex — ABI
  • adding support to reallocand returning the size of allocated memory is an ABI breakdown for polymorphic allocators
  • Creating Implicit Virtual Destructors for Polymorphic Types
  • the return type y push_backcan be changed if the current ABI is broken
  • In general, do we really need and push_back, and emplace_back?
  • std::shared_ptr also causes ABI breakdown
  • [[no_unique_address]] could be output by the compiler if we did not care about saving the current ABI

And the list does not end there. I think WG21 should try to keep up to date the list of such things. But I will take note of everyone who says “it will break the ABI”, being with me in the same room.

What else do we want to change?


I dont know. And I don’t know what I don’t know. But if I were asked to guess, I would say the following:

  • C++23 ABI, - , ABI.
  • , , , , ABI
  • ABI,
  • , ABI
  • Tombstone ABI

ABI


During the last ABI discussion, a number of surveys were conducted in Prague, which, however, do not tell us anything. Depending on whether you are a pessimist or an optimist, the current results can be interpreted by you differently.

Key facts:

  • WG21 does not want to break ABI in 23
  • WG21 considers it necessary to break the ABI in future versions of C ++ (C ++ 26 or later)
  • WG21 will take time to consider proposals that violate ABI
  • WG21 does not promise eternal stability
  • WG21 considers it important to maintain priority on performance, even to the detriment of language stability

These statements have many important points, but there is no consensus. The committee, oddly enough, split in half.

Fortune telling


In which version of C ++ to wait for changes


The obvious drawback of all these polls is that we have not decided when exactly we want to break the ABI.

in c ++ 23? No, definitely not already.

in C ++ 26? Some people intend to vote for, but another part is likely to vote for C ++ 41 or for the time when they retire and they do not have to support their current project. One of the questions just mentioned C ++ - some ; very comfortably!

There is no reason to believe that if the ABI cannot be violated now, it can be violated later. People who need stability are several years behind the standard. So if we don’t break it now, people will continue to rely on the never-promised ABI, maybe another ten or even twenty years. The simple fact that we conducted this survey eventually voted not to violate the ABI, shows that the whole ecosystem is gradually becoming stony and stagnant - every day the problem only gets worse and potentially more expensive.

I don’t think that something will change in the survey conducted in three years. It’s like global warming: everyone agrees that someday we need to tackle this problem. And then let's ban diesel in 2070?

Everything that is not planned to be done in the next five years is likely to never happen.

About offers that violate ABI


WG21 voted to devote more time to proposals that violate the current ABI. This means a few things:

  • We will waste more time in one of the noisiest rooms of the committee to discuss this issue and leave less on those proposals that have more chances of adoption, and in the end we will reject them all the same
  • We will look for alternatives that do not break ABI (this will be discussed below)
  • Partial changes in ABI may be introduced (see also below)

Performance is more important than ABI stability


It's like asking if a five year old wants a candy. Of course we will vote on the importance of performance. However, I am worried that some people still voted against.

It seems to me that the committee at the same time wants to sit on two chairs at once. And this is impossible:
Bryce Adelstein Lelbach @blebach
- Performance
- Stability ABI
- Ability to change something

Choose two options from the proposed ones.

the stability of the language and ABI, of course, collide with each other, forcing us to ask such a fundamental question - What is C ++ and what is its standard library?

Usually in this case, the terms “performance”, “ zero-cost abstraction”, “ do not pay for what you do not use ” are remembered . And the stability of ABI stands across all of this.

Far reaching consequences


image

I am deeply convinced that the decision not to break the ABI in the 23rd year is the biggest mistake the committee has ever made. And I know that some people believe the opposite. But here is what is likely to happen soon:

Nightmare of learning


Let's be honest. All programs that rely on ABI are likely to violate ODR principles somewhere or use incompatible flags, which, fortunately, still work.

New programs must be compiled from the source code, we need tools built on the assembly from the sources, and not a collection of libraries that we got from somewhere and somehow inserted into the project.

Yes, building from source is something that is not so easy to achieve. But we need to encourage this approach to the product, regularly update compilers so that people benefit from the newly introduced features a month after the release, and not ten years later. Right, reliable, scalable, and reproducible solutions, open source libraries, and a dependency system need to be encouraged.

Refusing to violate ABI, the committee openly states that it will support poorly written programs throughout their existence. Even if you do not link to the libraries obtained through apt-install (which are actually intended for the system), there will be other people, because the committee gave them their blessing.

This is a huge step back. How can we teach others good language practices if we have no incentive to do this?

Loss of interest in the standard library


The estimated loss in library performance due to our unwillingness to violate ABI is estimated at 5-10%. This number will only grow with time. Let's look at examples:

  • If you are a large company, you can buy yourself a new data center or pay a team of programmers who would support their own library
  • , 5% ,
  • , 5-10% , VR-
  • , —

I think, here, willy-nilly, the question arises between: “I definitely should use C ++ and its standard library!” and “Maybe I should not use the standard library? Or maybe I should not use C ++ in principle? Perhaps .NET, julia or Rust would be a better solution? ” Of course, there are many factors that influence the answer, but we see what is happening recently.

Many game developers are extremely skeptical of the standard library. They would rather develop their own alternative, such as EASTL , than take advantage of STL. Facebook has folly , Google has abseil and so on.

It's like a snowball. If people do not use the standard library, they have no interest in improving it. Performance is the key factor that keeps the library afloat. Without a guarantee of performance, much less effort will be put into its development.
>> Jonathan Müller @foonathan
What is the point of using containers from the standard library if they do not provide better performance?

Titus Winters @TitusWinters
Perhaps because they are common and easily accessible? (these two facts do not mean the same thing).
Voting to preserve ABI is like saying that the standard library should strive to be McDonald's - he is also everywhere, he is stable and technically solves the tasks.

How can a committee consider proposals breaking an ABI?


Several options are offered as pain relief caused by the inability to accept offers if they violate the ABI:

Adding New Names


image

This is the first and obvious solution. If we cannot change std::unordered_map, can we just add std::fast_map? There are several reasons why this is bad. Adding types to the standard library is expensive, both in terms of support costs and in terms of education. After the introduction of the new class, thousands of articles will inevitably appear, explaining which container should be used. For example, should I use std::scoped_lockor std::lock_guard? I have no idea! I need to google every time. There is also the problem that good names end sooner or later. We also get some overhead during program execution, since all containers must be constantly converted to each other, it becomes difficult to control a huge number of conversion overloads in the class, etc.

It’s ironic, but those people who support the solution above can also argue that C ++ is too complex a language. Adding duplicates to the library will definitely not make it easier.

But we could accept this offer as a standard!


Some library developers claim that their offers were rejected due to an ABI violation, although they did not actually violate anything, or they could be changed to circumvent ABI failure.

As a cynical person, it's a little hard for me to believe. The fact is that before there were no such proposals, and the scenarios in which they can be applied are very limited. ABI Review Group (ARG) could help in this matter, but they are likely to recommend another name for the class / function again.

Partial Abi Violation


The main idea is not to break the entire ABI, but only change it for a specific class or function. The problem with this approach is that instead of an error at the linking stage, we will see the problem already during the launch of the program, and it will be unpleasant to surprise us. The committee had already tried this approach in C ++ 11 when it changed the markup std::string. Nothing good came of it. Everything was so bad that this fact is still used as an argument in favor of maintaining the current ABI.

Another level of indexation


A solution to some of the problems with ABI would be the ability to access the data of the class through a pointer, then the markup of the class would be just that pointer. The idea is very close to the PIMPL idiom , which is actively used in Qt because of its ABI. Yes, that would solve the problem with class members, but what to do with virtual methods?

Considering the problem from a more critical point of view, we are talking about adding one more level of indirection (pointer index) and additional allocation of memory in the heap for everything that is enclosed in the framework of ABI. In STL, in fact, everything is enclosed within this framework because it is a collection of generalized classes.

As a result, the price of this approach will be huge.

As a solution to this problem, there are already several proposals in the standard. Some of them want to make PIMPL one of the features of the language, so you can choose between ABI stability and high performance.

Ironically, however, but in order to turn library types into PIPML types, we need to ... break the ABI.

Reassemble all code every three years


Just my thoughts out loud.

All current offers in the standard must be destroyed


Paradoxically, C ++ has never been as lively as it is now. In Prague, 250 people worked on many things for him, including:

  • Numerics
  • Linear algebra
  • Audio
  • Unicode
  • Asynchronous I / O
  • 2D and 3D Graphics
  • Many other features

All these proposals are united by one common fact - they are much more optional in comparison with what we have in the standard at the moment. People are simply trying to standardize things from their field of research and work, or that which is constantly evolving and changing.
In particular, Unicode algorithms are extremely unstable and change rapidly over time.

And then, such horror as networking looms over the horizon . It is very, very irresponsible to try to standardize anything that might lead to security problems, and at the same time not be able to change it later (remember about ABI).

Since C ++ decided to make it stable, all of these suggestions must be destroyed and burned. I would not want to be destroyed, but this must be done.

But they still won’t do it.

In the best case, we will not make mistakes and standardize the current state of things in one of the new versions of C ++, and then let everything decompose slowly, since it will not be possible to fix it (In the case of Networking TS, we seem to be unable to change anything at all, therefore we will have to standardize what existed ten years ago, then of course the library can still be significantly improved, but let's leave this story for another time).

But of course, we will make many, many mistakes.

>> Ólafur Waage @olafurw
( , )

, !

. , ( : , , )?

Hyrum Wright @hyrumwright
, , . — , .

Some mistakes are made intentionally, as they are trade-offs, while others go unnoticed for many years.

Time passes, but the standard library stands still. Previously made trade offs are gradually starting to bother us, and later become real “bottlenecks” in the existing code.

Some things are really impossible to change, since they are embedded in the API. We all have an idea of ​​how difficult it is to change an existing API. But part of the code can still be fixed and improved if we could break the ABI.

C ++ will still be afloat in the next 40 years. If we cannot realize the need to change it in an unpredictable way at any time, then the only right move will be to not play this game in principle.

Everyone knows that a standard associative container has been relevant for use for less than ten years. But why then do we think that larger proposals in the standard will be more successful?

Your offer to the standard will be destroyed, mine will be destroyed in the same way.

Can a committee in principle break an ABI?


Many are sure that the committee cannot, in principle, make such a decision, because then the library developers will simply ignore it. All this painfully resembles arm wrestling, in which the committee decided not to play.

But the fact is that developers of any product have their own users. Users are those who first of all need to understand what trade-offs are imposed on them.

Many people rely on ABI quite by accident, without making an informed choice. Many people also rely on stability because, of course, everyone wants to rely on it. But like any other thing, stability has a price, and now the entire C ++ ecosystem pays for it.

All Articles