Depth of rabbit hole or C ++ interview at PVS-Studio

Interview on C ++ at PVS-Studio

Authors: Andrey Karpov, khandeliantsPhilip Handelyants.

I would like to share an interesting situation when the question that we used at the interview turned out to be more complicated than the author intended. With the C ++ language and compilers, you should always be on the lookout. Do not get bored.

As in any other programming company, we have sets of questions for interviewing for vacancies of developers in C ++, C # and Java. We have many questions with double or triple bottom. For questions on C # and Java, we can’t say for sure, since they have other authors. But many of the questions compiled by Andrei Karpov for interviews in C ++ were immediately conceived to probe the depth of knowledge of the language features.

These questions can be given a simple correct answer. You can go deeper and deeper. Depending on this, during the interview, we determine how much a person is familiar with the nuances of the language. This is important for us, since we are developing a code analyzer and should very well understand the language subtleties and “jokes”.

Today we’ll tell a short story about how one of the first questions asked at the interview turned out to be even deeper than we planned. So, we show this code:

void F1()
{
  int i = 1;
  printf("%d, %d\n", i++, i++);
}

and ask: "What will be printed?"

Good question. Immediately can say a lot about knowledge. Cases when a person cannot answer him at all will not be considered. These are eliminated by preliminary testing on the HeadHunter website (hh.ru). Although, no, it’s a lie. In our memory, there were a couple of unique personalities who answered something in the spirit:

This code will print at the beginning a percentage, then d, then another percentage, d, then a wand, n, and then two units.

Clearly, in such cases, the interview ends quickly.

So, now back to the normal interview :). Often they answer like this:

1 and 2 will print.

This is the answer of the intern level. Yes, such values ​​can, of course, be printed, but we are waiting for approximately the following answer:

This is not to say what exactly this code will print. This is unspecified (or undefined) behavior. The order in which the arguments are calculated is not defined. All arguments must be evaluated before the body of the called function is executed, but the order in which this will happen is left to the discretion of the compiler. Therefore, the code may very well print both “1, 2” and vice versa “2, 1”. In general, writing such code is highly undesirable, if it is going to be built by at least two compilers, it is possible to “shoot in the foot”. And many compilers will give a warning here.

Indeed, if you use Clang , you can get "1, 2".

And if you use GCC , you can get "2, 1".

Once upon a time we tried the MSVC compiler, and it also produced "2, 1". No signs of trouble.

Recently, for a general third-party purpose, it was again necessary to compile this code using modern Visual C ++ and run it. Assembled under Release configuration with / O2 optimization enabled . And, as they say, they found adventure on their own head :). What do you think happened? Ha! Here's what: “1, 1”.

So think what you want. It turns out that the question is even deeper and more confusing. We ourselves did not expect this to happen.

Since the C ++ standard does not regulate the calculation of arguments in any way, the compiler interprets this type of unspecified behavior in a very peculiar way. Let's take a look at the assembler code generated by the MSVC 19.25 compiler (Microsoft Visual Studio Community 2019, Version 16.5.1), the flag version of the language standard is '/ std: c ++ 14':


Formally, the optimizer turned the code above into the following:

void F1()
{
  int i = 1;
  int tmp = i;
  i += 2;
  printf("%d, %d\n", tmp, tmp);
}

From the point of view of the compiler, such optimization does not change the observed behavior of the program. Looking at this, you begin to understand that, for good reason, the C ++ 11 standard, in addition to smart pointers, also added the “magic” function make_shared (and C ++ 14 also added make_unique ). Such a harmless example, and also can "break firewood":

void foo(std::unique_ptr<int>, std::unique_ptr<double>);

int main()
{
  foo(std::unique_ptr<int> { new int { 0 } },
      std::unique_ptr<double> { new double { 0.0 } });
}

A cunning compiler can turn this into the following order of calculations (the same MSVC , for example):

new int { .... };
new double { .... };
std::unique_ptr<int>::unique_ptr
std::unique_ptr<double>::unique_ptr

If the second call to the new operator throws an exception, then we get a memory leak.

But back to the original topic. Despite the fact that everything was fine from the point of view of the compiler, we were still sure that the output “1, 1” was incorrectly considered the behavior expected by the developer. And then we tried to compile the source code with the MSVC compiler with the standard version flag '/ std: c ++ 17'. And everything starts to work as expected, and "2, 1" is printed. Take a look at the assembler code:


Everything is fair, the compiler passed values ​​2 and 1 as arguments. But why has everything changed so dramatically? It turns out that the following was added to the C ++ 17 standard:

The postfix-expression is sequenced before each expression in the expression-list and any default argument. The initialization of a parameter, including every associated value computation and side effect, is indeterminately sequenced with respect to that of any other parameter.

The compiler still has the right to calculate arguments in an arbitrary order, but now, starting from the C ++ 17 standard, it has the right to start calculating the next argument and its side effects only from the moment all calculations and side effects of the previous argument are completed.

By the way, if you compile the same example with smart pointers with the '/ std: c ++ 17' flag, then everything becomes good there too - using std :: make_unique is now optional.

Here is another measurement of depth in the question it turned out. There is a theory, but there is practice in the form of a specific compiler or a different interpretation of the standard :). The world of C ++ is always more complex and unexpected than it seems.

If someone can more accurately explain what is happening, then please tell us in the comments. We must finally understand the question in order to at least ourselves know the answer to it at the interview! :)

Here is such an informative story. We hope it was interesting, and you share your opinion on this topic. And we recommend using the most modern language standard as much as possible, so as to be less surprised that current optimizing compilers can. Better yet, don’t write this code at all :).

PS We can say that we “lit up the question”, and now it will have to be removed from the questionnaire. We do not see the point in this. If a person is not too lazy to study our publications before an interview, reads this material and then uses it, then he is well done and deservedly will receive a plus sign :).


If you want to share this article with an English-speaking audience, then please use the link to the translation: Andrey Karpov, Phillip Khandeliants. How Deep the Rabbit Hole Goes, or C ++ Job Interviews at PVS-Studio .

All Articles