How JIT inlines our C # code (heuristics)

Inlining is one of the most important optimizations in compilers. It not only removes the overhead from the call, but also opens up many possibilities for other optimizations, for example, constant folding, dead code elimination, etc. Moreover, sometimes inlining leads to a decrease in the size of the calling function! I asked several people if they knew by what rules the functions in C # are inlined, and most answered that the JIT looks at the size of the IL code and inlines only small functions with a size of, say, up to 32 bytes. Therefore, I decided to write this post to disclose implementation details with the help of such an example, which will show several heuristics in action at once:




Do you think the call to the Volume constructor will be inlined here? Obviously not. It is too large, especially due to the heavyweight throw new operators, which lead to a rather bold codegen. Let's check in Disasmo:



Inline! Moreover, all exceptions and their branches have successfully deleted! You can say something in the style of “Ah, okay, the jit is very smart and did a full analysis of all candidates for inline, looked what would happen if you pass specific arguments” or “The jit tries to inline everything that is possible, performs all the optimizations, and then decides profitably it or not ”(pick up combinatorics and calculate the complexity of this operation, for example, for a call graph of ten or two methods).

Well ... no, this is unrealistic, especially in terms of just in time. Therefore, most compilers use the so-called observations and heuristics to solve this classic backpack problem and try to determine their own budget and fit into it as efficiently as possible (and no, PGO is not a panacea). RyuJIT has positive and negative observations. Positive increase the coefficient of benefit (benefit multiplier). The higher the coefficient, the more code we can inline. Negative observations on the contrary - lower it or even prohibit inlining. Let's see what observations RyuJIT made for our example:



These observations can be seen in the logs from COMPlus_JitDump (for example, in Disasmo):



All these simple observations increased the coefficient from 1.0to 11.5 and helped to successfully overcome the budget of the inlineer, for example, the fact that we pass a constant argument and it is compared with another constant tells us that with a high degree of probability after collapsing the constants one of the condition branches will be deleted and the code will become smaller. Or, for example, the fact that this is a constructor and it is called inside the loop is also a hint to the jit that it should soften the requirements for inlining.

In addition to benefit multiplier, RyuJIT also uses observations to predict the size of the native function code and its performance impact using magic constants in EstimateCodeSize () and EstimatePerformanceImpact () obtained using ML.

By the way, did you notice this trick ?:

if ((value - 'A') > ('Z' - 'A'))

This is an optimized version for:

if (value < 'A' || value > 'Z')

Both expressions are one and the same, but in the first case we have one base unit, and in the second there are three of them. It turns out that inliner has a strict limit on the number of base blocks in the function and if it exceeds 5 then it doesn’t matter how big our benefit multiplier is - inlining is canceled. So I applied this trick to fit into this strict requirement. It would be great if Roslyn did it for me.

Issue in Roslyn : github.com/dotnet/runtime/issues/13347
of PR in RyuJIT (my awkward attempt): github.com/dotnet/coreclr/pull/27480

There I described an example of why it makes sense to do not only in Jit but and in the C # compiler.

Inlining and virtual methods


Everything is clear here, you cannot inline what there is no information about at the compilation stage, although if the type or method is sealed then why not .

Inlining and throwing exceptions


If a method never returns a value (for example, it just does throw new...), then such methods are automatically marked as throw-helpers and will not inline. This is such a way to sweep the complex codegen from throw newunder the carpet and appease the inliner.

Inlining and [AggressiveInlining] attribute


In this case, you recommend that the inlineer inline the method, but you have to be extremely careful for two reasons:

  • Perhaps you optimize one case and worsen all the others (for example, improve the case of constant arguments) by the size of the codegen.
  • Inlining often generates a large number of temporary variables that can exceed a certain limit - the number of variables whose life cycle RyuJIT can track (512) and after that the code will begin to grow into terrible spills on the stack and slow down greatly. Two good examples: tyts and tyts .

Inlining and dynamic methods


Currently, such methods do not inline and do not inline themselves: github.com/dotnet/runtime/issues/34500

My attempt to write my heuristic


Recently, I tried to write your own heuristics to help here such an occasion:



In a post last I mentioned that I recently optimized to RyuJIT calculation of the length of the constant strings ( "Hello".Length -> 5we see that if zainlaynit), and so, in the example above ^ Validatein Test, we get if ("hello".Length > 10)what is optimized in if (5 > 10)what is optimized in the removal of the entire condition / branch. However, the inliner refused to inline Validate:



And the main problem here is that there is no heuristic yet that tells the jit that we are passing a constant string to System.String::get_Length, which means that the callvirt-call will most likely collapse into a constant and the whole branch will be deleted. Actually, my heuristicand adds this observation (the only minus is that you have to resolve all callvirts, which is not very fast).

There are other restrictions, a list of which can be found in general here . And here you can read the thoughts of one of the main JIT developers about the design of the inliner and his article on the use of Machine Learning for this case.

All Articles