Speeding up a project build on CMake + GCC: precompilation of header files

There are several reasons why a project in C ++ is on average going longer than comparable projects in other languages, such as Java or C #. Accordingly, there are several ways to reduce assembly time. One of the best known is the use of precompiled headers. Today I will tell you how using this method allowed me to significantly reduce the build time of my project.


A bit of history and theory


For several years now I have been participating in the development of a project in C ++. The cross-platform project, on CMake, uses GCC as the main compiler for Linux. Currently, the project has grown to more than hundreds of thousands of lines of code, the Boost library and some others are intensively used. Over time, the assembly of the project began to take more and more time, and as a result, the complete assembly of the entire project from scratch on the integration server took almost 45 minutes.


It's time to think about optimizing the build process, and I decided to try to screw the preliminary compilation of the header files. Moreover, the CMake 3.16 version has recently been released, which added built-in support for this technique.


I will not describe in detail how pre-compilation support is implemented, since the details of this implementation vary between compilers. But in general terms, precompilation works as follows. A header file is created (let's call it precompiled.h), which includes the header files for preliminary compilation. It generated special pch-file based on this header file ( .pch, .gch, .pchi- depending on the compiler), which contains the result of precompiled headers connected to precompiled.h. Further, if the compiler sees the inclusion when building the next unitprecompiled.h, then it doesn’t read and analyze this file and all the header files included in it, but instead uses the result of preliminary compilation from the pch-file.


, ( precompiled.h), . . pch- , . -, pch- β€” . -, pch- , . β€” . , . , , β€” .


. , . , . , Visual C++ :


//   
#include "stdafx.h"
#include "internal-header.h"
...

( stdafx.h β€” precompiled.h) , . . , stdafx.h . . .


Visual C++ :


//   
#include <vector>
#include <map>
#include "stdafx.h" // :    
                    //      
#include "internal-header.h"
...

, -, , β€” . , , , . stdafx.h #ifdef', .


, , GCC , stdafx.h . , #ifdef' stdafx.h, :


//   
#include "stdafx.h"
#include <vector>
#include <map>
#include "internal-header.h"
...

. , (#ifdef guard'), .


, . , precompiled.h stdafx.h, , (force include) (-include GCC /FI Visual C++). , , . .


CMake. CMake 3.16 target_precompiled_headers(). , (target') CMake-. , , stdafx.h precompiled.h, , pch-. -include /FI .


, target_precompiled_headers(<target1> REUSE FROM <target2>), pch- target1, target2. , , target1 target2 , (preprocessor defines).



. , , . , . CMake , "" :


set_property(GLOBAL PROPERTY RULE_LAUNCH_COMPILE "${CMAKE_COMMAND} -E time")
set_property(GLOBAL PROPERTY RULE_LAUNCH_LINK "${CMAKE_COMMAND} -E time")

:


[ 60%] Building CXX object source1.cpp.o
Elapsed time: 3 s. (time), 0.002645 s. (clock)
[ 64%] Building CXX object source2.cpp.o
Elapsed time: 4 s. (time), 0.001367 s. (clock)
[ 67%] Linking C executable my_target
Elapsed time: 0 s. (time), 0.000672 s. (clock)

, GCC:


-Winvalid-pch -       gch-
-H -          

CMake :


add_compile_options(-Winvalid-pch)
add_compile_options(-H)

, GCC -ftime-report:


add_compile_options(-ftime-report)

, :


Execution times (seconds)
 phase setup             :   0.01 ( 4%) usr   0.00 ( 0%) sys   0.01 ( 3%) wall    1223 kB ( 8%) ggc
 phase parsing           :   0.21 (81%) usr   0.10 (100%) sys   0.33 (87%) wall   13896 kB (88%) ggc
 phase opt and generate  :   0.03 (12%) usr   0.00 ( 0%) sys   0.03 ( 8%) wall     398 kB ( 3%) ggc
 phase last asm          :   0.01 ( 4%) usr   0.00 ( 0%) sys   0.01 ( 3%) wall     237 kB ( 2%) ggc
 |name lookup            :   0.05 (19%) usr   0.02 (20%) sys   0.03 ( 8%) wall     806 kB ( 5%) ggc
 |overload resolution    :   0.00 ( 0%) usr   0.01 (10%) sys   0.02 ( 5%) wall      68 kB ( 0%) ggc
 dump files              :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 3%) wall       0 kB ( 0%) ggc
 preprocessing           :   0.06 (23%) usr   0.04 (40%) sys   0.12 (32%) wall    1326 kB ( 8%) ggc
 parser (global)         :   0.06 (23%) usr   0.02 (20%) sys   0.11 (29%) wall    6783 kB (43%) ggc
 ...
 TOTAL                 :   0.26             0.10             0.38              15783 kB

- , Python, -, .


, , ( ):


PHASES SUMMARY
   phase opt and generate                   : 1309.1 s. = 21.8 m. ( 50 %)  --->  1577.5 s. = 26.3 m. ( 74 %)
   deferred                                 :  135.0 s. =  2.3 m. (  5 %)  --->   221.4 s. =  3.7 m. ( 10 %)
   integration                              :   62.2 s. =  1.0 m. (  2 %)  --->    85.1 s. =  1.4 m. (  4 %)
   template instantiation                   :  224.3 s. =  3.7 m. (  9 %)  --->   246.5 s. =  4.1 m. ( 12 %)
   callgraph optimization                   :   32.9 s. =  0.5 m. (  1 %)  --->    48.5 s. =  0.8 m. (  2 %)
   unaccounted todo                         :   36.5 s. =  0.6 m. (  1 %)  --->    49.7 s. =  0.8 m. (  2 %)
   |overload resolution                     :   82.1 s. =  1.4 m. (  3 %)  --->    95.2 s. =  1.6 m. (  4 %)
                                                        ...
   parser enumerator list                   :    2.1 s. =  0.0 m. (  0 %)  --->     0.5 s. =  0.0 m. (  0 %)
   parser function body                     :   32.0 s. =  0.5 m. (  1 %)  --->     9.3 s. =  0.2 m. (  0 %)
   garbage collection                       :   55.3 s. =  0.9 m. (  2 %)  --->    16.7 s. =  0.3 m. (  1 %)
   |name lookup                             :  132.8 s. =  2.2 m. (  5 %)  --->    63.5 s. =  1.1 m. (  3 %)
   body                                     :   87.5 s. =  1.5 m. (  3 %)  --->    18.2 s. =  0.3 m. (  1 %)
   parser struct body                       :  113.4 s. =  1.9 m. (  4 %)  --->    21.1 s. =  0.4 m. (  1 %)
   parser (global)                          :  158.0 s. =  2.6 m. (  6 %)  --->    25.8 s. =  0.4 m. (  1 %)
   preprocessing                            :  548.1 s. =  9.1 m. ( 21 %)  --->    88.0 s. =  1.5 m. (  4 %)
   phase parsing                            : 1119.7 s. = 18.7 m. ( 43 %)  --->   228.3 s. =  3.8 m. ( 11 %)
  TOTAL : 2619.2 s. = 43.7 m.  --->  2118.4 s. = 35.3 m.

, (parsing, preprocessing). , . , , . , , .


. . Boost . , . , , Boost. . , Boost. β€” , Boost, , .


pch- , target_precompiled_headers(<target1> REUSE FROM <target2>). .



, , 43 35 .


In addition to precompiling headers, there are other ways to speed up full or partial builds. Some of them require editing and organizing the source files in a certain way (for example, reducing the connection of unnecessary header files in other headers and moving them to the source files .cpp). Others use approaches that do not require editing the source (for example, ccache). Ccache, for example, allowed to reduce the time for a complete project assembly from 35 to 3 minutes, but more about that, perhaps next time.


As for the use of preliminary compilation of header files, this is a very effective way to reduce the build time of the project.


All Articles