Understanding memory management in modern programming languages

Hello, Habr! I present to you the translation of the article " Demystifying memory management in modern programming languages " by Deepu K Sasidharan.

In this series of articles, I would like to dispel the veil of mysticism over memory management in software (hereinafter referred to as software) and consider in detail the possibilities offered by modern programming languages. I hope that my articles will help the reader to look under the hood of these languages ​​and learn something new for themselves.

An in-depth study of the concepts of memory management allows you to write more efficient software, because the style and practice of coding have a great influence on the principles of allocating memory for the needs of the program.

Part 1: Introduction to Memory Management


Memory management is a whole set of mechanisms that allow you to control program access to the computer's RAM. This topic is very important in software development and, at the same time, causes difficulties or even remains a black box for many programmers.

What is RAM used for?


When a program runs on a computer’s operating system, it needs access to random access memory (RAM) in order to:

  • upload your own bytecode to execute;
  • store the values ​​of variables and data structures that are used in the process;
  • load external modules that the program needs to complete tasks.

In addition to the space used to load its own bytecode, the program uses two areas of RAM when working - the stack (stack) and the heap (heap).

Stack


The stack is used for static allocation of memory. It is organized on the principle of "last come - first come out" ( LIFO ). You can imagine the stack as a stack of books - it is allowed to interact only with the topmost book: read it or put a new one on it.

  • thanks to the mentioned principle, the stack allows you to quickly perform operations with data - all manipulations are performed with the “top book in the stack”. The book is added to the very top if you need to save data, or taken from the top if the data needs to be read;
  • , , , — ;
  • — . , , — , . , , , . , , , — , ;
  • ;
  • ; ;
  • ;
  • (stack overflow), . , ;
  • , ;



JavaScript. , .


The heap is used to dynamically allocate memory, however, unlike the stack, the data in the heap must first be found using the "table of contents". One can imagine that a heap is such a large multi-level library, in which, following certain instructions, you can find the necessary book.

  • operations on the heap are performed somewhat slower than on the stack, since they require an additional step to search for data;
  • heap stores data of dynamic sizes, for example, a list into which you can add an arbitrary number of elements;
  • heap common to all application threads;
  • due to its dynamic nature, the heap is non-trivial in management and with it most of all problems and errors associated with memory arise. Ways to solve these problems are provided by programming languages;
  • , — ( , ), , , ;
  • (out of memory), , ;
  • , , , .


?


Unlike hard drives, RAM is very limited (although hard drives, of course, are also not unlimited). If a program consumes memory without freeing it, then, ultimately, it will absorb all available reserves and try to go beyond the memory limits. Then it will simply fall on its own, or, even more dramatic, will bring down the operating system. Therefore, it is highly undesirable to be frivolous with memory manipulations in software development.

Different approaches


Modern programming languages ​​try to simplify the work with memory as much as possible and remove part of the headache from developers. Although some venerable languages ​​still require manual control, most still provide more elegant automatic approaches. Sometimes a language uses several approaches to managing memory at once, and sometimes a developer can even choose which option will be more effective specifically for his tasks (a good example is C ++ ). Let's move on to a brief overview of the various approaches.

Manual memory management


The language does not provide mechanisms for automatic memory management. The allocation and freeing of memory for created objects remains entirely on the developer's conscience. An example of such a language - the C . It provides a number of methods ( malloc , realloc , calloc and free ) for managing memory - the developer must use them to allocate and free memory in his program. This approach requires great accuracy and care. It is also particularly difficult for beginners.

Garbage collector


Garbage collection is the process of automatic memory management on the heap, which consists in finding unused portions of memory that were previously occupied for the needs of the program. This is one of the most popular options for memory management in modern programming languages. The garbage collection routine usually starts at predetermined time intervals and it happens that its launch coincides with resource-consuming processes, resulting in a delay in the application. JVM ( Java / Scala / Groovy / Kotlin ), JavaScript , Python , C # , Golang , OCaml andRuby are examples of popular languages ​​that use the garbage collector.

  • Mark & ​​Sweep garbage collector : This is an algorithm that operates in two phases: first, it marks the objects in memory that are referenced, and then frees the memory from objects that have not been tagged. This approach is used, for example, in JVM, C #, Ruby, JavaScript, and Golang. There are several different garbage collection algorithms to choose from in the JVM, and JavaScript engines such as V8 use a tagging algorithm in addition to link counting. Such a garbage collector can be connected in C and C ++ as an external library.

    Visualization of the tagging algorithm: objects linked by links are marked, and then unreachable ones are deleted
  • : — , . . , , , , PHP, Perl Python. C++;


(RAII)


RAII is a software idiom in OOP, the meaning of which is that the memory area allocated for an object is strictly tied to its lifetime. Memory is allocated in the constructor and freed in the destructor. This approach was first implemented in C ++ , and is also used in Ada and Rust .

Automatic Link Counting (ARC)


This approach is very similar to garbage collection with reference counting, however, instead of starting the counting process at certain time intervals, instructions for allocating and freeing memory are inserted directly into the bytecode at the compilation stage. When the reference counter reaches zero, the memory is freed as part of the normal program flow.

Automatic reference counting still does not allow processing of circular links and requires the developer to use special keywords for additional processing of such situations. ARC is one of the features of the Clang translator , therefore it is present in the Objective-C and Swift languages . Also, automatic reference counting is available for use in Rust.and the new C ++ standards with smart pointers .

Possession


This is a combination of RAII with the concept of ownership, when each value in memory should have only one owner variable. When the owner leaves the execution area, the memory is immediately freed. We can say that this is approximately like counting links at the compilation stage. This approach is used in Rust and at the same time I could not find any other language that would use a similar mechanism.




This article examined the basic concepts in the field of memory management. Each programming language uses its own implementations of these approaches and algorithms optimized for various tasks. In the following parts, we will take a closer look at memory management solutions in popular languages.

Read also the other parts of the series:



References





If you liked the article, please put a plus or write a comment.

You can subscribe to the author of the article on Twitter and on LinkedIn .

Illustrations:
Stack visualization done using pythontutor .
Ownership concept illustration: Link Clark, The Rust team under Creative Commons Attribution Share-Alike License v3.0 .

For the proof of translation, special thanks to Alexander Maksimovsky and Katerina Shibakova

All Articles