Hello, Habr.Today we offer you a translation of an article that touches on a topic that isn’t the most discussed: compilation of code in Python, namely: work with abstract syntax tree (AST) and byte code. While Python is an interpreted language, such features in it are extremely important from an optimization point of view. We’ll talk about them today.Have you ever wondered how exactly the compiler optimizes your code so that it works faster? Want to know what an abstract syntax tree (AST) is and what it can be used for?This review article describes how Python code is converted to tree form (AST). Having built the AST of your program, you can go on to the search for opportunities to optimize and transform your code. However, keep in mind that optimizing Python programs in non-trivial ways is extremely difficult .Program code as a tree
How can a computer make sure that it evaluates expressions from your code in the correct order?To do this, he first remakes your program code into a tree structure called AST.When working with an interpreted programming language (such as Python), it is generally accepted that the interpreter passes through your code and does everything that it encounters, right on the spot, without converting Python code into machine code in any way. However, in practice, this execution scheme provokes a lot of problems, which make it very inconvenient.Take, for example, such a simple problem as the priority of operators. In a view expression 3 + 4 * x
, the part is first calculated4 * x
, and only then can 3 be added to the result of the multiplication. Perhaps you learned the priority of operators in mathematics lessons by drawing these trees under the expression:
Python uses the standard rules of mathematical notation (first, multiplication, then addition). In order not to confuse anything with the priority of the operators, in Python, at first it is built such a tree as in the previous picture. The general operation is addition (at the root of the tree), and while the left side of this sum is a regular number, on the right we have the product. The resulting data structure looks like this:BinOp(
left = Num(3),
op = Add(),
right = BinOp(
left = Num(4),
op = Mult(),
right = Name('x')
)
)
BinOp
means Binary Operation (Binary Operation) and indicates that in operations such as addition and multiplication - two operands. Naturally, you won’t get any addition if the right part of the expression doesn’t have the correct value - therefore, you must first multiply.In the theory of compilers and programming languages, such a tree is called Abstract Syntax Tree , or AST for short. The AST in the above example includes two nodes BinOp
, two nodes, Num
and one node Name
.There is a nice feature in Python - the ability to directly view and display AST for any particular Python program. All that is required is to import a standard moduleast
, parsing the program, and then displaying the result on the screen (by the way, parsing is the process of converting the program source code to the AST tree).import ast
my_tree = ast.parse("3 + 4*x")
print(ast.dump(my_tree))
However, you will notice that there will be additional nodes and fields in the AST generated by Python, and it will be displayed on one line, which makes it seem more complicated at first glance than it actually is.Module(body=[Expr(value=BinOp(left=Num(n=3), op=Add(), right=BinOp(left=Num(n=4), op=Mult(), right=Name(id='x', ctx=Load()))))])
Let's split it into separate nodes, like last time - and re-open the AST, already on top, as part of the whole tree:Module(body = [
Expr(
value = BinOp(
left = Num(n=3),
op = Add(),
right = BinOp(
left = Num(n=4),
op = Mult(),
right = Name(id='x', ctx=Load())
)
)
)
])
Obviously, Python “thinks” that the line we gave it for parsing is a whole module. The body of the module is a list of all the instructions contained in it. The only instruction in our example is an expression Expr
whose meaning is exactly what we discussed above.Note: the node Name
has an additional field ctx
(abbreviated as “context”), which has a value Load()
. So Python says that we use the value stored in the variable x
, and not (re) define or delete the name x
. Now your move, try to parse something like del x
or yourself x = 123
, and you will see how the field ctx
in the node Name
changes to Del()
or Store()
, respectively.By the way: if you install the moduleastunparse
, then the AST output to the screen can be made much more beautiful, and even convert the AST back into live Python code.The compilation process: the remainder
Having collected AST programs, it is in principle possible to complete the entire program by going through the AST and performing the operations in the order in which they are indicated. However, this approach has at least two drawbacks. Firstly, AST can occupy a relatively large amount of memory, especially if it contains redundant information. Secondly, AST traversal may take longer than necessary. In short: it can be done, but it is inefficient.The compiler does not process the AST directly, but prepares bytecode, which is then executed on the Python virtual machine. Although discussing the details of this process is beyond the scope of this article, the basic principle is that the compiler translates the AST into reverse Polish notation (RPN). Instead of putting an operator+
between the left and right operands, we put it after both operands. In the example 3 + 4*x
above, we get the sequence 3 4 x * +
(and this notation is especially good because you can immediately see from the sequence: first you need to perform the multiplication, and only then the addition). Since each of the five elements in this sequence can in principle be represented as a single byte, such a code is called a byte code. Python then uses the stacked virtual machine to efficiently execute this code.In other words, the process of compiling a program written in Python takes place in two stages. First, the program received by the input is parsed, and the result is an abstract syntax tree (AST). The compiler then passes through AST and generates bytecode. After that, the Python interpreter executes this bytecode. Having taken up optimization, it can be applied either at the AST level or at the bytecode level. Both of these options have their own advantages and disadvantages.Finally, keep in mind that although AST is common in any Python implementation, the process of translating AST into bytecode may be different, and in some Python implementations, say, JavaScript, rather than bytecode, may be generated at the intermediate stage.Paradigms from other programming languages
Not all programming languages use infix notation, as in Python. Two examples worth noting in this case are PostScript, where the program is written directly in reverse Polish notation, and Lisp, of course, where programs are usually written in Polish notation. So, our expression of the above example, in Lisp would take the following form: (+ 3 (* 4 x))
.Node conversion within AST
Having an AST program, how to convert individual parts of this tree? With the convenient built-in features of Python.If we take a look at AST and, for example, find that both fields left
and right
nodes BinOp
are numbers (nodes Num
), we can perform the corresponding calculations in advance, and then replace them with a BinOp
normal node Num
.Of course, you need to act very carefully so as not to change the behavior of the program, doing such transformations. For example, in len([a(), b(), c(), d()])
, it is clear that the result is 4. But we can not replace all the expression of the number 4 because four functions a
, b
, c
, d
still have properly invoked.Again, start with a simple optimization. Wherever a name appears in the source code of a program pi
, replace it with the value 3.14159265. The Python module ast
already provides the data structures necessary to do this: a converter class NodeTransformer
that goes through all the ASTs and checks for each node whether it can be replaced. By default, the transformation method simply returns the source node for each node, so that we get the same AST from which we started. But we can easily override the method for nodes Name
, say, so that it checks to pi
see if it is, and then returns the node Num
instead of the node with the original name ... import ast
class MyOptimizer(ast.NodeTransformer):
def visit_Name(self, node: ast.Name):
if node.id == 'pi':
return ast.Num(n=3.14159265)
return node
tree = ast.parse("y = 2 * pi")
optimizer = MyOptimizer()
tree = optimizer.visit(tree)
print(ast.dump(tree))
In order for the converter / optimizer to go through our tree, it is necessary to call its method visit
, which then will return a new, changed tree.Unfortunately, it is impossible to compile and run the resulting AST, the reason for this is one technical detail. This is not yet visible, but (almost) all nodes in the AST also have fields lineno
and col_offset
. They indicate the exact position of a particular node in the source code. If you do not install them properly, the compiler will swear and refuse to work.So, let's copy the appropriate fields from the source node Name
to the new node Num
. You can then compile and execute the resulting AST:import ast
class MyOptimizer(ast.NodeTransformer):
def visit_Name(self, node: ast.Name):
if node.id == 'pi':
result = ast.Num(n=3.14159265)
result.lineno = node.lineno
result.col_offset = node.col_offset
return result
return node
tree = ast.parse("print(2 * pi)")
optimizer = MyOptimizer()
tree = optimizer.visit(tree)
code = compile(tree, "<string>", "exec")
exec(code)
Note: function compile requires not only the source code (in which can be a program itself, or the AST line), but the file name (as we asked "<string>"
), as well as one of three: "exec"
, "eval"
or "single"
.The need to copy the fields describing the position of the node in the source code arises quite often. Therefore, the module ast
has a dedicated function copy_location
just for this purpose, and we can write:def visit_Name(self, node: ast.Name):
if node.id == 'pi':
result = ast.Num(n=3.14159265)
return ast.copy_location(result, node)
return node
Finally, you can extend the previous example so that it actually performs optimization, namely, on the node BinOp
. According to the transformation rule, first we must transform / optimize the left, and then the right node as part of BinOp. If as a result it turns out that both the left and right nodes are numbers, then the calculations can be performed right on the spot and replace the original with the BinOp
numerical result of the operation.class MyVisitor(ast.NodeTransformer):
def visit_BinOp(self, node: ast.BinOp):
node.left = self.visit(node.left)
node.right = self.visit(node.right)
if isinstance(node.left, ast.Num) and isinstance(node.right, ast.Num):
if isinstance(node.op, ast.Add):
result = ast.Num(n = node.left.n + node.right.n)
return ast.copy_location(result, node)
elif isinstance(node.op, ast.Mult):
result = ast.Num(n = node.left.n * node.right.n)
return ast.copy_location(result, node)
return node
def visit_Name(self, node: ast.Name):
if node.id == 'pi':
result = ast.Num(n=3.14159265)
return ast.copy_location(result, node)
return node
tree = ast.parse("y = 2 * pi + 1")
optimizer = MyOptimizer()
tree = optimizer.visit(tree)
print(ast.dump(tree))
By the way, the CPython compiler is already optimizing nodes BinOp
as shown here. The corresponding code is written in C and is given in Python / ast_opt.c . Please note: CPython optimizer is more universal and works not only with numbers, as in the example we are considering, but with different types of constant values.Testing nodes in AST
How to make sure that the transformations we made were correct? First you need to completely bypass AST and inspect the entire program.The optimizer presented above remains a serious flaw. What happens if you redefine somewhere in the program pi
? Just imagine something as simple and intelligible as pi = 4
. Our optimizer will simply replace pi on the left side of the expression with the numeric value 3.14159265, and Python will then refuse to compile because it cannot assign anything to a literal value.Perhaps this is precisely the behavior you sought, making pi a true constant, which is replaced during compilation and can never be reassigned, that is, it cannot get a different value. However, this definitely violates the semantics of Python.So what to do if we want to stick to the semantics of Python, but replace pi anyway wherever possible? In this case, you first need to go through the entire program and check if the value for is assigned somewhere pi
. Until we complicate it: we will not resort to replacing pi if at least one point in the program has any assignment of a value to pi
.Now we use the visitor node, similar to the converter node described above. Unlike the converter, the visitor is not intended to change any nodes, he simply goes through the AST and examines the nodes (visits them). Accordingly, visiting methods do not return anything.In our case, we check whether the node refers Name
to pi
and does anything other than loading the valuepi
(remember the context field ctx
).import ast
class MyVisitor(ast.NodeVisitor):
def __init__(self):
self.modify_pi = False
def visit_FunctionDef(self, node: ast.FunctionDef):
if node.name == 'pi':
self.modify_pi = True
self.generic_visit(node)
def visit_Name(self, node: ast.Name):
if node.id == 'pi' and not isinstance(node.ctx, ast.Load):
self.modify_pi = True
program = """
def pi():
return 3.1415
print(2 * pi())
"""
tree = ast.parse(program)
my_visitor = MyVisitor()
my_visitor.visit(tree)
print("Pi modified:", my_visitor.modify_pi)
The method generic_visit(node)
is called by the visitor for each node for which we do not provide a specialized visiting method. In other words: there is no such method visit_FunctionDef
in the class NodeVisitor
that we could call using super()
. Regarding function definitions, we must call a generic visitor to make sure that the entire body of the function is also processed correctly. Otherwise, we could hide the instruction in the function global pi
and globally change the value pi
, so that our optimizer would not notice anything.Local Values in Python
Our method, which allows us to determine if the programmer has changed pi, turned out to be rather rude. However, the Python compiler acts very similarly when it determines which names in the scope of a function correspond to local variables. If a variable changes somewhere in the scope of the function (and is not explicitly made global, for example, using the global instruction), then this variable is considered local in the entire scope of the function.The following example will execute fine without the fourth line. But, although x = 0
the fourth line is never executed, it is still considered an assignment tox
and therefore x becomes a local variable at the scale of the entire function, and even on line 3. That's why Python will swear that the variable x on the third line does not matter yet.x = 1
def print_x():
print(x)
if False: x = 0
print_x()
If you are interested in the details of exactly how Python works here, check out Python / symtable.c .Conclusion
In Python, as in most programming languages, a particular program is not executed directly from source code. In fact, the translation of the source code takes place in two stages: first, an abstract syntax tree (AST) is made from it, and then byte code for the stacked virtual machine. Python also provides a number of very nice features for analyzing and even transforming the AST of any particular Python program, after which the modified AST can be compiled and executed. Thus, we can easily implement our own optimizations.Of course, I simply omitted many details here. To make sure that your optimization will work correctly in all possible cases and circumstances is a very non-trivial matter. However, the purpose of this article is not to tell you about the optimization that is ready for use in production, but to give a basic idea of how Python analyzes your program code so that you can learn how to correctly convert it and then optimize it.