PEP 572 (Assignment Expressions in python 3.8)

Hello, Habr. This time we will look at PEP 572, which talks about assignment expressions. If you are still skeptical of the ": =" operator or do not fully understand the rules for its use, then this article is for you. Here you will find many examples and answers to the question: “Why is this so?” This article turned out to be as complete as possible, and if you have little time, then look at the section I wrote. At its beginning, the main “theses” are collected for comfortable work with assignment expressions. Forgive me in advance if you find errors (write about them to me, I will fix it). Let's start:

PEP 572 - Assignment Expressions

Pep572
Title:Assignment Expressions
Authors:Chris Angelico <rosuav at gmail.com>, Tim Peters <tim.peters at gmail.com>, Guido van Rossum <guido at python.org>
Discussion:doc-sig at python.org
Status:Accepted
A type:Standard
Created:28-Feb-2018
Python version:3.8
Post story:28-Feb-2018, 02-Mar-2018, 23-Mar-2018, 04-Apr-2018, 17-Apr-2018, 25-Apr-2018, 09-Jul-2018, 05-Aug-2019
Permission to adopt the standard:mail.python.org/pipermail/python-dev/2018-July/154601.html (with VPN for a long time, but it loads)
Content


annotation


This convention will talk about the possibility of assignment inside expressions, using the new notation NAME: = expr.

As part of the innovations, the procedure for calculating dictionary generators (dictionary comprehension) has been updated. This ensures that the key expression is evaluated before the value expression (this allows you to bind the key to a variable and then reuse the created variable in the calculation of the value corresponding to the key).

During a discussion of this PEP, this operator became unofficially known as the walrus operator. The formal name of the construct is “Assignment Expression” (according to the PEP: Assignment Expressions heading), but it may be referred to as “Named Expressions”. For example, the reference implementation in CPython uses this very name.

Justification


Naming is an important part of programming that allows you to use a “descriptive” name instead of a longer expression, and also makes it easy to reuse values. Currently, this can only be done in the form of instructions, which makes this operation unavailable when generating lists (list comprehension), as well as in other expressions.

In addition, naming parts of a large expression can help with interactive debugging by providing tools for displaying prompts and intermediate results. Without the ability to capture the results of nested expressions, you need to change the source code, but using the assignment expressions you just need to insert a few "markers" of the form "name: = expression". This eliminates unnecessary refactoring, and therefore reduces the likelihood of unintentional code changes during debugging (a common cause of Heisenbugs is errors that change the properties of the code during debugging and may unexpectedly appear in production]), and this code will be more understandable to another to the programmer.

The Importance of Real Code


During the development of this PEP, many people (both proponents and critics) were too focused on toy examples on the one hand, and overly complex examples on the other.

The danger of toy examples is twofold: they are often too abstract to make someone say “oh, this is irresistible”, and they are also easily rejected with the words “I would never write that”. The danger of overly complex examples is that they provide a convenient environment for critics suggesting that this functionality be removed (“This is too confusing,” such people say).

However, there is good use for such examples: they help clarify the intended semantics. Therefore, we will give some of them below. However, to be convincing , examples must be based onreal code that was written without thinking about this PEP. That is, the code that is part of a really useful application (no difference: whether it’s big or small). Tim Peters helped us a lot by looking at his personal repositories and choosing examples of the code he wrote, which (in his opinion) would be more understandable if they were rewritten (without fanaticism) using assignment expressions. His conclusion is this: the current changes would bring a modest but obvious improvement in a few bits of his code.

Another example of real code is the indirect observation of how programmers value compactness. Guido van Rossum checked the Dropbox codebase and found some evidence that programmers prefer to write fewer lines of code than using a few small expressions.

Case in point: Guido found several illustrative points when a programmer repeats a subexpression (thereby slowing down the program), but saves an extra line of code. For example, instead of writing:

match = re.match(data)
group = match.group(1) if match else None

Programmers preferred this option:

group = re.match(data).group(1) if re.match(data) else None

Here is another example showing that programmers are sometimes willing to do more work to maintain the “previous level” of indentation:

match1 = pattern1.match(data)
match2 = pattern2.match(data)
if match1:
    result = match1.group(1)
elif match2:
    result = match2.group(2)
else:
    result = None

This code computes pattern2, even if pattern1 already matches (in this case, the second sub-condition will never be fulfilled). Therefore, the following solution is more effective, but less attractive:

match1 = pattern1.match(data)
if match1:
    result = match1.group(1)
else:
    match2 = pattern2.match(data)
    if match2:
        result = match2.group(2)
    else:
        result = None

Syntax and semantics


In most cases where Python uses arbitrary expressions, you can now use assignment expressions. They have the form NAME: = expr, where expr is any valid Python expression, except for the unparenthesized tuple, and NAME is the identifier. The value of such an expression coincides with the original, but an additional effect is the assignment of a value to the target object:

# Handle a matched regex
if (match := pattern.search(data)) is not None:
    # Do something with match

# A loop that can't be trivially rewritten using 2-arg iter()
while chunk := file.read(8192):
   process(chunk)

# Reuse a value that's expensive to compute
[y := f(x), y**2, y**3]

# Share a subexpression between a comprehension filter clause and its output
filtered_data = [y for x in data if (y := f(x)) is not None]

Exceptional Cases


There are several places where assignment expressions are not allowed in order to avoid ambiguity or confusion among users:

  • Assignment expressions not enclosed in parentheses are prohibited at the “upper” level:

    y := f(x)  # 
    (y := f(x))  # ,   

    This rule will make it easier for the programmer to choose between an assignment operator and an assignment expression - there will be no syntactic situation in which both options are equivalent.
  • . :

    y0 = y1 := f(x)  # 
    y0 = (y1 := f(x))  # ,   

    . :

    foo(x = y := f(x))  # 
    foo(x=(y := f(x)))  # ,     

    , .
  • . :

    def foo(answer = p := 42):  # 
        ...
    def foo(answer=(p := 42)):  # Valid, though not great style
        ...

    , (. , «» ).
  • , . :

    def foo(answer: p := 42 = 5):  # 
        ...
    def foo(answer: (p := 42) = 5):  # ,  
        ...

    : , "=" ":=" .
  • -. :

    (lambda: x := 1) # 
    lambda: (x := 1) # ,  
    (x := lambda: 1) # 
    lambda line: (m := re.match(pattern, line)) and m.group(1) # Valid

    - , ":=". . , , () , .
  • f- . :

    >>> f'{(x:=10)}'  # ,  
    '10'
    >>> x = 10
    >>> f'{x:=10}'    # ,  ,  '=10'
    '        10'

    , , f-, . f- ":" . , f- . , .


An assignment expression does not introduce a new scope. In most cases, the scope in which the variable will be created does not require explanation: it will be current. If the variable used the nonlocal or global keywords before, then the assignment expression will take this into account. Only lambda (being an anonymous definition of a function) is considered a separate scope for these purposes.

There is one special case: an assignment expression that occurs in generators of lists, sets, dictionaries, or in the “expressions of generators” themselves (hereinafter collectively referred to as “generators” (comprehensions)) binds the variable to the scope that the generator contains, observing the globab modifier or nonglobal, if one exists.

The rationale for this special case is twofold. Firstly, it allows us to conveniently capture the “member” in the expressions any () and all (), for example:

if any((comment := line).startswith('#') for line in lines):
    print("First comment:", comment)
else:
    print("There are no comments")

if all((nonblank := line).strip() == '' for line in lines):
    print("All lines are blank")
else:
    print("First non-blank line:", nonblank)

Secondly, it provides a compact way to update a variable from a generator, for example:

# Compute partial sums in a list comprehension
total = 0
partial_sums = [total := total + v for v in values]
print("Total:", total)

However, the name of the variable from the assignment expression cannot match the name already used in generators by the for loop to iterate. The last names are local to the generator in which they appear. It would be inconsistent if the assignment expressions also referred to the scope within the generator.

For example, [i: = i + 1 for i in range (5)] is not valid: the for loop determines that i is local to the generator, but the “i: = i + 1” part insists that i is a variable from the external scope For the same reason, the following examples will not work:


[[(j := j) for i in range(5)] for j in range(5)] # 
[i := 0 for i, j in stuff]                       # 
[i+1 for i in (i := stuff)]                      # 

Although it is technically possible to assign consistent semantics for such cases, it’s difficult to determine if the way we understand this semantics will work in your real code. That is why the reference implementation ensures that such cases raise SyntaxError, rather than being executed with undefined behavior, depending on the particular hardware implementation. This restriction applies even if an assignment expression is never executed:

[False and (i := 0) for i, j in stuff]     # 
[i for i, j in stuff if True or (j := 1)]  # 

# [.  . - ""   
# ,       
# ,    ,   ]

For the generator body (the part before the first keyword “for”) and the filter expression (the part after the “if” and before any nested “for”) this restriction applies exclusively to variable names that are simultaneously used as iterative variables. As we have already said, Lambda expressions introduce a new explicit scope of the function and therefore can be used in expressions of generators without additional restrictions. [approx. again, except in such cases: [i for i in range (2, (lambda: (s: = 2) ()))]]

Due to design limitations in the reference implementation (the symbol table analyzer cannot recognize whether the names from the left part of the generator are used in the remaining part where the iterable expression is located), therefore assignment expressions are completely forbidden as part of iterable (in the part after each “in” and before any subsequent keyword “if” or “for”). That is, all these cases are unacceptable:

[i+1 for i in (j := stuff)]                    # 
[i+1 for i in range(2) for j in (k := stuff)]  # 
[i+1 for i in [j for j in (k := stuff)]]       # 
[i+1 for i in (lambda: (j := stuff))()]        # 

Another exception occurs when an assignment expression is used in generators that are in the scope of a class. If, when using the above rules, the creation of a class remeasured in the scope should occur, then such an assignment expression is invalid and will result in a SyntaxError:

class Example:
    [(j := i) for i in range(5)]  # 

(The reason for the last exception is the implicit scope of the function created by the generator - there is currently no runtime mechanism for functions to refer to a variable located in the scope of the class, and we don’t want to add such a mechanism. If this problem is ever solved, then this special case (possibly) will be removed from the specification of assignment expressions.Note that this problem will occur even if you created a variable earlier in the scope of the class and try to change it with an assignment expression from the generator.)

See Appendix B for examples of how assignment expressions found in generators are converted to equivalent code.

Relative Priority: =


The: = operator is grouped stronger than the comma in all syntactic positions where possible, but weaker than all other operators, including or, and, not, and conditional expressions (A if C else B). As follows from the “Exceptional Cases” section above, assignment expressions never work at the same “level” as the classic assignment =. If a different order of operations is required, use parentheses.

The operator: = can be used directly when calling the positional argument of a function. However, this will not work directly in the argument. Some examples clarifying what is technically permitted and what is not possible:

x := 0 # 

(x := 0) #  

x = y := 0 # 

x = (y := 0) #  

len(lines := f.readlines()) # 

foo(x := 3, cat='vector') # 

foo(cat=category := 'vector') # 

foo(cat=(category := 'vector')) #  

Most of the above "valid" examples are not recommended for use in practice, as people who quickly scan your source code may not correctly understand its meaning. But in simple cases this is allowed:

# Valid
if any(len(longline := line) >= 100 for line in lines):
    print("Extremely long line:", longline)

This PEP recommends that you absolutely always put spaces around: =, similar to the PEP 8 recommendation for = for classic assignment. (The last recommendation is different in that it forbids spaces around =, which is used to pass key arguments to the function.)

Change the order of calculations.


In order to have well-defined semantics, this agreement requires that the evaluation procedure be clearly defined. Technically, this is not a new requirement. Python already has a rule that subexpressions are usually evaluated from left to right. However, assignment expressions make these “side effects” more noticeable, and we propose one change in the current calculation order:

  • In dictionary generators {X: Y for ...}, Y is currently evaluated before X. We suggest changing this so that X is calculated before Y. (In a classic dict such as {X: Y}, as well as in dict ((X, Y) for ...) this has already been implemented. Therefore, dictionary generators must comply with this mechanism)


Differences between assignment expressions and assignment instructions.


Most importantly, ": =" is an expression , which means it can be used in cases where instructions are not valid, including lambda functions and generators. Conversely, assignment expressions do not support the extended functionality that can be used in assignment instructions :

  • Cascading assignment is not supported directly

    x = y = z = 0  # Equivalent: (z := (y := (x := 0)))
  • Separate "targets", except for the simple variable name NAME, are not supported:

    # No equivalent
    a[i] = x
    self.rest = []
  • Functionality and priority “around” commas differs:

    x = 1, 2  # Sets x to (1, 2)
    (x := 1, 2)  # Sets x to 1
  • Unpacking and packing values ​​do not have “pure” equivalence or are not supported at all

    # Equivalent needs extra parentheses
    loc = x, y  # Use (loc := (x, y))
    info = name, phone, *rest  # Use (info := (name, phone, *rest))
    
    # No equivalent
    px, py, pz = position
    name, phone, email, *other_info = contact
  • Inline type annotations are not supported:

    # Closest equivalent is "p: Optional[int]" as a separate declaration
    p: Optional[int] = None
  • There is no shortened form of operations:

    total += tax  # Equivalent: (total := total + tax)

Specification changes during implementation


The following changes were made based on our experience and additional analysis after the first writing of this PEP and before the release of Python 3.8:

  • To ensure consistency with other similar exceptions, and not to introduce a new name that may not be convenient for end users, the originally proposed subclass of TargetScopeError for SyntaxError has been removed and reduced to the usual SyntaxError. [3]
  • Due to limitations in parsing the CPython character table, the reference implementation of the assignment expression raises a SyntaxError for all uses within iterators. Previously, this exception occurred only if the name of the variable being created coincided with that already used in the iterative expression. This can be revised if there are sufficiently convincing examples, but the additional complexity seems inappropriate for purely “hypothetical” use cases.

Examples


Python Standard Library Examples


site.py


env_base is used only in a condition, so the assignment can be placed in if, as the "header" of a logical block.

  • Current Code:
    env_base = os.environ.get("PYTHONUSERBASE", None)
    if env_base:
        return env_base
  • Improved code:
    if env_base := os.environ.get("PYTHONUSERBASE", None):
        return env_base

_pydecimal.py


You can avoid nested ifs, thereby removing one level of indentation.

  • Current Code:
    if self._is_special:
        ans = self._check_nans(context=context)
        if ans:
            return ans
  • Improved code:
    if self._is_special and (ans := self._check_nans(context=context)):
        return ans

copy.py


The code looks more classic, and also avoids the multiple nesting of conditional statements. (See Appendix A to learn more about the origin of this example.)

  • Current Code:
    reductor = dispatch_table.get(cls)
    if reductor:
        rv = reductor(x)
    else:
        reductor = getattr(x, "__reduce_ex__", None)
        if reductor:
            rv = reductor(4)
        else:
            reductor = getattr(x, "__reduce__", None)
            if reductor:
                rv = reductor()
            else:
                raise Error(
                    "un(deep)copyable object of type %s" % cls)
  • Improved code:

    if reductor := dispatch_table.get(cls):
        rv = reductor(x)
    elif reductor := getattr(x, "__reduce_ex__", None):
        rv = reductor(4)
    elif reductor := getattr(x, "__reduce__", None):
        rv = reductor()
    else:
        raise Error("un(deep)copyable object of type %s" % cls)

datetime.py


tz is used only for s + = tz. Moving it inward if helps to show its logical area of ​​use.

  • Current Code:

    s = _format_time(self._hour, self._minute,
                     self._second, self._microsecond,
                     timespec)
    tz = self._tzstr()
    if tz:
        s += tz
    return s
  • Improved code:

    s = _format_time(self._hour, self._minute,
                     self._second, self._microsecond,
                     timespec)
    if tz := self._tzstr():
        s += tz
    return s

sysconfig.py


Calling fp.readline () as a “condition” in the while loop (as well as calling the .match () method) in the if condition makes the code more compact without complicating its understanding.

  • Current Code:

    while True:
        line = fp.readline()
        if not line:
            break
        m = define_rx.match(line)
        if m:
            n, v = m.group(1, 2)
            try:
                v = int(v)
            except ValueError:
                pass
            vars[n] = v
        else:
            m = undef_rx.match(line)
            if m:
                vars[m.group(1)] = 0
  • Improved code:

    while line := fp.readline():
        if m := define_rx.match(line):
            n, v = m.group(1, 2)
            try:
                v = int(v)
            except ValueError:
                pass
            vars[n] = v
        elif m := undef_rx.match(line):
            vars[m.group(1)] = 0

Simplify List Generators


Now the list generator can be effectively filtered by "capturing" the condition:

results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]

After that, the variable can be reused in another expression:

stuff = [[y := f(x), x/y] for x in range(5)]

Please note again that in both cases the variable y is in the same scope as the variables result and stuff.

Capture Values ​​in Conditions


Assignment expressions can be effectively used in the conditions of an if or while statement:

# Loop-and-a-half
while (command := input("> ")) != "quit":
    print("You entered:", command)

# Capturing regular expression match objects
# See, for instance, Lib/pydoc.py, which uses a multiline spelling
# of this effect
if match := re.search(pat, text):
    print("Found:", match.group(0))
# The same syntax chains nicely into 'elif' statements, unlike the
# equivalent using assignment statements.
elif match := re.search(otherpat, text):
    print("Alternate found:", match.group(0))
elif match := re.search(third, text):
    print("Fallback found:", match.group(0))

# Reading socket data until an empty string is returned
while data := sock.recv(8192):
    print("Received data:", data)

In particular, this approach can eliminate the need to create an infinite loop, assignment, and condition checking. It also allows you to draw a smooth parallel between a cycle that uses a function call as its condition, as well as a cycle that not only checks the condition, but also uses the actual value returned by the function in the future.

Fork


An example from the low-level world of UNIX: [approx. Fork () is a system call on Unix-like operating systems that creates a new sub-process relative to the parent.]

if pid := os.fork():
    # Parent code
else:
    # Child code

Rejected Alternatives


In general, similar suggestions are quite common in the python community. Below are a number of alternative syntaxes for assignment expressions that are too specific to understand and have been rejected in favor of the above.

Changing the scope for generators


In a previous version of this PEP, it was proposed to make subtle changes to the scope rules for generators to make them more suitable for use in the scope of classes. However, these proposals would lead to backward incompatibility and were therefore rejected. Therefore, this PEP was able to fully focus only on assignment expressions.

Alternative spellings


In general, the proposed assignment expressions have the same semantics, but are written differently.

  1. EXPR as NAME:

    stuff = [[f(x) as y, x/y] for x in range(5)]

    EXPR as NAME import, except with, (, ).

    ( , «with EXPR as VAR» EXPR VAR, EXPR.__enter__() VAR.)

    , ":=" :
    • , if f(x) as y , ​​ if f x blah-blah, if f(x) and y.
    • , as , , :
      • import foo as bar
      • except Exc as var
      • with ctxmgr() as var

      , as if while , as « » .
    • «»
      • NAME = EXPR
      • if NAME := EXPR

      .
  2. EXPR -> NAME

    stuff = [[f(x) -> y, x/y] for x in range(5)]

    , R Haskell, . ( , - y < — f (x) Python, - .) «as» , import, except with, . Python ( ), ":=" ( Algol-58) .
  3. «»

    stuff = [[(f(x) as .y), x/.y] for x in range(5)] # with "as"
    stuff = [[(.y := f(x)), x/.y] for x in range(5)] # with ":="

    . Python, , .
  4. where: :

    value = x**2 + 2*x where:
        x = spam(1, 4, 7, q)

    ( , «»). , «» ( with:). . PEP 3150, ( given: ).
  5. TARGET from EXPR:

    stuff = [[y from f(x), x/y] for x in range(5)]

    This syntax is less likely to conflict with others than as (unless you count the raise Exc from Exc constructs), but otherwise be comparable to them. Instead of a parallel with with expr as target: (which may be useful, but it can also be confusing), this option has no parallels with anything at all, but it is surprisingly better remembered.


Special cases in conditional statements


One of the most common use cases for assignment expressions is the if and while statements. Instead of a more general solution, using as improves the syntax of these two statements by adding a means of capturing the value to be compared:

if re.search(pat, text) as match:
    print("Found:", match.group(0))

This works fine, but ONLY when the desired condition is based on the "correctness" of the return value. Thus, this method is effective for specific cases (checking for regular expressions, reading sockets, returning an empty string when execution ends), and is completely useless in more complex cases (for example, when the condition is f (x) <0, and you want save the value of f (x)). Also, this does not make sense in list generators.

Advantages : No syntactic ambiguities. Disadvantages : even if you use it only in if / while statements, it only works well in some cases.

Special cases in generators


Another common use case for assignment expressions is generators (list / set / dict and genexps). As above, suggestions were made for specific solutions.

  1. where, let, or given:

    stuff = [(y, x/y) where y = f(x) for x in range(5)]
    stuff = [(y, x/y) let y = f(x) for x in range(5)]
    stuff = [(y, x/y) given y = f(x) for x in range(5)]

    This method results in a subexpression between the for loop and the main expression. It also introduces an additional language keyword, which can create conflicts. Of the three options, where is the cleanest and most readable, but potential conflicts still exist (for example, SQLAlchemy and numpy have their where methods, as well as tkinter.dnd.Icon in the standard library).
  2. with NAME = EXPR:

    stuff = [(y, x/y) with y = f(x) for x in range(5)]

    , , with. . , «» for. C, , . : « «with NAME = EXPR:» , ?»
  3. with EXPR as NAME:

    stuff = [(y, x/y) with f(x) as y for x in range(5)]

    , as, . , for. with

Regardless of the method chosen, a sharp semantic difference will be introduced between generators and their deployed versions through a for loop. It would become impossible to wrap the cycle in a generator without processing the stage of creating the variables. The only keyword that could be reoriented for this task is the word with . But this will give it different semantics in different parts of the code, which means that you need to create a new keyword, but it involves a lot of costs.

Lower operator priority


The: = operator has two logical priorities. Or it should have as low priority as possible (on par with the assignment operator). Or it should take precedence greater than comparison operators. Placing its priority between comparison operators and arithmetic operations (to be precise: slightly lower than bitwise OR) will allow you to do without parentheses in most cases when and while using, since it is more likely that you want to keep the value of something before how the comparison will be performed on it:

pos = -1
while pos := buffer.find(search_term, pos + 1) >= 0:
    ...

As soon as find () returns -1, the loop ends. If: = binds the operands as freely as =, then the result of find () will first be “captured” into the comparison operator and will usually return True or False, which is less useful.

Although this behavior would be convenient in practice in many situations, it would be more difficult to explain. And so we can say that "the operator: = behaves the same as the usual assignment operator." That is, the priority for: = was chosen as close as possible to the operator = (except that: = has priority higher than the comma).

You give commas on the right


Some critics argue that assignment expressions should recognize tuples without the addition of brackets so that the two entries are equivalent:

(point := (x, y))
(point := x, y)

(In the current version of the standard, the last record will be equivalent to the expression ((point: = x), y).)

But it is logical that in this situation, when using the assignment expression in the function call, it would also have a lower priority than the comma, so we got would be the following confusing equivalence:

foo (x: = 1, y)
foo (x: = (1, y))

And we get the only less confusing way out: make the: = operator a lower priority than the comma.

Always requiring brackets


It has always been proposed to bracket the assignment expressions. This would save us many ambiguities. Indeed, parentheses will often be needed to extract the desired value. But in the following cases, the presence of brackets clearly seemed to us superfluous:

# Top level in if
if match := pattern.match(line):
    return match.group(1)

# Short call
len(lines := f.readlines())

Frequent objections


Why not just turn the assignment statements into expressions?


C and similar languages ​​define the = operator as an expression, not an instruction, as Python does. This allows for assignment in many situations, including places where variables are compared. The syntactic similarities between if (x == y) and if (x = y) contradict their sharply different semantics. Thus, this PEP introduces the operator: = to clarify their differences.

Why bother with assignment expressions if assignment instructions exist ?


These two forms have different flexibilities. The operator: = can be used inside a larger expression, and in the = operator it can be used by the "family of mini-operators" of the type "+ =". Also = allows you to assign values ​​by attributes and indexes.

Why not use local scope and prevent namespace pollution?


Previous versions of this standard included a real local scope (limited to one statement) for assignment expressions, preventing name leakage and namespace pollution. Despite the fact that in some situations this gave a certain advantage, in many others it complicates the task, and the benefits are not justified by the advantages of the existing approach. This is done in the interest of the simplicity of the language. You no longer need this variable? There is a solution: delete the variable using the del keyword or add a lower underscore to its name.

(The author would like to thank Guido van Rossum and Christophe Groth for their suggestions to advance the PEP standard in this direction. [2])

Style Recommendations


Since assignment expressions can sometimes be used on a par with an assignment operator, the question arises, what is still preferred? .. In accordance with other style conventions (such as PEP 8), there are two recommendations:

  1. If you can use both assignment options, then give preference to operators. They most clearly express your intentions.
  2. If the use of assignment expressions leads to ambiguity in the execution order, then rewrite the code using the classical operator.

Thanks


The authors of this standard would like to thank Nick Coghlan and Steven D'Aprano for their significant contributions to this PEP, as well as Python Core Mentorship members for their help in implementing this.

Appendix A: Tim Peters Conclusions


Here is a short essay that Tim Peters wrote on this topic.

I don’t like the “confused” code, and also don’t like putting conceptually unrelated logic on one line. So, for example, instead of:

i = j = count = nerrors = 0

I prefer to write:

i = j = 0
count = 0
nerrors = 0

Therefore, I think I will find several places where I want to use assignment expressions. I don’t even want to talk about their use in expressions that are already stretched to half the screen. In other cases, such behavior as:

mylast = mylast[1]
yield mylast[0]

Significantly better than this:

yield (mylast := mylast[1])[0]

These two codes have completely different concepts and mixing them would be crazy. In other cases, combining logical expressions makes code more difficult to understand. For example, rewriting:

while True:
    old = total
    total += term
    if old == total:
        return total
    term *= mx2 / (i*(i+1))
    i += 2

In a shorter form, we have lost “logic.” You need to understand how this code works. My brain does not want to do this:

while total != (total := total + term):
    term *= mx2 / (i*(i+1))
    i += 2
return total

But such cases are rare. The task of preserving the result is very common, and “sparse is better than dense” does not mean that “almost empty is better than sparse” [approx. a reference to Zen Python]. For example, I have many functions that return None or 0 to say "I have nothing useful, but since this happens often, I don’t want to bother you with exceptions." In fact, this mechanism is also used in regular expressions that return None when there are no matches. Therefore, in this example, a lot of code:

result = solution(xs, n)
if result:
    # use result

I find the following option more understandable, and of course more readable:

if result := solution(xs, n):
    # use result

At first I didn’t attach much importance to this, but such a short construction appeared so often that it soon began to annoy me that I could not use it. It surprised me! [approx. apparently this was written before Python 3.8 was officially released.]

There are other cases where assignment expressions really "shoot". Instead of rummaging around in my code again, Kirill Balunov gave a fine example of the copy () function from the standard copy.py library:

reductor = dispatch_table.get(cls)
if reductor:
    rv = reductor(x)
else:
    reductor = getattr(x, "__reduce_ex__", None)
    if reductor:
        rv = reductor(4)
    else:
        reductor = getattr(x, "__reduce__", None)
        if reductor:
            rv = reductor()
        else:
            raise Error("un(shallow)copyable object of type %s" % cls)

The ever-increasing indentation is misleading: after all, the logic is flat: the first successful test “wins”:

if reductor := dispatch_table.get(cls):
    rv = reductor(x)
elif reductor := getattr(x, "__reduce_ex__", None):
    rv = reductor(4)
elif reductor := getattr(x, "__reduce__", None):
    rv = reductor()
else:
    raise Error("un(shallow)copyable object of type %s" % cls)

The simple use of assignment expressions allows the visual structure of the code to emphasize the “plane” of logic. But the ever-increasing indentation makes it implicit.

Here is another small example from my code, which made me very happy because it allowed me to put internally related logic on one line and remove the annoying “artificial” indentation level. This is exactly what I want from the if statement and it makes reading easier. The following code:

diff = x - x_base
if diff:
    g = gcd(diff, n)
    if g > 1:
        return g

Turned into:

if (diff := x - x_base) and (g := gcd(diff, n)) > 1:
    return g

So, in most lines where variable assignment occurs, I would not use assignment expressions. But this design is so frequent that there are still many places where I would take this opportunity. In most recent cases, I won a little, as they often appeared. In the remaining sub-part, this led to medium or large improvements. Thus, I would use assignment expressions much more often than a triple if, but much less often than augmented assignment [approx. short options: * =, / =, + =, etc.].

Numerical example


I have another example that struck me earlier.

If all the variables are positive integers, and the variable a is greater than the nth root of x, then this algorithm returns the “lower” rounding of the nth root of x (and approximately doubles the number of exact bits per iteration):

while a > (d := x // a**(n-1)):
    a = ((n-1)*a + d) // n
return a

It is not clear why, but such a variant of the algorithm is less obvious than an infinite loop with a conditional branch break (loop and a half). It is also difficult to prove the correctness of this implementation without relying on a mathematical statement (“arithmetic mean - geometric mean inequality”) and not knowing some non-trivial things about how the nested rounding functions behave downward. But here the problem is already in mathematics, and not in programming.

And if you know all this, then the option using assignment expressions is read very easily, like a simple sentence: “Check the current“ guess ”and if it is too large, reduce it” and the condition allows you to immediately save the intermediate value from the loop condition. In my opinion, the classic form is harder to understand:

while True:
    d = x // a**(n-1)
    if a <= d:
        break
    a = ((n-1)*a + d) // n
return a

Appendix B: A Rough Code Interpreter for Generators


This appendix attempts to clarify (although not specify) the rules by which a variable should be created in generator expressions. For a number of illustrative examples, we show the source code where the generator is replaced by an equivalent function in combination with some “scaffolding”.

Since [x for ...] is equivalent to list (x for ...), the examples do not lose their generality. And since these examples are intended only to clarify the general rules, they do not claim to be realistic.

Note: generators are now implemented through the creation of nested generator functions (similar to those given in this appendix). The examples show the new part, which adds the appropriate functionality for working with the scope of assignment expressions (such scope as if the assignment were performed in a block containing the most external generator). To simplify the “type inference”, these illustrative examples do not take into account that assignment expressions are optional (but they take into account the scope of the variable created inside the generator).

Let's first recall what code is generated “under the hood” for generators without assignment expressions:

  • Source code (EXPR most often uses the VAR variable):

    def f():
        a = [EXPR for VAR in ITERABLE]
  • The converted code (let's not worry about name conflicts):

    def f():
        def genexpr(iterator):
            for VAR in iterator:
                yield EXPR
        a = list(genexpr(iter(ITERABLE)))


Let's add a simple assignment expression.

  • Source:

    def f():
        a = [TARGET := EXPR for VAR in ITERABLE]
    
  • Converted Code:

    def f():
        if False:
            TARGET = None  # Dead code to ensure TARGET is a local variable
        def genexpr(iterator):
            nonlocal TARGET
            for VAR in iterator:
                TARGET = EXPR
                yield TARGET
        a = list(genexpr(iter(ITERABLE)))

Now let's add the global TARGET statement to the declaration of the f () function.

  • Source:

    def f():
        global TARGET
        a = [TARGET := EXPR for VAR in ITERABLE]
    
  • Converted Code:

    def f():
        global TARGET
        def genexpr(iterator):
            global TARGET
            for VAR in iterator:
                TARGET = EXPR
                yield TARGET
        a = list(genexpr(iter(ITERABLE)))

Or vice versa, let's add nonlocal TARGET to the declaration of the f () function.

  • Source:

    def g():
        TARGET = ...
        def f():
            nonlocal TARGET
            a = [TARGET := EXPR for VAR in ITERABLE]
    
  • Converted Code:

    def g():
        TARGET = ...
        def f():
            nonlocal TARGET
            def genexpr(iterator):
                nonlocal TARGET
                for VAR in iterator:
                    TARGET = EXPR
                    yield TARGET
            a = list(genexpr(iter(ITERABLE)))

Finally, let's put in two generators.

  • Source:

    def f():
        a = [[TARGET := i for i in range(3)] for j in range(2)]
        # I.e., a = [[0, 1, 2], [0, 1, 2]]
        print(TARGET)  # prints 2
    
  • Converted Code:

    def f():
        if False:
            TARGET = None
        def outer_genexpr(outer_iterator):
            nonlocal TARGET
            def inner_generator(inner_iterator):
                nonlocal TARGET
                for i in inner_iterator:
                    TARGET = i
                    yield i
            for j in outer_iterator:
                yield list(inner_generator(range(3)))
        a = list(outer_genexpr(range(2)))
        print(TARGET)

Appendix C: No Changes in Scope Semantics


Note that in Python the scope semantics have not changed. The scope of local functions is still determined at compile time and have an indefinite time extent at runtime (closure). Example:

a = 42
def f():
    # `a` is local to `f`, but remains unbound
    # until the caller executes this genexp:
    yield ((a := i) for i in range(3))
    yield lambda: a + 100
    print("done")
    try:
        print(f"`a` is bound to {a}")
        assert False
    except UnboundLocalError:
        print("`a` is not yet bound")

Then:

>>> results = list(f()) # [genexp, lambda]
done
`a` is not yet bound
# The execution frame for f no longer exists in CPython,
# but f's locals live so long as they can still be referenced.
>>> list(map(type, results))
[<class 'generator'>, <class 'function'>]
>>> list(results[0])
[0, 1, 2]
>>> results[1]()
102
>>> a
42

References


  1. Proof of concept implementation
  2. Discussion of the semantics of assignment expressions (VPN is tight but loaded)
  3. Discussion of TargetScopeError in PEP 572 (loaded similarly to the previous one)

Copyright


This document has been made publicly available.

Source: github.com/python/peps/blob/master/pep-0572.rst

My part


To start, let's summarize:
  • So that people don’t try to remove semantic duality, in many “classical” places where one could use both “=” and “: =” there are restrictions, therefore the operator:: = should be often enclosed in brackets. These cases will have to be reviewed in the section describing the basic use.
  • The priority of assignment expressions is slightly higher than that of a comma. Due to this, tuples are not formed during assignment. It also makes it possible to use the: = operator when passing arguments to a function.
  • , , , . . lambda , «» .
  • : ,
  • , .
  • / .
  • , .

In the end, I want to say that I liked the new operator. It allows you to write flatter code in conditions, “filter” lists, and also (finally) remove the “same”, lonely line before if. If people use assignment expressions for their intended purpose, then this will be a very convenient tool that will increase the readability and beauty of the code (Although, this can be said about any functional language ....)

All Articles