We deal with the wave function collapse algorithm


Since the advent of DeBroglie and Tessera, I have been asked many times to explain how they work. Generating might look like magic, but the rules behind it are actually simple.

So what is the Wave Function Collapse (WFC) algorithm? It was developed by Maxim Gumin to create “tiled” images based on a simple configuration or image samples. The algorithm can do a lot: look through the examples provided by Maxim and Twitter #wavefunctioncollapse , watch this video . The variety is amazing.


Maxim in README briefly explained the work of WFC, but it seems to me that this issue requires a more detailed disclosure, from the very basics. Since the algorithm is related to programming in constraints , most of the article is devoted to explaining the concept of programming in constraints, and in the end we will return to WFC .

Restricted programming is a way of instructing computers. Unlike imperative programming, when you specify a list of explicit functions, or functional programming, when you specify mathematical functions, here you provide the computer with a strict description of the problem, and it uses the built-in algorithms to find a solution.

Note:This guide describes various concepts, not methods and code. If you are more interested in the implementation, then you can use my WFC opensource library ( github , documentation ). Although it is better to start the study with the implementation of Maxim . If you use Unity, you can buy Tessera , this is my tool for applying WFC in this engine.

Mini sudoku


As an illustrative task, I took a mini Sudoku . This is a puzzle that looks like a 4 × 4 grid with numbers in some cells.


The goal is to fill each empty cell with a number from 1 to 4 in accordance with the rules:

  • Each number from 1 to 4 must be present in each row in a single copy.
  • Each number from 1 to 4 must be present in each column in a single copy.
  • Each number from 1 to 4 must be present in each 2 × 2 corner square in a single copy.

According to these rules, the solution will be as follows:


You may have easily solved this puzzle. But we are interested in how a computer can do this. The problem can be divided into two parts: a description of the conditions for the computer, and then a solution using the algorithm.

Description of conditions


Usually, a programming language in restrictions is used for this. There are several such languages, and their principles of action are similar.

First we define the variables . Here they are not the same as in conventional programming. In the context of a problem solver, a variable is an unknown value that must be solved to solve a problem. In our Sudoku example, we will create variables for all empty cells. For convenience, you can also create variables for filled cells.

Then for each variable we define a domain : a set of possible values. In our sudoku, for each empty cell, the domain will be {1, 2, 3, 4}. And for a cell in which 1 is already entered, the domain will be {1}.

Finally, we set the constraints: rules binding our variables. In most programming languages, there is already a function with restrictions all_distinct(..)that needs to pass unique values. So Sudoku rules can be translated into 12 constraints all_distinct- one for each row and column, as well as for 2 × 2 corner squares. We need only one type of constraint to solve this problem, however, problem solvers for satisfying the constraints usually come with a large library of different kinds of constraints to describe your problem.

Note : in practice, programming languages ​​support arrays in constraints, so there are more accurate ways of formulating this task. There are many articles on the net about solving sudoku.

Algorithms for solving constraint satisfaction problems


There are several different solution techniques. But I will consider the simplest of them to demonstrate to you the principle of their work. Domains for each cell are shown here:


These are all possible values ​​in variable domains.

Now take the first limitation. It requires that all values ​​in the top row are unique. In one cell, the value 4 is already inscribed, therefore, in other cells of the series this value can not be . We remove it from the domains of these cells.


Repeat this process with all 12 restrictions. This is called propagation of restrictions , because information is distributed from a variable to another through restrictions.


And look, we have variables, in the domains of which there is one value left. These should be the correct solutions for these variables.


It seems that we can get even more benefit from the distribution of restrictions: the new red units indicate that we will have even more variables with a single domain, and this will also contribute to the distribution. Repeat the procedure until the puzzle is solved.


We complicate the task


Unfortunately, not all logic puzzles can be solved so quickly. Here is a big sudoku, it works the same way, except that now we have 9 different values, 9 rows, 9 columns and 9 3 × 3 squares.


If we try to apply the above technique, then we’ll get stuck:


Now all the restrictions are common, but there are still undefined variables.

A person will decide this, but a computer algorithm will not be able to. The manuals for people say that now we need to look for more and more complicated logical conclusions. But we do not need a computer algorithm to do this, because it is difficult, but we need a universal algorithm that can work with any set of restrictions, and not just according to sudoku rules.

Therefore, computers do the most stupid thing: they assume . First, we record the current state of the puzzle. Then we select an arbitrary variable and set it to one of the possible values. Let's say we assign the value 1 to the central cell.

Now we can spread the restrictions a bit more. Here's what I got for the center column:


The blue values ​​are the conclusions we made after the assumption. As you can see, something happened. We put down a few more variables, but take a look at the middle upper cell. It is empty: in its domain there are no possible values satisfying the restrictions (there cannot be 5, because this value already exists in the same 3 × 3 square, and all other numbers are already in this column).

If we get a variable with an empty domain, this means that we cannot assign it any value. That is, the puzzle cannot be solved. It turns out that our assumption was erroneous .

Faced with such a contradiction, we perform the return search process(backtracking). First, we restore the state that was before our assumption. Then we remove from the domain the value that we used as an assumption - it can no longer be the correct answer.


A lot of work has been done, but they have moved forward. However, even after exclusion of 1 from the central cell, we are still at a dead center. It's time to make an assumption again, and again, and again.

There are not many algorithmic actions here. Every time we can’t distribute restrictions, we make an assumption and move on. And since you can get stuck several times before encountering a contradiction, then you will accumulate several saved states and assumptions.

With each iteration with a return, you reduce the domain by at least one variable, so that, despite numerous rollbacks, the algorithm will eventually complete the work.

This topic is much more extensive than I describe. In practice, optimizations such as the choice of assumptions, understanding when to distribute, and when to make more complex logical conclusions can have a huge impact on program execution. And since problems with constraints, as a rule, grow exponentially, you can get an answer tomorrow, or after 5000 years.

We return to the collapse of the wave function


The collapse of the wave function is a tricky task with a limitation: there are thousands of possible solutions. If we make assumptions randomly, then instead of a solver, we get a generator . At the same time, it will still comply with all the given restrictions, that is, it will be much more manageable than most other procedural generators.

The task of the WFC algorithm is to fill cells with tiles so that the images on the tiles are combined with each other. Within the terminology used above, each tile is a value, each cell is a variable, and the rules for placing tiles are constraints.

Maxim, the author of WFC, found that with a reasonable choice of tiles and appropriate randomization, you rarely have to use backtracking, so this procedure can not be implemented. Thus, the essence of WFC is as follows:

  • A boolean array is created for each cell, which represents the domain of this variable. Domains contain one record per tile, and all of them are initialized with a value true. A tile enters a domain if its value is equal true.
  • At the same time, there are cells in which domains contain several elements:
    • Selection of a random cell using the heuristic method of "least entropy".
    • Select a random tile from the cell domain and remove all other tiles from there.
    • Updating the domains of other cells based on this new information is the spread of cell restriction. This must be done repeatedly, as changes in the cells can have further consequences.

Note : my terms are different from the terms of Maxim. He calls the array of domains a wave, and choosing a random tile is an observation. That is, an analogy is drawn with quantum mechanics. But the comparison is superficial, so we will ignore it.

There are many additions to the above principles that add grace and performance to the algorithm. But let's first look at the steps described.

Example


Let's say we need to fill a 3 by 3 grid with four types of tiles:


The limitations are: the colors of adjacent tiles must match each other.

As in the Sudoku example, I will illustrate the contents of domains with monochrome images. When there is only one element left in the domain, I will increase it in size and show it in color.

First, in each cell, tiles of any kind can be located:


Run the main WFC loop. Randomly select a cell, for example, the upper left. Now select a tile in the domain and delete all the others.


Distribute the restrictions. The only rule applies to adjacent tiles, so we need to update two cells:


Can we update other tiles given the shrinking domains? Yes! Although we are not sure about the choice of the first tile, we see that the remaining options lead to the right. That is, some types of tiles can not be put in the upper right corner. Same thing with the bottom left corner.


We can no longer distribute the restrictions, so we repeat the main cycle: cell selection, tile selection and distribution. This time we take the upper middle cell:


Another cycle: take the left middle cell. After the proliferation of restrictions for the central cell, one possible value remains.


In general, you understand the idea. You can fill in the rest of the cells yourself.

If you want to experiment with a more complex interactive example , I recommend this one .

Smallest entropy


When choosing the next cell to fill in, some options will be preferable. If you randomly choose from anywhere in the grid, a situation may arise when at the same time you will have different areas of the grid filled. This can lead to problems: depending on the tiles available, the filled areas cannot be joined. So it’s better to choose cells that are less likely to lead to such problems in the future. In other words, you need to take cells with the least number of possible values ​​in the domain (but not less than two values):

  1. Most likely, they will be next to the already filled cells.
  2. If we leave them for the future, then difficulties may arise, because the available values ​​are already few.

The smallest domain approach works well if all tiles are equally likely. But if you choose from a weighted random distribution , then you need to consider something else. Maxima recommends “minimal entropy”, that is, choose a cell that minimizes:

entropy=pilog(pi)


This is a summation of tiles in the domain where pi- probability for this tile.

Efficient computing


Although I do not want to go into details, however, there are two optimizations that allow me to increase the speed so much that they cannot be ignored.

Since our only rule is related to adjacent tiles, we can only check those restrictions that can give results that differ from the previous ones. That is, when updating a cell, we add it to the queue of cells awaiting a decision. Then we remove one cell from the queue, each time checking its adjacent cells. If such checks lead to the renewal of other cells, they also fall into the queue. Repeating this procedure until the queue is empty, we make sure that we check all the restrictions. But we do not check them until the domain of at least one cell associated with these restrictions changes.

Further, to check the adjacency constraint, we need to answer the question: “Given the tiles in the domain of cell A, what tiles are possible in the domain of cell B if the cells are adjacent?”

Sometimes this question is called the “support” of cell A. For a given tile “b” in domain B, it is easy to calculate the cycles for tiles in domain A. If A has at least one tile that can be placed next to “b”, then “b” still suitable, at least for the pair of tiles in question. If in A there is no such tile, then you cannot put tile “b” from domain B, and we can discard it.

Loops make it easy to implement this in code, but will work extremely slowly. And if we store information in an additional data structure, we can quickly answer the question as necessary. The new data is a large array with entries for each side of each cell and each tile. In our example, there are 9 cells, each with 4 sides, and 4 types of tiles. So we need 9 * 4 * 4 records. We will save support counters in the array: for each cell / tile / side, we count the number of tiles in the adjacent cell domain that can be placed next to the tile in question. If the counter drops to zero, then this tile cannot be put, because no one can be adjacent to it.

Algorithm Extensions


Because WFC is based on a more general understanding of constrained problems, there are many ways to extend the algorithm by changing the constraints used.

One of the obvious changes is that no one forces us to use square cells. WFC works great on hexagonal cells, in three dimensions or even more unusual surfaces .




You can introduce additional restrictions: fix certain tiles, or create “modules” from several cells.

The introduction of additional restrictions can play a very important role from a practical point of view. Since WFC limits only adjacent tiles, it rarely generates large structures that provide a high level of homogeneity and look unusual. This algorithm works best when choosing tiles, but to work with some large structures, it is better to use a different technique or introduce other restrictions.

In another article, I talked about how you can achieve the best results from WFC.

Overlapping WFC


One of the interesting extensions to the algorithm is the “overlapping” WFC. In the above examples, the main limitation was related to pairs of adjacent tiles. This is enough to ensure the connection of lines and create simple structures like caves, rooms, etc. But at the same time a lot of information is lost. If we need, say, that red tiles are always located next to the blue, but never were adjacent to them, then it will be difficult to express in terms of only adjacency.

Maxim proposed the concept of overlapping WFC: we replace the adjacency constraint with a new constraint that affects several tiles at once. For example, so that at the output each group of 3 × 3 cells corresponds to a 3 × 3 group from a grid sample. The patterns present in the sample will be repeated over and over again in different variations at the output:


This restriction is much more sensitive than “simple” adjacency restrictions. And since it depends on a given sample, it is very suitable for solving some kind of artistic tasks. So far, I have not come across anything as interesting. Perhaps the reason is that such an algorithm is more difficult to implement, or it works more slowly, or sometimes it reproduces the original sample too well, which causes some kind of naturalness and naturalness to be lost.

What's next?


Solving problems with restrictions is a large and actively developing field of computer science, and I only touched on it. WFC - just like any other procedural generation algorithm for solving constrained problems - is still new. I recommend reading r / proceduralgeneration , #wavefunctioncollapse , @exutumno and @osksta to get an idea of ​​recent use cases.

You can also read my article about WFC , experiment with my opensource library or Unity tool . Do not forget about my other articles on procedural generation .

All Articles