Realistic combat AI for 2D game

image

Although Close Quarters is predominantly a multiplayer game, complex AI bots must still be present in it to keep players playing when the Internet connection is poor or if there are no other online players. In addition, bots play an important supporting role in some game modes. Therefore, they should behave believably and demonstrate a set of complex behaviors, including the use of shelters, the use of objects at the right time, bypassing the flanks, throwing grenades and running away from them.

Environment and restrictions


The gaming environment consists of polygons. Most polygons block movement, visibility and shooting, but there are also “low” polygons that only block movement. The environment is tightly covered with obstacles and shelters.

AI is also limited by several technical factors. The most important of them: the server on which bots are executed, when there are few players online, should quickly work on an inexpensive VPS with at least ten bots. In addition, the CPU load should remain low enough to allow multiple server instances on the same VPS without exceeding the CPU limit, and without causing sanctions by the VPS service provider.


Figure 1: environments


Figure 2: view from the player’s face of obstacles blocking the view (gray areas are invisible to him)

Mesh of navigation and visibility, tactical search of a way


Bots use a dense navigation mesh consisting of discrete points. The mesh is generated as follows: first, the polygons that make up the environment are expanded and combined . Additional nodes are added near the corners, because these places are most likely to be suitable shelter positions. The resulting movement space is then triangulated to generate a mesh.


Figure 3: navigation mesh

Since bots must perform a tactical search for the path , the navigation mesh must contain visibility data that allows us to quickly check whether the node contains shelter from the selected enemy. Therefore, each node contains an array of 24 bytes, each of which represents a pre-calculated approximate distance between the node and the nearest obstacle blocking the field of view in a certain direction.


Figure 4: Visibility data stored in the node (blue lines). Each line represents a single-byte value that defines visibility in a given direction.

Thanks to these data, you can search the graphs using the A * algorithmon the navigation mesh, in which nodes that can be opened to known enemies receive lower priority without costly line of sight checks. To check whether a certain node is open to the enemy, we determine the direction and distance from the node to the enemy, and then check whether this distance is less than the distance stored in the array element that is closest to this direction. In addition, we can check if the enemy is looking at the node. Then we can apply the penalty factor to the cost of moving through nodes open to enemies, and the resulting path will tend to avoid such nodes.

The same visibility data, in addition to finding the path between two given points, can be used for other “tactical” actions. For example, the bot seeks shelter by performing a breadth-first search and stops as soon as it finds a protected site. Line-of-sight tests are used to verify that the site actually provides shelter. Similarly, we can generate attack paths from the flanks by searching A * for the target; at the same time, high fines are imposed on open nodes within the target shooting cone. The search stops as soon as we reach an open knot outside this firing cone. (One of the problems with this approach is that bots that go out of scope constantly try to get closer to the goal, and therefore seem too aggressive; this can probably be fixed by setting the A * heuristic soso that the bot does not move directly to the target, but to any nodes located at a selected distance from the target).

Feelings and memory of bots


In order for bots to behave convincingly, they should not seem like cheaters. In other words, the information that the bot works with should be similar to the information that the player possesses. For example, the enemy behind the obstacle should be invisible to the bot just as it is not visible to the player.


There are two ways in which a player can detect an enemy’s position: he can see the enemy or hear how he moves, shoots, or performs some other action.

Each bot keeps a list of known "facts" about the positions and direction of the gaze of enemies. Without updates, these facts are deleted after ten seconds. A fact related to a specific enemy is updated when the bot can hear or see that enemy. When a bot hears an enemy, to simulate uncertainty, the position of the corresponding fact is shifted from the true position of the enemy in a random direction and distance, depending on how close the bot was sound (see video, 1:28).


Figure 5: facts (pink circles) in the bot’s memory

Behavior tree


In an earlier version of Close Quarters, AI used STRIPS, a solution popularized by FEAR in 2005. In STRIPS, the relationship between different AI behaviors is not predefined by the programmer. Instead, each behavior contains a list of binary preconditions and outcomes. Each bot has a state of the problem in the world and uses the search in the graph A * to find the sequence of behaviors to achieve it. This solution worked well, but I felt that it was too complex for my application and better suited for AI, which needed to develop complex plans involving many different behaviors. In most cases, I already knew the circumstances under which the bot had to perform this or that behavior, so using A * for this algorithm was an unnecessary waste of CPU resources.

Therefore, bots now use a simple decision tree and behavior. In each measure, the bot goes around the tree, starting from the root, until it reaches the behavior. If this behavior is the same as already performed, then the bot continues this behavior. If not, the bot initiates the behavior and begins to execute it.

Some behaviors can “block”, that is, prevent the bot from repeatedly traversing the tree until a certain condition is met. This is useful, for example, to ensure that bots get to cover before deciding to attack. Also, behaviors can “nullify” each other, forcing the bot to re-walk the tree and re-initiate the behavior. This is useful, for example, when a bot escapes from a grenade, and another grenade appears that compromises selected escape positions.


Figure 6: The decision and behavior tree currently in use

Some secondary behaviors are encoded within other, more general behaviors. For example, if a bot tries to attack an enemy and discovers that the enemy is not in the expected position, then it assumes where the enemy can be now and calculates a new attack path without leaving the attack behavior.

Load distribution


Each bot does not have to be updated in every frame of physics, that is, 40 times per second. To reduce CPU costs, each bot "thinks" only 20 times per second (this number can be reduced if necessary). Therefore, only half of the bots are updated in each physics cycle.

Work with grenades


A serious problem was the meaningful use of grenades by bots. Working with grenades is much more difficult than shooting, because grenades can fly off walls, have a radius of destruction and time to throw. Fortunately, bots are not required to use grenades perfectly, enough persuasiveness.

The traditional solution to this problem is to pre-calculate the paths of grenades in the navigation nodes , but when it is implemented, a few seconds are added to the load time of each map, which contradicts my goal: Close Quarters should throw players into the battle in a matter of seconds after starting the game.

Therefore, bots are looking for opportunities to use grenades, calculating the paths of grenades on the fly. In each measure, the attacking bot checks several possible trajectories in a given direction within 60 degrees from the direction of the selected target. If a grenade thrown along a proven path can kill the target and the target is out of scope, the bot throws a grenade. The checked direction is looped in each AI clock.


Figure 7: directions checked by the bot for one second (pale pink lines) and tested trajectories (blue circles) along the direction selected in the current measure (bright pink line).

That is, bots use grenades whenever possible and do not move to a position specifically in order to throw a grenade. However, the path chosen by the bot to attack the enemy is often a sensible path from which to throw a grenade.

A significant drawback of such a system is that, due to the limited number of checked directions, bots miss the opportunity to throw grenades so that they bounce off small obstacles. Most noticeably, this is next to the doorways, where bots usually do not recognize the possibility of using a grenade. This problem can be solved by testing several directions in one frame and thus reducing the angle between the direction being checked and the next.

Movement to more human behavior


Such a problem quickly becomes apparent: the bots are too quick to pull the trigger, because it is very difficult to defeat them in a one-on-one battle. The average human reaction time to visual stimuli is 250 milliseconds, but at 20 beats per second, the maximum bot reaction time is only 50 milliseconds!

To solve this problem, I deliberately added a delay between the moment when the bot gets the opportunity to shoot and the shot itself, so that its reaction time is comparable to the reaction time of a person.

Further improvements


The systems presented above provide a strong fundamental AI, but do not provide opportunities for major improvements. For example, now the bot’s spatial thinking is limited by its immediate surroundings, therefore, although the bot usually tries to bypass the enemy from the flank, it often misses outflanking routes around large obstacles. In addition, bots only roughly know about the presence of teammates, so sometimes they accumulate in one place, instead of being divided and bypassing the flanks.

All Articles