Continuation of participation (and victory) in the Russian AI Cup 2019

Hello everyone, my name is Andrey Tokarev , and again I would like to share my experience of participation and victory in the Russian AI Cup.

If anyone does not know, the Russian AI Cup (hereinafter RAIK) is an artificial intelligence programming championship where we (or rather our program) must manage one or more characters (units) that compete with each other. This year the game was a 2-dimensional platformer resembling computer games of the early 90s, and the units were armed men who were supposed to kill the same men of the enemy. You can read more about this in the announcement.



This time there was no meeting when the agenda came for the start of the RAIK, so let's get right to the point.


First look


So, I ran over the rules, tried the game. The first impression was - what to do here? Since the units did not have inertia, and the initial levels were without closed cycles, the enemy could always stick to us if he wanted to, and defend himself against this was impossible. And in this situation, you understand, the possibilities are very limited. You won’t dodge bullets. But my first fears did not materialize, although the factor of chance was high, the rules turned out to be quite playable.

I did not immediately begin to develop, a couple of days fell a fooldecided to reflect. I didn’t come up with anything special. It is clear that you need to dodge bullets, run after first-aid kits, and if possible, do not let the enemy take a first-aid kit and all that. There was still some vague plan to find strategically advantageous positions, and revolves around them, but it was not entirely clear how to distinguish these positions.

Start


So here we go. Out of habit, I convert the data objects in the language pack into my own data structures. Yes, it takes some time, but not very much, but I get more flexibility, work in a more familiar environment, and while I convert, it’s better to begin to understand how the game world works. Of course, not all at once, but as needed. For example, I did not recreate the mines until the political situation at the championship forced me to resort to extreme measures. I approximately copied the logic of Smartgayev, adding only a check for the presence of a wall between me and the enemy. I looked, they seem to be running fine, so we can assume that we have a game world.

Simulation


Now we proceed to the most favorite pastime of all RAIK participants: we are writing a simulation! After reading the championship chat, it became clear that writing an accurate simulation is bad. So I immediately scored for accuracy. It turned out, as it happened, even with a simple rebound from the ceiling, there was a discrepancy within the teak. It was possible, of course, to tinker with it for several days and significantly increase the accuracy, but, firstly, it is not known what profit would be from this. Secondly, I was too impatient, and as always I wanted to see the result as quickly as possible. So I decided to start writing the strategy itself, and I thought I would correct my simulation curve, then, if necessary. In hindsight, I must say that perhaps I was wrong.
It would be worthwhile to immediately bring the simulation to normal, and so it was necessary to make corrections there until the end of the championship, and still it remained crooked.

Strategy basis


Then we proceed according to the standard scenario: we generate the sequence of actions of unit robots , simulate a change in the game world and submit it all to the input of the evaluation function (OF). If in football (last year’s tournament) it was ineffective to run after the ball with a complete random number of players (there is too little chance to hit the ball, especially from the right angle), then there is no such problem. Basically, we need a simulation for two things: dodging bullets and finding the way. A few completely random action scenarios are enough to evade bullets. Several dozens of scenarios will also work for search of a way more or less, at least, on not so difficult maps.

By the way, the search for the way - it’s clear that counting to the end is not always possible, so you need to be able to determine the distance. Using the same simulation, one could more or less accurately calculate distances in time. But here I also simplified it, just counted the distances along a rectangular grid. This simplification, of course, had some consequences on complex maps. Sometimes my units were still confused in the search for weapons, while the enemy was already starting firing. True, these incidents were quite rare.

The choice of target was quite simple: while there are no weapons, we go to weapons, if there are weapons, we go to the enemy, and if there is little "health", we go to the medicine cabinet. In the case of first-aid kits and weapons, those objects that were closer to the enemy as a target were excluded. For some reason, the priority on weapons for me was first on the machine.

The first evaluation function consisted of two components: it is health (not mine, but a unit) with a large coefficient, and the distance to the target with a small coefficient.
Those. the route was chosen in which it does not fall under the bullets, but at the same time approaches the target as quickly as possible. It is important to remember the best route found for the next tick, otherwise a situation may arise when we find a good option, but did not give it out at the next tick.

We go to the front, and the first improvements


Looking at the battles, it was clear that if the enemy had a normal dodge, then it was almost impossible to get into him from a long distance. This is a waste of bullets. So I added a simple condition for shooting: shoot only if the enemy is closer than 7 meters. In addition, if the distance is very large, then it is worth recharging, even if only one bullet is missing in the clip. These two simple conditions added quite a bit.

This was the first version I sent. On the same day, I realized that a gun is more profitable than a machine gun. Plus, he added to the OB another distance to the enemy, in case the shot is not soon. Thus, when reloading, he tried to keep his distance.

The start was quite successful. I don’t remember how high the strategy managed to rise in the ranking before the reset (somewhere around the first three-five), and after the reset it confidently went up.

The main improvement in the next version was due to an attempt to control first-aid kits. Here again, everything was simple: the distance from the first-aid kits to me and the enemy was calculated, and if they were approximately equal, then it was believed that the first-aid kit was draw, and so, the one to whom it is closer. This was taken into account by the evaluation function. Those. in theory, the unit tried to position itself so as to obscure as many first-aid kits as possible. In theory, of course, but not in practice.

After that, I did not send updates for a long time. To my surprise, the strategy first reached the 1st place, and then still came off by 100+ points. Probably everyone thought that I was cutting something cool, and I was shocked how this wretched strategy soared. Miserable, because, well, there is a clumsy control of first-aid kits, there are two ifs for shooting, but basically this is a simple dodge and movement based on a simulation curve.

Testing


I would like to talk about this separately, because proper testing is a very important part of development, and many participants seem to ignore this aspect. Accident is intuitively misunderstood by people.

Suppose we run a test and after 40 games we see a score of 27:13 in favor of the new one. That's all, we found the Grail, stop the test and accept the changes. In fact, even with equal forces, there is approximately 2% probability of such a situation.

Another example, we launch 100 games and accept the changes if the score is 55:45 or more. With a probability of 2.8%, a version can play this way, which in reality should lose with the same score. A couple percent of the error may not seem very large, but if 90% of our changes are unsuccessful, then this means that every 4th change accepted will be unsuccessful. So, even 100 games are still a so-so test, but less - nothing at all.

Of course, launching several hundred games can take an indecent amount of time. There are different options, you can run tests at night, drive in the background, rent server time, or make the strategy fast :) My first versions worked on average, somewhere, for 5 seconds. to the game, so lucky with that.

If the strategy is slow, then you can locally drive truncated versions - fewer searches, less micro-ticks, for example. I used this in football (previous RAIK), no problems were noticed. Perhaps the quality is lost somewhere, but, in any case, it is more profitable than checking for changes on 20 games.

Enhancements


This time I will not evaluate the improvements in numbers. Firstly, they were all small (less than 20%) and therefore inaccurate. Secondly, ambiguous, i.e. the new version could confidently beat the previous one, and play worse against others. And thirdly, I just don’t remember them.

  • , . , . , , .
  • . .. , , . , , -. , . , , . .
  • , , . , , , , , . , . , .
  • , ? , , . . , . .

-


Objectively, you need to shoot in the direction where the mat. damage expectation is highest. It is clear that analytically this cannot be calculated, and where mathematics does not help, as we know, simulation comes to the rescue. We shoot in all possible directions and try to dodge the enemy from each bullet individually. To each pool we assign the minimum damage value that it will inflict precisely. In the case of a conventional bullet, it is either 0 or its damage, and in the case of a rocket there are 3 options.

Now we know the mat. expecting damage in each direction separately, from this we can easily calculate the mat. wait for each direction of the shot and choose the best. It would be nice to use this function in enumeration so that the unit is correctly positioned, but it was too slow to call it for each scenario. It was called once per tick for each unit, and now the shooting was carried out not by distance, but if the expected damage exceeded a certain limit. The increase was significant, and in the case of the bazooka it looked pretty clear - now mine could even aim somewhere on the wall in order to hit the enemy with a blast wave.



But due to the fact that I could not insert this function into the search, there was a problem that the conditions for the shot in the search and in reality were different. I did not know how to solve this problem, I had to live with it.

After that, there were no very significant changes until the last day when I introduced self-blasting. He added a penalty if there was no jump left at the moment when the enemy fired, because falling would make it harder to dodge. I tried to predict the path of the enemy to the first-aid kit, if he has few lives, and thus slightly improved control over them. He himself began to run after first-aid kits from 60% of his health, instead of 50%. He added the distance between his in the PF so that they would not interfere with each other, and the bazooka would not immediately knock out two. In the end, he added an aggressive mod in case the situation does not change for a long time and I lose on points, otherwise mine sometimes get stuck.

results


As I said, pretty quickly got to first place in the ranking. With the exception of short periods of time, I stayed there until the end. In the rounds, the situation was a little worse: 4th place in the first, 3rd in the second. This indicated that I played well against strong players, but not stable enough against weak ones. This is quite clearly demonstrated by one of the games of the 2nd round, where kostya200300, who was then in the very last place, beat me at 0!


Self-blasting


Like most participants, I was not enthusiastic about the possibility of self-explosion. This greatly increased the already considerable random factor. But it was clear that one could not do without it. True, I did not really want to do this, and I put it off until the last day. This technique is quite simple, if we are standing on the ground, the enemy is in a radius of the explosion and we have enough (maximum two) mines, then we put mines and shoot at the bottom. I put an additional condition that after the explosion I get a clear advantage, i.e. either immediately win, or kill more opponents than their own. I thought that since I aspire to a high place, then changing one on one is unprofitable. Well, it's like in chess, if you play to win, then exchanging everything in a row is unprofitable. This is how I sent my warriors endowed with kamikaze skills to the final battle.

The final


As I already wrote, the rating only speaks of the balance of power with the closest competitors, and I did not have stability against weaker rivals. In this regard, the elimination of the participants to the finals played a mess for me. On the other hand, the massive spread of terrorist ideology on the last day introduced a further spread in the distribution of forces. In short, despite the confident leadership in the ranking, there wasn’t any confidence in the final, there was only hope. After the first half of the final, it became clear that the main struggle will be between me and Ivan Tyamgin. I led to a small number of points, while the rest were far behind.

During the break, I dawdledpicking up self-detonation - something was fixed (it turned out that mines should not be placed at the top of the stairs), he added some countermeasures against detonation. And repaired the simulation, because so far mine clung to the corner.

The beginning of the second part of the final round was successful for me, the gap began to grow slowly and reached a maximum of about 20 points. I even relaxed a little and instead of refreshing the results page every 3 minutes, I started playing chess on the phone. When I looked again, the gap was already about 7 points and then quickly rolled to the 1st. I wonder what they give for 2nd place? 1 hour to the end, again there is a gap. 30 minutes, the gap is catastrophically reduced. 10 minutes - just 3 points! One minute ... victory seems! Vanya lost to me only 3 points. This is within random scatter. The rest are 50 points or more behind.

About funny bugs


Already after the final, I noticed that in some games my units behave strangely - they do not avoid bullets when, it seemed, they could, and even, as if, specially catch them. It turned out that during the break in the final, I made a mistake - when it was determined whether the enemy could blow me up, it was necessary to calculate which of mine was in the radius of the explosion. Here I mixed up the coordinates of my own and another's unit and mistakenly assumed that the explosion would occur where I am, and thus guaranteed to fall into the radius of the explosion. The real explosion was already considered from the right point, so in the simulation the enemy itself could explode away from me. Usually it just added a big constant to the estimate, but it didn't really affect anything. However, if the enemy did not have enough mines to blow me up, it turned out that it was profitable for me to catch a bulletso that the enemy already had enough mines, and I got the very addition to the assessment. Perhaps it was due to lack of sleep, or a drunk can of beer, but in fact, mistakes always happen.

Conclusion


The game, although not as spectacular as football, was no less interesting from the point of view of strategy. The threshold entry was relatively low, and this favorably affected the number of participants. I received a lot of emotions during the month of the championship, both joy and emotions, and for this I am grateful to the organizers. I also thank all the participants, especially Vanya Tyamgina, for not making me bored in the finals.

All Articles