Developers Club geek daily blog

1 year, 10 months ago
I want to tell about the participation and a victory in annual competition in programming of AI of "Russian AI Cup 2015" from Mail.Ru Group. To look at detailed rules of competition and record of game persons interested can on the competition russianaicup.ru website.

image

This year competition was organized at absolutely new level. Changes happened as in scale of the game world in which AI, and acts on the competition website. Thanks to three-dimensional visualization, games looked much more fascinatingly. On staginess, in my opinion, competition considerably exceeded last year's hockey, and "tell-tales" of 2013.

The participant was offered to write AI for driving in races on a survival. As well as last year, the task was "with physics". But on it time source codes of "the physical engine" were open. Still, unlike last year, this time all accidental phenomena in the game world were evident — the accidental card, accidentally placed bonuses. At once it was visible — when good luck on your party and when it from you turned away. In last year's hockey, even watching game considerably the opponents differing on force, it was difficult to understand there was a prize thanks to a case or skill. I think, it positively affected staginess of competition.

Short description of rules


The purpose — to pass 2 circles on the closed route most quicker. It is necessary to score more than all points more precisely, but to arrive the first is the main method to score points. Still points give for collecting of bonuses on the road and drawing a loss to opponents. The route as the designer, gathers from square "tayl", these are straight sections of the route, corners (turn of the route by 90 degrees), or intersections (T figurative and normal). It is necessary to go on key points ("tayla") of the route in a certain order – sometimes it is necessary to do loops, sometimes in general to go back. Still machines have an opportunity to spill for themselves fuel oil pools, to shoot each other special shells (buses and washers), and to use the special nitro accelerator. Charges for all these devices are limited, and are replenished with picking up of the bonuses which are accidentally scattered according to the card.

I will tell how my AI is arranged and thanks to what (as I think) it was succeeded to win.

First version


I participate in competition not the first time, every year it was possible to achieve the best results, than in previous. From my experience, very most part of success is occupied by correctly organized development of the strategy. As time is strongly limited, it is necessary to think over from the very beginning — as in what order you will do. If it is necessary to throw out everything and to write again, it is not enough chances of a victory.
I do not think that in my strategy there were some "killer feature", rather a set of small successful solutions developed in one big benefit. The story will be quite long with detailed the description of all these moments.

Concerning testing and debugging. In last years happened that I was up against a blank wall trying to understand — why my algorithm does not give that benefit which I expected. And having spent plenty of time I, suddenly, found a bug in the heart of the algorithm — and everything flew up even above, than I calculated. Even this year I nearly missed a victory because of bugs, it was necessary to find for tests more time. This time visualization became for me the main method of debugging. I did not manage to think up short, but effective tests, and it was a pity to spend a lot of time for writing of long tests which then also constantly should be rewritten. There was only a test for the physical simulator which compared the predicted trajectory and real. Without it it would be impossible to debug physics of collisions in general.

The first version I just decided to teach to pass the route. But even this minimum if to implement it fairly, demanded quite large volume of work, were necessary: physical simulator of the free movement of the machine, detection of collision with boards, search of a way and scheduler of trajectories. The first three components it was clear — how to do, the fourth not really.

I began, as well as in last times, with writing of the physical simulator. Here problems did not arise — source codes of the physical engine were available. At once it became clear that if to do the 10th iteration on each tic as in the engine, then with a performance there can be problems. Remembering article of the last year's winner whether I had doubts absolute accuracy is necessary. Decided to make nevertheless precisely, and then if it is necessary, to remake on 1 iteration instead of 10. It was the right choice as it appeared. Further I was engaged in detection of collisions with barriers. To do detection of collisions as in the engine I was afraid, decided that will be too slowly. Having considered several options, I stopped on such algorithm:

Everyone "тайл" breaks a square grid into 9 sectors. Being in each sector, the machine can face only the certain type of barrier which is from one of 4kh the parties and, for some types, specularly displayed barrier.

These types of barrier:

  • Direct wall on the right/from below/at the left/on top
  • Interior angle
  • Exterior angle
  • Circle
  • The wall passing into a circle on the one hand behind repartitions of sector. Here 4 parties and for everyone still mirror display
  • The circle passing into a wall on the one hand outside sector. Here too 4 parties and mirror display

These 6 options cover all chances. Options 5 and 6 are necessary as the machine can be in one "tayl", and a corner to face barrier in other "tayl". Further the table for all card where in each of 9 sectors for each "tayl" the object which could check lay was under construction — whether the machine faced its barrier. Implementation turned out quite effective, consumed less than a third of time spent for iteration. It allowed to do check on collision on each iteration, and more than once in a tic without essential loss in performance.

I will tell about one more optimization. As angular speed participated in the movement, the first implementation calculated sine and cosines on each iteration — it was slowly. But as the turning angle of wheels (influencing angular speed) changed only once in a tic, and angular speed from collision appeared seldom, is normal on each iteration it was necessary just to add the same permanent corner to a machine orientation angle. I stored a machine orientation angle a vector and on each iteration rotated it on a constant - it is 4 multiplication and 2 additions. Sine and cosines needed to be considered only once in a tic and, in rare instances, after collisions on each iteration.

Search of a way made the easiest way — wave algorithm. Wanted to rewrite then on Dijkstra's algorithm somehow to consider existence of turns, bonuses, fuel oil pools. But did not return to it.

There was the last component — the scheduler of trajectories. It was necessary to tinker with it, I think, in joint account for all competition on it a half of time left. The second half on the physical simulator, on all the rest it was spent time minimum. At first I tried to solve a problem by the same method as earlier — search of trajectories of a certain format. The elementary format: we go N tics directly, then the M tics we twist a wheel in some party, we return a wheel in neutral provision further. Worked, but badly and very slowly, the miscalculation of all options on one hundred tics forward did not fit into time limits any more. The machine did not understand that it should be rebuilt in the next row before turn. Provozivshis couple of days with this idea, I understood that something else is necessary.

Decided to build a tree of trajectories, I think, many participants came up with this idea. But probably how I built it is of interest. I think, I managed to win in many respects thanks to successful algorithm of creation of this tree. The algorithm thought out itself, not based on something known. You do not judge strictly if invented the bicycle. The tree thus was under construction:

To construct several initial trajectories 450 tics long (length is measured in tics here and further, on 450 stopped not at once). Further many times (the selected time will not come to an end yet) the recursive method which in the middle of any of branching lines made "branching" was called. When "branching" trajectory section to a point of branching became a trunk of a new subtree, section after — one of branches of a new subtree, and also new branches were added. Length of all branches was selected so that the total length of a way from a root to any end state was permanent, or it is less and the branch terminates in collision. In the first version for "branching" 4 options of the movement were used. Gas always in a floor, and a wheel we twist in 4 provisions: to the right, to the left, in neutral provision, we keep in the current provision. Later options with braking and driving by the back were added. For work of this algorithm of creation of a tree 3 merit functions were used:

  1. To make "branching" of a trunk of the current tree (always in the middle) if the average length of branches of descendants is less than length of a trunk. Otherwise to pass to the choice of the descendant for a recursive call on it. I much here experimented — it the option was the best.
  2. To select the descendant of the current tree for a recursive call on it. The descendant with the maximum value is selected: <average length of branches in a tree> * <the third merit function> / sqrt (the sum of lengths of all branching lines). Division is necessary that the tree did not degenerate: the more time it is spent on research of a branch, the denominator less priority is more.
  3. The basic merit function on the basis of which decided on what branch all of us will go (what branch better). In the first version value of this function equaled to the assessment sum for a trunk and the maximum assessment of the descendant. The award was given for approach to the finish — "taking of a tayl". Later this function strongly changed.

Rather evenly branching tree with a priority of branching of those branches which conducted to the finish turned out. Unfortunately I did not have pictures of that tree, but it almost did not differ visually from the latests version.

Here so visualization of the latest version of my strategy on one of cards in the conditions of limited visibility looks. These are trajectories which are under construction for one tic. The best is selected from a set of such trees received for many tics (about it and other acceptances — is lower in article).

image

Here I want to thank organizers. This time the world in which AI worked turned out much more difficult, than in last, and it pushed participants to look for new algorithms and approaches. Such tree of trajectories, I think, perfectly would work both in last year's hockey and in tanks.

Approximately in 9 days prior to the first round the first version was ready. It already normally went and rose approximately to 200kh-300kh places. Further it became a framework of strategy and essentially did not change, the new functionality was only added.

Preparation for the first round


Though at that time cards was few, and they were simple, the machine often got stuck. Resting against a wall as the back was not able to go. Switching on "smartgy" in case of failure of the main algorithm was ineffective. I did not begin to write an abnormal heuristic algorithm as was going to teach everything the main and did not want to spend time in vain. Added driving by a backing to branching options for a root of a tree of trajectories. At this stage at me each tic the tree of trajectories again was under construction. Adding of options of branching in all branches strongly reduced branchiness of normal branches (resources are limited), the machine began to go worse. So I tried to add new branches only to a root and only in urgent cases (got stuck). Learned to leave simple situations — defeats became less. The rating spread up, reached the second hundred.

Here I want to notice that I tried to do all completions in such order first of all to solve a problem, the reason of the greatest number of defeats. This approach was repaid as some problems were not required to be corrected — that saved time.

The following improvement was very simple, but very effective. Why to throw out each tic the found trajectory and to look for it again on the following tic? As the physical simulator issued accuracy till 10-12 a sign after a comma, during tens of tics the machine remained on the selected trajectory with a pinpoint accuracy. Here I understood that ideally exact prediction of a trajectory will save to me many computing resources. Each tic the new constructed tree was compared with saved on the previous tic and the decision to leave old or to replace new was made. Considered that the machine "descended" from a trajectory if the predicted and actual coordinates or speed differed in 7 m a sign. As the tree was not symmetric, and extended towards the purpose, and length of each branch to 10 and more tics — the found tree "became outdated" quite slowly.

Thus, the trajectory was selected from the whole wood of the trees constructed during tens of tics. If I followed advice that absolute accuracy is not necessary — this acceptance, perhaps, it would be difficult to apply, it should control the saved-up error. Ten iterations reduced the speed of calculation of a trajectory by each tic by 10 times (actually much less) but gave the chance to calculate it tens of times longer. And it is difficult to make tricks by blow about the bus like a turn without absolute accuracy (it to a question of adequate accuracy).

After the previous optimization of a problem about performance receded: even reducing time for constructions of a tree by 3-4 times, I almost did not weaken strategy. The next completion permitted to use a brake. There were problems: a set of statuses almost distinguishable from each other with braking of already almost stopped machine spent all selected time. Solved very roughly, but worked: prohibited to brake if the longitudinal speed of the machine is less than set. Permanent defeats on maps with abrupt turns stopped — the rating was fixed in the first hundred.

Further I added use nitro. Just some more options of branching, but only for a root. Nitro it was allowed to use if length of the found way with activated nitro exceeded 7 "tayl". To plan use algorithm, nitro through some time, could not, I considered it not necessary and did not do for economy of resources. Governed bugs and added several "crutches" that jotas somehow to shoot, put pools.

The following that I decided to implement is collecting of bonuses. By then very few people brought together them and, having arrived the last, easily it was possible to take the second place in arrival. I was too lazy to do physics of collision with a bonus and did not make until the end of competition. So it was not difficult to add collecting of bonuses, each tic in the physical simulator checked intersection with a bonus. For taking of a bonus an award in 3y merit function. While gave a big award for points, average for "nitro" and for "ремкомплект", on zero for the rest (all the same was not able to use yet). This version participated in the first round and took the 14th place, and after a round was fixed in "top20".

Preparation for the second round


At the weekend, while there was a round, there was time to have a rest a little and once again to consider the general approach. Except that I was not able to shoot and spill pools yet, strangenesses in picking up of bonuses were observed, and the machine became worse to go by high-speed sections.

Decided to begin with the second problem. Adding of braking in branchings options of a tree of trajectories, even with restrictions, had all the same an adverse effect on driving on high-speed sections as the tree became less branching. Having spent a lot of time for experiments with merit functions of a tree, I so achieved nothing. And then decided that it is possible to build both trees (with braking and without) and to select the best. It was the excellent idea, the composite algorithm which in the future I still expanded turned out, having added to composition a tree with the permission of collisions. However it was necessary to reduce the number of iterations on each tree many times, here performance stock very much was useful. Now the algorithm combined benefits of both versions. On the stated above image of a tree of trajectories it is possible to see three green lines is the best trajectories in each of three different trees. The black line is section of braking on one of these three trajectories.

The problem with bonuses had more fundamental character. Studying behavior of the machine, it was visible that it passes by some very simple bonuses. The reason appeared that ahead there were 2 bonuses (closer and far away). Strategy did not manage to count a trajectory on which it could take both bonuses and selected the second as it at the same time held the route slightly better. Though having taken the neighbor, it easily would find a method to take and distant. To motivate to take first of all near bonuses — in 3yu merit function added exponential damping of an award for any events. The best result was yielded by quite slow attenuation 0.99 for a tic (twice approximately for 70 tics). I think, this acceptance positively affected also driving of the machine, and firing in the future though for certain it is difficult to tell.

Further the analysis of games showed that there are a lot of defeats because of zoom-in on fuel oil pools. Added a penalty for zoom-in on a fuel oil pool. Then decided that it is not enough, very much was afraid enough harmless pools. Added to the simulator falling of friction coefficient at contact with a pool. Picked in "drank up a raner", found a formula — as angular speed at the first contact with a fuel oil pool changes. It was "rand" there gives only 2 options: we twist to the right or to the left, and the speed of twisting can be calculated precisely. Made the "probabilistic" branch in the tree of trajectories having these 2 options in the descendants and averaging an assessment on them. Began to go round pools perfectly, and sometimes even to use in own favor.

The second round came nearer, it was time to master buses. Here essentially new I did nothing. Added to the physical simulator of the bus, their collision with boards and with the machine. On it a lot of time left, it was necessary to deal with physics of collisions in the engine, to test everything. For receipt of damages gave a penalty in merit function, and for death very heavy fine. After that I was already able to evade from buses and even to use collision with them to myself on advantage, but was not able to shoot yet. Now adding of firing became absolutely trivial task. It was necessary to provide that I am an opponent and to try to evade from my bus. If it was impossible or it turned out but at the price of incision in a wall — means it is necessary to shoot. I set up the opponent in the search algorithm of the best trajectory and if the general assessment of a trajectory (considering everything: the movement, a loss, bonuses) at my shot strongly decreased, then I shoot.

For the opponent of a trajectory were calculated on 150 tics, and time was selected 10-20 times less than for search of the trajectory. Tricks with a turn in blow about own bus turned out by itself — strategy simulated a shot on the opponent and saw that own trajectory which was recalculated taking into account the flying bus becomes worse not, and permitted a shot. Setting of pools was added similarly, any more I do not remember to or after firing. There were many misses because of the small volume of calculations, but I already finished the second round on the first place with a big separation. While there was the second round, I learned to shoot the buggy. I made it in the same way as well as firing buses. At first learned to evade from washers, then added firing.

Preparation for the Final


Time to prepare for features of the final came. Organizers promised unknown cards and limited visibility. What will have crucial importance in such conditions — it was unclear. With driving in the conditions of limited visibility I had problems, in it and decided to be engaged.

Likely, as well as all participants of the final, I just considered unknown "tayla" as intersections. Finished visualization — drew "war fog", all known to me "tayla". Instead of unknown drew intersections, now I saw the world eyes of the machine. The first viewing of arrival revealed a problem. The machine found the best trajectory through unknown "tayla" which considered as intersections, but after their disclosure did not reset it though now went straight to a wall. Wrote small algorithm which "cut" the saved tree of trajectories each tic in process of receipt of new information. If the branch passed through unknown "тайл", and after its "disclosure" it appeared not that type which was assumed — this branch was cut from a tree. It was also necessary to make processing and other events (emergence of pools, shells, bonuses) but was not in time, for them all tree was reset.

There was one more problem which was revealed by visualization. Some "tayla" were considered "passable" (not empty) though on next "tayla" it was visible that it not so. A little modified wave algorithm (which was repeatedly passing on "tayla" for which new data on existence of walls are found) corrected this a situation. Now the whole areas of the card could "collapse" sharply to the big empty area as soon as it became clear according to indirect data that there the road cannot be. I do not know whether it gave some gain of force of game, but it became much more pleasant to look in visualization.

What else strongly weakened strategy in the conditions of limited visibility? Unexpected turns created many problems, in panic being afraid to concern a wall, strategy wasted a lot of time. Or strongly braked on turns, or, having slightly concerned a wall, even at a small angle — passed to a backing. Decided to be engaged, at last, permission of collisions of the machine with walls. Here too especially there is nothing to tell — a lot of time for copying of logic and testing, attempts to make all this not really slow. There were also basic problems. If before the trajectory broke, having concerned a wall, then long (in tics) pushing trajectories at a wall appeared now. The machine practically did not move, upershis in a wall, and a tree of trajectories degenerated on these almost distinguishable statuses again.

It was necessary to get back to old idea with which I experimented much, trying to solve a similar problem when braking. Its sense was in replacement of "length in tics" by this length. Though it spoiled normal trajectories, for trajectories with collisions this the best that could think up. Added this option of driving to the "composite" search algorithm of a trajectory. It strengthened practically all parties of strategy. Unexpected intersections stopped being a problem, the detour of pools and deviation from buses became much surer (having dexterously hit against the bus, then about a wall "as though it is accidental" it was possible to continue the movement without big dead time). Firing and setting of pools became more dangerous by buses too, opportunities will evade, having slightly struck about a wall, left much less. And even in the course of rewriting of logic of processing of collision with walls I found a bug (collision with a corner incorrectly was defined), which correction strengthened strategy nearly stronger than all the rest.

On the available cards strategy began to play quite well, collision of machines with each other and firing at the was the only explicit problem. Though there was it seldom, but could spoil everything. Corrected firing at the so: if was going to shoot, we will check whether it "will spoil" an assessment of the best trajectory of one of the machines more than on the set value. Now it was necessary to teach not to face with each other machines. As I have a saved trajectory for each machine on which it is going to go, it was possible just to add a penalty for intersection of machines. So I also made, the penalty was proportional to a square of relative speed of machines as at big speeds of collision it is more dangerous. Worked surprisingly well, now machines with whistle rushed by each other at intersections, accurately went a column, and politely passed each other.

I expected surprises from unknown cards of the final. Deliberating that can be unexpected and very unpleasant in new cards, the output arose that anywhere driving by the back, only turns after collisions was not necessary so far. It was necessary to provide it, began experiments with driving by the back. Here I found "bug" which was then "feature" in the merit function. If to give an award for approach to the finish and a penalty for a distance — the machine was not deployed, very much was afraid to move away from the finish, even for a while. On it, I gave an award only for the first "taking of a tayl" if then we drive off back and again we come nearer – did not give an award, did not fine for a distance. In this logic there was a bug, for branching lines the award for repeated "taking of a tayl" nevertheless was given. Because of it when driving by the back the machine got stuck between "tayl" and twitched to and fro, repeatedly deriving pleasure an award. Correction of this bug considerably improved driving by the back, but on new "randomny" cards (it was the last day, they were already laid out) it unexpectedly sharply weakened strategy. It was necessary to return urgently old logic for driving the rehouse, and new to leave only for driving by the back. Here I made a mistake — sometimes nevertheless got stuck when driving by the back, and because of it nearly missed a victory in the final, corrected only in a break. But why that "bug" was "feature"? I think, a wrong award for repeated "taking of a tayl" at the same time with a reduced award for "taking of a tayl" by the back, motivated the machine to pass a little forward and to be unrolled, having made a loop (if there is such opportunity) instead of at once braking and going the back.

The last that I made before the final, in the evening on Friday, there was time monitoring system. It had not to allow surprises from "taymlimita", but everything turned out on the contrary, corrected only in a break. The corrected option carelessly spends processor time it much so far, and in process of exhaustion — the quantity iteration is exponential reduced by all simulations. When falls below a limit minimum — everything is disconnected, except the most universal option of creation of a tree of trajectories (with braking, without collisions, on the minimum quantity of iterations) that jotas somehow to reach to the finish. Initial rate of an expenditure of processor time selected so that was enough without output for emergency operation on the majority of cards.

Conclusion


The victory nearly escaped from my hands because of several hard errors. It needed to pay more attention and from the very beginning to develop the testing methodology and debuggings. For the rest, I think, approach was selected successful.

The source code of strategy is available here.

This article is a translation of the original post at habrahabr.ru/post/273649/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: sysmagazine.com@gmail.com.

We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.
Best wishes.

comments powered by Disqus