How do we predict the future in Yandex search: from bug fixes to discovery queries

People do not always accurately formulate their queries, so search engines should help them with this. My name is Sergey Yudin, I lead a group of search functionality analytics in Yandex. Every day we improve something through machine learning. Last year, we have been developing technology that foresees human interests.

With an expert from my team Anastasia GaidashenkoavgaydashenkoI will tell readers of Habr how this technology works, I will describe architecture and applied algorithms. You will also learn how the prediction of the next request differs from the prediction of future human interests.



What does the user want?


Let's see how Yandex technologies help in solving problems, using the example of an imaginary user. Suppose he prints “save ticket” in the search box. What does he want to find? To learn about the accumulation of tickets, or was he just wrong?



Yes, he was sealed up. He wants not to save, but to buy a ticket. Yandex understood him, the guardian helped him in this - a system that corrects incorrectly entered requests . Mathematically, this system maximizes the likelihood of a correctly entered query, provided that the user entered the text. This problem has been solved in Yandex for more than ten years. And not only in the search.

So, our imaginary user entered the request “buy a ticket”. At this point, a sjest comes into play .(or search suggestions). Sajest helps the user to formulate the request, to complete it correctly.



Our sajest has come a long way. A couple of years ago, we complicated the task. I wanted to show not just the most logical completion of the request, but also to predict which request this particular user would enter in the end and start his prerender before the click. If you are interested in how this works, then you can look at Habré .

Our imaginary user selects the end of the request from a series of tips: it turns out that he was looking for tickets to the Tretyakov Gallery. Thus, the recommendation system completed its first task - it helped the user to formulate a request .

This task has been completed, but the user still has questions. What will he look for next? Maybe he wants to know how to get to the gallery? Yes, he prints Lavrushinsky per. 10 to verify the address.



Can we predict this request? Yes. And we have been doing this for quite some time. There is such a block - “Also asked” at the end of the issue. It shows the queries that people usually ask after entering in the search field. It is in it that we will see our request with the address of the Tretyakov Gallery.



We maximize the probability of a request subject to a previous user request. The system completed the second task - predicted the next request .

And then something very interesting happens. The user prints the request "when can I visit the Tretyakov Gallery for free." This request is different from the rest, runs counter to the user task, solves some orthogonal task.



But let's think: if we were looking for tickets to the gallery, what would we like to see as a recommendation? A huge number of people would like to know that a ticket may not need to be bought. This is the third, most difficult and interesting part of the task - to recommend something new and useful to the user. Something that he himself hadn’t thought of yet.

We call such requests discovery. We learn to identify them in our search logs, save and recommend them at the end of the issue. And this is precisely the new and revolutionary task that we are actively working on now. To a person buying Scandinavian sticks, Yandex may recommend a request on how to pick them up by height. If a person travels often, then perhaps he will be interested in the search query "where to go without a visa." Etc.



The mathematical formulation of the problem in this case will look like this: we do not maximize the probability of the next request, but the probability of a click on the request, which we recommend to the user based on his previous session.

How it works?


Let’s look at how our recommendation system is implemented, what architecture is hidden behind it. But to begin with, we will determine what we generally want to get from the recommendation system.

1. Useful recommendations! Of course, we want the requests that we recommend to the user to suit his interests. They should be useful and relevant.

2. Scalability. We hope that the system will be good: there will be more and more users, and the number of requests will increase. And we want to increase coverage for those on which we can make recommendations.

3. Ease of implementation.We assume that our system will still work, and we do not want to rewrite it many times. The system should be easy to implement, so that we can later improve it, not launching a new version, but improving the current one.

Having decided on our wishes, let's see how you can bring them to life.

We have some discovery database - a database of queries that may seem interesting and useful to our users. But if we start ranking this entire base, we will not have enough computing power. Users have many requests, they are diverse, so first we need to filter this database.



Filtering can be done by different methods. In Yandex, we use kNN (k-nearest neighbors) for this- The basic classification algorithm in machine learning, known as the “nearest neighbor search”. Using this algorithm, we want to filter the database: select the most close queries to what may interest the user. To do this, we want to compare the user’s request and those requests that we are ready to recommend in vector space.

But in order to bring these requests into one vector space, we also need to come up with something. For example, you can use DSSM (Deep Structured Semantic Model) - a kind of black box that can translate different entities into one vector space. Initially, this approach was proposed in an article from Microsoft. But Yandex has already adapted it quite strongly for its tasks and has gone far from the original idea. If you are interested in reading more about this, then the information can be found, for example, in the article about the Palekh search algorithm .



The next step is ranking. When we have a list of requests that are close to what the user might be interested in, we want to understand what will be more interesting to him and what is less.

For example, we selected 100 queries. It is unlikely that the user will scroll all 100. You need to select the top 5 and recommend it. To do this, we assign ratings to our requests. We get these estimates based on the probability of a click on a request that we recommend to the user based on his previous session.

How do we get the probability of the next click? Our system is already up and running, so we simply collect feedback from users - and thereby gradually improve our recommendation system.



Now that we’ve reviewed all the steps separately, let's go back to the beginning and put it all together. Total: the user comes to us with some request, and we have some kind of recommendations base. We take this database and filter it, receiving requests that we want to arrange and recommend to the user.



Now recall that we generally formulated wishes for our recommendation system. And let's see how we can implement them in the resulting architecture.

For example, we wanted scalability in terms of improving the base. Our implementation method has all the properties necessary for such scaling. We should not keep the entire base in memory: as soon as the base expands so much that it will not fit on one cluster, we can split it into two. And if earlier we went through one cluster with kNN and selected the top 100, which we will rank, now we can go through two clusters separately and select, for example, the top 50 in each. Or, in general, break up clusters by topics and walk with kNN only on the desired topics.

To scale the number of users, you can simply add additional computing power and process each one separately, because in our architecture there are no places where we would have to keep all users in memory at the same time.

In some other approaches, there are places that filter, for example, when decomposing a matrix. Matrix decomposition is another approach used in the recommendations. In fact, Yandex also uses it, but not for filtering, but as additional features for training, because there is still a lot of information that is useful to analyze.

Additional features, new algorithms, and other methods can be applied to the rest of the architecture. When the system is up and running, we can begin to improve it.



Where does it work?


Such an architecture has already been implemented in Yandex, and in comparison with the usual reformulations, when we tried to advise the user to find out the address of the Tretyakov Gallery, we can already advise how to get to the Tretyakov Gallery without waiting in line or even free of charge.



This is a new level of interaction between the search engine and users. We do not just correct mistakes and supplement requests, but learn to predict the interests of a person and offer him something new. Perhaps this will be the search for the future. What do you think?

All Articles