How I circumvented the ban on Messages API through the Vkontakte documentation

Hello to the entire Habro community. For me, this first article is written under a certain euphoria, so please do not judge this article too strictly for the literary part. But well, fewer words and get down to business.

How it all began


We all know that VC has an API, and I’m sure that most people tried to use it for their own purposes. Personally, I have a lot of projects related to it: pieces of 5 powerful bots, compiling large-scale datasets from group posts, etc. And it is not surprising that my friends asked me a couple of times to download songs from the attachments of the dialogue, photos, or save the text of correspondence with some person in a separate file.

But once the “it” came, and from that moment the implementation of such small requests ceased to be a trivial task:

image

And so, a couple of days ago, to get rid of this problem once and for all, I decided to write my wrapper via http requests, pretending to be a regular user, so that have the same powerful tool as the official API for the messages section.

Let's get down to business


So, I started with authorization. Armed with the https sniffer and Firefox, I was able to go through all the “steps” of authorization and get the final cookies. From now on, it only remained to understand how queries were made. It was found that most of the data is received by a POST request from https://vk.com/wkview.php , just the parameters for different situations change each time. I managed to write functions for pumping out absolutely all types of investments, but we will not go into details of this, because at one moment everything changed dramatically.
Link to the file for receiving authorization cookies (I wrote it only for two-factor authentication, since it costs most people)

Unexpected discovery


I was working on a laptop when a friend came up to me and asked what I was doing. Since I couldn’t explain to him the whole problem quickly on my fingers, I opened the official documentation on the messages section, and was stunned when I saw what is under the main description of these “forbidden” methods:

image

No, you understand me correctly, I'm not the first just see this opportunity. I used it many times with other methods, but I couldn’t even think that the function “sample request” would remain with the methods of the messages section. And even stronger was my surprise when I scrambled the traffic. These were just ordinary API requests, only on the site, which only slightly different parameter names in the web form and had some kind of hash-ID.

image

In a few minutes, I realized that the hash-ID is just a string located in the data-hash attribute of the button tag, and after a couple of minutes I was already trying hard to implement emulation of “test requests” and did not fully believe that it would work. After all, for sure these requests have some kind of limit on the number or something like that. But what was my surprise when this script in 30 lines (not counting the receipt of cookies), which was written on my knees, was able to pump out a half thousand pictures from the dialogue attachments in 4 minutes.

image

I apply the used code
import requests, pickle, re, json

with open('cookies_vk_auth.pickle', 'rb') as handle:
    cookies_final = pickle.load(handle)

session = requests.Session()
peer_id = int(input('  :  '))

response = session.get(f'https://vk.com/dev/messages.getHistoryAttachments', cookies=cookies_final)
hash_data =  re.findall(r'data-hash="(\S*)"', response.text)[0]

session = requests.Session()
response = session.post(f'https://vk.com/dev',
            data=f'act=a_run_method&al=1&hash={hash_data}&method=messages.getHistoryAttachments&param_count=20&param_max_forwards_level=45&param_media_type=photo&param_peer_id={peer_id}&param_photo_sizes=0&param_preserve_order=0&param_v=5.103', cookies=cookies_final)

count=20

for i in range(200):
    response_json = json.loads(json.loads(response.text[4:])['payload'][1][0])['response']['items']

    for photo in response_json:
        ph = photo['attachment']['photo']['sizes'][-1]['url']
        r = session.get(ph, timeout=10)
        
        if r.status_code == 200:
            with open(f'D://dev/'+str(ph.split('/')[-1]), 'wb') as f:
                f.write(r.content)

    m_id = photo['message_id']
    response = session.post(f'https://vk.com/dev',
            data=f'act=a_run_method&al=1&hash={hash_data}&method=messages.getHistoryAttachments&param_count=20&param_start_from={m_id}&param_max_forwards_level=45&param_media_type=photo&param_peer_id={peer_id}&param_photo_sizes=0&param_preserve_order=0&param_v=5.103', cookies=cookies_final)


I was so impressed that at this point I decided to cool down and try to implement some other method (all of a sudden I was just mistaken). I took up the History method and the result was similar. Only I had to set a delay of 0.1 seconds so that the server did not give an error about too many requests. (If someone repeats, please remember that when changing the method you also need to change the url to the documentation, where hash-data comes from). That is, this method really made it possible to access the messages section through official documentation, using only the password and user login. For reliability, I tried to do the same steps on another account and got the same result.

To summarize


And so, I think, everyone has already realized that this is a breach in the protection of our personal data, which has been hanging in the documentation for a year and it is not known how many people have already used it. Moreover, this gap is very large, and it needs to be closed soon. And in order to prove once again that this should not work this way, I will quote the VK developers themselves:
If you plan to start developing a messenger, after February 15, 2019, you will need to get test access in Support, which implies the work of the methods of the Messages section with the keys of the administrators of your Standalone application.

That is, even to obtain a token of an internal application that will have access to the user's correspondence, you need a personal permission from VK, let alone access with a regular password and login.

My personal opinion


The prohibition of the messages section did not bring any fundamental changes to the security of users. He just designated the border and cut off a group of "under-hackers" who, without even understanding what they were doing, could gain full access to the data. For the rest of the people, more experienced in programming, gaining access to correspondence is just a matter of time. And in the first part of the article, I proved by my own example, having created a program for pumping out attachments, that the emergence of a library that can pretend to be a user is not far off. Maybe I myself will bring it to the end, and VK developers need to be prepared for this and come up with ways to recognize too suspicious user activity if the privacy of our data is really important for them.

PS
, ) , .

All Articles