When I hear the words "restored the neural network", I climb to check backups

In addition to being an IT specialist, I’m also a technology historian, and this is what determines my reaction to the news about the latest achievements in the field of digital technologies. A month ago, I decided to start writing a book for people who are far from IT and are close to historical research and sources (“ Digital source study - specific problems ” is written on the sites of draft books), in which I will tell them what the development of digital technologies turned for them .

A couple of days after that, the news flashed through the Internet: “Arrival of the train” was improved with the help of neural networks - the 1896 film can now be watched in 4K and 60 frames per second ”, and this is a good reason to tell IT people about the same thing.

I don’t have the original movie “Arrival of the Train,” so I used modern photographs (reduced or discolored) + photos from the 1930s (presumably) as test samples

When I hear the words "restored the neural network", I climb to check backups

0. What is the problem?


The problem that will be discussed arises because of how exactly real historians and neural networks work.

According to the layman, the ideal historian sits exclusively in the archives and works with official and well-preserved documents. In reality, historians work with the sources that they have and in the form in which they came to them.

In reality, in addition to official documents in state archives, personal photos, letters, memoirs, etc. can serve as sources. Unfortunately, historians very often work not with source documents, but with various copies.

Have you ever heard the phrase that various icons and texts “have come down to us on the lists”? In this case, the word “list” does not mean a catalog in which some work is mentioned, but a copy of this work itself. This term comes from the word "write off".

Many of the texts, photographs and films have reached us in the form of copies, and there is no guarantee that the only copy of the film “Seventeen Moments of Spring” that has reached the historians of the future will not be just a painted and cropped version. For the paths of the historical source are inscrutable.

On the other hand, there is a lot of news that the neural network has restored or improved something. It sounds like some kind of magic and many have the feeling that some kind of artificial intelligence can really restore something.

In fact, about any restoration of color or details in small pictures is not talking and can not go. The program simply adds elements to the photo or video that its algorithms determine as appropriate.

Unfortunately, in reality it is impossible to restore the lost image, because the bleaching operation is irreversible, and if a photograph does not have a part of the image, then it cannot be restored only on the basis of the same photograph.

Therefore, neural networks do exactly the same thing that people do in such cases - they fantasize on the basis of their experience.

And now I will show what is obtained as a result of these fantasies.

1. Comparison of different colorization services


Although coloring photos and films is not a completely new phenomenon, now it is available to everyone who has Internet access, and many people take advantage of this new opportunity.

We already live in a world where there are many painted photographs of soldiers of the Great Patriotic War, the interiors of the Titanic, the royal family and many others.

It may seem to an uninitiated person that it is a question of restoring the original color, and that a colorized photograph shows us how people and objects of a hundred years ago actually looked. Based on these photos, someone can begin to draw conclusions about people's lives in the past, analyze various events and situations.
And although I understand the impossibility of restoring the real color from a black and white photograph, as a researcher I must check and make sure that I am right.

To test this idea, I took two modern color photographs, bleached them in a graphical editor, and drove them through online colorization services.

1.1 Colorization of the car Ford A Phaeton


In this case, I used a photograph taken by me at the end of January 2020 at the Moscow Domodedovo Airport. I do not know how the coloring of these cars matches their original color, but that doesn’t matter. In this experiment, we check how accurately the color of the bleached photo will be restored.

Colorization of the car Ford A Phaeton

I conducted this experiment on photographs of different cars and the result is unchanged: all services paint real cars differently, but nobody paints correctly.

At the same time, I personally prefer not the original version, but the coloring result from deepai.org - a calm body color with blue roof sides. (But in this versionthe original color is shown in stripes numbered 2 and 7, but I like strip 5 colored by algorithmia.com , where part is colored yellow and part is red).

The problem with car coloring is explained very simply - data embedded in each neural network. And in the same way as with manual coloring, automatic coloring indicates exactly on the basis of what experience the coloring was made.

That is, there is no question of any restoration of the original color of the speech and can not go.

Of course, there will be people who say that you need to upload even more photos to the neural network and then everything will be fine, but this contradicts the very principle of the neural networks — they simply average the data loaded into them and are not able to go beyond the “experience” obtained in this way.

1.2


The next experiment was with a photograph showing architecture and many people in colored clothes. The original photo was cropped, discolored, and uploaded to colorization services.

Colorization of the fountain at VDNH

Due to the large number of objects to be painted, the result is not as straightforward as was the case with the Ford A Phaeton.

Yes, none of the services painted the statues in golden color, red tulips at the bottom of the picture, and bright green and bright blue t-shirts. However, all services brilliantly coped with painting a white T-shirt of a man sitting on the parapet of a fountain and a white blouse of a woman walking from right to left with a handbag on her side.

Thus, we again have a completely predictable result - the colorization services are not able to restore the real color.

But the benefit of this example is not to repeat the obvious fact again. Of course, repeating the obvious facts is necessary and very correct, but there is one more point.

Bonus from 9may.mail.ru


In addition to coloring, the service 9may.mail.ru carries out the operation “troubleshooting”. If you compare just a colorized photo and a colorized photo with which defects were removed, you will find a very interesting feature.

Bonus from 9may.mail.ru

This illustration shows an enlarged fragment of the right edge of the photo with a fountain. As you can clearly see, during the “elimination of defects”, the sculptural element was removed (I will not dare to say its name :))

Similar “elimination of defects” were noticed in other photographs colorized by 9may.mail.ru, but there these were not so big deletions.

Thus, the historical source was not only incorrectly painted, but also had “scuffs” that destroyed part of the image (which again brings us back to the question of “Digital wear and tear ”)

This example allows you to move smoothly to the next part of the story about the impact of“ improvement ”of photographs by neural networks on historical sources.

2. Increase in photo size


As well as coloring, enlargement of photographs existed in the pre-digital era.

The result for both cases is the same, we begin to see the minimum element of the photo. In analog photography it was “grain”, now its place was taken by the “pixel”, but they have one essence - it’s the minimum indivisible element (I really want to say “atomic”, but despite its name - the atom is not indivisible :))

If we look on a chessboard in a magnifying optical device (telescope, binoculars, etc.), then we can “zoom in” it and make out details that were previously not visible.

But if we photographed a chessboard so that it fit in one grain / pixel, then there is no way to “zoom in” and make out each cell individually. When enlarging such a picture, we will see a large one-color spot where the chessboard should be.

Exactly the same situation will happen if we change the pixel size of a digital photograph of a chessboard - information about the cells on the chessboard will be lost, and there is no way to restore it only on the basis of the same photograph.

In general, I feel awkward in saying this commonplace idea, but, as practice shows, the idea of ​​the irreversibility of reducing digital photography is not obvious to everyone.

From time to time, news appears that some neural network has increased and improved the old photo, so now we can see the details that we could not see before.

Just like in the case of coloring, I tried to apply online services to real photos.

2.1 Unknown mill from the 1930s


Once, on Saturday evening, a colleague sent me a link to a photograph on the Perm State Archive page in Vkontakte . 1024 by 705 pixels that have undergone JPEG compression several times, with poorly readable labels.

Unknown Mill from the 1930s

We had a great time, solved this riddle and on Monday he confirmed our findings by going to the archive and studying the original photograph.

This allowed me to conduct an experiment and see what neural networks are capable of.

Unknown mill from the 1930s - comparison

As a result, the most readable option was “simple increase” (in general, I read this inscription simply by enlarging it on the smartphone screen).

biz.mail.ru made the label unreadable at high scaling, but the line “Acme Road Mach Co” remains partially readable at a certain scale.

The remaining applicants made a noise so much that the inscription ceased to be read at all. Although it remained partially recognizable.

That is, the services for "improving photos" did the exact opposite - they worsened the real photo.

And if you say that improving the inscriptions on old photographs is not a task for such services, then I will agree, because this is precisely the problem. The fact is that these services exist, they are positioned as services for “restoration” and “restoration”, without explaining to users the risks and consequences associated with the technology used. People who study the history of their family or their locality can “improve” their digital photographs.

And I have big doubts that they will all carefully store the original unimproved photo.

I have another example related to the Perm archive and attribution of photos, but it will be in the next update of Digital Source Studies , and now I prefer to return to the machines I photographed at Domodedovo.

2.2 Hood Lorraine-Dietrich B36


To check the possibilities for enlarging photos, I took one of my photos, reduced the pixel size from 4000 to 3000 to 1024 to 768, and drove through the same services as in the case of the mill photo from the previous example.

Lorraine-Dietrich B36

And if an ordinary viewer of such "improved" pictures doesn’t really look at them, then I was interested in small details.

Hood Lorraine-Dietrich B36

The result was predictable.

The logo on the radiator grill is recognizable, but distorted - the lines have become even.

Side vent holes are smoothed out and are not distinguishable from glare on the hood.

Quite expectedly, many small details disappeared, but this example is not at all here to once again confirm the idea of ​​the irreversibility of losing information from a digital photograph while reducing its pixel size.

If you carefully looked at the photos, you already saw signs that the neural network had worked here.

Bonus from letsenhance.io


Here is the time to recall how neural networks work - selects suitable options from their own "experience" obtained as a result of training.

And now I will show how exactly letsenhance.io increased 4 times the photo, which I previously reduced 4 times.

On the left you see the original photo before reduction, on the right - obtained after enlargement. (An intermediate reduced photo is not shown)

Bonus from letsenhance.io

Yes, that's right - this is the face of the monkey.

And if you see in this a funny case, the problem of training a neural network or its misuse, then I see a completely different thing. Namely, a huge number of digital photos that have been and will be "improved" by the neural network and will go into circulation. Some of them will replace the originals by virtue of their loss.

And if before starting writing this article I was just aware of the problems associated with the fashion for improving / restoring images using neural networks, now this problem has found its own specific face.

But this is not the end of the story.

3. The increase in the number of frames in the video


In order to get a movie, it’s not enough to have one big and colorful picture. There should be many such pictures and they should replace each other very quickly.

One of the ways to improve films is to increase the speed with which these pictures replace each other. Or, as it's right to call it, “frame rate increase”.

And in this case, too, there is nothing new. Just as in the case of discoloration and reduction in pixel size, there is no way to get information about what happened between frames.

It can be assumed how the subject moved in the frame and finished it on the newly added frames, but, as in the case of colorization and enlargement, it will be the completion of new details, and not the restoration of what actually happened.

This is best illustrated by a shot from a DAIN neural network demo . (Judging by the description for the video “Arrival of a train” mentioned earlier, it was this neural network that was used by its authors to increase the frame rate)

Increasing the number of frames in a video

Here is a comparison of 3 options for increasing the frame rate from 12 fps to 24 fps.

The top left frame is the original video.
The bottom right is the result of DAIN.
The remaining two are solutions that DAIN creators compare themselves to.

As you can see, in all three cases of increasing the frame rate, we are trying to find the average state between two frames. Despite the fact that the DAIN option (lower right frame) looks sharper than the SepConv and ToFlow options, it still shows how the shirt on the back and the head are smeared.

And even when technologies move forward and there will be no such smearing, this will not change the situation with the fact that it is impossible to restore what happened between the frames, and all that remains for us is to draw up some kind of averaged state.

Conclusion


As an IT specialist, I understand that these technologies are not designed to correctly preserve digital sources. Neural networks are needed in order to produce beautiful and easily slippable content.

Therefore, films are colorized, cropped and they increase the frame rate.

This is just show business, and technology authors should not care about how users use their development.

But, as a historian, I see the results of using these technologies. An increase in the number of photos and films “improved by neural networks” will lead to their getting into materials used as historical sources in various studies. Concomitant phenomena will lead to washing out old versions of files and turning “improved” copies into the only available ones (hi, “ Digital Wear ”).

This process cannot be stopped, but approaches can be developed to minimize damage. Actually, this is what the book about digital source studies is about , and it is aimed specifically at my colleagues in the historical workshop, and not at IT industry specialists.

Although, there is a way accessible to all people, regardless of profession, to stop calling the process of creating easily digestible media content the words “restoration” and “restoration”, so as not to create a false impression among the uninitiated about the essence of this process and the resulting product.

There is another word for this:
— , , . -, , . (, ); , . , , . , ( ) , , . , - - — , (. ). , , , , . If R. limited herself to correcting only this disharmony, her role should have been recognized as highly desirable and useful.

(Bold selection is mine).

Brockhaus and Efron Encyclopedic Dictionary: Volume XXVIA, ​​p. 624

Published in THOUSAND EIGHT HUNDRED AND NINE-NINE.

As you can see, this problem is not known for the first millennium and was relevant even at the time of the appearance of the original film “Arrival of the train”.

All Articles