Eyes, brain, video quality: reflections on 120fps, 8K, HDR, wands, cones and the “soap opera effect”

I wrote a lot in this blog about the fidelity of sound reproduction, and the video was unfairly bypassed and touched only indirectly. I decided to fix the situation - the article is entirely devoted to the problems of image quality, high fidelity of video playback, as well as their connection with human vision.



As an impeccable fidelity to video playback, one can take such a technological level at which the image on the monitor cannot be distinguished from the surrounding reality. It is known that at the moment neither traditional TVs and projectors, nor glasses and helmets of virtual reality are capable of this, but the patterns and directions in which it makes sense to work are already being traced. Under the cut, I will try to determine the most significant criteria for the quality of reproduction in accordance with the way our visual perception works.

Neurophysiology of visual perception


To understand how significant this or that criterion is, you need to have an idea of ​​how vision works and how the brain recognizes and processes visual signals. On the human retina there is an extremely significant area - the central fossa of the retina. It is this area, which makes up only 1% of the total area of ​​the visual sensor, that is able to see and transmit high-definition and detailed images (i.e., it has enough visual receptors: rods and cones). Moreover, the information coming from this site loads more than half of the neurons of the visual zone of the cerebral cortex with work.



At the same time, the visual fossa is able to cover only 2 degrees of the visual field. A larger-scale visual picture of the surrounding world is created in the visual zone of the cortex due to separate “scanned” sections of information, as well as thanks to the visual memory system. The process of visual “scanning” is fast, but still takes some time. At the moment, there is no consensus on the difference over which resolution the human eye is not able to determine.

Processing of visual information by the brain is quick and invisible to us, but it is this high speed of perception and processing that allows us to clearly notice the difference between different frame rates and a number of other features that signal to the brain that the image of a linden is not reality. At the same time, we can say that the whole image of reality that we see is a product of regular brain activity, somewhat lagging behind in time (by milliseconds) from events that are happening at the moment. So it can be argued that our eyes live in the very recent past.



An important aspect is the frame rate. The illusion of continuous movement arises already at a frequency of 13 - 17 frames per second. It is also known that human vision is capable of capturing objects on the screen that appear for a split second (from 1/16 to 1/220 seconds). This ability is individual and especially well developed among e-sportsmen and video engineers. According to Professor Stuart Enstos, the brain is able to reduce the "delay" in visual perception to 10-15 milliseconds.

The rods are practically not susceptible to flowers, and cones are slower than the rods. Roughly, we can say that the "maximum FPS" in human vision is possessed by sticks. In the center of the retina, both rods and cones are concentrated, which allows you to get a full-fledged picture, while the “slow” cones form a greater degree of color perception, and the rods mainly record movement.

The inertia of the retina receptors (rods and cones) is considered the main physical limitation of the speed of visual perception; they mainly determine the time delay when transmitting information from the eye to the brain. The Massachusetts Technological Institute measured the minimum inertia of photoreceptors for transmitting visual information - it was 13 ms (which corresponds to 77 fps), confirming the data of Enstos. It is also known that the minimum “wand-cone" inertia is 20 ms (which corresponds to 50 fps). Simply put, the physiological limitation of the speed of visual perception is 77 fps, and the color perception is 50 fps.

However, there are empirical observations that indicate that a picture of 50 fps and, for example, 96 fps visually differs in color intensity and subjective assessment of image clarity. This difference justifies the creation of films and other content with a frequency above 77 fps.

Some authors write that the brain is equipped with a certain algorithm that processes visual information, and even compare it with camera processing algorithms. In reality, the comparison is not correct, and the concept of the algorithm is applicable only when pulling the retina onto the globe. Our brain, as many, I believe, know, is an analog system, and electrochemical and biochemical processes occur in it that determine the results of its activity. This can be called an algorithm, but it will be very far from digital data processing algorithms.

Thus, the criteria responsible for fidelity of reproduction are directly related to how we receive visual information and how the brain perceives and processes it. It will be fair to say that the secret of realism lies in the plane of creating such an illusion that the brain cannot identify as a fake image. And there is a big problem, data on the speed of perception of the real and the formation of a fictional visual image are not well understood, can individually vary within a wide range. and depend on hundreds of physiological factors. For this reason, the norms of high-quality images are usually determined empirically, by the most accurate method of scientific poking.

Resolution


Often, people who are just beginning to delve into the problems of digital image quality, believe that its only significant criterion is resolution. In part, we can agree with them, since the detail of the picture primarily depends on it. Today, there is debate about how much life needs formats with a resolution higher than FullHD. It’s clear that realism in video cannot be achieved without high resolution, although it is not the only condition for high fidelity of reproduction.

From subjective experience, we can conclude that if you look closely and pay attention to details, especially when it comes to projectors and really large screen sizes, the use of 4K is justified. Moreover, today 4K is turning into a kind of standard for people who care about the quality of the picture, they say, with a resolution below it’s even indecent to buy. Trading companies, including ours, are very happy with this stereotype - this allows us to earn more.



Already today, many buyers consider 8K a necessary evil. As a rule, such a desire does not arise from the fact that the difference is really noticeable to the buyer. Often they just want to buy a device that they will not doubt, i.e. in order to know that the purchased TV or projector was created at the limit of the technological capabilities of its time. The resolution acquires special significance in VR systems, where they try to achieve user immersion and strive to create the most reliable illusion of reality.

Contrary to the forecasts of sane, but not very shrewd experts, broadcasting in 8K has already begun. RED and NHK are broadcast in this format, and maybe someone else I don’t know, but these have become pioneers. Based on research in the field of VR, some predict 8K future and write that this format has great practical value. I do not share such frank optimism, although I am convinced that sooner or later I can find some differences from 4K and even that 8K will not be the last in the resolution race.



This is not from the fact that 8K is really necessary, but in connection with this, progress should not stop. The technological ability to increase the resolution depends on the actual pixel size, it is logical that the last one is smaller, the higher the resolution. I doubt that the pixels can be reduced indefinitely, sooner or later this process will reach a physical limit.

The problem of high resolution is the ever-increasing volume of video, and, accordingly, the need to constantly increase the bandwidth of data channels. When it comes to television and regular broadcasting, colossal volumes of high-resolution data (even 4K) sometimes cause lags and distortions, which is associated with an insufficient channel for broadcasting.

Dynamic range


The ability of our vision to perceive light and colors of various intensities, to distinguish a large number of shades formed another criterion for the quality of the video image - the dynamic range. A feature that captures the ratio of the brightest color to the less bright (but not yet black) in the video content. High dynamic range and its technology providing today is called HDR. It is generally accepted that the human eye is able to determine the dynamic range within 1,000,000: 1, and sometimes even higher. This ability of vision is largely due to the need to distinguish between the dimensions and shape of the objects surrounding us, even with not very good lighting.



An experiment is known that reflects the importance of the dynamic range along with resolution. So, 1080p SDR and UHD HDR, and then 1080p HDR and UHD HDR, were compared. In the first comparison, the difference was noticeable to the naked eye, but for the second case it was not obvious. The subjective assessment was that the use of HDR is perceived as an improvement in the quality (clarity, detail, accuracy) of a picture by 90% compared to SDR.

Frame rate per second


In our blog, we once touched on the issue of frame rate per second. Objectively, the high frame rate per second definitely makes the image more realistic. What the first film in the world shot at 96fps, which we already wrote about, proved. Characteristically, the film was documentary. It is no coincidence that it is precisely the frequencies that are many times higher than the traditional 24fps for cinema, which are used in the world's best serial virtual reality system HTC vive pro. As I mentioned, it is in VR that there is an urgent need for the most realistic image. Documentary



Record, which we wrote about, was beaten by the American film “Long Way Billy Lynn in the Break of a Football Match” by Ang Lee, where the frequency was already 120fps. Despite the breakthroughs in the field of frequency, most people are met with innovations with indignation, blaming the effect of “soap opera”, “television”, “low cinematography”, “artistic malignancy” of such a frequency. Like, the bad frequency kills all the “warm tube magic of the movie”.

The effect is explained by the fact that the vast majority of critics identify good cinema with what they have already seen, and they saw good cinema in movie theaters with 24fps. At the same time, an increase in the frequency of the majority is associated with television series, far from all of which were of fair quality. The standard is so rooted in human culture that it is guaranteed to be reflected in perception and becomes one of the criteria of quality, although objectively it should not be.

Total


The logical components of fidelity of reproduction, as usual, became quite understandable, physically explainable characteristics, as in the case of sound. In this case, by analogy, there is a request for archaic (in the case of sound, there were warm lamps) and sharp criticism of the progressive direction of development. Obviously, generations must pass before the moment when good cinema and other video content is perceived not through the prism of the number of frames per second, but according to other criteria. Perhaps I missed some of the significant factors of fidelity of reproduction. Let me know in the comments. I would also appreciate your personal ideas about what the video of the future might be.

Photo content used:
www.quora.com/How-come-its-really-hard-to-look-directly-at-something-in-low-light-but-when-we-look-right-next-to-the-object-we-can-see-it-better
data.cyclowiki.org/images/9/90/Centralnaya-yamka-setchatki.jpg
www.provideomontaj.ru/tv-8k-chto-jeto-nuzhno-li-sejchas-pokupat-televizor-8k


, . , , .

All Articles