About color, sound and “crowd exploration” as a separate kind of beautiful

This story began, as expected, in line for morning coffee in one good and friendly, but only this and noteworthy IT company. It began, as usual, with the usual stuffing.

The author of the idea, probably, already doesn’t even remember if he thought about anything before voicing it, or, as usual, first said the first thing that occurred to him, and then he began to actively prove that he was right. Try to refute, they say, if there is anything.

Actually, what was the very stuffing: all who studied at school probably remember the dualism of the nature of light, and that one of its sides is the idea of ​​light as a wave. Waves of light belong to a certain frequency range, and each shade of the visible spectrum corresponds to a certain wavelength. On the one hand, this range is continuous. On the other hand, it is a proven fact that the human eye directly perceives only three frequencies, and the rest of the colors get to perception as a combination of two or three components, of which the brain “thinks out” the original color cast. As in the monitor, we put three bulbs of red, green and blue colors, set the required intensity of the components and get a pixel.

Now we follow our hands: if sound is also a wave, and different frequencies of sound are encoded at different frequencies, is the same effect observed in the perception of sound? Is it possible to divide the set of perceived frequencies into those that are recorded by one “sensor” and those that are perceived due to the “combination”?

I warn you right away, there will be no morality under the cut, but if you are interested in reasoning about refuting the stuffing ...

... that reasoning this simple idea gave rise to a mass.

Right away from the coffee machine, arguments began to pour in for and against. The main argument for is dictated by the beauty of the analogy. Individuals have the so-called color hearing, a phenomenon poorly studied, but definitely suggesting thoughts about the possible similarity of perception mechanisms.

In addition, a certain number of frequencies, even within the audible range, the human ear hears clearly worse. This is known, verified, this fact is reflected in the mp3 compression algorithm and is actively used.

An amateur musician who met among us confirmed that there are such musical intervals in which two sounds are clearly heard, and there are those in which two sounds are much more difficult to hear.They could not be convinced, due to the lack of musical hearing in some of those present.
The former artist who met among us said that any color can, of course, be mixed with yellow, pink and blue paint, but a certain shade of colors is needed, the further it is from the ideal - the dirtier the resulting color is in the mix ( CMYK, in fact, inverted RGB, isn't it? ).

The main objection is: sound is a longitudinal wave, and light is a transverse one. This can greatly affect the feasibility of that folding, and also indicate all the same different approaches to the perception of one and the other by the body.

The second objection, no less strong, is the possibility of transposing music to other keys. You can increase / decrease all notes by half a tone, for example, and the result will not differ much in perception. For people without absolute hearing, there will be no difference at all.

But the favorite tonality among the masses is still in A minor!

The remaining arguments were formed with a clear transition to the individual. In particular, the author that day pretty much fell for banter over the snobbery of sound engineers regarding the quality of reproducing equipment. What kind of drama will it be, if it turns out, if in all this luxury of transmission accuracy we are, in principle, able to hear only a very, very small part.
In general, the spears broke and parted, but, as they say, I was hooked, but the fact is that in the algorithmic sense it is sometimes much more difficult to verify the non-existence of a phenomenon than to confirm the existence. For the latter, any successful example is suitable; for the former, you have to dig deep into the possible options and at the end you will still be asked a question: is there really a cat in the room, or was it just that you were looking incorrectly?

Actually, to the point.


As the saying goes, what to think is to jump .

Let's try to push off the analogy with paints. With one color - or sound - nothing is clear, but if you mix two, you can see - or hear from the result - whether the original components were “pure” enough. If it looks like a “simple” wave of a certain frequency, we immediately found two of the desired components. If not, at least one of them is "dirty." The range of audible frequencies is from 20 to 40,000 Hz, beyond its limits there is clearly no sense in adding something with something.

Second question: what should one test suite look like? If we take two frequencies, then - as in the case of color - it is probably logical to try to add them in different proportions of the amplitude. The first and last sounds are clean for the added frequencies, in the middle - in the proportion of 50 to 50, and for a couple of options between the center and the middle. That is, it takes about 7 steps, each second. Plus a second for a break between tests.

In each set, we add up several frequencies in order to select the most “similar to the truth” from the set of options, but the duration of listening to the entire set should not be longer, say 10 minutes, so that you can pause, evaluate the damage done to the psyche, do the conclusion about whether there is something interesting in the set, to think about whether we want to continue or that's enough for today.

We count. The total dialing time is divided by the duration of listening to a pair of frequencies: (10 minutes x 60) / (7 seconds + 1 second) = 75 . Since we have two measurements, and we believe that our sets are “square”, we take the square root of 75 to get about 8.66. For simplicity, we take 8 steps for each of the measurements.

Now we select the discretization: we understand how many tested points to divide the original range. Still, if the phenomenon is observed, it is logical to assume a certain “smoothness”. When approaching the desired “point”, the result will be more and more similar to the truth, until we hear (if we hear) in the combination of two different sounds one sound of a “new” frequency. For reference, we take a table with the heights of the notes, the benefit of them is full in free access.

Halfton - you can focus on this interval as a subjective indicator of sensitivity (a good musician sometimes hears half of this interval and a quarter, but “mere mortals” and half a ton are too close) - in absolute frequency difference the greater the higher the sound. If about 40 Hz we are talking about units of hertz between two semitones, then about 400 Hz - about dozens, and about 4000 Hz - about hundreds.

Well, we will consider the step width in the test set depending on the frequency of the first element. For simplicity, we take a linear relationship: step length = start frequency * 0.005

Now let's calculate the number of test sets. We know the size of the set for each measurement, we know the range of measurement frequencies and the step width coefficient in it is 0.005. We write a simple algorithm, run it, we get the answer: 194. I remind you, this is for each dimension. Since the order when adding the waves is not important, instead of simply squaring the resulting number of sets, we consider as: (194 * (194 - 1)) / 2 = 18721 . Duration of listening to the set: 64 * 8 = 512 seconds .

18721 * 512 = 9585152 seconds = 159752 minutes = 2662 hours = 111 days

Wound back, are we not going to sleep at all? If we divide by 8 hours in a day and not by 24, we get about 333 days. About a year. Nda.

Somewhere in this place, an understanding came why the answer to a simple question in the third paragraph of the article is still not written in the school biology textbook. We need a researcher with a musical ear close to ideal, with equipment that can accurately reproduce sounds in a wide frequency range, psychologically stable enough to listen to monotonous sound sets for eight hours a day throughout the year. And most importantly, it is not entirely clear what practical benefits it is proposed to achieve. Do you have any volunteers?

What does crowd-research have to do with it


However, I just didn’t want to give up. If the task is, in general, divided into parts, why not divide it into a number of participants and solve it faster? Technology today makes it possible to share the results of their observations quickly and without unnecessary effort.

So, we need an interface that automatically loops through and loses the test suite. It should be able to navigate through individual tests, the ability to mark interesting points, repeat once again playing a specific test. In addition, you need a "reference sound." That is, a test of the same duration, but it will have “clean” waves, with different frequencies in the range from the first frequency of the test to the second. And you can also “draw” a wave in order to study the phenomenon not only by ear, but also visually.



That's somehow it should look like. Preliminary tests showed by the way that not everything is so simple with the very cat in the room. In a large percentage of randomly pulled out frequency pairs, clearly two sounds are heard, you can safely note and forget. But in some, there’s something odd, which makes us take a fresh look at the phenomenon of consonance (I think the matter is in the intervals, but so far this is not as obvious as we would like).

In addition, you need an interface to view a common database and select a set of tests for today. The main thing we want to see is whether some square was checked. And if you found something interesting somewhere - read in the comments what exactly it looked like. Something like this:



For this task, a project on the github is slowly being conducted .

In Java, so that in the future there are no problems with portability to other platforms. On Spring - because why not. Under IDEA - because the convenient thing is, in fact. In the plan: to organize it just like a client-server, with additional (in the screenshots something done while Swing under Ubuntu) client options in the form of a web application and an android application. If someone - than the goblin is not joking - wants to help with advice or a pool request, who am I to interfere with him.

Instead of a conclusion, I propose to discuss in the comments the prospect of the “crowd research” format. If you successfully manage to collect money for the production of a device that many people want, why not collect individual facts to form an answer to a question that many people are interested in? Perhaps it is worth disseminating this format of research more widely?

All Articles