Empirical probability

image
(frame from the Monty Hall TV show: the guest was not able to correctly calculate the probabilities, so he won the surprised llama instead of the car)

Let's discuss what we mean when we say the word " probability ". I ask you to try to answer this question not from the perspective of a student or a “pure” mathematician, but in the way an engineer, applied researcher or any other person who has to make a decision on the basis of empirical data should understand it.

Naive approach


As for me personally, for example, the saying: “a symmetric coin with a 50% probability falls up the eagle,” I understand as follows:

“If you flip a coin many times, then in about half the case it will fall so that the eagle is on top ".

More precisely, I usually use the simplified six-sigma rule, according to which in a series, for example, of 100 tosses, the number of dropped eagles will be determined by the formula:

10012±10012(112)


that is, to lie between 35 and 65.

Without a doubt, my statement contains a logical error and theoretically, according to the results of the experiment, the number of eagles can be less than 35, or more than 65. However, if in practice in the first hundred tosses the number eagles really go beyond the specified boundaries, I will be very surprised at this circumstance.

Academic Science Perspective


Contradictions and mistakes are not very good, even if they rarely appear. Perhaps there is some better way to give meaning to the concept of probability, a method devoid of logical errors and not contradicting experience? Let us turn to exact science for advice - try to recall a university course!

If we restrict ourselves to cases when the experiment has only a finite number of possible outcomes , then according to traditional university courses the concept of probability will be reduced to assigning each such outcome a certain non-negative weight , and the additional requirement that the sum of all weights be equal to one.

Presented in this form, the theory of probability is indeed free from contradictions (it has a model) and allows one to formally prove many interesting results, such as the Law of Large Numbers or the Central Limit Theorem. However, for the experimenter, all these results remain purely formal and have no meaning until he answers the following questions:

  1. How to choose the right weight for the outcome of a particular experiment?
  2. If the weights are assigned incorrectly, can this be understood from observations?
  3. If the weights are assigned correctly, what predictions can be made regarding future experiments?

Abstract theories


At this point I would like to stop and make a small remark about abstract theories in their modern sense. According to the "pure" mathematicians, to create an abstract theory (first order), you just have to do three things:

  • Reserve words (character strings) that will denote formal variables

  • Reserve the words that will denote (one-, two-, three- ... local) formal relations between formal variables
  • Using formal relations between formal variables as atomic statements, write out any number of logical formulas that will serve as formal axioms of your abstract theory


Let me give you a simple example.

We reserve all small letters of the Latin alphabet as the names of formal variables.

We reserve two words: “is_direct” and “is_point” - for single formal relations and two more words: “belongs” and “coincides_s” - for double relations of our theory.

As axioms, we take the following logical statements:

i) For all a , b : if [ a is_direct] and [ b is_direct] and not- [ a coincides_ with b ], then there exists d such that: [ d is_point] and [ d belongs a ] and [d belongs to b ] and (for any c : if [ c belongs to a point ] and [ c belongs to a ] and [ c belongs to b ], then [ c matches_ to d ])

ii) For all a , b : if [ a is a point] and [ b is a point ] and not [ a coincides with b ], then there exists d such that: [ d is_direct] and [ a belongs to d ] and [ b belongsd ] and (for all c : if [ c yavlyaetsya_pryamoy] and [ a member c ] and [ b belongs c ], then [ c sovpadaet_s d ])

image
(Parallel lines intersect. Illustration taken from robinurton.com)

For the sake of readability, I enclosed atomic statements in square brackets. If you studied projective geometry, you probably learned in this example the axiomatics of an abstract projective plane. Translated into Russian, axiom i) says that any two different straight lines intersect exactly at one point, and axiom ii) - that exactly any straight line passes through any two different points.

It is worth recalling here that formal variables and formal relationships are just sequences of printed or handwritten characters. When you create an abstract theory, it is not even necessary to assume that formal variables in reality can mean some things, and formal relations are real relations between these things. Thus, any meaning in formal statements is initially absent.

Using formal relations between formal statements as atomic formulas, in addition to axioms, you can construct other formal logical statements. If any of these statements can be deduced from the axioms of the theory according to the rules of symbolic logic, then it will be a (formal) theorem for this theory. Just like formal axioms, formal theorems initially do not carry any meaning and do not express any properties of the world around us.

Why then are abstract theories created?

Model and Interpretation


Take some suggestions from our everyday speech, for example: "A black cat sits on a window." The same sentence could be written differently: "There are x and y such that: [ x yavlyaetsya_koshkoy] and [ x imeet_chernyy_okras] and [ y yavlyaetsya_oknom] and [ x sidit_na y ]».

As you can see, our comic sentence in its second entry has some similarities with formal logical statements. However, it should be noted that there is an important difference between them. While the formal variables and formal relationships that make up formal statements mean nothing, the variables x and yin the last example, empirical objects are designated: a specific cat and a window, and each of the relations: “be a cat”, “be a window”, “have_black_color”, “sit_on” - refers to a well-defined individual or mutual empirical quality of these objects.

By “empirical” I mean any concept that can be defined solely in terms of empirical data, and in addition, for which there is an algorithm to understand whether it is present on experimental data or not. All the concepts used in macroscopic physics are so long, the mass, the current strength, or the amount of energy is empirical, and the concepts of “god” and “truth” are not considered as such at the moment.

Variables denoting empirical objects, and relationships calling empirical properties, it is reasonable to call material. Thus, if all atomic statements of a certain logical formula are material relations between material variables, then all these atomic statements and, in general, the logical formula itself become meaningful, that is, they acquire meaning and significance. Their meaning is in the statement of a certain property of the surrounding world, and the meaning is either truth or false.

The simplest way is to make sure that some meaningful logical statement is true, is repeatedly set experiment or long-term observations of the world. For example, in order to consider the statement true: “You can’t put an elephant into a box from under matches”, you just have to try to push it there many times.

Being naturally intelligent creatures, people quickly realized that checking each statement empirically was long and not always safe for life. Therefore, they quickly discovered another way. Actually, it turned out that performing certain manipulations on sets of true statements, one can get many new logical statements and all of them magically turn out to be true.

The big surprise was that the type of the mentioned manipulations and the rules for their use in no way required knowledge of the meaning of the statements, but relied only on the way of writing their logical formulas. For example, whatever the meaningful statements A and B may be , or if the statements “ A ” and “If A , then B ” are both true, then statement “ Balso turns out to be true .

So, to understand whether a statement is true, it is no longer necessary to know its meaning. As a result, now anyone can take an arbitrary list of logical formulas, and considering them conditionally “true” (in other words, formal axioms), with the help of a certain set of manipulations (formal rules of inference) get other, conditionally “true” logical formulas.

The benefit of such seemingly meaningless exercises can appear only when another person dealing with the experiment decides for some reason to use formal variables and formal relationships as names for real objects and their mutual empirical properties. Such a solution in itself means that the formal theory has a meaningful interpretationand every statement in her language becomes meaningful and acquires meaning.

If a theory is interpreted in such a way that all its axioms turn out to be true, then all of its theorems will be true, the interpretation itself is considered consistent for this theory and serves as its (material) model .

Examples
Let us return to the abstract theory of the projective plane and in three ways "breathe" meaning into it.

  1. . :
    «_» ;
    «_» — , , ;
    «» — ;
    «» «» «» — .
  2. .
    «» , - ;
    «» — , , , ;
    «» «» — , , .
  3. : , .
    «». , ;
    «» — , ;
    «» — ( );
    «» — .

Interpretation at 1) is not a model. Indeed, on a flat Whatman sheet some lines will be parallel and not intersect, even if the sheet is unlimitedly large. The remaining two serve as models for the projective plane.

Error checking


What happens if an experimenter, trying to explain his observations, selects the “wrong” theory? As a rule, in such cases, the experimenter will quickly discover a discrepancy between what the theory predicts and what actually happens.

image
(When something is wrong with your model of the world)

Take, for example, a surveyor. As long as he deals with small flat plots, the accuracy of the measuring tools used by him does not allow to detect violations of any axioms or theorems of Euclidean geometry. However, it is worth the land surveyor to undertake work on a planetary scale, when straight lines intersecting each other twice are found, in the large triangles the sum of the angles changes, and the circumference ceases to be equal to π r. The discrepancy between the predictions and the experimental data should force the surveyor to take some other geometry as a model.

Another example is a physicist. As long as his observations relate to slowly moving bodies, he can safely apply the Galilean rule for adding speeds and Newtonian dynamics: within the required accuracy, theoretical predictions will coincide with the experimental results. However, if a physicist tries to apply the same (essentially abstract) theories to predict the trajectory of an electron in an accelerator of elementary particles, it will suffer a crushing fiasco: the laws of the Lorentzian world apply here.

The reaction of contradiction to inappropriate use is a “gentlemanly” feature of almost all natural-scientific theories. If they did not possess it, then, as you will see later, on the basis of the same empirical data, the experimenters could make valid, but conflicting conclusions.

So, back to our main topic. Try to draw in your imagination three mathematicians who asked a random passerby toss one of the coins he had a hundred times in a row.

The first mathematician suggested that the coin will be described by the theory of Bernoull tests with weights 1/2 for both the eagle and the tails. The second once read that minting technology violates the symmetry of coins, so he chose the Bernoulevsky test theory, in which tails have a weight of 1/3, and an eagle has a weight of 2/3. The third mathematician was fond of philosophy and, for the sake of an existential experiment, assigned the weight of 1 to the eagle and 0 to the tails. As a result, all three mathematicians were selected according to an abstract theory with which they were going to look at the result.

In forty-seven out of a hundred tosses, the coin fell eagle up.

The first mathematician exclaimed that the result deviates from the average calculated by him by less than “three sigma”, and there are no contradictions between his interpretation and experience.

The second mathematician exclaimed that the result deviates from the average calculated by him by more than “three sigma”, that the total weight of such outcomes is less than 5/1000 and there are no contradictions between his interpretation and experience.

The philosopher exclaimed that according to his calculations, the weight of the sequence obtained in the experiment is zero, the total weight of all sequences including at least one lattice is also zero, and there are no contradictions between his interpretation and experience.

Apparently, one will have to admit that each of the mathematicians is right. What then is the meaning of the assigned scales?

Evidence


As already mentioned, by choosing a suitable theory and constructing its interpretation, the researcher is given the opportunity to prove the truth of hypotheses using the formal derivation procedure alone. Confidence in the truth of statements derived from axioms is determined only by trust in relation to the truth of the axioms themselves in their interpreted sense.

The use of deductive methods does not prohibit looking for patterns directly in the data and trying to justify them experimentally. Moreover, these two approaches are not equivalent: the fact that a hypothesis has an experimental justification does not mean that it is possible to prove this hypothesis formally, just like the other way around. For example, through personal experience, I am almost sure that all the crows are black, and thanks to the theorems of geometry, that the area of ​​a circle with a radius of one kilometer is π square kilometers. At the same time, I have no theory to formally prove the first statement, and no experience to experimentally substantiate the second.

In cases where the hypothesis of an empirical regularity has both experimental justification and can be formally proved within the framework of the accepted theory, it is said that this regularity received a theoretical explanation . For example, the pattern discovered by Kepler in the forms of the orbits of celestial bodies has a theoretical explanation in the framework of the Newtonian theory of gravity.

If you think about it, any pattern is a certain limitation of the possible results of observations: a crow can only be black, the area of ​​a circle cannot be much larger or smaller than π r 2 , planets cannot move except in an ellipse.

It should also be intuitively clear that formal inference methods do not have the right to introduce any additional restrictions compared to those imposed by the meaningful value of axioms. Indeed, had it been the other way round, a situation would have arisen when the axioms are “true” and one of the theorems contradicts the observations.

In fact, substantive statements of theorems are just convenient reformulations of the aggregate “axiom” restrictions applied to some particular set of circumstances. For example, the ellipticity of the orbits is a consequence of the Law of gravity and Newton’s three dynamic laws in circumstances when one of the two celestial bodies is one heavy and “motionless”, and the second is light and does not move too “fast”.

The conclusion to this paragraph will be the following statement: "The restrictions imposed by the axioms of the theory should, in aggregate, be no weaker than the restrictions imposed by those empirical laws that the experimenter is going to explain using this theory.

The Naked King


“In the capital of this king, life was very cheerful; almost every day foreign guests came, and now two deceivers appeared. They pretended to be weavers and said that they could make such a wonderful fabric that it is impossible to imagine anything better: apart from the unusually beautiful pattern and colors, it also has an amazing property - to become invisible to any person who is out of place or impenetrably stupid . ”
.................................................. ........................ Hans Christian Anderson “The New Dress of the King”


(French students demand a new philosophy of science. Source: salamancartvaldia.es)

Let's return to the theory of probability and three maths with a coin.

What do you think, if mathematicians try to repeat their experiment many times, will they discover any empirical laws? In other words, will they be able to make a reasonable conclusion that it is impossible to observe a certain type of sequence in their experiments ?

And the second question: if there are empirical laws, then which of them can be explained in the framework of the generally accepted probability theory?

I'm afraid to disappoint you, but the answer to the second question is extremely simple: "None."

Indeed, all that the meaningful meaning of the axiom of probability requires is that the weights assigned to the eagle and tails be non-negative and, in total, give unity. When this requirement is fulfilled, any sequence of eagles and tails is admissible in the observations, since it does not change the assigned weights and thereby does not create contradictions with the axioms. This leads to the conclusion: in their meaningful value, the axioms of probability theory do not impose exactly any restrictions on the possible results of observations and therefore, in a strict logical sense, are not able to explain any patterns in the data.

As regards the question of the existence of empirical laws, a double opinion is possible here.

On the one hand, if a coin is not made with any special tricks, then in each experiment it can fall up, both with an eagle and a tails, so the experiment can end with any sequence of them, which means empirical laws, in a strict definition of this concept, - no.

On the other hand, even devoting a whole life to experiments on a symmetric coin, it is unlikely to be able to see at least one series of 100 tosses, in which there will be no more than 10 eagles (in a single series, the chances are less than 1 in 10 15) The latter means that the experimenter with a clear conscience has the right to accept the statement: “In a series of 100 tosses, a symmetric coin will fall upwards with the eagle at least 11 times” as a well-founded empirical regularity.

Here we clearly come to the contradiction between the philosophy of science and common sense, which of which follows?

When it comes to specific decisions, we have to act categorically: attack - or defend, operate - or continue to treat medically, make a deal - or refuse the offer. In such circumstances, you will not be able to use the theory of probability in any way without first making mistakes in interpreting it. In some cases, unlikely events will have to be considered impossible; in others, it is necessary to replace the probability with a frequency or think of mathematical expectation as the average value for a finite series of experiments.

The reason for this strange situation is hardly worth looking for in the defects of the abstract probability theory: there is every reason to believe that this mathematical discipline is just consistent. It’s another matter that any theory based on the philosophy of the unambiguous “Yes” and “No”, the absolute “Truth” and “Objective reality” is unlikely to correspond to our intuitive understanding of what “probability” is and how to measure it. There is not even complete certainty that this concept is real, and is not a simplification of some concept that has not yet been discovered (As it once was with the “Heavenly sphere” or “Ethereal wind”).

If a theory is not fully developed, and its interpretations are often contradictory, is it worth putting this theory into practice? In those cases when the result does not differ too much from common sense - probably worth it! For example, Leibniz, Euler, Lagrange, Fourier and many of their contemporaries successfully used the "Analysis of the Infinitesimals" long before they managed to create at least some theory of real numbers.
Do not take science too strictly!

As a belated April Fool’s joke.
Sergey Kovalenko.

2020 year
magnolia@bk.ru


(author: Alexas_Fotos)

All Articles