Probability Theories: Preparing for an Interview and Solving “Paradoxes”


Every year I participate in about a hundred interviews in JetBrains educational projects : interviewing applicants at the Computer Science Center and ITMO's corporate magistracy (by the way, recruitmentgoes to the program right now). All interviews are arranged according to one pattern: we ask you to solve problems on the spot and ask basic questions on the disciplines that students studied at universities. Most of the questions we ask are quite simple - you need to give a definition of a concept, formulate a property or theorem. Unfortunately, in a significant proportion of students, all these definitions are eroded immediately after exams at universities. It would seem that there is nothing surprising? In the modern world, any definition can be google in a couple of seconds, if necessary. But the inability to restore the basic definition indicates a lack of understanding of the essence of the subject.

If a misunderstanding of algebra or mathematical analysis can have little effect on your life, then a misunderstanding of probability theory makes you an easy target for deception and manipulation. Judgments about the probabilities of various events are so deeply embedded in our daily lives that the ability to reason and distinguish truth from ignorance or manipulation is necessary. In this short review, we will talk about the basic concepts of probability theory, learn how to correctly formulate statements about simple random processes, and analyze a few paradoxes. Part of the material was borrowed from A. Shen 's brochure “Probability: Examples and Tasks” , which I highly recommend for independent study.

Before talking about definitions, we need to agree on where does randomness come from in our world. For example, why do we think coin tossing is a random process? From the point of view of classical physics, which describes processes in the macrocosm, everything is determined, therefore, by the parameters of the coin toss, it is possible to determine unambiguously which side it will fall. However, in practice it turns out that it is impossible to measure and take into account all the forces that actually affect the coin, and therefore the result of this experiment is considered to be random. It is important to understand that this question is not a question of probability theory. Probability theory works with models - for it, a coin in which the eagle and tails fall equally often, and a coin in which there are two times as many eagles as tails are just two different models. The question iswhich of the models is more consistent with the observed reality is a question of our experience (experience shows that the frequency of the eagle and the tails is approximately the same). Thus, the first thing we need to agree on a model.

Definitions


To determine a model that allows us to talk about probabilities, we need to describe a probabilistic space .

The probability space in the simplest final case consists of many elementary outcomes Ω={a1,a2,
,an} and a set of non-negative numbers {p1,p2,
,pn}such that their sum is equal to 1. Quite often, all outcomes are considered equally probable, i.e.p1=p2=⋯=pn. In a more complex infinite case, it is necessary to separately isolate the set of events of interest to us and set the probabilities of events using a function called a probability measure . An event is a set consisting of elementary events, i.e. any subsetΩ. Event probabilityE⊆Ωis denoted by Pr[E], Is the sum of all such pi, what ai∈E. In particular, the probability of an empty eventE=∅ equals zero and events E=Ω equal to 1. In the case when all outcomes are considered equally probable, the probability of an event is simply equal to the ratio of the number of outcomes contained in the event to the total number of elementary outcomes, i.e. Pr[E]=|E|/|Ω|.

The probability of any event is between 0 and 1. If the probability of the event is zero, then this event is called impossible ; if the probability of the event is equal to unity, then such an event is called reliable .

It is important that without determining the probability space it is impossible (in the mathematical sense) to talk about the probability of something.

Comment


Based on the definition of a probability space, it is easy to distinguish between probability theory and statistics: probability theory predicts frequencies based on knowledge of the probability space, and statistics solves the inverse problem - determines the parameters of an unknown probability space based on the observed frequencies.

Example: Coin Flip


We assume that the coin minted is “correct” or “symmetrical”, i.e. it equally often falls out with an eagle and a tails, and never rises on an edge. Then the set of elementary outcomes consists of two elements,Ω={,}. Since we agreed that the coin is “correct”, it is reasonable to assume thatp1=p2=1/2. Now let's list all the possible events and their probabilities.

  1. Neither the eagle nor the tails will fall out. This corresponds to the event.E=∅, Pr[E]=0.
  2. An eagle will fall E={}, Pr[E]=1/2.
  3. Tails will fall out E={}, Pr[E]=1/2.
  4. An eagle or tails will fall E={,}, Pr[E]=1/2+1/2=1.

Example: Dice Flip


As in the case of the coin, we will assume that the playing die falls out all the faces equally often. Then the set of elementary outcomes consists of six elements,Ω={1,2,3,4,5,6}, all their probabilities are equal p1=p2=⋯=p6=1/6. The number of different events in this experiment is64=26(this is the number of all subsets of a set of 6 elements). Surprisingly, the question “how many different events are there in an experiment with a tossing a dice?”, In my observation, baffles 9 out of 10 applicants.
Let's look at some examples of events.

  1. Drop 1 E={1}, Pr[E]=1/6.
  2. There will be a number greater than three, E={4,5,6}, Pr[E]=1/6+1/6+1/6=1/2.
  3. A multiple of three will appear, E={3,6}, Pr[E]=1/6+1/6=1/3.

Example: two coin flips


Under the same assumptions about the "symmetry" of the coin, we define the set of elementary outcomes as the set of ordered pairs

Ω={(,),(,),(,),(,)}.

The symmetry of the coin allows us to conclude that all elementary outcomes are equally probable, i.e. p1=p2=p3=p4=1/4.
Examples of events.

  1. In the first roll, tails E={(,),(,)}, Pr[E]=1/4+1/4=1/2.
  2. At least one tails will fall out E={(,),(,),(,)}, Pr[E]=1/4+1/4+1/4=3/4.
  3. The coin will be dropped twice on one side, E={(,),(,)}, Pr[E]=1/4+1/4=1/2.

Example: selecting a random number from the 2020 calendar


Many elementary outcomes Ω={1,2,
,31}. How to choose probabilities? It depends on how the experiment is arranged. For example, we can tear out a random sheet of a tear-off calendar and see the number on it. The most accurate model describing this experiment would be a probability space with366outcomes where the same numbers of different months differ. And then the probability that the number 1 falls out would be the sum of the probabilities of elementary outcomes corresponding to the first numbers of different months, i.e.12⋅1/366. But for convenience we can consider a simpler set of elementary outcomesΩ with 31 outcomes, but with different probabilities: p1=p2=⋯=p29=12/366, p30=11/366, p31=7/366.

An example of an event: "the drawn day of the month is divided by 10". This corresponds to the event.
E={10,20,30}, Pr[E]=p10+p20+p30=(12+12+11)/366=35/366.

Comment


Once we have determined the probability space (i.e., we have decided on the set Ωand the probabilities that we attribute to elementary outcomes), then the question of the probability of some event becomes purely arithmetic. In other words, as soon as we have chosen some mathematical model, which from our point of view describes the physical process, the probabilities of all events are uniquely determined.

Self-Test Tasks


In each problem, one should first describe the probability space, and only then do the calculations.

  1. : . , .
  2. .
  3. 1 20. , , :
    • ;
    • 3;
    • 2, 3;
    • 2, 3;
    • 9;
    • , 3.

,


Consider the following experiment: toss two coins and look at which sides they fell. One could say that in this problem there are only three outcomes: two tails, two eagles and an eagle and tails. If we assume that all outcomes are equally possible, it turns out that the probability of two eagles falling is equal to 1/3. Mathematics does not prohibit us from considering such a probabilistic space, but experimental verification suggests that in the physical world the answer is rather closer to 1/4. Therefore, you should not assume by default all outcomes are equally likely, otherwise we will get 1/2 in response to a question about the likelihood of a dinosaur meeting.

Probability formula


We say that two events are incompatible if their intersection is equal to an empty set. That is, there are no outcomes that would correspond to both events. Example: the events “an even number fell on a dice” and “one or three fell on a dice” are incompatible.

Incompatible events have the following property. Let beA and B- two incompatible events. The probability that at least one of them will happen is equal to the sum of the probabilitiesA and B, in other words Pr[AâˆȘB]=Pr[A]+Pr[B]event AâˆȘBalso called the sum of the eventsA and B and denote A+B. This property is not executed for arbitrary events. For example, the events “an even number fell on a dice” and “a number more than four fell on a dice” are not incompatible and the sum of their probabilities (5/6) is greater than the probability of their sum (4/6).

Consider the following problem. In the bag are balls of three colors: white, yellow and black. Moreover, it is known that white10% of the total, and yellow - 15%. What is the probability that a randomly drawn ball will be bright? Accurate counting shows that if in a bagN balls, then the event in question corresponds to 0.1N+0.15N=0.25N balls, i.e. 25%of the total number of balls. The events “pulled a white ball” and “pulled a yellow ball” are incompatible, so the probability that the ball will be light is equal to the sum of the probabilities of these events.

Events are called opposite if exactly one of them always happens. From this definition, we can conclude that, firstly, these events are incompatible, and secondly, their total probability is 1. An event opposite to the eventEexpressed as Ω∖E (if all elementary outcomes have a positive probability, then this is the only such event).

Self-Test Challenge


Randomly select a number n from 1 to 100. Consider the following events:

  1. number n evenly;
  2. number n odd;
  3. number n divisible by 4;
  4. number n has a remainder of 2 when divided by 4;
  5. number n has a remainder of 1 when divided by 4.

Which of these events are incompatible? (indicate all pairs)

Inclusion and Exclusion Formula


How to determine the probability of the sum of two events that are not incompatible? Consider the following example. Among school students15% percent know French and 20%know German. The proportion of those who speak both languages ​​in all5%. What is the proportion of students who know at least one of these two languages? If we draw a diagram, if we add up the shares of those who know French and those who know German, then we will count twice those who know both languages. Therefore, the answer:15%+20%−5%=30%.

The same question can also be formulated in the language of probability theory: with what probability does a randomly selected student know at least one of two languages? A similar reasoning leads us to the following formula:

Pr[AâˆȘB]=Pr[A]+Pr[B]−Pr[A∩B],

Where A∩B Is an intersection of events A and B, i.e. this event consists of those elementary outcomes that enter simultaneously inA, and in B(such an event is also called a product of eventsA and B and denote Pr[AB])

Self-Test Challenge


It is known that class students with deuces in algebra make up 25%, and students with deuces in geometry make up 15%. How many students have deuces in algebra and geometry, if students who do not have deuces in any of the subjects make up 70%?

Conditional probability


Again, consider the problem of students and foreign languages. What proportion among students who know German knows French? The answer is easy to figure out by looking at the picture. It is necessary to calculate the ratio of the number of students who know both languages ​​to the number of students who know German, i.e.0.05N0.2N=25%. Turning to the language of probability theory, one can ask the following question: what is the probability that a randomly chosen student knows French, provided that he knows German? Let the eventsA and Bcorrespond to the fact that a randomly selected student knows French and German, respectively. Then the desired probability is called the conditional probability of occurrenceA provided B and is designated Pr[A∣B]. By analogy, we obtain the following formula for conditional probability:

Pr[A∣B]=Pr[A∩B]Pr[B].

What is the probability that a randomly chosen student knows German, provided that he knows French?

From the conditional probability formula, we can obtain a formula for the probability of the product of two events.

Pr[A∩B]=Pr[B]⋅Pr[A∣B].

In words: to find the likelihood that both events will occur A and B, you need to multiply the probability of the event B on the conditional probability of an event A with known B.

Self-Test Challenge


In a class of 50% of boys; 60% of boys love ice cream. What is the proportion of boys who love ice cream among class students? How to reformulate this in the language of probability theory?

Independence


Consider an experiment with throwing two dice: red and blue. There are 36 outcomes in this experiment that we consider equally possible. The likelihood that a triple will fall on a red dice is equal to1/6 (6 out of 36 outcomes), the probability that a triple will fall on a blue cube is also equal 1/6. What is the likelihood that a three will fall on a blue cube, provided that a three has fallen on red? Using the conditional probability formula, you need to calculate the ratio of the probability of a triple on both cubes to the probability of a triple on red. We get1/361/6=1/6. Note that the presence of information that a triple has fallen on a red cube does not affect the probability of a triple falling out on blue. Such events will be called independent . We will say that the eventsA and B independent if

Pr[A∣B]=Pr[A].

(This definition assumes that both probabilities of events A and Bstrictly greater than zero.)

An alternative definition can be obtained by using the definition of conditional probability: two events are called independent if the probability of their product is equal to the product of their probabilities.

Pr[AB]=Pr[A]⋅Pr[B].



Self-Test Tasks


  1. Are the “know German” and “know French” events independent?
  2. Throw one dice. Are events independent:
    1. “Even fell out” and “odd fell out”,
    2. “Even fell out” and “2 fell out”,
    3. "Even fell out" and "a multiple of three fell out."

The next step is a conversation about the Bayes formula, which is derived from the definition of conditional probability. Rewrite the definition:

P[B∣A]=P[A∩B]P[A]⇒P[A∩B]=P[B∣A]⋅P[A].


And substituting this into the definition, we get the Bayes formula

P[A∣B]=P[A∩B]P[B]=P[B∣A]⋅P[A]P[B],


which allows you to swap the event and condition under the sign of probability. I think that about applying the Bayes formula you need to write a separate post, for example, one .

We will end here with definitions and before moving on to paradoxes, let's discuss, and in which cases we can talk about probability.

When can we talk about probability?


I propose to consider several issues that illustrate the importance of wording.

What is the likelihood that when walking along the street you will meet a dinosaur?

I think it is clear to everyone that this is not 1/2. But still, how to answer this question correctly? The problem with this question is that it is formulated incorrectly - it is impossible to unambiguously determine the probability space from it, and therefore it is impossible to talk about probability. You can suggest some other wording of the question, in which it will be obvious. For example, starting tomorrow, on every street in the city, every minute with a probability of 0.00001, a dinosaur materializes and exists for an hour without leaving anywhere. In this formulation, the random process is understandable and you can assess the likelihood of a meeting if you determine how the walk is arranged, how long it takes and how many streets it touches.


You tossed a coin and without peeping covered it with your hand. What is the likelihood that the coin is turned eagle up?

I'd like to say that in this case certainly the probability is 1/2. However, strictly speaking, there is no longer any random process. The coin has already fallen to one side. Just because you don’t know something does not mean that it is something random. For example, if you do not know the solution to the equation, this does not mean that any number can be its solution with equal probability. Therefore, in this case, the probability space cannot be described. You can reformulate the question, for example, like this: "What is the probability that you will guess the side of the coin, if at random it is equally probable to choose an eagle or tails?". In this formulation, it is already clear what is a random process (choosing an eagle or tails), how to determine the probability space and get the answer 1/2. Moreover, in such a wording, it is completely unimportant whether the coin was “honest” or not.

Comment. Our confidence in something can also be described in terms of probability theory - this is done in the framework of the Bayesian interpretation of probability theory. This interpretation allows us to use the apparatus of probability theory to assess our confidence in the truth of certain statements (not necessarily random) based on the information that we know. However, it is worth noting that in this case the concept of probability becomes subjective - the same event from the point of view of different observers may have different probabilities. For example, in poker you can consider the probability of a spades falling as positive (since you do not see her on the table and in your hand), and your opponent, who already has a spades in your hand, will evaluate the probability of her falling as zero. In this case, one can also come up with an option in which both estimates turn out to be different from the “real”, objective, probability. There is no contradiction, because these are three different sizes (players have different information,and the objective probability in this case corresponds to complete information).

You woke up in the morning. What is the probability that today is Sunday?

I think that you already understood that the 1/7 answer is incorrect, or rather, the question is incorrect. It is not clear what is a random process. In order to get 1/7 you need to clarify the question, for example, like this: you fall asleep on Sunday evening and randomly wake up any morning next week, what is the probability that you will wake up on Sunday? But even with this clarification, if you ask about the day of the week after you woke up (after a random choice was made), such a question will remain incorrect - otherwise you have to assume that you are in a superposition of all days of the week before until you look at the calendar.


I wrote some (specific) number on the board and claim that I have successfully tested it for simplicity twice with a probabilistic algorithm, which is mistaken with a probability of less than 1%. How likely is this number to be prime?

I would like to say that this number is prime with a probability of more than 99.99%. However, from a mathematical point of view, a number can be either simple or not. Therefore, it is incorrect to say so. After the algorithm has completed work, there is no longer anything random in this formulation of the problem, therefore, there is no probability either. It would be right to say that you are 99.99% sure that this number is prime, but you can only state this if you trust me 100% :)

Paradoxes


In this section, we will try to parse several well-known "paradoxes" of probability theory and understand that they either have no contradictions or the questions are posed incorrectly.

The Monty Hall Paradox


This is a very famous paradox . Many copies were broken about him, including even eminent mathematicians gave the wrong answer.
Imagine that you became a participant in a game in which you need to choose one of three doors. There is a car behind one of the doors, goats behind two other doors. You select one of the doors, for example, number 1, after which the host, who knows where the car is and where the goats are, opens one of the remaining doors, for example, number 3, behind which the goat is located. After that, he asks you if you would like to change your choice and choose door number 2? Will your chances of winning a car increase if you accept the host’s offer and change your choice?

As Wikipedia suggests , in order for the task to be defined correctly, we need to clarify that the participant in the game knows the following rules in advance:

  1. the car is equally likely to be placed behind any of the three doors;
  2. the presenter knows where the car is;
  3. in any case, the leader must open the door with the goat (but not the one chosen by the player) and invite the player to change the choice;
  4. if the presenter has a choice which of two doors to open, he chooses any of them with the same probability.

If you are not familiar with this paradox, then I suggest you think for a few minutes about what the correct answer will be.


In order to answer the asked question, let's figure out what is a random process here. The clarification shows that the random process is mentioned only in paragraphs 1 and 4: “the car is equally likely to be located behind any of the three doors” and “if the presenter has a choice which of the two doors to open, he chooses either of them with the same probability”. The question we must learn to answer is: “Will your chances of winning a car increase if you accept the host’s offer and change your choice?” Those. we are asked about which of the two strategies gives a greater probability of winning. I note that condition number 4 does not affect the fact of winning the player, so it makes no sense to include it in the probability space. Therefore, it is proposed to choose a probability space with many elementary outcomes.Ω={1,2,3}corresponding to the door number behind which the car is located and the probabilities p1=p2=p3=1/3. Now consider two strategies of the player: “leave the selected door”, we denoteS1, and “change the door”, we denote S2.

We don’t know how the player chooses the first door, but we don’t need to know. It is enough to check how the strategy works in all first-door elections. Denote byd the door that the player chose initially, and through x- the door behind which the car is hidden. Then for anyd∈{1,2,3} event “player won by using strategy S1"Corresponds to the fact that he guessed the right door on the first try. Formally speaking, we are interested in the event.E1={d}, i.e. x=d, and its probability 1/3. Event “player won using strategyS2»Corresponds to the opposite event E2=Ω∖{d}, i.e. x≠d, and its probability 2/3. It remains to be noted once again that if this analysis is true for any choicedTherefore, it is true with any strategy for choosing the first door. In addition, we note that we did not use condition 4 in any way.

As you can see, there are no ambiguities here, this task is called a paradox only because the answer may not correspond to intuition. But this happens quite often in mathematics.

The paradox of a boy and a girl


I quote Wikipedia .
The task was first formulated in 1959 when Martin Gardner published one of the earliest versions of this paradox in Scientific American, entitled “The Two Children Problem”, where he cited the following statement:

  • Mr. Jones has two children. The eldest child is a girl. What is the likelihood that both children are girls?
  • Mr. Smith has two children. At least one child is a boy. What is the likelihood that both children are boys?

Gardner himself initially answered 1/2 and 1/3respectively, but subsequently realized that the situation in the second case is ambiguous. The answer to the second question may be1/2 depending on how it was found out that one of the children is a boy.

Probabilistic space given Ω={,,,} and all probabilities are equal 1/4. In the first case, we know that the event is completedE={,}. Therefore, subject toEthe probability of two girls is 1/2.

In the second case, everything is more complicated, because it’s not clear how we learned that Mr. Smith has one of the children’s boys. Two options can be suggested:

  1. A random person with two children is selected and asked if there is a boy among his children. Then the probability of two boys is 1/3, because this corresponds to the probability of MM subject to the eventE={,,}.
  2. A random person with two children is selected, his random child (senior or youngest) is selected and his gender is asked. This experiment corresponds to another probabilistic space in which one must still take into account the choice of the child about whom they are asked. It will have 8 elementary outcomes, and four of them will suit us (MM was asked about the eldest, MM was asked about the younger, MD was asked about the older, DM was asked about the younger). Two outcomes suit us, so the answer is 1/2.

The Paradox of Sleeping Beauty


The discussion of this paradox is motivated by this post on the Habré , which caused widespread discussion, but there is a description of this paradox on Wikipedia .
The test subject (Sleeping Beauty) is given an injection of sleeping pills. A symmetric coin is thrown. In the case of an eagle’s loss, it is woken up, and the experiment ends there. In the case of tails, they wake her up, give her a second injection (after which she forgets about the wake up) and wake up the next day without throwing coins (in this case, the experiment lasts two days in a row). The whole procedure is known to Beauty, but she does not have information on which day she was woken up.

Imagine yourself in the place of Sleeping Beauty. You woke up. What is the likelihood that a coin has gone tails?

It is proposed to consider two alternative solutions with different results.


Solution 1


You do not have any information about the result of a coin loss and previous awakenings. Since it is known that the coin is fair, it can be assumed that the probability of tails1/2.

Decision 2


Let's do the experiment 1000 times. Sleeping Beauty is woken up on average 500 times with an eagle and 1000 times with a tails (because when tails fall, the Sleeping Beauty is asked 2 times). Therefore, the probability of tails2/3.

It seems that both decisions can claim to be right. However, in an attempt to determine the probability space, serious difficulties await us. What is a random process? The fact is that when the Sleeping Beauty wakes up, there is no longer any random process. The choice has already been made. She does not know the result of this choice, but nothing accidental is already there. This brings us back to the dinosaur example. If you do not know if there is a dinosaur around the corner, then this does not mean that it is there with a probability of 1/2. Therefore, “Decision 1” does not answer the question about probability, but the question about the degree of confidence of the Sleeping Beauty. And “Solution 2” suggests considering a completely different experiment, in which a completely different question is being asked, which is proposed to be answered by an external observer before the experiment begins.

In order to give this question a mathematical meaning and get the desired answer 2/3, you will have to use some philosophical device, such as “sharing souls”. For example, like this: you go into the soul relocation apparatus, after which a coin is thrown for the Sleeping Beauty, which creates two parallel universes: one where the coin fell by an eagle, and the other where it falls by tails. In total, in the space-time of these two alternative universes, there are three different awakenings of the Sleeping Beauty. The Soul Relocation Apparatus with a 1/3 probability injects your soul into the body of the Sleeping Beauty shortly before one of these awakenings. What is the probability that you wake up in a parallel universe where the tails fell out?

As you can see, in order to give a mathematical meaning to this question, one has to fantasize well, but this is not done by mathematicians, but by philosophers (more in this post ). To say that “both solutions are correct” is incorrect from a mathematical point of view.

Self-Test Challenge


Explain why in the problem about the children of a sailor, with which this post begins , the question is posed incorrectly (i.e., neither 1/2 nor 1/3 are the correct answer).

Endless case


When we move on to the infinite case, i.e. If we consider experiments with an infinite number of elementary outcomes, then everything becomes much more complicated. I will not go into details and will not even determine the probability space for an infinite case, because this requires more complex math. However, to illustrate, I note that in the infinite case there can be such (bad) sets of elementary outcomes that do not have probability (immeasurable sets). Moreover, for all good (measurable) events, the probability is uniquely determined. Therefore, those “paradoxes” that arise in the infinite case also arise due to the ambiguity of the choice of probability space. A good case in point is the Bertrand paradox, showing how seemingly equivalent (actually not) probabilistic spaces lead to different results.

Instead of a conclusion


Even if you are not going to go anywhere or get interviewed for technical positions in an IT company, you may still want to refresh your knowledge of mathematics, which can be useful in programming. I can advise the online course of the CS center on probability theory , which is read by A.I. The brave ones.


BONUS


I invite everyone to listen to a lecture by Alexander Shen "Generators of" random numbers: theory and practice " this Sunday April 26 at 14:00 at the Computer Science Club . The lecture will be given in zoom, to participate you need to sign up for a course or subscribe to the newsletter.


All Articles