Probability

From Jonathan Gardner's Tech Wiki
Jump to: navigation, search

Abstract

These are some notes on probability theory.

Introduction

Probability is the science and math of studying what is possibile and how likely it is to occur. When we say, "There is a 10% chance you will roll a 1", we mean, "If you roll the dice a lot of times (like a thousand or more), then 10% of those rolls will be a 1."

Probability is terrible at predicting what a single event will be. That is, if you are playing a dice game, and you are betting your life's fortune on an event, there is a good chance you will lose your money. That's why gambling is always a bad idea.

Probability is very good at predicting what a lot of events will look like. If I say that there is a one in a million chance you will get in a car accident every time you drive, then you should feel safe driving around a thousand times since it will be unlikely you will see any car accidents. If I say that there is a one in a hundred chance you will get in a car accident every time you drive, then you should really limit your driving to less than ten times otherwise you might see a car accident. If I look at a population of a million people who drive every day with a one in a million chance of getting in a car accident, then I should prepare a plan for a car accident or two every day.

Probability doesn't change based on past experience. That is, if you are getting 7's consistently, your chances of getting a 7 in the future is no more or less than it was before you started playing. (Unless, of course, you switch your dice with loaded ones.) Your luck doesn't change, and you don't have luck momentum. It really is better to quit when you are ahead, since probability predicts you shouldn't be ahead in the first place.

Probability can be an exact science. It all depends on what we know. Sometimes we can get good at making probability predictions even when we don't know much.

Statistics is roughly related to probability. Statistics, however, has to do with counting while probability has to do with predicting. A large part of probability is also counting so that accounts for their relation. To summarize the difference, statistics says that 10% of the people in your classroom has blonde hair. Probability says that if you blindly and randomly pick a single person from the classroom, 10% will have blonde hair.

Where things get hairy is when you start confusing statistics with probability. Just because 10% of the people in your classroom are blonde doesn't mean 10% of the people who will walk through the door into the classroom will have blonde hair. Likewise, when medical scientists confuse statistics with probability ("eating spinach reduces your chance of colon cancer by 5% since we surveyed a group and those who ate spinach had 5% rarer occurences of cancer"), they are saying things they cannot know and shouldn't pretend to know. Going from one to the other is difficult at best, and should always be treated with the highest levels of suspicion.

What is the chance of X?

When I say "X" (or "Y"), I mean the chacne that the future will adhere to the rules of X (or Y). So X and Y are really rules to be satisfied.

For instance, X can be "the coin I flipped will read 'heads'." Or X can be "the dice I rolled sum to 7." Or it can even be, "I am dealt two pair in a game of poker." Or X could be, "I get cancer and die by the age of 60." Or even, "that I earn a million dollars by my 40th birthday."

The chance of X is written as P(X) can be computed by taking all possibile future events and counting the ones that satisfy X and the ones that don't. For each specific possible future, we'll call is 's' if it is possible and 'x' if it satisfies X as well. Note that all x are in s, since they are all possible.

Then, based on the possibility of each scenario, we add up the probabilities of each one that satisfies X and we can calculate, precisely.

Formally,

       ΣP(x) satisfying X
P(X) = -----
       ΣP(s) all possibilities

In card games, dice, and other controlled events, it is assumed that each card or each side of the dice is just as likely to appear as any other. This may not be true. For instance, a clever shuffler can deal you a specific card, or a loaded dice will show one side more often than others. When this is violated, then you can no longer work with the assumption that one possibility is just as likely as another.

So, with all possibilities just as probable as the other,

P(X) = n(x) / n(s)

where n(x) is the number of x's and n(s) is the number of possibilities, each equally probable.

For instance, the chance of throwing a 1 on a 6-sided die is:

P(throwing a 1 on a six-sided die) = n(threw a 1) / n(threw something) = 1/6

The chance of drawing a Jack of Clubs from a 52-card Poker deck is:

1/52

The chance of drawing a Jack of Clubs if someone has already drawn a Jack of Clubs and there are 42 cards left:

0/42

The chance of drawing a Jack of Clubs if no one has drawn the Jack of Clubs and there are 42 cards left:

1/42

What is the chance of X and Y occurring together?

If X precludes Y (or vice versa), then there is no chance. For instance, if X says, "the die I throw shows a 5" and Y says, "the die I throw shows a 4" then the chance of both occuring is 0.

If, however, the two events are independent, for instance, X being "the first die I throw shows a 5" and Y being "the second die I throw shows a 4", then the chance is simply the product of the two possibilities. (We are assuming, quite reasonably, that one die throw doesn't affect the other. If it did, this wouldn't hold true.)

Note that you can't be vague with this without rewriting the rules. For instance, X cannot be "one die of two shows a 5" and Y "the other die shows a 4". These kinds of rules need to be simplified before doing the math.

What is the chance of X or Y occurring?

As long as X doesn't imply Y (that is, they aren't related at all, as in the "X and Y" example above), the the chance of X or Y occurring is the sum of the chance of X and Y.

Everything from here

The above is enough to mathematically construct the probabilities of most games you might play that either involve dice or cards. For instance, we can calculate, given the above, the chances of you having the opportunity to buy James Street given 5 players.

How many...

The following are some notes for counting. Note that the factorial is used frequently.

How many ways can you arrange n things?

One very common thing you have to count when doing probability is "How many ways can you arrange n things?"

The number of ways to arrange n things is simply n!.

n! = n·(n-1)·(n-2)·...·(n-n+3)·(n-n+2

The explanation is simply that you have as an option n ways to choose the first thing, (n-1) ways to choose the second thing (since you've already used one up), (n-2) ways to choose the third thing (since you've used up two now), and so on.

Imagine a Tree. The leaves of the tree represents each possible distribution. The branches represent how we got there.

From the trunk, you have n possibilities. The first item could be any one of the possible items.

At the next level, you have n-1 possibilities for each of the first n branches. The second item can only be any one of the possible items minus the one we already used up.

At the next level, you have n-2 possibilities for each of the n&times(n-1) branches. That's because 2 items have been used up,

And so it goes on, until you run out of choices.

How many ways can you arrange n things in m buckets of 1 item each?

Drawing from a pile/bag

In cases where order is important (it usually isn't):

nPr = P(n,r) = number of ordered combinations of drawing r items from n distinct items.
             = n!/(n-r)!

Note that this is the same as multiplying all the numbers from n to n-r. So P(6,4) is 6×5×4×3 = 360.

In cases where order isn't important:

nCr = C(n,r) = number of unordered combinations of drawing r items from n distinct items
             = nPr/r!

Why the relationship? Given r items, there is r! ways to order them. So if we know the number of ordered ways to draw the items, that means there is that many divided by r! ways to draw the items where order isn't important.

Poker
How many ways to draw 5 cards from 52? C(52,5) = 2,598,960
Rummy
Ways to draw 7 cards? C(52,7) = 133,784,560
Hearts/Spades
Ways to draw 13 cards? C(52,13) = 635,013,559,600
Phase 10
C(107,10) = 35,137,373,005,735

Given something like Phase 10 where you have 107 cards, 8 of which are wild, how many hands have...

0 wild
C(8,0)*C(99.10)
1 wild
C(8,1)*C(99,9)
2 wild
C(8,2)*C(99,8)
3 wild
C(8,3)*C(99,7), etc...

In poker, how many hands are a royal flush? Well, there is only 4 ways to make a royal flush, so 4. How many hands have 4 of a kind? For 4 aces, C(4,4)*C(48,1)=48. There are 13 numbers, so 13*48=624.

Rolling Dice

Rolling dice is fundamentally different. The result of one die doesn't depend on another like cards. In cards, if you already have the 4 of spades, you aren't going to get it again (in a single deck.)

The number of possibilities is (number of possibilities of 1 throw)^(number of throws). So in Yahtzee, there are 6^5 = 7,776 possibilities (ordered).

There is always 1 possibility for a particular ordered throw.

If the order doesn't matter, then the possibility of a particular throw is calculated with multinomial coefficients. You have:

(total)!
--------
(ones)!(twos)!(threes)!(fours)!...

So, for 5 dice, if you want 4 1's and 1 4, there is exactly:

  5!      120
------ = ------ = 5
 4!1!     24×1

If you wanted 2 1's, 2 2's, and a 3,

   5!        120
-------- = ------ = 30
 2!2!1!     2×2×1


If you are only interested in pairs or triplets, then you have to multiply by 6!/(number of numbers used)!. For instance, if you wanted two pairs and a singlet, you could start by counting how many combinations with 2 1's, 2 2's, and a 3 there are (30). Now, instead of a 1, you can choose 6 other numbers. Instead of a 2, 5 other numbers. Instead of a 3, 4 other numbers. So you have 30*6!/3! = 3600 possibilities. Note here, however, that you are double counting. For instance, you are counting 11223 and 22113. Since the pairs are indistinguishable from one another, you have to divide by 2. So the number of possibilities is really 1800.

See Yahtzee Probability by Mark-Jason Dominus.

Monte Carlo Method

What if you don't want to count---or can't? Then roll the dice, and count how many times you event happens. With enough rolls, you should get a pretty good picture of what the possibilities are.

The Monte Carlo method is good at predicting the possibilities that are most likely to occur. It is very bad at predicting very unlikely possibilities. This makes it suitable for real life since improbable things don't occur very often.

The Monte Carlo method is very, very useful, especially for scenarios where you have to make a choice before the next dice roll or the next card draw. If you have a strategy encoded in a program, then you can see how well that strategy works by running it. You can also test two strategies against each other to see which ones work better.