In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the probability of (exactly) k successes in n draws from a finite population (of size N) without replacement.(1)
Above you see the mathematical formula for the hypergeometric distribution (i.e. the probability mass function), where
- N is the population size
- m is the number of potential successes in the population
- n is the number of draws
- k is the number of successes
- N is the size of deck
- m is number of cards I want to draw
- n is the number of draws (the number of cards of our opening hand)
- k is the number of cards I want to draw
Oh wait, that last number for k is not entirely right! Actually we do not want to calculate the probability of having exactly one card in our opening hand (X=1), but having at least 1 card in our opening hand (X=1). Now we have two ways of solving this:
- Add the two probabilities for X=1 and X=2 (see generalized formula below), or
- Take 1 and then subtract the probability for X=0 from it.
So there's a 17.9% chance that we have at least one Information Highway in our opening hand (with 2 copies in out 75 card deck). Now the next question would be how does our chances change for drawing at least one copy of the card, if either change the decksize or if we change the number of copies (of Information Highways) in our deck.
As you can see above the calculations become tedious fast, if you want to calculate the resulting percentages over and over again for different parameters (different values of k, n, M or N). Luckily modern spreadsheet programs(2) have built-in functions for these type of calculations:
- In Microsoft Excel you can use the function "hypgeomdist" for calculation a certain probability. In general the looks like this, when you enter it:= HYPGEOMDIST(successes_in_sample, sample_size, number_of_successes, population; culmulative)(3)= HYPGEOMDIST(k; n; M; N; TRUE|FALSE)When entering the parameters from our example above, it would look like this:= 1-HYPGEOMDIST(0; 7; 2; 75; FALSE)
- A similar function (with the same name) exists in Apache OpenOffice:= HYPGEOMDIST(k; n; M; N)When entering the parameters from our example above, it would look like this:= 1-HYPGEOMDIST(0; 7; 2; 75)
As you can see it makes a huge difference how large your deck is. For the 60 card deck the probability is as high as 22,15%, and for the 90 card deck as low as 15,03%. This is one of reasons for playing smaller sized decks, especially if you're playing a number of cards that you only have one or two copies of in your deck.
What happens if now vary the number of copies of the card we want to have in our opening hand (in a deck of 75 cards).
As you can see again there's a dramatic change of the probabilities we're having when we increase the number of copies in our deck. I am not postulating to add seven Information Highway to your deck, but if you want to have a 50% chance of having some sort of crypt acceleration in your crypt you could something like 2 Information Highway, 2 Dreams of the Sphinx and 3 Zillah's Valley to your deck (for example).
Of course I'm aware, that all of you know, that increasing the number of copies in your deck or decreasing your decksize, increases the chances of drawing a particular card, but I wanted to give you exact numbers instead of an educated guess.
Varying decksize and the number of cards in the deck, the final chart looks like this:
Next time: more fun with multivariate hypergeometric distribution!
- Hypergeometric Distribution (on Wikipedia)
- Hypergeometric Calculator
- Probability: Drawing Cards from Decks (in "The Mathematics of Magic The Gathering")
(1) cf. the binomial distribution, which describes the probability of k successes in n draws with replacement.
(2) In Perl you can use the module Math-GSL-0.26 (Math::GSL::CDF).
(3) Only Excel 2010 (or higher) have the 5th parameter (cumulative).