Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I understand that "swap with entire deck" can't possibly be uniform because it has 52^n input possibilities, which is not divisible by 52! (and that the correct Fisher-Yates having 52! input possibilities and being able to generate every possible outcome is one way to prove that it is uniform). However, I'm not sure I can come up with an intuition for why any particular bias should exist, or why there is a discontinuity that makes it much more likely for a card to end up a short distance after its starting position:

http://en.wikipedia.org/wiki/File:Orderbias.png

Anyone have a good explanation?



I'm not entirely sure, but I have been able to figure a few things out.

By running some simulations by shuffling the range from 1 to n (with n going from 3 to 7), I found that at least one of the most common permutations had always started with 2. I'm unable to come up with a reason to explain this, but I was able to figure out something else interesting.

Imagining that the deck is vertical and going through each card and swapping it with a random card in the deck, the random card that has been swapped will never move further up the deck while the other card can still possibly move further up the deck. This implies that cards at the top of the deck should stay near the top and the cards at the bottom should stay near the bottom. I tested this by taking the sum of the sums of the first half of all the permutations and the sum of the sums of the second half of all the permutations and found that the total sum of the first halves was slightly smaller than the total sum of the second halves.

  n first halves | second halves
  3           53 |            54
  4         1265 |          1295
  5        18322 |         18976
  6       461683 |        498093
  7      9638931        10051128
So it seems that cards starting near the top are more likely to end up near the top and cards starting near the bottom are more likely to end up near the bottom.

Note: for odd numbers I threw out the middle number.


Here's my attempt at an intuitive explanation:

After the first swap, the distribution of numbers at the first position is uniform. What happens in the second swap? Well, the second position contains 1 with probability 1/n, and 2 with probability (n-1)/n. With probability 1/n, this position that is highly biased towards 2 (for n > 2) will get swapped with the first position. So 2 is more likely to be found in the first position.

Of course, it gets messy to carry this out for the n swaps and I still haven't really answered your question yet. To do that let's make the following assumption: after the ith swap, the distribution of numbers at position i is uniform. (I think you can prove this inductively but I haven't tried working it out - I was studying for a midterm before I got nerd-sniped.)

Before the ith swap, position i is biased towards number i. By our assumption, position i-1 is uniform. By making position i uniform we are essentially "smoothing out" the bias at position i to the other positions. Hence, as in the example I started with, it is more likely after swap i that position i-1 will contain number i, at the expense of the other numbers. This then reduces the likelihood of finding number k in positions 1, 2, etc. as opposed to position k-1: 2 is more likely in 1, 3 is more likely in 2, and so on.


There's a decent explanation here:

http://www.codinghorror.com/blog/2007/12/the-danger-of-naive...

There's also a more complete story about the online poker exploit here:

http://www.cigital.com/papers/download/developer_gambling.ph...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: