"The chances that anyone has ever shuffled a pack of cards in the same way twice in the history of the world are infinitesimally small, statistically speaking."
From QI: http://qi.com/infocloud/playing-cards
Given the task of creating a random sample size of N unique permutations of K cards then for quite small values of K, and N in the millions for example the total number of permutations of K cards is much greater than N. This means that S random shuffles of the cards, where S is "quite close" to N is likely to generate the N unique permutations.
Factorials get large quickly!
The number of permutations of K cards is given by K!, K-factorial.
Here is a table of factorials:
In [6]:
from math import factorial as fact
for k in range(1,10) + range(10, 41, 5):
print('K = %2i, K! = %10e (= %i)' % (k, fact(k), fact(k)))
Random sample counts for different N and K
Python uses the random modules Mersenne Twister algorithm producing 53-bit precision floats having a period of 2**19937-1 for its pseudorandom number generator and the unbiased Fischer-Yates (a.k.a Knuth) shuffle algorithm.
In [14]:
from random import shuffle
for n in (1000, 10000, 100000, 1000000):
kstart = [x for x in range(20) if fact(x) >= n][0]
for k in range(kstart, kstart + 6):
perms = set()
perm = list(range(k))
shuffles = 0
while len(perms) < n:
for i in range(n - len(perms)):
shuffle(perm)
shuffles += 1
perms.add(tuple(perm))
print('N=%7i, K=%2i, shuffles=%7i i.e %5.1f%% more shuffles than N, K!/N= %i'
% (n, k, shuffles, shuffles * 100. / n - 100, fact(k) // n))
print('')
That last line of the table above means that if you want a million perms of just fifteen cards then you are likely to get them by just shuffling the cards and writing what you get down a million times!