Proposal: Array.prototype.shuffle (Fisher-Yates)
Isiah Meadows
isiahmeadows at gmail.com
Sun Apr 29 22:27:42 UTC 2018
BTW, I added this to my list of various proposed array additions (as a
weak one) [1].
I did do a little reading up and found that in general, there's a
major glitch that makes shuffling very hard to get right (issues even
Underscore's `_.shuffle` doesn't address), specifically that of the
size of the random number generator vs how many permutations it can
hit. Specifically, if you want any properly unbiased shuffles covering
all permutations of arrays larger than 34 entries (and that's not a
lot), you can't use any major engines' `Math.random` (which typically
have a max seed+period of 128 bits). You have to roll your own
implementation, and you need at least something like xorshift1024* [2]
(1024-bit max seed/period size) for up to 170 entries, MT19937
[3]/WELL19937 [4] (seed/period up to 2^19937-1) for up to 2080
entries, or MT44497/WELL44497 for up to 4199 entries. Keep in mind the
minimum seed/period size in bits grows roughly `ceil(log2(fact(N)))`
where `N` is the length of the array [5], and the only way you can
guarantee you can even generate all possible permutations of all
arrays (ignoring potential bias) is through a true hardware generator
(with its potentially infinite number of possibilities). Also, another
concern is that the loop's bottleneck is specifically the random
number generation, so you can't get too slow without resulting in a
*very* slow shuffle.
If you want anything minimally biased and much larger than that,
you'll need a cryptographically secure pseudorandom number generator
just to cover the possible states. (This is about where the
intersection meets between basic statistics and cryptography, since
cipher blocks are frequently that long.) But I'm leaving that as out
of scope of that proposal.
[1]: https://github.com/isiahmeadows/array-additions-proposal#arrayprototypeshuffle
[2]: https://en.wikipedia.org/wiki/Xorshift#xorshift*
[3]: https://en.wikipedia.org/wiki/Mersenne_Twister
[4]: https://en.wikipedia.org/wiki/Well_equidistributed_long-period_linear
[5]: http://www.wolframalpha.com/input/?i=plot+ceiling(log2(x!))+where+0+%3C+x+%3C+10000
-----
Isiah Meadows
me at isiahmeadows.com
Looking for web consulting? Or a new website?
Send me an email and we can get started.
www.isiahmeadows.com
On Sun, Apr 29, 2018 at 4:01 PM, Alexander Lichter <es at lichter.io> wrote:
> On 29.04.2018 18:34, Naveen Chawla wrote:
>>
>> I don't think there's such a thing as "real random" in digital algos, just
>> "pseudo random".
>
> You are right. I meant 'unbiased' pseudo randomness here.
>
>> Apart from card games, what's the use case?
>
> There are a lot of potential use cases. The best that comes into my mind is
> sampling test data.
>
>
> On 29.04.2018 19:01, Isiah Meadows wrote:
>>
>> I think this would be better suited for a library function rather than a
>> language feature. I could see this also being useful also for
>> randomized displays, but that's about it. And I'm not sure what an
>> engine could provide here that a library couldn't - you can't really
>> get much faster than what's in the language (minus bounds checking,
>> but the likely frequent cache misses will eclipse that greatly), and
>> it's not unlocking any real new possibilities.
>
> As Tab Atkins Jr. already pointed out it's not about performance benefits.
> It's about how error-prone custom shuffle implementations are/can be.
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
More information about the es-discuss
mailing list