Proposal: Array.prototype.shuffle (Fisher-Yates)

Isiah Meadows isiahmeadows at gmail.com
Mon Apr 30 08:59:20 UTC 2018


If you can come up with something better than what I have, feel free to
file an issue against my repo. (I have minimal need, but I'd love to have
something that works.)

https://github.com/isiahmeadows/array-additions-proposal/issues/new

-----

Just an item of note about algorithm choice: I strongly doubt TC39 reps
would be okay with a cryptographic primitive being used in the spec, since
it's pretty useless elsewhere, and I'm not sure about the legal aspect
(export restrictions, etc.).

On Mon, Apr 30, 2018, 01:50 J Decker <d3ck0r at gmail.com> wrote:

>
>> >
>> > While a good shuffle should be able to start from the initial state and
>> > generate a good shuffled result, it is slightly better to progressively
>> > shuffle the previous result array into new positions than to start from
>> an
>> > initial state and compute a 1-off.
>>
>> Evidence? (BTW, the Fisher-Yates shuffle is theoretically as biased as
>> the RNG that powers it.)
>>
>>
> Specifically re-1-off vs progressive shuffling?  Not on-hand; it was an
> evaluation I did 10 years ago; and while the difference wasn't that much,
> it was notable.  the rate for any single ball to be in a specific position
> took longer (by iteration count) when resetting the array to shuffle to
> initial conditions (1-N)
>
>
> Fisher-Yates. is biased by the usage of the number generator; not just the
> generator.
> I've been considering how to graphically/visually demonstrate it; but I'v
> failed so far.
>
> for a list of 10 numbers; and a 'perfect' RNG.
> For the last number, it's 90% guaranteed to evaluate it's shuffle twice.
> (10% it won't move) It has a 10% chance to be in the last position, but
> only a 1.1% chance to be in the second to last position.  (1/10 * 1/9 =
> 1/90 = 0.011...) so it will NOT be the last position 90% but it will NOT be
> the second to last 98.9% of the time.  For an ideal shuffle, every original
> position will end up in every other position 10% of the time.
>
>
>
>> >
>> > On Sun, Apr 29, 2018 at 3:27 PM, Isiah Meadows <isiahmeadows at gmail.com>
>> > wrote:
>> >>
>> >> BTW, I added this to my list of various proposed array additions (as a
>> >> weak one) [1].
>> >>
>> >> I did do a little reading up and found that in general, there's a
>> >> major glitch that makes shuffling very hard to get right (issues even
>> >> Underscore's `_.shuffle` doesn't address), specifically that of the
>> >> size of the random number generator vs how many permutations it can
>> >> hit. Specifically, if you want any properly unbiased shuffles covering
>> >> all permutations of arrays larger than 34 entries (and that's not a
>> >> lot), you can't use any major engines' `Math.random` (which typically
>> >> have a max seed+period of 128 bits). You have to roll your own
>> >> implementation, and you need at least something like xorshift1024* [2]
>> >> (1024-bit max seed/period size) for up to 170 entries, MT19937
>> >> [3]/WELL19937 [4] (seed/period up to 2^19937-1) for up to 2080
>> >> entries, or MT44497/WELL44497 for up to 4199 entries. Keep in mind the
>> >> minimum seed/period size in bits grows roughly `ceil(log2(fact(N)))`
>> >> where `N` is the length of the array [5], and the only way you can
>> >> guarantee you can even generate all possible permutations of all
>> >> arrays (ignoring potential bias) is through a true hardware generator
>> >> (with its potentially infinite number of possibilities). Also, another
>> >> concern is that the loop's bottleneck is specifically the random
>> >> number generation, so you can't get too slow without resulting in a
>> >> *very* slow shuffle.
>> >>
>> >> If you want anything minimally biased and much larger than that,
>> >> you'll need a cryptographically secure pseudorandom number generator
>> >> just to cover the possible states. (This is about where the
>> >> intersection meets between basic statistics and cryptography, since
>> >> cipher blocks are frequently that long.) But I'm leaving that as out
>> >> of scope of that proposal.
>> >>
>> >> [1]:
>> >>
>> https://github.com/isiahmeadows/array-additions-proposal#arrayprototypeshuffle
>> >> [2]: https://en.wikipedia.org/wiki/Xorshift#xorshift*
>> >> [3]: https://en.wikipedia.org/wiki/Mersenne_Twister
>> >> [4]:
>> https://en.wikipedia.org/wiki/Well_equidistributed_long-period_linear
>> >> [5]:
>> >>
>> http://www.wolframalpha.com/input/?i=plot+ceiling(log2(x!))+where+0+%3C+x+%3C+10000
>> >>
>> >> -----
>> >>
>> >> Isiah Meadows
>> >> me at isiahmeadows.com
>> >>
>> >> Looking for web consulting? Or a new website?
>> >> Send me an email and we can get started.
>> >> www.isiahmeadows.com
>> >>
>> >>
>> >> On Sun, Apr 29, 2018 at 4:01 PM, Alexander Lichter <es at lichter.io>
>> wrote:
>> >> > On 29.04.2018 18:34, Naveen Chawla wrote:
>> >> >>
>> >> >> I don't think there's such a thing as "real random" in digital
>> algos,
>> >> >> just
>> >> >> "pseudo random".
>> >> >
>> >> > You are right. I meant 'unbiased' pseudo randomness here.
>> >> >
>> >> >> Apart from card games, what's the use case?
>> >> >
>> >> > There are a lot of potential use cases. The best that comes into my
>> mind
>> >> > is
>> >> > sampling test data.
>> >> >
>> >> >
>> >> > On 29.04.2018 19:01, Isiah Meadows wrote:
>> >> >>
>> >> >> I think this would be better suited for a library function rather
>> than
>> >> >> a
>> >> >> language feature. I could see this also being useful also for
>> >> >> randomized displays, but that's about it. And I'm not sure what an
>> >> >> engine could provide here that a library couldn't - you can't really
>> >> >> get much faster than what's in the language (minus bounds checking,
>> >> >> but the likely frequent cache misses will eclipse that greatly), and
>> >> >> it's not unlocking any real new possibilities.
>> >> >
>> >> > As Tab Atkins Jr. already pointed out it's not about performance
>> >> > benefits.
>> >> > It's about how error-prone custom shuffle implementations are/can be.
>> >> >
>> >> > _______________________________________________
>> >> > es-discuss mailing list
>> >> > es-discuss at mozilla.org
>> >> > https://mail.mozilla.org/listinfo/es-discuss
>> >> _______________________________________________
>> >> es-discuss mailing list
>> >> es-discuss at mozilla.org
>> >> https://mail.mozilla.org/listinfo/es-discuss
>> >
>> >
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20180430/52acfe32/attachment-0001.html>


More information about the es-discuss mailing list