Yet another attempt at typed JS data

kai zhu kaizhu256 at gmail.com
Mon Feb 10 17:12:56 UTC 2020


if you really care about performance AND structured-data, perhaps
javascript is not the best tool for the job.

if you goto the wasm-sqlite3 demo @
https://kaizhu256.github.io/demo-sqljs-csv/
you can paste the following code into browser's dev-console to ingest a
million-row csv and perform queries on it:

```js
(async function () {
    "use strict";
    let csv;
    let ii;
    let randomSelect;
    let result;
    randomSelect = function (list) {
    /*
     * this function will select a random element from list
     */
        return list[Math.floor(list.length * Math.random())];
    };
    csv = "rowid,name,address,random\n";
    ii = 0;
    while (ii < 1000000) {
        csv += (
            ii + 1 + ","
            + randomSelect([
                "Bob", "Jane", "John"
            ]) + " " + randomSelect([
                "Doe", "Smith", "Williams"
            ]) + ","
            + "\"1234 Main St., Los Angeles, CA 90001\","
            + Math.random() + "\n"
        );
        ii += 1;
    }
    console.error(csv.slice(0, 1000));
    // rowid,name,address,random
    // 1,Jane Doe,"1234 Main St., Los Angeles, CA 90001",0.8783498663648375
    // 2,Bob Williams,"1234 Main St., Los Angeles, CA
90001",0.22973214766766303
    // 3,John Doe,"1234 Main St., Los Angeles, CA 90001",0.8658095647533652
    // 4,Jane Smith,"1234 Main St., Los Angeles, CA
90001",0.27730496836028085
    // ...
    // 1000000,Jane Williams,"1234 Main St., Los Angeles, CA
90001",0.43105992922801883
    await window.sqljsTableImport({
        csv,
        tableName: "table1"
    });
    // sqljsTableImport - 945 ms - inserted 12,572 rows - 1 MB
    // sqljsTableImport - 1163 ms - inserted 25,017 rows - 2 MB
    // ...
    // sqljsTableImport - 6242 ms - inserted 997,423 rows - 81 MB
    // sqljsTableImport - 6252 ms - inserted 1,000,000 rows - 81 MB
    result = await window.sqljsExec(
        "SELECT * FROM table1 WHERE\n"
        + "name LIKE 'John'\n"
        + "AND random > 0.5\n"
        + "ORDER BY random DESC\n"
        + "LIMIT 1000;"
    );
    console.error(result.results[0].values);
    // ["961621", "John Doe", "1234 Main St., Los Angeles, CA 90001",
"0.999 ...
    // ["51800", "John  Williams  ", "1234 Main St., Los Angeles, CA
90001", "0.999 ...
    // ["241184", "John  Smith  ", "1234 Main St., Los Angeles, CA 90001",
"0.999 ...
    // ["591592", "John  Williams  ", "1234 Main St., Los Angeles, CA
90001", "0.999 ...
    // ["32403", "John Doe", "1234 Main St., Los Angeles, CA 90001", "0.999
...
    // ["847237", "John  Smith  ", "1234 Main St., Los Angeles, CA 90001",
"0.999 ...
    // ["23195", "John Doe", "1234 Main St., Los Angeles, CA 90001", "0.999
...
    // ["136423", "John  Smith  ", "1234 Main St., Los Angeles, CA 90001",
"0.999 ...
}());
```
-kai

On Mon, Feb 10, 2020 at 3:08 AM Andrea Giammarchi <
andrea.giammarchi at gmail.com> wrote:

> Unfortunately, `Array.from({ length: 4 }, () => whatever)` produces a
> holey array, so that the `.repeat(...)` idea, if capable of packing
> elements in a better way, wouldn't be so terrible, as simplification.
>
> Although, the intent of this proposal was to also grant "shapes" or
> kindness of each entry, same way typed Arrays do, but maybe that would
> require some better primitive, as in `const Shape =
> Object.defineShape(...)` and `Object.createShape(Shape)` or similar.
>
> On Sun, Feb 9, 2020 at 10:01 PM Jordan Harband <ljharb at gmail.com> wrote:
>
>> That already exists - `Array.from({ length: 4 }, () => whatever)` - I
>> assume that the hope is to have an array where it is *impossible* for it to
>> have the wrong "kind" of data, and a userland factory function wouldn't
>> provide that.
>>
>> On Sun, Feb 9, 2020 at 10:39 AM kai zhu <kaizhu256 at gmail.com> wrote:
>>
>>> > It's a bit of a mess to create an Array that is not holed and gets
>>> best optimizations [1], and this proposal would like to address that exact
>>> case.
>>>
>>> could the performance issue be resolved more easily with a simple
>>> static-function `Array.repeat(<length>, <repeater>)`?
>>>
>>> ```js
>>> let structuredList;
>>> structuredList = Array.repeat(4, function (ii) {
>>>     return {
>>>         index: 2 * ii + 1,
>>>         tags: []
>>> });
>>> /*
>>> structuredList = [
>>>     { index: 1, tags: [] },
>>>     { index: 3, tags: [] },
>>>     { index: 5, tags: [] },
>>>     { index: 7, tags: [] }
>>> ];
>>>  */
>>> ```
>>>
>>> the only time i can practically enforce the shape of a "StructuredArray"
>>> is during element-insertion,
>>> and a userland insertion/creation function would be just as effective as
>>> a StructuredArray constructor.
>>>
>>> enforcing shapes during element deletions and updates are going to be
>>> hard
>>> and likely just as confusing with StructuredArray as they are with
>>> regular Array.
>>>
>>> also note that most javascript arrays need to be easily JSON-serialized
>>> for message-passing
>>> over-the-wire (commonly http) to external systems.
>>>
>>> -kai
>>>
>>> On Sat, Feb 8, 2020 at 3:46 AM Andrea Giammarchi <
>>> andrea.giammarchi at gmail.com> wrote:
>>>
>>>> > having to retroactively add checks like...
>>>>
>>>> we already have typed arrays in JS so I don't think this would be any
>>>> different
>>>>
>>>> > I _think_ that moderns virtual machines already did these
>>>> optimisations despite there isn't a TypedArray like that.
>>>>
>>>> It's a bit of a mess to create an Array that is not holed and gets best
>>>> optimizations [1], and this proposal would like to address that exact case.
>>>>
>>>> [1] https://v8.dev/blog/elements-kinds
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> es-discuss mailing list
>>>> es-discuss at mozilla.org
>>>> https://mail.mozilla.org/listinfo/es-discuss
>>>>
>>> _______________________________________________
>>> es-discuss mailing list
>>> es-discuss at mozilla.org
>>> https://mail.mozilla.org/listinfo/es-discuss
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20200210/8be94c05/attachment.html>


More information about the es-discuss mailing list