100% Pre-Release Longitudinal Table Splitting

Frank Bertsch fbertsch at mozilla.com
Thu Mar 9 13:34:33 UTC 2017


Hi All,

I'm making the pre-release longitudinal dataset, which would have every
probe for every process as a column. But we are running into an issue
where the table is too wide for Spark [0]. My plan now is to split this
into 3 tables, one for each process type (parent, process, gpu). Each
table would have all of the environment data for a client, but if you
wanted to compare probes across process types, you would have to join
the datasets.

Any objections to this?

-Frank

[0] https://issues.apache.org/jira/browse/SPARK-18016



More information about the fhr-dev mailing list