new de-orphaned fhr data available
Anurag Phadke
aphadke at mozilla.com
Tue Mar 4 18:28:09 PST 2014
Hey John
We will start working with Brendan to productionize this code. It shouldn't be too hard..
-anurag
> On Mar 4, 2014, at 6:22 PM, John Jensen <jjensen at mozilla.com> wrote:
>
> Hi all,
>
> AFAICT this is going to be our main source-of-truth FHR dataset for some time, so we should get this process routinized as much as is practicable. Anurag, how problematic would this be?
>
> John
>
> ----- Original Message -----
>> From: "Brendan Colloran" <bcolloran at mozilla.com>
>> To: "Saptarshi Guha" <joy at mozilla.com>
>> Cc: "Analytics team" <quants at mozilla.com>, "Anurag Phadke" <aphadke at mozilla.com>, "Thirunavukarasu Meyarivan"
>> <tmary at mozilla.com>, "FHR-dev" <fhr-dev at mozilla.org>
>> Sent: Tuesday, March 4, 2014 1:51:06 PM
>> Subject: Re: new de-orphaned fhr data available
>>
>> Mmm, well we haven't a Formal Discussion with IT or anything, and we haven't
>> followed CAB procedures, so I guess everything is subject to change without
>> warning until we decide to stop doing things the cowboy way. But the job
>> runs once a week (on Thursday), and for the time being I will consistently
>> be putting the final cleaned data on hdfs at:
>>
>> /user/bcolloran/fhrDeorphaned_YYYY-MM-DD
>>
>> The pipeline takes as long as it takes, which seems to be about 5ish days.
>> But since it's also variable based on cluster load, I'll continue sending
>> email alerts out until we formalize things a bit. But if you wish you can
>> look for new data at the above path.
>>
>> Because the data is big and the space is limited, I'm also only retaining the
>> two most recent de-orphaned data sets. Older data will be deleted when the
>> new data is done being processed.
>>
>> -Brendan
>>
>> ----- Original Message -----
>>> From: "Saptarshi Guha" <sguha at mozilla.com>
>>> To: "Brendan Colloran" <bcolloran at mozilla.com>
>>> Cc: "Analytics team" <quants at mozilla.com>, "Anurag Phadke"
>>> <aphadke at mozilla.com>, "Thirunavukarasu Meyarivan"
>>> <tmary at mozilla.com>, "FHR-dev" <fhr-dev at mozilla.org>
>>> Sent: Tuesday, March 4, 2014 1:29:26 PM
>>> Subject: Re: new de-orphaned fhr data available
>>>
>>> Will this be automated? cron job? So that the sampler can inspect for the
>>> presence of a file
>>> and then do it's thing.
>>>
>>>
>>> ----- Original Message -----
>>>> From: "Brendan Colloran" <bcolloran at mozilla.com>
>>>> To: "Analytics team" <quants at mozilla.com>, "Anurag Phadke"
>>>> <aphadke at mozilla.com>, "Thirunavukarasu Meyarivan"
>>>> <tmary at mozilla.com>
>>>> Cc: "FHR-dev" <fhr-dev at mozilla.org>
>>>> Sent: Tuesday, March 4, 2014 11:17:30 AM
>>>> Subject: new de-orphaned fhr data available
>>>>
>>>> on hdfs at:
>>>> /user/bcolloran/fhrDeorphaned_2014-02-27
>>>>
>>>> saptarshi, please reply-all when you have samples ready. thank!
>>>>
>>>> -brendan
>
> --
> John Jensen | Director, Metrics | Mozilla Corporation
More information about the fhr-dev
mailing list