new de-orphaned fhr data available
Brendan Colloran
bcolloran at mozilla.com
Fri Mar 7 13:38:18 PST 2014
Tmary requested that we change the schedule from starting the pipeline on Thursday to Monday, to reduce the risk of cluster downtime over the weekend. I am going to abort the current batch and start a new one next Monday. I will send a note when that batch is complete, probably around next Friday.
-Brendan
----- Original Message -----
> From: "Anurag Phadke" <aphadke at mozilla.com>
> To: "John Jensen" <jjensen at mozilla.com>
> Cc: "Brendan Colloran" <bcolloran at mozilla.com>, "Saptarshi Guha" <joy at mozilla.com>, "Analytics team"
> <quants at mozilla.com>, "Thirunavukarasu Meyarivan" <tmary at mozilla.com>, "FHR-dev" <fhr-dev at mozilla.org>
> Sent: Tuesday, March 4, 2014 6:28:09 PM
> Subject: Re: new de-orphaned fhr data available
>
> Hey John
> We will start working with Brendan to productionize this code. It shouldn't
> be too hard..
>
> -anurag
>
> > On Mar 4, 2014, at 6:22 PM, John Jensen <jjensen at mozilla.com> wrote:
> >
> > Hi all,
> >
> > AFAICT this is going to be our main source-of-truth FHR dataset for some
> > time, so we should get this process routinized as much as is practicable.
> > Anurag, how problematic would this be?
> >
> > John
> >
> > ----- Original Message -----
> >> From: "Brendan Colloran" <bcolloran at mozilla.com>
> >> To: "Saptarshi Guha" <joy at mozilla.com>
> >> Cc: "Analytics team" <quants at mozilla.com>, "Anurag Phadke"
> >> <aphadke at mozilla.com>, "Thirunavukarasu Meyarivan"
> >> <tmary at mozilla.com>, "FHR-dev" <fhr-dev at mozilla.org>
> >> Sent: Tuesday, March 4, 2014 1:51:06 PM
> >> Subject: Re: new de-orphaned fhr data available
> >>
> >> Mmm, well we haven't a Formal Discussion with IT or anything, and we
> >> haven't
> >> followed CAB procedures, so I guess everything is subject to change
> >> without
> >> warning until we decide to stop doing things the cowboy way. But the job
> >> runs once a week (on Thursday), and for the time being I will consistently
> >> be putting the final cleaned data on hdfs at:
> >>
> >> /user/bcolloran/fhrDeorphaned_YYYY-MM-DD
> >>
> >> The pipeline takes as long as it takes, which seems to be about 5ish days.
> >> But since it's also variable based on cluster load, I'll continue sending
> >> email alerts out until we formalize things a bit. But if you wish you can
> >> look for new data at the above path.
> >>
> >> Because the data is big and the space is limited, I'm also only retaining
> >> the
> >> two most recent de-orphaned data sets. Older data will be deleted when the
> >> new data is done being processed.
> >>
> >> -Brendan
> >>
> >> ----- Original Message -----
> >>> From: "Saptarshi Guha" <sguha at mozilla.com>
> >>> To: "Brendan Colloran" <bcolloran at mozilla.com>
> >>> Cc: "Analytics team" <quants at mozilla.com>, "Anurag Phadke"
> >>> <aphadke at mozilla.com>, "Thirunavukarasu Meyarivan"
> >>> <tmary at mozilla.com>, "FHR-dev" <fhr-dev at mozilla.org>
> >>> Sent: Tuesday, March 4, 2014 1:29:26 PM
> >>> Subject: Re: new de-orphaned fhr data available
> >>>
> >>> Will this be automated? cron job? So that the sampler can inspect for the
> >>> presence of a file
> >>> and then do it's thing.
> >>>
> >>>
> >>> ----- Original Message -----
> >>>> From: "Brendan Colloran" <bcolloran at mozilla.com>
> >>>> To: "Analytics team" <quants at mozilla.com>, "Anurag Phadke"
> >>>> <aphadke at mozilla.com>, "Thirunavukarasu Meyarivan"
> >>>> <tmary at mozilla.com>
> >>>> Cc: "FHR-dev" <fhr-dev at mozilla.org>
> >>>> Sent: Tuesday, March 4, 2014 11:17:30 AM
> >>>> Subject: new de-orphaned fhr data available
> >>>>
> >>>> on hdfs at:
> >>>> /user/bcolloran/fhrDeorphaned_2014-02-27
> >>>>
> >>>> saptarshi, please reply-all when you have samples ready. thank!
> >>>>
> >>>> -brendan
> >
> > --
> > John Jensen | Director, Metrics | Mozilla Corporation
>
More information about the fhr-dev
mailing list