TMO Stability Dashboard is Changing
Benjamin Smedberg
benjamin at smedbergs.us
Wed Mar 1 17:50:20 UTC 2017
Especially in the transition period, having a consistent view across to see
if the regression is date-based is important. I agree that builds which are
very old are much less interesting (probably defining "very old" is a
per-channel thing).
A lot of what release management needs is to determine whether we have a
regression by build, and to solve that issue the current view seems
inadequate. We really need view that shows the crash rate per build. That's
far more valuable than inferring a regression by looking at dates.
--BDS
On Tue, Feb 28, 2017 at 1:11 PM, Chris Hutten-Czapski <chutten at mozilla.com>
wrote:
> Relevance to release engineering and short-to-moderate term planning, I
> suppose. It is easier (possible) to improve the crash rate of the
> population that is up-to-date than the population that isn't?
>
> I'm not particularly attached to this approach, it is what came out of
> user feedback during development and use.
>
> However, conversely: How is showing all beta builds no matter their
> version number better than only showing the most-recent version? Are there
> signals that would show in that view that wouldn't show in this one?
>
> :chutten
>
> On Tue, Feb 28, 2017 at 1:04 PM, Benjamin Smedberg <benjamin at smedbergs.us>
> wrote:
>
>> Why do we have to choose? Why is either of these behaviors better than
>> showing all beta builds no matter their version number?
>>
>> --BDS
>>
>> On Tue, Feb 28, 2017 at 1:02 PM, Chris Hutten-Czapski <
>> chutten at mozilla.com> wrote:
>>
>>> The default (only) view of the dashboard shows crash rates by crash date
>>> (crash_aggregates' activity_date), so long as the crash happened in the
>>> most-recent (previously, most-used) version (not build).
>>>
>>> An example may help.
>>>
>>> The previous version of the dashboard showed for the date of January 27
>>> beta crashes for all beta builds on beta 51. This is despite beta 52 having
>>> been released. This was because there were more people spending more time
>>> on beta51 than on beta52.
>>>
>>> The present version of the dashboard now shows all crashes from beta52
>>> on that date instead, since crash rates on beta51 cease to be something we
>>> can do anything about. It's release 51 we then care about and can influence.
>>>
>>> I hope this helps.
>>>
>>> :chutten
>>>
>>>
>>> On Tue, Feb 28, 2017 at 12:50 PM, Benjamin Smedberg <
>>> benjamin at smedbergs.us> wrote:
>>>
>>>> I don't quite understand what this means.
>>>>
>>>> Is the default view of this dashboard to show crash rates by date, or
>>>> crash rates by build?
>>>>
>>>> If showing crash rates by date, why do you care what the version is?
>>>> Just show all version for the channel.
>>>> If showing crash rates by build, you should be able to just line up the
>>>> e.g. 53 nightlies < 54 nightlies < 55 nightlies all on the same graph.
>>>>
>>>> In either case, it doesn't seem useful to require people to pick a
>>>> particular version.
>>>>
>>>> For most metrics, but especially for crashes, being able to switch
>>>> between date metrics and build metrics is important, because some
>>>> regressions are caused by stuff we check in and therefore show up clearly
>>>> on per-build charts. Other things such as crashes caused by an external
>>>> website are date-driven and having the date-based view helps correlate that
>>>> across channels.
>>>>
>>>> Having to pick a version is one of the least attractive things about
>>>> the histogram views on t.m.o as well. If you're chasing a nightly
>>>> regression around a version bump, you end up having to switch for no
>>>> particular reason.
>>>>
>>>> --BDS
>>>>
>>>>
>>>> On Tue, Feb 28, 2017 at 12:43 PM, Chris Hutten-Czapski <
>>>> chutten at mozilla.com> wrote:
>>>>
>>>>> Just a small change [1] with large ramifications to the TMO Stability
>>>>> Dashboard: https://telemetry.mozilla.org/crashes/
>>>>>
>>>>> Previously the dashboard was looking at whatever was the most-used
>>>>> release on a particular day, and plotting those crash numbers. Around
>>>>> release days that approach is rather less useful, as it might show data
>>>>> from an older, now-abandoned, release.
>>>>>
>>>>> From now on it will show data from the most-recent release for that
>>>>> particular day.
>>>>>
>>>>> It uses https://product-details.mozilla.org/ to determine what
>>>>> release is most recent. More details in the pull request and commit message.
>>>>>
>>>>> One cool thing from this is you can now more easily pick out when
>>>>> release numbers changed because there's a dip in the kuh graphs.
>>>>>
>>>>> One less-cool thing is that the (oft-confusing) %ge numbers in the
>>>>> table are now no longer sufficient to give you an idea of how
>>>>> inflated/deflated the crash figures may be, as they are still tuned to what
>>>>> last week's usage volume was, independent of across how many releases it
>>>>> was split.
>>>>>
>>>>> I have yet to come up with a replacement "trust" figure for how likely
>>>>> the numbers are to reflect some ideal, "true" crash rate and so am just
>>>>> wishing bug 1336360 along so that main pings will start being received
>>>>> faster.
>>>>>
>>>>> If you have any questions, please do ask.
>>>>>
>>>>> :chutten
>>>>>
>>>>> [1]: https://github.com/mozilla/telemetry-dashboard/pull/282
>>>>>
>>>>> _______________________________________________
>>>>> fhr-dev mailing list
>>>>> fhr-dev at mozilla.org
>>>>> https://mail.mozilla.org/listinfo/fhr-dev
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/fhr-dev/attachments/20170301/2f3d4cc6/attachment-0001.html>
More information about the fhr-dev
mailing list