Intermittent test failures

Bob Silverberg bsilverberg at
Mon Oct 30 13:22:59 UTC 2017

There was some discussion on Friday about how to best measure intermittents
and what criteria we should use for a goal. Kris and Andrew suggested that
we should look at number of failures, as opposed to number of bugs and
provided a couple of sample queries:


I took a bit from each of these and produced one that gives us a total
count, and includes the majority of WebExtensions tests:

Is this what we want to base our Q4 OKR on?

On Fri, Oct 6, 2017 at 9:50 AM, Bob Silverberg <bsilverberg at>

> It turns out the first query referenced in the above email is no longer
> correct. Please use [1] instead.
> [1]
> On Fri, Sep 29, 2017 at 9:15 AM, Bob Silverberg <bsilverberg at>
> wrote:
>> Hi Team and Friends of Add-Ons,
>> You may have noticed that there are a large number of bugs filed about
>> intermittent failures of our tests. At Andy's suggestion I have undertaken
>> to better understand where we are with these and what action, if any,
>> should be taken about them.
>> I spoke to Joel Maher about the stockwell project, which aims to monitor
>> intermittents and take action on those requiring it.
>> In his opinion, the only intermittents that we should be spending any
>> time on are ones that have been identified via stockwell, and all of those
>> have a whiteboard tag which includes "stockwell" on them. This means that
>> we don't really need to do any work in terms of querying and research, all
>> we need is a couple of Bugzilla queries.
>> [stockwell needswork] [1] - Any current intermittents that are serious
>> enough to need our attention will be tagged with [stockwell needswork], so
>> if we are looking for current intermittents to address, this is our one
>> stop shop. Bugs are marked with [stockwell needswork] when they fail 30 or
>> more times in a 7 day period. There are currently zero of these bugs.
>> [stockwell disabled] [2] - Any tests that have been disabled because they
>> were failing intermittently with too high a frequency will be tagged with
>> [stockwell disabled]. After dealing with any bugs that are labelled
>> "needswork", we should start working through the backlog of these disabled
>> tests. There are currently 14 of these bugs.
>> Any new bugs filed with [stockwell needswork] will appear in our normal
>> triage, as these bugs are also assigned a priority of "--", as opposed to
>> other intermittents which are assigned a priority of P5. When we encounter
>> these during triage we should find an appropriate person to which to assign
>> them, and that person should ideally try to fix the intermittent within a
>> couple of weeks.
>> For the disabled tests, I am going to do some more research to identify
>> just how disabled they are. If a test is disabled on a single platform, it
>> is less of an issue for us than if it is disabled across all platforms.
>> Once we identify which tests are disabled enough to represent a significant
>> reduction of coverage, we should try to find owners for those bugs and
>> should try to get them fixed in a timely manner.
>> If you find yourself needinfo'd or assigned to an intermittent it is
>> likely because of the effort described above. If you are a contributor and
>> you'd like to help us restore some test coverage, please feel free to
>> browse through the list of disabled tests at [2] and let us know if you'd
>> like to take any of them on.
>> If anyone has any questions about any of this, please reply to this email.
>> Thanks,
>> Bob
>> [1]
>> [2]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Dev-addons mailing list