Intermittent test failures

Andrew McKay amckay at
Mon Oct 30 19:35:20 UTC 2017

Based on the fact that we can have one bug for a test that fails a lot
or fails very infrequently, this makes a lot of sense to me.

On 30 October 2017 at 06:22, Bob Silverberg <bsilverberg at> wrote:
> There was some discussion on Friday about how to best measure intermittents
> and what criteria we should use for a goal. Kris and Andrew suggested that
> we should look at number of failures, as opposed to number of bugs and
> provided a couple of sample queries:
> Andrew's:
> Kris':
> I took a bit from each of these and produced one that gives us a total
> count, and includes the majority of WebExtensions tests:
> Is this what we want to base our Q4 OKR on?
> On Fri, Oct 6, 2017 at 9:50 AM, Bob Silverberg <bsilverberg at>
> wrote:
>> It turns out the first query referenced in the above email is no longer
>> correct. Please use [1] instead.
>> [1]
>> On Fri, Sep 29, 2017 at 9:15 AM, Bob Silverberg <bsilverberg at>
>> wrote:
>>> Hi Team and Friends of Add-Ons,
>>> You may have noticed that there are a large number of bugs filed about
>>> intermittent failures of our tests. At Andy's suggestion I have undertaken
>>> to better understand where we are with these and what action, if any, should
>>> be taken about them.
>>> I spoke to Joel Maher about the stockwell project, which aims to monitor
>>> intermittents and take action on those requiring it.
>>> In his opinion, the only intermittents that we should be spending any
>>> time on are ones that have been identified via stockwell, and all of those
>>> have a whiteboard tag which includes "stockwell" on them. This means that we
>>> don't really need to do any work in terms of querying and research, all we
>>> need is a couple of Bugzilla queries.
>>> [stockwell needswork] [1] - Any current intermittents that are serious
>>> enough to need our attention will be tagged with [stockwell needswork], so
>>> if we are looking for current intermittents to address, this is our one stop
>>> shop. Bugs are marked with [stockwell needswork] when they fail 30 or more
>>> times in a 7 day period. There are currently zero of these bugs.
>>> [stockwell disabled] [2] - Any tests that have been disabled because they
>>> were failing intermittently with too high a frequency will be tagged with
>>> [stockwell disabled]. After dealing with any bugs that are labelled
>>> "needswork", we should start working through the backlog of these disabled
>>> tests. There are currently 14 of these bugs.
>>> Any new bugs filed with [stockwell needswork] will appear in our normal
>>> triage, as these bugs are also assigned a priority of "--", as opposed to
>>> other intermittents which are assigned a priority of P5. When we encounter
>>> these during triage we should find an appropriate person to which to assign
>>> them, and that person should ideally try to fix the intermittent within a
>>> couple of weeks.
>>> For the disabled tests, I am going to do some more research to identify
>>> just how disabled they are. If a test is disabled on a single platform, it
>>> is less of an issue for us than if it is disabled across all platforms. Once
>>> we identify which tests are disabled enough to represent a significant
>>> reduction of coverage, we should try to find owners for those bugs and
>>> should try to get them fixed in a timely manner.
>>> If you find yourself needinfo'd or assigned to an intermittent it is
>>> likely because of the effort described above. If you are a contributor and
>>> you'd like to help us restore some test coverage, please feel free to browse
>>> through the list of disabled tests at [2] and let us know if you'd like to
>>> take any of them on.
>>> If anyone has any questions about any of this, please reply to this
>>> email.
>>> Thanks,
>>> Bob
>>> [1]
>>> [2]
> _______________________________________________
> Dev-addons mailing list
> Dev-addons at

More information about the Dev-addons mailing list