Going forward with Quality

Ludovic Hirlimann ludovic at mozilla.com
Sun Sep 2 16:17:03 UTC 2012


Hi,

tl,dr; Description of how quality control is done and ideas on how we 
should continue.

I'm ludovic and I'm the quality lead for Thunderbird. I've been managing 
quality with the help of the community since february 2009. I work full 
time (5 days a week) on quality and I try to make sure that we ship a 
stable product.

So what does my job consist of these days :


  Today


    I Tools

maintaining the tools and making sure the tools mozilla uses stay 
compatible with the way we use them. The tools we use are :

  * sorocco (the interface to the crash database)
  * bugzilla (the bug tracking system
  * Litmus (and it's coming replacement, the testcase management software)

There's not much work there once in a while make sure things continue to 
work - and the people maintaining the tools always have been a pleasure 
to work with.


    II Testing

Testing is divided between new feature testing and regression testing.


      New feature testing

This involves figuring out what the new feature will be , how it will 
work and use it with normal use case and then edge use cases. Once 
testing is done this means :

  * Adding new testcase to Litmus or testcase management tool
  * Filling bugs so the developers can try to fix them before the
    features make it to a release

I usually send emails to the Thunderbird-testers mailing list when I 
feel that the feature is "ready" and to get broader testing - in this 
email I usually explain what the new feature is suppose to do, and give 
hints on where to file bugs.


      Regression testing

This is done prior to releases and is what I do to make sure that 
Thunderbird doesn't regress. The auto update mechanism I always test 
myself for all releases - as it's easier and needs coordination with 
Release Engineering and Release drivers, this is around 2 hours of 
testing when everything is coordinated properly.

Since we've jumped on the Rapid release trains, I usually do my 
regression testing in the following way :

 1. Fire my windows VM
 2. Install the beta that we are going to release, of final build
 3. Create a pop Account
     1. Receive emails
     2. Send emails
     3. send email to the other account with return receipt on
     4. create a saved search
 4. Create an imap Account
      * do the same as pop3 testing
 5. Create a RSS account
      * Add a few feeds , read them (and as I usually use a newspaper
        feeds see that it get's updated)
 6. Create a NNTP account
      * Post to some .test newsgroup
 7. send an email to release driver stating that the build is ok or not

When a large feature lands I will call for testers on 
mozilla.dev.apps.thunderbird, tb-plaining and the thunderbird-testers 
mailing list (eg last time we did that was when maildir-like support 
landed), to have more than a pair of eyes trying to figure out what 
might be broken. I don't do it more often because I feel it's time 
consuming and I don't get many testers (eg max is 20, min is 1 or 2) and 
we don't find regressions , or not enough to my taste when using this 
method.


    III Bugzilla

Most of my time is spent in bugzilla :

 1. Reading all comments on all bugs
 2. Making sure flags are set
 3. Trying to recruit new contributors
      * by email
 4. engaging with new contributors
      * by email
 5. engaging with the rest of mozilla when we are affected by Core issue
      * either by mail, or in the bugs
 6. Trying to find important regression issues
      * pinging proper developers on these
      * asking on irc for other issues when I'm unsure
 7. Replying ASAP to new bugs to try to capture as much information as
    possible
      * Replying on the day, the bug is fresh and bug reporters are most
        likely to add and answer your questions


  Going Forward


    I Bugzilla

We need to find a process that works for both developers and people 
involved in QA so that bugs get fixed.
We need to fix old bugs as well as new bugs that arise from new features 
landing or from Core Gecko changes.

Here is the easy list of criteria we should use for bubbling up bugs  :

 1. Number of people affected (we'll probably need some input from
    support for this)
 2. Is it because of a new feature ?
 3. Is it a main feature of the product (eg an edge case of printing)

Then we'll need a way to expose those bugs/ issues to devs and devs will 
need a way to look/assign and fix these. I'm thinking about sending a 
summary email on a know occurence (eg once a month, a week , every 15 
days) ?

I think that once we've got a list of criteria that both devs and 
contributors to quality agree , we'll just need to have more people 
helping in bugzilla.

Right now there are between 0 and 7 people helping at various levels in 
bugzilla (some searching for duplicates, some moving to the proper 
component, some asking question and trying to get more information from 
our users, some closing bugs that we can't do much with - because of 
lack of precise information). While I've been trying over the last few 
years to grow the number of contributors in bugzilla - I've never 
managed to have it grow. People come , stay and leave except a few 
exceptions.  So if you have ideas on how we could manage to grow the 
number of people caring with bugzilla please chime in.


    II Testing

With the time That I'll be allowed to work on the project I should be 
able to continue doing update testing - but I should probably write 
somewhere exactly what I'm doing so someone could take over.

For new feature testing I proposed to crowd source them directly, when a 
new feature lands :

  * Organize a few days of testing where
  * people testing meet on irc
  * the developer is on irc too so he can quickly answer
  * bugs are flagged to be easily findable by any developer who would
    want to fix the feature before it reaches mainstream

for regression testing see below


      Automated testing

One thing that changed the quality of Thunderbird in the last few years 
was when in 2010 we forced each new patch to come with new tests. I 
can't stress out how much this has made catching regression easier and 
faster. I would stress that we enforce this policy in the future in a 
stricter ways than it has been in the past.

Also some areas are under tested by unit test I think it would be wise 
to have some of our engineering effort put into adding more tests. I've 
been a total failure at that in the last 3 years. Ideas on having people 
spending coding efforts on adding more test  are more than welcome. We 
could use jcranmer great work on code caverage to easily figure out 
where more tests are needed.


      Unformal testing

This is what I call testing done by people using Early bird, beta and 
daily on a a daily basis - people that use our pre-release software all 
the time not just when a test event comes up. I think it's way too 
fragmented right now. We get good feedback on daily and sometimes on 
beta. I can't recall that we ever caught or got any feedback on early 
bird. We currently don't have enough user using pre-released version of 
Thunderbird as in the pst we caught major regressions the day of the 
release - this is widely due to all the configuration option mailnews 
offers, plus the zillion configuration of add-ons and external software 
our user base uses.
Mark Banner as the complete numbers (mark can you chime these in of the 
conversation please), but I think we need to grow these numbers and 
maybe scatter them less (eg have more daily users and beta users, push 
less earlybrid). Ideas on how to achieve this growth , or where it 
should grow are very welcomed.


      More Formal testing

I'd like to do more format testing, have a week in the beta cycles where 
we gather a big number of volunteers and they run all the testcases we 
have in our testcase management tool. As  said before I thought that it 
was time consuming to do these, but in a context where I follow bugzilla 
less closely the advantages of having such event driven testing pop up :

  * a tracking bug so bugs are easy to find by devs.
  * we get people to test all areas (not just those used by regular
    betatesters)

The issue with this are :

  * Having enough people willing to spend enough time to get 100%
    coverage (with 20 people participating a good hour I think it's
    achievable)
  * Some test are difficult to setup (eg you need LDAP, you need a proxy)
  * People tend to come to one event, if we have them to often they
    don't come anymore
  * How to get enough people to reach 100% coverage everytime
  * Having a set of testcase that are up to date.

Getting more people for a one time shot is easy. keeping them is hard.


    III Tools


I don't see any forthcoming issues with tools - so I'm not really afraid 
on that front.


    Conclusion

None yet, chime in. Talk, argue and let's build a plan on how to make 
Thunderbird even better and raise the quality standard for email clients.

Ludo
ps I probably forgot a few things here, please ask, argue and let's get 
Quality forward.

-- 
@lhirlimann on twitter
https://wiki.mozilla.org/Thunderbird:Testing

my photos http://www.flickr.com/photos/lhirlimann/collections/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/tb-planning/attachments/20120902/6a9bede8/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4435 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mail.mozilla.org/pipermail/tb-planning/attachments/20120902/6a9bede8/attachment.p7s>


More information about the tb-planning mailing list