Firefox update hotfix deployed

Gregory Szorc gps at mozilla.com
Wed Jul 23 01:14:31 UTC 2014


On 7/16/14, 11:27 AM, Gregory Szorc wrote:
> We just deployed a new Firefox hotfix to our entire user base.
>
> Hotfix v20140527.01 (tracked in bug 928173) identifies Firefox installs
> that are "stuck" on old releases and attempts to upgrade them to Firefox
> 30 by downloading a full installer and running it, bypassing the
> built-in updating mechanism. The goal of this hotfix is to reduce the
> long tail of installs running older Firefox releases and get these users
> on a modern, safer, better, faster, stronger (but not harder) Firefox.
>
> The hotfix will be downloaded and installed on every Firefox 10+
> install. However, the upgrading only occurs if a number of conditions
> are met [1]. Essentially, Firefox < 29 on Windows x86 non-partner builds
> on the release channel that have auto updating enabled. So it is stated
> explicitly, we only do the upgrade on clients that should have already
> upgraded automatically: we don't go behind our user's backs and force an
> unwanted update.
>
> If you see an uptick in new Firefox 30 users or an increase of surprised
> users that their Firefox is drastically different, this is likely the
> reason.
>
> If you see something strange, don't hesitate to pick up the phone and
> call Benjamin Smedberg or myself, especially if you think the hotfix
> should be pulled.
>
> [1]
> https://hg.mozilla.org/releases/firefox-hotfixes/file/d29a3d24405c/v20140527.01/resource/update.jsm#l875

I thought people would interested in learning how effective this hotfix 
has been and what was observed from the user base.

Data Analyzed
=============

The update hotfix submits data to Mozilla upon uninstall and other key 
events.

Data is only submitted if the hotfix was compatible to a particular 
install. e.g. if you were running Firefox 30 when the hotfix was 
installed, you didn't upload data.

Data is only submitted if the user has consented to data upload. This 
effectively boils down to FHR or Telemetry being enabled. This means we 
have no direct hotfix data on some users. We account for users upgrading 
from pre-FHR Firefoxes and wait for FHR to kick in before we assume they 
can't submit data and uninstall. This allows us to maximize data return.

Due to a bug in the hotfix (possibly Gecko) (bug 1040231) combined with 
a server that didn't react well to malformed data, *we lost about the 
first 24 hours of data*. This means we don't have direct data for hotfix 
installs that completed during this time. This is very unfortunate and 
adds a massive asterisk to our analysis.

However, if a hotfix install completed during the first 24 hours, that 
likely means a) the hotfix did its job and the user upgraded b) the 
hotfix encountered a fatal error and uninstalled itself. "a" can be 
measured via other data sources. The ratios for "b" should hopefully be 
similar for the non-lost data, so we can extrapolate.

High-Level Overview
===================

Total records encountered: > 4.5M
Success rate: ~47.1%
Clients with hotfix still installed: ~52.7%
Uninstall due to bugs or other weirdness: 0.2%
Clients experiencing download failures: 2.40%
Windows compatibility mode detected: 0.70%

Install Attempts
================

Here is a breakdown of the number of install attempts per client:

0    4.96%
1   92.77%
2    2.12%
3    0.12%

Here is a breakdown of failure counts:

0    52.47%
1    46.68%
2     0.75%
3     0.08%
4     0.01%

So, ~47.5% of our install attempts result in failure. Here is a 
breakdown of the exit codes for the overall install attempts:

Success              45.04%
Elevation Cancelled  40.76%
Accessing Log Failed  4.55%
Unarchiving Failed    0.28%
Elevation Failed      0.23%
Installation Failed   0.08%
Other                 0.08%

"Elevation Cancelled" means that we showed a UAC prompt to the user and 
the user did not grant privileges to allow the install to continue. This 
is treated as a transient failure by the hotfix.

"Elevation Failed" means the user attempted to grant privileges via UAC 
but couldn't. This could mean the user needs an admin password (which 
they do not know or entered incorrectly).

"Accessing log failed" is tracked by bug 1040234 and appears to be a 
legitimate issue with the hotfix, affecting mostly Windows XP users.

Upgraded From
=============

Clients that successfully upgraded were upgraded from the following 
Firefox versions:

10	 0.41%
11	 0.36%
12	 2.01%
13	 0.48%
14	 0.47%
15	 0.74%
16	 0.85%
17	 0.39%
18	 0.36%
19	 0.44%
20	 2.13%
21	 3.75%
22	 3.89%
23	 4.70%
24	 6.29%
25	 7.78%
26	12.47%
27	16.86%
28	30.71%
Unknown	 4.91%

Having more clients come from more modern versions is expected, as 
Firefox users are on a "long tail" of versions.

We somehow had 127 clients upgrade from the Aurora or Nightly channel 
(Firefox versions with "a" in the version string). We didn't explicitly 
check for this in the "is hotfix applicable" check. Instead, we checked 
that channel == "release." I suspect these are instances of profile copying.

Notification
============

If an installation fails in a transient manner, the hotfix will show a 
notification on startup until Firefox is upgraded or a "real" error is 
encountered.

Only 5.19% of reporting clients have seen the notification. This value 
seems extremely weird to me. We have the hotfix still installed on over 
50% of the reporting clients. If we had daily users who were restarting 
the browser daily, I'd expect clients seeing the notification to be 
similar to those that still have the hotfix installed. *This needs 
further investigation.*

Profile Sharing
===============

0.12% of clients had the hotfix uninstalled because it was "no longer 
applicable." This means the hotfix, when installed, met the conditions 
of the hotfix. But those conditions changed making the hotfix no longer 
applicable, causing the hotfix to uninstall.

The overwhelming majority of these clients appeared to switch to a 
different release channel. The beta channel accounted for 0.03%. The 
"none" channel accounted for 0.08%. This means that app.update.channel 
is throwing or is returning "none." I'm scratching my head on this one.

It's worth noting that if I turn off record deduplication, 0.16% of 
records are reporting "no longer applicable." I suspect a small faction 
of our users are copying profiles and running them with a different 
Firefox install. (We've seen FHR records that better identify what's 
going on.)

Download Speed
==============

I performed a crude analysis of the average download speed of each 
client. Here is a histogram of how things broke down:

Bps	%	cumulative %
0	1.21	 1.21
10000	2.58	 3.79
20000	2.77	 6.56
30000	2.68	 9.24
40000	2.83	12.07
50000	2.69	14.75
60000	2.07	16.82
70000	1.90	18.72
80000	1.86	20.58
90000	1.87	22.45
100000	4.57	27.02
125000	2.95	29.97
150000	2.78	32.75
175000	2.74	35.49
200000	5.33	40.82
250000	3.39	44.21
300000	3.16	47.37
350000	2.93	50.30
400000	2.46	52.76
450000	2.30	55.07
500000	3.90	58.96
600000	3.70	62.66
700000	2.90	65.56
800000	2.38	67.94
900000	1.98	69.92
1000000	8.03	77.95
1500000	4.91	82.87
2000000	5.39	88.26
3000000	2.75	91.00
4000000	1.66	92.66
5000000	7.33	100.00

"B" in these values is bytes, not bits.

It's worth noting that 22.45% of hotfix users are downloading at 
<100KBps and 50% are below 400KBps.

I empathize with the 1.21% of users in the 0-10KBps bucket: we made them 
download a ~28M installer which would take at least 48 minutes. Users in 
the other slow buckets don't fare much better. I haven't 
cross-referenced these aggregate results to see if our users on slower 
connections had the patience to sit through the full download.

The hotfix did not limit the Necko channel in any way (unlike the 
built-in update mechanism, which throttles).

Download Failures
=================

As reported above, 2.40% of reporting clients experienced at least 1 
download failure.

A download failure occurs when the hotfix thinks it has successfully 
downloaded an installer but the file size or SHA512 verification fails.
This could mean a) download/Necko code is buggy b) HTTP servers are 
lying or misbehaving (possibly captive portal) c) evil parties are 
rewriting Firefox installers. This will need additional analysis.

Summary
=======

The hotfix has so far upgraded a few million clients to Firefox 30!

I'm concerned about the few million clients that still have the hotfix 
installed, especially the ones that apparently haven't seen a 
notification. I'll need to deep dive into payloads to get a feel for 
what's happening.

Remember, we lost data for clients that completed in the first ~24 hours.

Let me know if you have any questions or data requests.

Gregory



More information about the firefox-dev mailing list