Firefox update hotfix deployed

Gregory Szorc gps at mozilla.com
Fri Aug 8 19:46:48 UTC 2014


On 7/22/14 6:14 PM, Gregory Szorc wrote:
> On 7/16/14, 11:27 AM, Gregory Szorc wrote:
>> We just deployed a new Firefox hotfix to our entire user base.
>>
>> Hotfix v20140527.01 (tracked in bug 928173) identifies Firefox installs
>> that are "stuck" on old releases and attempts to upgrade them to Firefox
>> 30 by downloading a full installer and running it, bypassing the
>> built-in updating mechanism. The goal of this hotfix is to reduce the
>> long tail of installs running older Firefox releases and get these users
>> on a modern, safer, better, faster, stronger (but not harder) Firefox.
>>
>> The hotfix will be downloaded and installed on every Firefox 10+
>> install. However, the upgrading only occurs if a number of conditions
>> are met [1]. Essentially, Firefox < 29 on Windows x86 non-partner builds
>> on the release channel that have auto updating enabled. So it is stated
>> explicitly, we only do the upgrade on clients that should have already
>> upgraded automatically: we don't go behind our user's backs and force an
>> unwanted update.
>>
>> If you see an uptick in new Firefox 30 users or an increase of surprised
>> users that their Firefox is drastically different, this is likely the
>> reason.
>>
>> If you see something strange, don't hesitate to pick up the phone and
>> call Benjamin Smedberg or myself, especially if you think the hotfix
>> should be pulled.
>>
>> [1]
>> https://hg.mozilla.org/releases/firefox-hotfixes/file/d29a3d24405c/v20140527.01/resource/update.jsm#l875
>>
>
> I thought people would interested in learning how effective this hotfix
> has been and what was observed from the user base.
>
> Data Analyzed
> =============
>
> The update hotfix submits data to Mozilla upon uninstall and other key
> events.
>
> Data is only submitted if the hotfix was compatible to a particular
> install. e.g. if you were running Firefox 30 when the hotfix was
> installed, you didn't upload data.
>
> Data is only submitted if the user has consented to data upload. This
> effectively boils down to FHR or Telemetry being enabled. This means we
> have no direct hotfix data on some users. We account for users upgrading
> from pre-FHR Firefoxes and wait for FHR to kick in before we assume they
> can't submit data and uninstall. This allows us to maximize data return.
>
> Due to a bug in the hotfix (possibly Gecko) (bug 1040231) combined with
> a server that didn't react well to malformed data, *we lost about the
> first 24 hours of data*. This means we don't have direct data for hotfix
> installs that completed during this time. This is very unfortunate and
> adds a massive asterisk to our analysis.
>
> However, if a hotfix install completed during the first 24 hours, that
> likely means a) the hotfix did its job and the user upgraded b) the
> hotfix encountered a fatal error and uninstalled itself. "a" can be
> measured via other data sources. The ratios for "b" should hopefully be
> similar for the non-lost data, so we can extrapolate.
>
> High-Level Overview
> ===================
>
> Total records encountered: > 4.5M
> Success rate: ~47.1%
> Clients with hotfix still installed: ~52.7%
> Uninstall due to bugs or other weirdness: 0.2%
> Clients experiencing download failures: 2.40%
> Windows compatibility mode detected: 0.70%
>
> Install Attempts
> ================
>
> Here is a breakdown of the number of install attempts per client:
>
> 0    4.96%
> 1   92.77%
> 2    2.12%
> 3    0.12%
>
> Here is a breakdown of failure counts:
>
> 0    52.47%
> 1    46.68%
> 2     0.75%
> 3     0.08%
> 4     0.01%
>
> So, ~47.5% of our install attempts result in failure. Here is a
> breakdown of the exit codes for the overall install attempts:
>
> Success              45.04%
> Elevation Cancelled  40.76%
> Accessing Log Failed  4.55%
> Unarchiving Failed    0.28%
> Elevation Failed      0.23%
> Installation Failed   0.08%
> Other                 0.08%
>
> "Elevation Cancelled" means that we showed a UAC prompt to the user and
> the user did not grant privileges to allow the install to continue. This
> is treated as a transient failure by the hotfix.
>
> "Elevation Failed" means the user attempted to grant privileges via UAC
> but couldn't. This could mean the user needs an admin password (which
> they do not know or entered incorrectly).
>
> "Accessing log failed" is tracked by bug 1040234 and appears to be a
> legitimate issue with the hotfix, affecting mostly Windows XP users.
>
> Upgraded From
> =============
>
> Clients that successfully upgraded were upgraded from the following
> Firefox versions:
>
> 10     0.41%
> 11     0.36%
> 12     2.01%
> 13     0.48%
> 14     0.47%
> 15     0.74%
> 16     0.85%
> 17     0.39%
> 18     0.36%
> 19     0.44%
> 20     2.13%
> 21     3.75%
> 22     3.89%
> 23     4.70%
> 24     6.29%
> 25     7.78%
> 26    12.47%
> 27    16.86%
> 28    30.71%
> Unknown     4.91%
>
> Having more clients come from more modern versions is expected, as
> Firefox users are on a "long tail" of versions.
>
> We somehow had 127 clients upgrade from the Aurora or Nightly channel
> (Firefox versions with "a" in the version string). We didn't explicitly
> check for this in the "is hotfix applicable" check. Instead, we checked
> that channel == "release." I suspect these are instances of profile
> copying.
>
> Notification
> ============
>
> If an installation fails in a transient manner, the hotfix will show a
> notification on startup until Firefox is upgraded or a "real" error is
> encountered.
>
> Only 5.19% of reporting clients have seen the notification. This value
> seems extremely weird to me. We have the hotfix still installed on over
> 50% of the reporting clients. If we had daily users who were restarting
> the browser daily, I'd expect clients seeing the notification to be
> similar to those that still have the hotfix installed. *This needs
> further investigation.*
>
> Profile Sharing
> ===============
>
> 0.12% of clients had the hotfix uninstalled because it was "no longer
> applicable." This means the hotfix, when installed, met the conditions
> of the hotfix. But those conditions changed making the hotfix no longer
> applicable, causing the hotfix to uninstall.
>
> The overwhelming majority of these clients appeared to switch to a
> different release channel. The beta channel accounted for 0.03%. The
> "none" channel accounted for 0.08%. This means that app.update.channel
> is throwing or is returning "none." I'm scratching my head on this one.
>
> It's worth noting that if I turn off record deduplication, 0.16% of
> records are reporting "no longer applicable." I suspect a small faction
> of our users are copying profiles and running them with a different
> Firefox install. (We've seen FHR records that better identify what's
> going on.)
>
> Download Speed
> ==============
>
> I performed a crude analysis of the average download speed of each
> client. Here is a histogram of how things broke down:
>
> Bps    %    cumulative %
> 0    1.21     1.21
> 10000    2.58     3.79
> 20000    2.77     6.56
> 30000    2.68     9.24
> 40000    2.83    12.07
> 50000    2.69    14.75
> 60000    2.07    16.82
> 70000    1.90    18.72
> 80000    1.86    20.58
> 90000    1.87    22.45
> 100000    4.57    27.02
> 125000    2.95    29.97
> 150000    2.78    32.75
> 175000    2.74    35.49
> 200000    5.33    40.82
> 250000    3.39    44.21
> 300000    3.16    47.37
> 350000    2.93    50.30
> 400000    2.46    52.76
> 450000    2.30    55.07
> 500000    3.90    58.96
> 600000    3.70    62.66
> 700000    2.90    65.56
> 800000    2.38    67.94
> 900000    1.98    69.92
> 1000000    8.03    77.95
> 1500000    4.91    82.87
> 2000000    5.39    88.26
> 3000000    2.75    91.00
> 4000000    1.66    92.66
> 5000000    7.33    100.00
>
> "B" in these values is bytes, not bits.
>
> It's worth noting that 22.45% of hotfix users are downloading at
> <100KBps and 50% are below 400KBps.
>
> I empathize with the 1.21% of users in the 0-10KBps bucket: we made them
> download a ~28M installer which would take at least 48 minutes. Users in
> the other slow buckets don't fare much better. I haven't
> cross-referenced these aggregate results to see if our users on slower
> connections had the patience to sit through the full download.
>
> The hotfix did not limit the Necko channel in any way (unlike the
> built-in update mechanism, which throttles).
>
> Download Failures
> =================
>
> As reported above, 2.40% of reporting clients experienced at least 1
> download failure.
>
> A download failure occurs when the hotfix thinks it has successfully
> downloaded an installer but the file size or SHA512 verification fails.
> This could mean a) download/Necko code is buggy b) HTTP servers are
> lying or misbehaving (possibly captive portal) c) evil parties are
> rewriting Firefox installers. This will need additional analysis.
>
> Summary
> =======
>
> The hotfix has so far upgraded a few million clients to Firefox 30!
>
> I'm concerned about the few million clients that still have the hotfix
> installed, especially the ones that apparently haven't seen a
> notification. I'll need to deep dive into payloads to get a feel for
> what's happening.
>
> Remember, we lost data for clients that completed in the first ~24 hours.
>
> Let me know if you have any questions or data requests.

I pulled the data again today and have more data to report.

We now have over 15M hotfix installs reporting. That's 15 million people 
running (or were running) a slower, buggier, and vulnerable Firefox. 
About 7M have not yet been upgraded by the hotfix :/

The counts below are histograms. Counts are recorded in a bucket if they 
are <= to the reported bucket value.

What             Count  % here    % cumulative  % left

downloadAttempts     0  1.65%       1.65%      98.35%
downloadAttempts     1 79.88%      81.53%      18.47%
downloadAttempts     2  7.06%      88.59%      11.41%
downloadAttempts     3  2.93%      91.52%       8.48%
downloadAttempts     4  1.65%      93.17%       6.83%
downloadAttempts     5  1.12%      94.29%       5.71%
downloadAttempts    10  2.79%      97.08%       2.92%
downloadAttempts    15  1.06%      98.14%       1.86%
downloadAttempts    20  0.52%      98.66%       1.34%
downloadAttempts    30  0.54%      99.21%       0.79%
downloadAttempts    40  0.26%      99.47%       0.53%
downloadAttempts    50  0.15%      99.62%       0.38%
downloadAttempts   100  0.27%      99.89%       0.11%
downloadAttempts   inf  0.11%     100.00%       0.00%
downloadComplete     0  4.93%       4.93%      95.07%
downloadComplete     1 92.33%      97.26%       2.74%
downloadComplete     2  1.59%      98.84%       1.16%
downloadComplete     5  0.53%      99.37%       0.63%
downloadComplete    10  0.60%      99.98%       0.02%
downloadComplete    20  0.02%     100.00%       0.00%
downloadComplete   inf  0.00%     100.00%       0.00%
downloadFailures     0 97.57%      97.57%       2.43%
downloadFailures     1  1.31%      98.88%       1.12%
downloadFailures     2  0.29%      99.17%       0.83%
downloadFailures     5  0.25%      99.42%       0.58%
downloadFailures    10  0.57%      99.99%       0.01%
downloadFailures   inf  0.01%     100.00%       0.00%
downloadSuccess      0  5.64%       5.64%      94.36%
downloadSuccess      1 94.19%      99.83%       0.17%
downloadSuccess      2  0.15%      99.98%       0.02%
downloadSuccess    inf  0.02%     100.00%       0.00%

98.35% of our hotfix users attempted to download an installer. 94.36% 
(of the total population) were successful. The hotfix has been deployed 
for a few weeks, so the fact that a few percent of our users couldn't 
get an installer downloaded (and validated) is a bit worrying.

Bug 1042461 could account for some of that. I also noticed some weird 
CDN behavior and filed bug 1042481 to investigate that.

(downloadComplete means the hotfix thinks it finished a download and 
downloadSuccess means the download passed integrity checks.)

The hotfix supports download resume:

downloadResumes     0 86.62%      86.62%      13.38%
downloadResumes     1  6.50%      93.11%       6.89%
downloadResumes     2  2.35%      95.46%       4.54%
downloadResumes     3  1.21%      96.67%       3.33%
downloadResumes     4  0.74%      97.41%       2.59%
downloadResumes     5  0.49%      97.90%       2.10%
downloadResumes    10  1.11%      99.01%       0.99%
downloadResumes    15  0.39%      99.39%       0.61%
downloadResumes    20  0.20%      99.59%       0.41%
downloadResumes    30  0.18%      99.77%       0.23%
downloadResumes    40  0.08%      99.85%       0.15%
downloadResumes    50  0.04%      99.90%       0.10%
downloadResumes   100  0.08%      99.97%       0.03%
downloadResumes   inf  0.03%     100.00%       0.00%

The majority of users appeared to download the hotfix within one process 
lifetime. The vast majority were able to completely download the 
installer with only a handful of resume attempts.

What about the speed users were able to download at, in bytes per second?

downloadMean     10000  1.18%       1.18%      98.82%
downloadMean     25000  3.82%       5.00%      95.00%
downloadMean     50000  6.86%      11.85%      88.15%
downloadMean     75000  5.66%      17.51%      82.49%
downloadMean    100000  4.66%      22.17%      77.83%
downloadMean    150000  7.41%      29.58%      70.42%
downloadMean    200000  5.49%      35.07%      64.93%
downloadMean    250000  5.21%      40.27%      59.73%
downloadMean    300000  3.39%      43.66%      56.34%
downloadMean    350000  3.13%      46.79%      53.21%
downloadMean    400000  2.84%      49.63%      50.37%
downloadMean    450000  2.36%      51.99%      48.01%
downloadMean    500000  2.20%      54.18%      45.82%
downloadMean    600000  3.70%      57.88%      42.12%
downloadMean    700000  3.50%      61.38%      38.62%
downloadMean    800000  2.73%      64.11%      35.89%
downloadMean    900000  2.25%      66.36%      33.64%
downloadMean   1000000  1.85%      68.20%      31.80%
downloadMean   1250000  4.27%      72.47%      27.53%
downloadMean   1500000  3.18%      75.65%      24.35%
downloadMean   1750000  2.35%      77.99%      22.01%
downloadMean   2000000  2.19%      80.19%      19.81%
downloadMean   3000000  4.95%      85.13%      14.87%
downloadMean   4000000  2.49%      87.63%      12.37%
downloadMean   5000000  1.48%      89.11%      10.89%
downloadMean       inf  6.18%      95.29%       4.71%

There is a non-trivial possibility that the code for calculating 
download speed is not accurate and skews data towards higher numbers.

I may update my stats job to break this out by locale to get a rough 
approximation of client-seen download speeds by geo.

What happened when users attempted an install?

installAttempts    0  5.08%       5.08%      94.92%
installAttempts    1 91.77%      96.85%       3.15%
installAttempts    2  2.83%      99.68%       0.32%
installAttempts    3  0.24%      99.92%       0.08%
installAttempts    4  0.05%      99.97%       0.03%
installAttempts    5  0.01%      99.98%       0.02%
installAttempts   10  0.01%     100.00%       0.00%
installAttempts  inf  0.00%     100.00%       0.00%
installFailures    0 58.46%      58.46%      41.54%
installFailures    1 40.53%      99.00%       1.00%
installFailures    2  0.82%      99.82%       0.18%
installFailures    3  0.13%      99.94%       0.06%
installFailures    4  0.03%      99.98%       0.02%
installFailures    5  0.01%      99.99%       0.01%
installFailures  inf  0.01%     100.00%       0.00%

41.54% have experienced a failure running the installer. The majority of 
these are due to UAC cancellation or failure. (The hotfix needs to 
request admin privileges via UAC because it's initiated from Firefox 
itself.)

If we encounter a "soft" failure such as UAC elevation problems, we fall 
back to showing a notification every day and asking the user to upgrade. 
~40% of the hotfix-compatible users appear to be in this mode.

Here is how users are interacting with the notification:

notificationsClicked    0 96.81%      96.81%       3.19%
notificationsClicked    1  2.87%      99.68%       0.32%
notificationsClicked    2  0.24%      99.92%       0.08%
notificationsClicked    3  0.05%      99.97%       0.03%
notificationsClicked    4  0.01%      99.98%       0.02%
notificationsClicked    5  0.01%      99.99%       0.01%
notificationsClicked    6  0.00%      99.99%       0.01%
notificationsClicked    7  0.00%     100.00%       0.00%
notificationsRemoved     0 93.08%      93.08%       6.92%
notificationsRemoved     1  3.50%      96.59%       3.41%
notificationsRemoved     2  1.40%      97.98%       2.02%
notificationsRemoved     3  0.68%      98.66%       1.34%
notificationsRemoved     4  0.40%      99.06%       0.94%
notificationsRemoved     5  0.26%      99.32%       0.68%
notificationsRemoved     6  0.18%      99.50%       0.50%
notificationsRemoved     7  0.13%      99.63%       0.37%
notificationsRemoved     8  0.10%      99.72%       0.28%
notificationsRemoved     9  0.07%      99.80%       0.20%
notificationsRemoved    10  0.05%      99.85%       0.15%
notificationsRemoved    15  0.12%      99.97%       0.03%
notificationsRemoved    20  0.02%      99.99%       0.01%
notificationsRemoved    25  0.00%     100.00%       0.00%
notificationsShown     0 91.84%      91.84%       8.16%
notificationsShown     1  3.43%      95.28%       4.72%
notificationsShown     2  1.69%      96.97%       3.03%
notificationsShown     3  0.89%      97.86%       2.14%
notificationsShown     4  0.56%      98.42%       1.58%
notificationsShown     5  0.37%      98.79%       1.21%
notificationsShown     6  0.28%      99.07%       0.93%
notificationsShown     7  0.21%      99.28%       0.72%
notificationsShown     8  0.17%      99.45%       0.55%
notificationsShown     9  0.13%      99.57%       0.43%
notificationsShown    10  0.10%      99.67%       0.33%
notificationsShown    15  0.25%      99.93%       0.07%
notificationsShown    20  0.06%      99.99%       0.01%
notificationsShown    25  0.01%     100.00%       0.00%

Due to bug 1049727, we don't collect data after users don't attempt 
install through the notification. We are assuming these users are seeing 
the notification and not acting on it (because if they acted on it they 
would attempt an install and we should see data afterwards).

What's troubling to me is that we have ~40% of users that should be in 
the notification mode. 8.16% are reporting that they've seen a 
notification and 3.19% are attempting an install by clicking the 
notification. We can infer from the absence of data that over 90% of 
users that should be in notification mode aren't clicking through. This 
could be due to bugs displaying the notification. But user apathy 
towards non-sticky notifications is a known behavior, so 90% ignore rate 
seems within the realm of possibility to me.

Overall, the hotfix success rate judged from uninstall reason is as follows:

uninstallReason   ADDON_MANAGER_UNINSTALL  0.00%       0.00%     100.00%
uninstallReason        LAUNCHER_RUN_ERROR  0.01%       0.01%      99.99%
uninstallReason      LAUNCHER_START_ERROR  0.01%       0.02%      99.98%
uninstallReason     MAX_DOWNLOAD_FAILURES  0.00%       0.02%      99.98%
uninstallReason            NOT_APPLICABLE  0.00%       0.02%      99.98%
uninstallReason                 NO_LOCALE  0.01%       0.04%      99.96%
uninstallReason      NO_LONGER_APPLICABLE  0.14%       0.18%      99.82%
uninstallReason                      None 46.25%      46.43%      53.57%
uninstallReason        SUCCESSFUL_UPGRADE 53.57%     100.00%       0.00%

So, 53.57% of installs were successful. The hotfix is still installed on 
46.25% of the user base. Part of this 46.25% is in a post-upgrade wait 
state. But that amount should be low.

Of the clients that attempted an install, here are the versions they 
were sitting at at time of upgrade.

upgradedFrom        10  0.41%       0.41%      99.59%
upgradedFrom        11  0.36%       0.78%      99.22%
upgradedFrom        12  2.04%       2.81%      97.19%
upgradedFrom        13  0.48%       3.29%      96.71%
upgradedFrom        14  0.49%       3.78%      96.22%
upgradedFrom        15  0.73%       4.52%      95.48%
upgradedFrom        16  0.85%       5.37%      94.63%
upgradedFrom        17  0.50%       5.87%      94.13%
upgradedFrom        18  0.47%       6.34%      93.66%
upgradedFrom        19  0.58%       6.91%      93.09%
upgradedFrom        20  2.43%       9.35%      90.65%
upgradedFrom        21  3.49%      12.84%      87.16%
upgradedFrom        22  3.64%      16.48%      83.52%
upgradedFrom        23  4.49%      20.97%      79.03%
upgradedFrom        24  6.01%      26.98%      73.02%
upgradedFrom        25  7.78%      34.75%      65.25%
upgradedFrom        26 12.93%      47.68%      52.32%
upgradedFrom        27 17.16%      64.84%      35.16%
upgradedFrom        28 30.12%      94.97%       5.03%
upgradedFrom        29  0.00%      94.97%       5.03%
upgradedFrom      None  5.03%     100.00%       0.00%

It's no surprise that more users are on modern versions. The spike in 
Firefox 12 is interesting. We do have more users stuck on Firefox 12 for 
some reason.

And in case you were curious of the breakdown of Windows versions for 
the hotfix-compatible users:

windowsVersion            5.1.0  0.00%       0.04%      99.96%
windowsVersion            5.1.1  0.00%       0.04%      99.96%
windowsVersion            5.1.2  6.64%       6.68%      93.32%
windowsVersion            5.1.3 23.42%      30.09%      69.91%
windowsVersion            6.0.0  0.46%      30.56%      69.44%
windowsVersion            6.0.1  0.80%      31.37%      68.63%
windowsVersion            6.0.2  2.51%      33.88%      66.12%
windowsVersion            6.1.0 14.47%      48.35%      51.65%
windowsVersion            6.1.1 42.79%      91.14%       8.86%
windowsVersion            6.1.3  0.01%      91.16%       8.84%
windowsVersion            6.2.0  6.03%      97.19%       2.81%
windowsVersion            6.3.0  2.81%     100.00%       0.00%

For reference:

5.1 XP
5.2 XP 64, Server 2003
6.0 Vista, Server 2008
6.1 7, Server 2008 R2
6.2 8, Server 2012
6.3 8.1, Server 2012 R2

I'm not sure how this compares to the version breakdown for the overall 
Firefox user base.

At this point, we are considering releasing a minor hotfix followup to 
try to capture more data from users apparently stuck in a notification 
loop and to fix other low-hanging fruit. We may follow up with or skip 
to a new major version of the hotfix that fixes larger bugs and is 
possibly more aggressive about getting an upgrade to work.



More information about the firefox-dev mailing list