Telemetry Experiments: experiment, A/B, and questionnaire implementation in Firefox
benjamin at smedbergs.us
Tue Jan 28 23:14:34 UTC 2014
One of the things I have been looking at in some detail recently is how
we can use qualitative measurements in Firefox. This includes better
integration of existing Telemetry and FHR systems, but also measurements
which don't fit into those systems.
Part of my study was prompted by a request from Mozilla user research,
who want to be able to run experiments and collect data from Firefox
users, in a similar way to the mostly-defunct Test Pilot program, but
with a better sample population and more rigorous engineering and release.
I also collected examples of problems that various groups have wanted to
solve using data collection. It turns out that some of these use cases
can already be solved using existing deployment and measurement systems,
while others need additional features.
I'm interested to know whether there are other important cases which a
significantly different from the ones which I've already collected. In
particular, I'm looking at the following variables:
* User population: what kind of user population is desirable/necessary
in order to answer the question? For an early-stage UI demo, the
desired population may be users who want to live on the bleeding
edge and are willing to live with bugs. For some studies, we may
want to examine user behavior in particular countries or who have
particular addons installed.
* Data privacy characteristics: in order to answer the question, do we
need to collect any identifying information, such as URLs? Does
collecting the data provide direct benefit back to users?
* Engineering: does the measurement require changing core code, or can
the measurement be implemented as addon code? What is the expected
quality of the change being considered?
* Result monitoring: what kind of result monitoring is necessary? Do
we expect a single report to run after a while, or will this measure
ongoing Firefox behavior? Is it important to be able to correlate
results against other pieces of data?
* User interactions: to what extent should users be aware that a
measurement or experiment is in progress? Do we want to ask them
specific questions or does the experiment require some sort of
opt-in or opt-out (this is related to the questions about privacy
and user population).
This quarter (in Firefox 30) my team is going to focus on building out
one specific system, a way to deploy experiment code to prerelease users
in Firefox desktop builds. We're going to start out small, solving a
specific request from Gregg Lind in user research for a tool to deploy
some experiments related to search behavior in Firefox.
Each experiment will be deployed as a restartless addon, and
measurements will be taken via some combination of existing FHR and
telemetry data collection channels. The experiment system will be
limited to people with telemetry enabled(1) and each experiment will
also be able to set additional conditions, such as limiting the
experiment to users in certain release channels, locales, addons or lack
of addons, etc.
After this first phase is complete, I expect to extend this system. We
will probably want to be able to run similar experiments in Firefox for
Android, although addons can do far less UI modification in general. We
will also want to handle A/B testing where we don't install an addon,
but simply flip various pref configurations. We also plan on extending
this same system as a way to deploy questionnaires or surveys to users.
For example, if we find an addon which appears to be malware, we might
ask users whether they know the addon is installed, whether they
installed it intentionally, etc. I am interested if people have specific
high-priority studies or surveys in mind that we can use to serve as
models for future revisions.
Finally, we are considering whether and how to combine FHR and telemetry
data collection. Each system currently has weaknesses which we'd like to
address, and it seems that the best way forward is to combine them. This
is still in early decision-making, but I've written up a proposal here
Questions, concerns? Followup to firefox-dev please.
1. Currently telemetry is enabled by default in nightly and aurora
builds, and I have requested that it be enabled by default in all
prerelease builds (including beta). Being able to run experiments on
beta users and measure the results is critical, since our beta user
population is much more representative of release users.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the firefox-dev