<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<br>
<div class="moz-cite-prefix">On 12/22/2014 5:46 AM, Gervase Markham
wrote:<br>
</div>
<blockquote cite="mid:5497F67F.9020006@mozilla.org" type="cite">
<pre wrap="">On 20/12/14 07:15, Roberto Agostino Vitillo wrote:
</pre>
<blockquote type="cite">
<pre wrap="">If you work with Telemetry data please have a look at this
blogpost: <a class="moz-txt-link-freetext" href="http://robertovitillo.com/2014/12/19/clientid-in-telemetry-submissions/">http://robertovitillo.com/2014/12/19/clientid-in-telemetry-submissions/</a>.
</pre>
</blockquote>
<pre wrap="">
I seem to remember that when we first did telemetry, one of the user
privacy protections we built in was that individual users could not be
tracked across sessions. We accepted the limitations of this in order to
preserve privacy.
Does this blog post mean we have changed our position?</pre>
</blockquote>
<br>
The short answer is yes.<br>
<br>
We did not build an identifier as part of Telemetry during it's
initial development, partly because we wanted to deploy the system
and understand the data. Also because many of the performance
metrics that Taras wanted to optimize didn't require a identifier.<br>
<br>
We've had a profile identifier as part of Firefox Health Report
(FHR) since it was initially deployed. This was originally a
"rolling" profile identifier which changed at each new upload, but
this rotating identifier caused significant data issues and was
replaced with a stable identiifer earlier this year.<br>
<br>
As we've worked with both FHR and telemetry, we've discovered things
about the systems:<br>
<ul>
<li>Engineering is painful on both the client and server because
FHR and telemetry are so different; it's impossible to
prototype/prove a measurement on the prerelease population
before making it part of FHR for the release population.</li>
<li>FHR is very complex and has very long lag time between events
and their measurement.</li>
<li>There are important questions that Telemetry cannot answer
about user behavior over time because of its lack of identifier.</li>
<li>The telemetry dataset has a lot of bits of data which in many
cases be used to heuristically associate records even without an
identifier.<br>
</li>
</ul>
<p>To solve some of these issues, we're combining the FHR and
telemetry systems for desktop Firefox, and using the same general
upload mechanism. For release (FHR) users, we will be collecting
the carefully controlled set of data that has a direct
relationship to improving the user experience, either for
individual users or in general. For prerelease (telemetry) users,
we will continue with the larger set of measurements including
speculative histograms.<br>
</p>
<p>Most measurements will be included in the main session ping which
is associated with a profile ID. For metrics which are more
sensitive, we have the option of sending a separate data ping
which doesn't have the profile ID and is much less identifying in
general. I believe that overall this change is likely to improve
user privacy for the more sensitive data while at the same time
enabling necessary user-behavior analysis as we start running more
experiments to test features with our prerelease users.<br>
</p>
<p>More details about the unification plan can be found at
<a class="moz-txt-link-freetext" href="https://docs.google.com/document/d/1IGpzsYGi_sq3YFQDAPyKOkU_BKvXAC95fZYA2i4ceVs/edit?usp=sharing">https://docs.google.com/document/d/1IGpzsYGi_sq3YFQDAPyKOkU_BKvXAC95fZYA2i4ceVs/edit?usp=sharing</a></p>
<p>I'm happy to have feedback on the current plan. The primary
mailing list we're using to discuss and implement this plan is the
fhr-dev list, which includes client and server engineering as well
as metrics engineers and other interested parties:
<a class="moz-txt-link-freetext" href="https://mail.mozilla.org/listinfo/fhr-dev">https://mail.mozilla.org/listinfo/fhr-dev</a> As Firefox data
collection steward/module owner I am ultimately responsible for
Firefox data collection, including the privacy and data quality
tradeoffs that we have to make.<br>
</p>
<p>--BDS<br>
<br>
</p>
</body>
</html>