Changing the ID scheme for FHR

Benjamin Smedberg benjamin at smedbergs.us
Mon Mar 10 06:41:35 PDT 2014


Because of ongoing issues with data quality in Firefox Health Report 
(FHR), we are planning on changing the way data is sent and stored on 
the server.


Currently Firefox and the server try to cooperate in storing only one 
record for each browser instance; they do this by creating a new upload 
ID on each upload and removing the document from the prior upload ID. In 
practice this system is not working: situations like backups, profile 
copying, machine imaging, and other scenarios are causing "orphan" 
documents in the FHR dataset, skewing many kinds of statistics that FHR 
was designed to collect.


We are changing this system to upload using a single identifier per 
browser instance for all uploads.


The original varying document ID was developed to improve user privacy. 
In reality it doesn't help: to associate FHR data with a particular 
person you would have to get the ID off of their client, but the client 
already stores all of the relevant data. The rotating server ID doesn't 
provide any additional privacy benefit.


For a transition period, we are going to add the stable identifier to 
the FHR payload: this will allow us to reliably measure the orphaning 
problem and give us time to add some additional error-handling and 
logging code to the collection servers. After we've verified that the 
stable ID isn't causing new problems, we will switch the client to 
upload using the stable ID.


In the case of profile copying and machine imaging, we may end up in a 
state where multiple Firefox profiles are uploading data to the same 
identifier. This may not be a problem in practice, but if we do measure 
this happening, we have a plan to help address the issue:


On upload, the server will compare the new data with existing data. If 
the data doesn't match, the server will log the affected documents and 
mark the upload ID as inactive. The next time any client tries to upload 
to an inactive ID, the server will instruct the client to generate and 
switch to a new random ID.


The client-side bug for the initial stage of this work is bug 968419.


Please direct any questions or concerns to the fhr-dev mailing list.


--BDS

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/fhr-dev/attachments/20140310/0d2918f4/attachment.html>


More information about the fhr-dev mailing list