Suggestions for the new unified FHR/Telemetry/Experiment ping
Mark Reid
mreid at mozilla.com
Mon Jan 26 06:57:07 PST 2015
On Sun, Jan 25, 2015 at 12:58 PM, Benjamin Smedberg <benjamin at smedbergs.us>
wrote:
>
>
> Georg wrote:
>
> My assumptions was that we will not reset.
>
>
> The proposal as written is that we will reset all the histograms for each
> subsession. Otherwise, realtime dashboard which process incoming pings will
> multipe-count various metrics, and we definitely want to avoid this.
>
So Vladan's interpretation is correct in the initial message.
>
>
> On 1/24/2015 1:35 AM, Vladan Djeric wrote:
>
>
>
> - It will be hard to do per-session analyses
>
>
> I have several responses here:
>
> 1) It will be a bit harder than currently, but I don't think that it will
> be extremely hard. There will be an efficient API to fetch all the pings
> associated with a user, which should make it relatively straightforward to
> stitch together an entire session from its pieces. This is a functional
> requirement for the more qualitative analyses, which will have to stitch
> together an entire user history and not just individual sessions. Doing an
> individual session should be fairly easy.
>
> 2) I treat the session orientation of telemetry as an unfortunate
> limitation, not a desirable property, for almost all of the use cases that
> I've seen. I'd like us to try and move away from reporting metrics based on
> sessions. Can you describe in more detail the use cases where analyzing
> data by session is preferable to analyzing by some constant denominator? We
> should be willing to use both clock time and activeTicks as denominators,
> and these denominators can both be calculated looking at individual
> subsession pings.
>
> 3) For the case of the current telemetry dashboard, I'd like to understand
> why simply replacing the current whole-session analysis with the new
> subsessions would produce statistically worse results than the current
> session-based analysis.
>
I believe the metrics team did a review of the general approach and
statistical validity of telemetry's use of histograms a few years back -
does anyone have a record of that?
Generally, I think there is value in maintaining continuity with the
existing telemetry dashboard if feasible - we are already aiming to change
a lot of things at once, being able to compare with previously-collected
Telemetry data would be nice.
>
>
> - Many of the 1000+ Telemetry measurements are inherently
> "per-session" and can't meaningfully be split into session fragments:
> - Flag histograms
> <https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Adding_a_new_Telemetry_probe#Choosing_a_Histogram_Type> track
> feature usage per-session.
> - They are automatically initialized to a value of "false" at the
> beginning of a session, and can only be set to "true" once.
> - If we reset Telemetry measurements every time we create a new
> ping, we'll be reporting nonsense: pings from the same session will
> contradict each other on whether a feature was ever used during the session
> - This would feed bad data to both the dashboards and any
> custom analyses
>
> *If* you really care about this per-session, why can't you just take
> "true" from any of the subsessions as an indication that it's true for the
> entire session?
>
> And if we just report by subsession, how is this much different from the
> skew that we already have between users who have lots of short sessions and
> users that keep their browser open for days or weeks?
>
> Maybe this just indicates that we're mis-using histograms for
> non-aggregate measurements, and we should just have a separate list of flag
> metrics which are treated differently.
>
I think the major difference is that one would need to know *for each
histogram* how to re-combine the subsessions into the whole-session
representation. At the moment, they can all be treated the same way. Maybe
having a different method per *type* of histogram would be enough, I don't
know.
>
> - Count histograms
> <https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Adding_a_new_Telemetry_probe#Choosing_a_Histogram_Type> are
> also per-session measurements. You can't aggregate a count-histogram value
> from the middle of a session together with final values from other sessions
>
>
> Won't summing across the subsessions get you the total count for the
> session?
>
>
> - For custom analyses, we sometimes want to correlate measurements
> from the beginning of a session with measurements from the end of a session
> (which could have lasted several days), e.g. histograms related to startup
> performance vs later performance
> - We would need that messy server-side session reconstruction
> process to get at per-session data.
> - More generally, a ping generated as a result of local time &
> environment changes is not inherently meaningful to us, unlike a full user
> session
>
>
> I don't understand this case. Assuming session stitching works, which is a
> general requirement for all sorts of analyses, this should work no worse
> than currently, and you potentially have finer-grain data on the subsequent
> days if that's useful.
>
>
> - Resetting Telemetry and FHR data when a TelemetryExperiment begins
> removes valuable context from the experiment ping. It's possible to
> reconstruct it, but that's yet another server-side job to run
>
> I don't understand this. Is this also assuming that stitching is
> expensive?
>
>
> - There's overhead from sending a new ping for each mid-session
> environment change
> - There's also a small privacy issue with creating ordered,
> fine-grained reports of user actions, e.g. when a user goes through their
> add-ons list and disables 5 addons, we report each user action
> - Either coalesce successive environment-change pings, or carefully
> vet which mid-session environment changes generate a new ping
>
>
> I think it's worth considering whether there's a window of time where
> multiple changes get coalesced. But I'm not particularly worried about the
> privacy problem, since we do in fact want to record when users disable
> addons.
>
> I'd like to propose that we implement the following modifications to the
> FHR/Telemetry v4 document:
>
> 1. Do not reset *Telemetry* measurements when a session crosses the
> 24-hour boundary
> - Continue to "reset" Telemetry measurements when we start a new
> session
> - There's no need to reset Telemetry on most environment changes
> (e.g. amount of memory installed) since those can't happen without a
> Firefox restart anyway.
> 2. Record mid-session environment changes (add-ons and
> TelemetryExperiments) in a special section in the ping.
> - For each such environment change, document the change in the
> section and also attach a snapshot of the Telemetry & FHR data at the time
> of the change
> - After the snapshot is saved, reset Telemetry and FHR measurements
> for the current session. In other words, snapshot & then build up a diff
> - For each additional environment change during the same session,
> just repeat and append to the new section
> - Telemetry backend scripts (dashboard, regression detector etc)
> can just ignore experiment/add-on change pings
>
> This model has some nice properties:
>
> - The *final ping* of a session is equivalent to a Telemetry
> saved-session ping
> - Per-session analyses are as easy to do as before
> - No need to run any session reconstruction jobs!
> - Every main ping submitted is meaningful without needing any
> reconstruction steps. All pings will contain the current FHR state + all
> the Telemetry measurements from the current session
> - Most pings will only have one environment change, so the relevant
> measurements that happened after the change are all going to be in the
> regular Telemetry/FHR section
> - However, when deeper analysis is required, Experiment pings will
> also have information about what was happening BEFORE the experiment began
> - Analyzing pings with multiple environment changes won't be much
> harder
>
>
> I feel like this proposal is optimizing for the wrong things.
>
> You are making a distinction between "Telemetry" measurements and other
> measurements in a way which I am specifically trying to avoid. The goal is
> to use the common histogram system for everything. At least some of those
> measurements must be distinguished by subsession. I explicitly want to get
> rid of the current situation where "telemetry metrics" are treated one way,
> and "FHR metrics" are treated in some entirely separate manner. We want to
> be able to use the standard histograms/keyed histograms for almost
> everything.
>
> For the simple things like the telemetry dashboard, I believe that doing
> all analysis by subsession is good enough (no worse than the current
> situation). For more complex queries , both stitching together an entire
> session and stitching together the history per-user will not only be
> possible but should be fairly efficient.
>
> --BDS
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/fhr-dev/attachments/20150126/6376e11b/attachment.html>
More information about the fhr-dev
mailing list