Suggestions for the new unified FHR/Telemetry/Experiment ping
Benjamin Smedberg
benjamin at smedbergs.us
Sun Jan 25 08:58:05 PST 2015
Georg wrote:
> My assumptions was that we will not reset.
The proposal as written is that we will reset all the histograms for
each subsession. Otherwise, realtime dashboard which process incoming
pings will multipe-count various metrics, and we definitely want to
avoid this.
On 1/24/2015 1:35 AM, Vladan Djeric wrote:
>
> * It will be hard to do per-session analyses
>
I have several responses here:
1) It will be a bit harder than currently, but I don't think that it
will be extremely hard. There will be an efficient API to fetch all the
pings associated with a user, which should make it relatively
straightforward to stitch together an entire session from its pieces.
This is a functional requirement for the more qualitative analyses,
which will have to stitch together an entire user history and not just
individual sessions. Doing an individual session should be fairly easy.
2) I treat the session orientation of telemetry as an unfortunate
limitation, not a desirable property, for almost all of the use cases
that I've seen. I'd like us to try and move away from reporting metrics
based on sessions. Can you describe in more detail the use cases where
analyzing data by session is preferable to analyzing by some constant
denominator? We should be willing to use both clock time and activeTicks
as denominators, and these denominators can both be calculated looking
at individual subsession pings.
3) For the case of the current telemetry dashboard, I'd like to
understand why simply replacing the current whole-session analysis with
the new subsessions would produce statistically worse results than the
current session-based analysis.
> * Many of the 1000+ Telemetry measurements are inherently
> "per-session" and can't meaningfully be split into session fragments:
> o Flag histograms
> <https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Adding_a_new_Telemetry_probe#Choosing_a_Histogram_Type> track
> feature usage per-session.
> + They are automatically initialized to a value of "false"
> at the beginning of a session, and can only be set to
> "true" once.
> + If we reset Telemetry measurements every time we create a
> new ping, we'll be reporting nonsense: pings from the same
> session will contradict each other on whether a feature
> was ever used during the session
> # This would feed bad data to both the dashboards and
> any custom analyses
>
*If* you really care about this per-session, why can't you just take
"true" from any of the subsessions as an indication that it's true for
the entire session?
And if we just report by subsession, how is this much different from the
skew that we already have between users who have lots of short sessions
and users that keep their browser open for days or weeks?
Maybe this just indicates that we're mis-using histograms for
non-aggregate measurements, and we should just have a separate list of
flag metrics which are treated differently.
> o Count histograms
> <https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Adding_a_new_Telemetry_probe#Choosing_a_Histogram_Type> are
> also per-session measurements. You can't aggregate a
> count-histogram value from the middle of a session together
> with final values from other sessions
>
Won't summing across the subsessions get you the total count for the
session?
> * For custom analyses, we sometimes want to correlate measurements
> from the beginning of a session with measurements from the end of
> a session (which could have lasted several days), e.g. histograms
> related to startup performance vs later performance
> o We would need that messy server-side session reconstruction
> process to get at per-session data.
> o More generally, a ping generated as a result of local time &
> environment changes is not inherently meaningful to us, unlike
> a full user session
>
I don't understand this case. Assuming session stitching works, which is
a general requirement for all sorts of analyses, this should work no
worse than currently, and you potentially have finer-grain data on the
subsequent days if that's useful.
> * Resetting Telemetry and FHR data when a TelemetryExperiment begins
> removes valuable context from the experiment ping. It's possible
> to reconstruct it, but that's yet another server-side job to run
>
I don't understand this. Is this also assuming that stitching is expensive?
> * There's overhead from sending a new ping for each mid-session
> environment change
> o There's also a small privacy issue with creating ordered,
> fine-grained reports of user actions, e.g. when a user goes
> through their add-ons list and disables 5 addons, we report
> each user action
> o Either coalesce successive environment-change pings, or
> carefully vet which mid-session environment changes generate a
> new ping
>
I think it's worth considering whether there's a window of time where
multiple changes get coalesced. But I'm not particularly worried about
the privacy problem, since we do in fact want to record when users
disable addons.
> I'd like to propose that we implement the following modifications to
> the FHR/Telemetry v4 document:
>
> 1. Do not reset _Telemetry_ measurements when a session crosses the
> 24-hour boundary
> * Continue to "reset" Telemetry measurements when we start a new
> session
> * There's no need to reset Telemetry on most environment changes
> (e.g. amount of memory installed) since those can't happen
> without a Firefox restart anyway.
> 2. Record mid-session environment changes (add-ons and
> TelemetryExperiments) in a special section in the ping.
> * For each such environment change, document the change in the
> section and also attach a snapshot of the Telemetry & FHR data
> at the time of the change
> * After the snapshot is saved, reset Telemetry and FHR
> measurements for the current session. In other words, snapshot
> & then build up a diff
> * For each additional environment change during the same
> session, just repeat and append to the new section
> * Telemetry backend scripts (dashboard, regression detector etc)
> can just ignore experiment/add-on change pings
>
> This model has some nice properties:
>
> * The *final ping* of a session is equivalent to a Telemetry
> saved-session ping
> o Per-session analyses are as easy to do as before
> o No need to run any session reconstruction jobs!
> * Every main ping submitted is meaningful without needing any
> reconstruction steps. All pings will contain the current FHR state
> + all the Telemetry measurements from the current session
> * Most pings will only have one environment change, so the relevant
> measurements that happened after the change are all going to be in
> the regular Telemetry/FHR section
> * However, when deeper analysis is required, Experiment pings will
> also have information about what was happening BEFORE the
> experiment began
> * Analyzing pings with multiple environment changes won't be much harder
>
I feel like this proposal is optimizing for the wrong things.
You are making a distinction between "Telemetry" measurements and other
measurements in a way which I am specifically trying to avoid. The goal
is to use the common histogram system for everything. At least some of
those measurements must be distinguished by subsession. I explicitly
want to get rid of the current situation where "telemetry metrics" are
treated one way, and "FHR metrics" are treated in some entirely separate
manner. We want to be able to use the standard histograms/keyed
histograms for almost everything.
For the simple things like the telemetry dashboard, I believe that doing
all analysis by subsession is good enough (no worse than the current
situation). For more complex queries , both stitching together an entire
session and stitching together the history per-user will not only be
possible but should be fairly efficient.
--BDS
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/fhr-dev/attachments/20150125/350f5066/attachment.html>
More information about the fhr-dev
mailing list