RR API Discussion (librr)

Benoit Girard bgirard at mozilla.com
Mon Oct 13 22:43:00 PDT 2014


On Mon, Oct 13, 2014 at 11:48 PM, Chris Jones <jones.chris.g at gmail.com> wrote:
> On Mon, Oct 13, 2014 at 5:33 PM, Benoit Girard <bgirard at mozilla.com> wrote:
>>
>> More importantly what API should rr provide?
>
>
> Do you mind enumerating your use cases for recording and replay?  That would
> help drive this discussion.

My use case (we should probably fork a discussion on how to use rr for
performance at some point):
My current use case is to extend the gecko profiler UI Cleopatra.
Currently the profiler keeps a time based ordered list of samples.
Cleopatra could connect to librr via jsctypes if a rr recording is
present. librr could provide features to dig into what happens between
profiler samples within a hotspot. Ideally for usability the librr JS
bridge could automate things. Currently the profiler is less than
useful in case where the hotspot is obvious but crucial information
isn't on the stack like restyles.

Here are the use cases that I think would be useful for the replay API:

1) Mapping a sample in the profile into a rr-timestamp somehow. This
should of course be very general. Samples can be (nearly) uniquely
matched by their timestamp. Being able to map when variable 'delta' in
the link is equals a certain value into a rr-timestamp would solve
this http://mxr.mozilla.org/mozilla-central/source/tools/profiler/TableTicker.cpp?rev=4ca3b16e24bd#770.
The call site in the profiler could even be modified if required.

2) Being able to replay to a rr-timestamp. From there we could open
the debugger from the profiler timeline to a nearby sample and let the
user inspect  and follow the execution. This could be useful in
figuring out why a restyle (or similar) was unusually expensive.

3) Unsure: Being able to collect an ordered list of PC/IP for all the
threads between two rr-timestamp. I'm not sure if this is feasible in
a reasonable amount of time. It would be nice to select 50ms of
execution in the profiler trace and have a time slider where you can
watch the PC/IP and quickly follow the execution throughout the code
by dragging the slider/stepping offline. When an interesting spot has
been found then 2) could be used to open a debugger there. Inspecting
code by stepping in GDB is painful and frustratingly slow and prone to
stepping past the point of interest.

4) Gathering expensive logging data during the replay step instead of
the record step. When profiling you wouldn't log the world but it's
common after profiling to find that the events you wanted are not the
log. Being able to use 2) to launch a debugger would help a great
deal. However being able to have something use the ReplaySession to
gather information that wasn't logged during the original run. For
instance after the fact we could go and fetch a detailed log of the
IPDL/IPC message and their timings/ordering without explicitly
instrumenting the data and logging the data during the
Record+Profiling session. Likewise we could replay and fish for extra
networking event data, or the list of event added to the gecko event
loop.

5) Doing 4) but needing to diverge. Getting a display list dump is
very useful if a rendering performance problem was caught but it's not
usually turned on and even when it is the performance overhead is
sizable.

Here's a rough use case sketch putting some of the above to debug a
rendering performance problem:
https://www.dropbox.com/s/9av6tq190vk8k9u/2014-10-14%2001.36.25.jpg?dl=0
. Basically the profiler trace will often catch something like 'Too
much time spent in pixman blending/scaling' for instance but then the
profile can still be missing a lot of crucial information. Why was the
page layerized this way? Which content node(s) contributed the most to
the pixman bottlenecks.


More information about the rr-dev mailing list