Handling state collisions

Robert O'Callahan robert at ocallahan.org
Tue Mar 17 00:46:36 UTC 2015

>From the beginning, rr has depended on the assumption that the states
defined by (perfcounter tick count, general-purpose register values) pairs
are unique during a program execution. Reverse-execution stresses this
assumption much more than before, because we often do a large number of
single-step operations, thus examining almost every state reached for a
given perfcounter-tick value. I've run into a real-life example where the
assumption doesn't hold.

It's in a debug build of Gecko, where some code calls
Matrix::ProjectRectBounds, which starts with:
  points[0] = ProjectPoint(aRect.TopLeft());
  points[1] = ProjectPoint(aRect.TopRight());
  points[2] = ProjectPoint(aRect.BottomRight());
  points[3] = ProjectPoint(aRect.BottomLeft());
During the execution of those lines, no conditional branches execute.
ProjectPoint is
  Point4D ProjectPoint(const Point& aPoint) const {
    float z = -(aPoint.x * _13 + aPoint.y * _23 + _43) / _33;
    return *this * Point4D(aPoint.x, aPoint.y, z, 1);
There is (at least) one point inside the Point4D constructor where, during
different invocations of the constructor, the GP registers are identical.

So, what to do about it? One option would be to use an opt Gecko build. I
suspect more aggressive register allocation, plus more inlining, would make
this bug go away in this case at least.

Another option would be to incorporate extra-registers (x86 XSAVE area)
into the state tuple. That would fix the case I'm looking at, because the
projected points have different coordinates and show up in the XMM
registers. However, I think it wouldn't solve this bug when the input rect
is empty, which is likely to occur.

Another option would be to do some best-effort stack walking to incorporate
some return addresses into the state tuple. In an x86 debug build, [$rbp]
is usually the caller's RBP and [$rbp+8] is the return address in the
caller. We can therefore record N return addresses by reading 16 bytes per
stack frame. This won't work in opt builds in general, but hopefully that
doesn't matter. This should not have much performance impact, since this
data is usually not needed.

I'll try implementing the last option.

oIo otoeololo oyooouo otohoaoto oaonoyooonoeo owohooo oioso oaonogoroyo
owoiotoho oao oboroootohoeoro oooro osoiosotoeoro owoiololo oboeo
osouobojoeocoto otooo ojouodogomoeonoto.o oAogoaoiono,o oaonoyooonoeo
osoaoyoso otooo oao oboroootohoeoro oooro osoiosotoeoro,o o‘oRoaocoao,o’o
oaonosowoeoroaoboloeo otooo otohoeo ocooouoroto.o oAonodo oaonoyooonoeo
osoaoyoso,o o‘oYooouo ofooooolo!o’o owoiololo oboeo oiono odoaonogoeoro
otohoeo ofoioroeo ooofo ohoeololo.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/rr-dev/attachments/20150317/ac061986/attachment.html>

More information about the rr-dev mailing list