Handling state collisions

Kyle Huey me at kylehuey.com
Tue Mar 17 00:55:13 UTC 2015

On Mon, Mar 16, 2015 at 5:46 PM, Robert O'Callahan <robert at ocallahan.org>

> From the beginning, rr has depended on the assumption that the states
> defined by (perfcounter tick count, general-purpose register values) pairs
> are unique during a program execution. Reverse-execution stresses this
> assumption much more than before, because we often do a large number of
> single-step operations, thus examining almost every state reached for a
> given perfcounter-tick value. I've run into a real-life example where the
> assumption doesn't hold.
> It's in a debug build of Gecko, where some code calls
> Matrix::ProjectRectBounds, which starts with:
>   points[0] = ProjectPoint(aRect.TopLeft());
>   points[1] = ProjectPoint(aRect.TopRight());
>   points[2] = ProjectPoint(aRect.BottomRight());
>   points[3] = ProjectPoint(aRect.BottomLeft());
> During the execution of those lines, no conditional branches execute.
> ProjectPoint is
>   Point4D ProjectPoint(const Point& aPoint) const {
>     float z = -(aPoint.x * _13 + aPoint.y * _23 + _43) / _33;
>     return *this * Point4D(aPoint.x, aPoint.y, z, 1);
>   }
> There is (at least) one point inside the Point4D constructor where, during
> different invocations of the constructor, the GP registers are identical.
> So, what to do about it? One option would be to use an opt Gecko build. I
> suspect more aggressive register allocation, plus more inlining, would make
> this bug go away in this case at least.
> Another option would be to incorporate extra-registers (x86 XSAVE area)
> into the state tuple. That would fix the case I'm looking at, because the
> projected points have different coordinates and show up in the XMM
> registers. However, I think it wouldn't solve this bug when the input rect
> is empty, which is likely to occur.
> Another option would be to do some best-effort stack walking to
> incorporate some return addresses into the state tuple. In an x86 debug
> build, [$rbp] is usually the caller's RBP and [$rbp+8] is the return
> address in the caller. We can therefore record N return addresses by
> reading 16 bytes per stack frame. This won't work in opt builds in general,
> but hopefully that doesn't matter. This should not have much performance
> impact, since this data is usually not needed.
> I'll try implementing the last option.

I also think the last option is worth exploring.  On some architectures
(e.g. ARM) that use a link register you get this for free, so it seems like
it would contribute to architecture parity.

- Kyle
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/rr-dev/attachments/20150316/b2b51dda/attachment.html>

More information about the rr-dev mailing list