Problems in Subprocesses of Tracees

Jun Inoue jun.lambda at gmail.com
Mon Dec 11 15:04:44 UTC 2017


On Sat, Dec 9, 2017 at 11:23 AM, Robert O'Callahan <robert at ocallahan.org> wrote:
> Your explanation makes sense. I don't really have any idea what the bug
> could be.
>
> One way to narrow down the divergence would be to inject additional dummy
> syscalls (could be anything, even just an invalid syscall number) into the
> trace between the last known good point and the divergence. For example
> between the print, close and exit. If you can inject them into if_print that
> would help too. Basically, the control flow seems to diverge between
> recording and replay and those divergences are only detected at traced
> syscall boundaries, so inserting more of those boundaries narrows down the
> window during which control flow diverges. The control flow probably
> diverges based on some data value so if we narrow it down enough we can
> figure out what the changed data value is.

I just came across -C on-syscalls, which easily revealed the offending
syscall.  It's this line:

    if (ioctl(skfd, SIOCGIFMAP, &ifr) < 0)

in lib/interface.c, on line 459, of net-tools-1.60+git20161116.90da8a0
as downloaded by 'apt-get source net-tools' on Ubuntu 17.04.

The memory divergence file suggests ifr.ifru.ifru_data and
ifr.ifru.ifru_newname have diverged after this call.  Before the call,
ifr looks like

{ifr_ifrn = {ifrn_name = "enxb88d1255ba5c"}, ifr_ifru = {ifru_addr =
{sa_family = 1500,
      sa_data = "\000\000\022U\272\\\340_\341\230\372U\000"},
ifru_dstaddr = {sa_family = 1500,
      sa_data = "\000\000\022U\272\\\340_\341\230\372U\000"},
ifru_broadaddr = {sa_family = 1500,
      sa_data = "\000\000\022U\272\\\340_\341\230\372U\000"},
ifru_netmask = {sa_family = 1500,
      sa_data = "\000\000\022U\272\\\340_\341\230\372U\000"},
ifru_hwaddr = {sa_family = 1500,
      sa_data = "\000\000\022U\272\\\340_\341\230\372U\000"},
ifru_flags = 1500, ifru_ivalue = 1500,
    ifru_mtu = 1500, ifru_map = {mem_start = 6681746532955325916,
mem_end = 94534795091936, base_addr = 24704,
      irq = 207 '\317', dma = 139 '\213', port = 252 '\374'},
    ifru_slave = "\334\005\000\000\022U\272\\\340_\341\230\372U\000",
    ifru_newname = "\334\005\000\000\022U\272\\\340_\341\230\372U\000",
    ifru_data = 0x5cba5512000005dc <error: Cannot access memory at
address 0x5cba5512000005dc>}}

where enxb88d1255ba5c is the name of my wireless interface.  I guess
this is another case of this issue:
https://github.com/mozilla/rr/issues/1827

According to the memory divergence file, there's a dozen other places
where the values disagree, but I don't know what those addresses are
used for.  Any tips/ideas on how to figure this out?  Some of them
look like just administrative differences between recording and
replaying.

Memory divergence file can be found here
(https://drive.google.com/open?id=1u-FQaqBIqzZXObJsnwC3RAhBNK8rJflf),
though I guess it wouldn't make a lot of sense without the binaries.
The address range [0x7ffc8bcf5c20..0x7ffc8bcf5c48) is the ifr.
I'd have to note the experimentation conditions have changed since my
initial report about ifconfig being called from an rtcd, though.  If
you just run ifconfig under rr on a computer with wireless, you can
(probably) reproduce this problem.


>
> Rob
> --
> lbir ye,ea yer.tnietoehr  rdn rdsme,anea lurpr  edna e hnysnenh hhe uresyf
> toD
> selthor  stor  edna  siewaoeodm  or v sstvr  esBa  kbvted,t
> rdsme,aoreseoouoto
> o l euetiuruewFa  kbn e hnystoivateweh uresyf tulsa rehr  rdm  or rnea lurpr
> .a war hsrer holsa rodvted,t  nenh hneireseoouot.tniesiewaoeivatewt sstvr
> esn



-- 
Jun Inoue


More information about the rr-dev mailing list