Kernel bypass networking using pseudo-syscalls?

Robert O'Callahan robert at ocallahan.org
Tue Aug 1 22:24:24 UTC 2017


On Wed, Aug 2, 2017 at 9:37 AM, Chris Pick <rr at chrispick.com> wrote:

> I was evangelizing rr to a friend last night
>

Thank you! :-)


> who's been debugging a program that uses kernel bypass for networking.
>
> I assumed that 1) userspace uses DMA to interact with the networking
> hardware and like shared memory, DMA isn't supported by rr.  If those are
> true and, further assuming 3) all the DMA/magic is hidden behind a pair of
> send_pkt() and recv_pkt() functions,
>

How true is assumption #3, do you know? I don't know anything about these
interfaces.


> would it be possible to have rr treat that pair as a set of custom
> pseudo-syscalls, recording their inputs and outputs for later replay?
>
I imagine something similar must be done if rr supports recording/replaying
> vDSO functions?
>

We do that by patching the vDSO entry points to perform the equivalent
regular syscall. That is pretty simple to do.

I think that what you're suggesting could be implemented, but it would be
significant work. One way to do it would be to read symbols during
recording to locate the recv_pkt() function, and patch its exit with an
rr-specific system call which takes the packet buffer address and size as a
parameter. This would basically behave the same as a read syscall ---
logging the output buffer during recording, and storing it back there
during replay. Then one would add support for that syscall to librrpreload
to get a non-kernel fast path.

If you could manually patch the recv_pkt() function or a wrapper around it,
that would make this a lot easier.

Of course that approach would only work if the program does not have data
races involving the DMA buffer. If it does, rr might not produce the
correct execution during replay.

Another question is whether it would be possible for your user to configure
their program to not use the kernel bypass, e.g. using a regular socket API
instead, and whether that would make it impossible for them to debug their
bug.

Rob
-- 
lbir ye,ea yer.tnietoehr  rdn rdsme,anea lurpr  edna e hnysnenh hhe uresyf
toD
selthor  stor  edna  siewaoeodm  or v sstvr  esBa  kbvted,t
rdsme,aoreseoouoto
o l euetiuruewFa  kbn e hnystoivateweh uresyf tulsa rehr  rdm  or rnea
lurpr
.a war hsrer holsa rodvted,t  nenh hneireseoouot.tniesiewaoeivatewt sstvr
esn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/rr-dev/attachments/20170802/06a88515/attachment.html>


More information about the rr-dev mailing list