Handling unshare() and kernel namespaces
jones.chris.g at gmail.com
Thu Apr 16 16:44:32 UTC 2015
On Thu, Apr 16, 2015 at 4:40 AM, Robert O'Callahan <robert at ocallahan.org>
> However, during replay the directory chroot()ed to doesn't exist, because
> it's created and unlinked during the recording, so it seems to me we can't
> execute the chroot() after all.
The directory might not exist in the sandbox case, but programs like apache
can also chroot to well-known system directories (or at least could in the
past). This feels similar to our copy-or-not-copy mmap heuristics: if the
chroot dir is user-writeable, we ought to "copy" it (see below), and if
it's a system dir, optimistically assume it won't meaningfully change.
Then during replay, on a chroot to a non-copied dir we would just execute
it, and on a chroot to a copied dir we could create a new tmp dir in our
emufs like we do for copied mmap files and then execute the chroot. (We
would need to preserve identity because theoretically multiple processes
could chroot into the same dir.)
But of course we wouldn't want to try preserving arbitrary fs layouts in
the copied dirs, so we can probably stipulate for now that on a chroot into
a user-writeable dir, the dir must be empty. (Maybe with an exception or
two TBD.) But if we make that stipulation, for that case it becomes less
important to actually execute the chroot during replay. So we could take
whichever approach looks simpler.
> That seems mostly OK since syscalls using file paths are almost entirely
> emulated (i.e., ignored) during replay. (Chris, I said on IRC that
> emulating chroot would be really hard, but fortunately I think I was
> totally wrong.)
Well, to be clear I was referring to emulation during recording, or in
other words boiling away the chroot by implementing it through chdir's and
file path rewriting. I think we can still probably agree that that sounds
> The only exception I'm aware of is execve(), where the filename passed
> during recording is passed to the kernel for execution during replay. This
> is already a problem when a tracee execs a temporary file. However,
> execve() in a chroot() sandbox is unlikely to be used, since the usual
> ld.so interpreter and standard libraries cannot be loaded, so the
> executable has to be carefully crafted or system libraries carefully pulled
> into the sandbox. In practice I think we can just fail if execve() occurs
> after a chroot().
That sounds fine to me. The most common use of chroot I'm aware of is
exactly to prevent exec'ing arbitrary code, so that lines up well.
> It would be nice to treat the exec'ed file similarly to other mmapped
> files, copying it to the trace or saving a hardlink to the trace directory
> and exec'ing that. (Theoretically we should do the same for ld.so or
> whatever other interpreter is specified for the binary.) It's a bit tricky
> to do because the filename passed to the kernel affects the layout and
> contents of the memory after exec, so we'd have to record and replay more
> of that. I think we can put this off.
Also agreed, and I think it's mostly orthogonal to chroot processing,
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the rr-dev