Sandboxing means running a program in a closed environment (such as no permission to open new files, no or restricted network access, etc.) in order to protect from malicious or erroneous software.
In Fedora Linux there is `policycoreutils` package which contains bin/sandbox based on SELinux.
This sandbox is not perfect however. So in this post I will describe some proposed updates and implementation considerations to improve this sandbox.
The first thing to say, it that it is implemented as two executables: a Python script which calls (if there are no error) the binary program written in C. In principle such hierarchical two-level structure should be eliminated (for example by performance considerations) and this should be instead done as a monolithic C program. However this is not urgent.
Consider two scenarios of ending execution of the sandboxed program:
- It terminates normally.
- The user or software does not want to wait more than 30 seconds and kills it with SIGKILL. (Somebody may argue that it is should be first hit by SIGTERM signal and be given time to exit gracefully before SIGKILL. But this probably doesn’t matter for a sandboxed program as it anyway has not opened any files and thus there is no need to close them.)
It terminates normally
In this case we (or rather the application which started a sandbox to do some calculations in it) need to know when it terminates.
Note that the sandboxed program may fork/exec childs and exit itself. Also it may call setsid().
At first it may seem that we can just waitpid() for the sandbox process, but the process may create children and exit itself. That would give false sense that our program really finished, while polluting process space by child processes which may not exit at all. As such I propose the sandbox process fork before loading the actual sandboxed program. The forked process would first move itself to a cgroup and then execute (now without forking) the actual sandboxed program. The original process would wait until the cgroup becomes empty.
To wait until a cgroup becomes empty is probably possible with 2.4 Notification API. (Please comment whether it can be done this way.) If it is impossible to implement, there should be conceived a Linux kernel patch.
The user or software does not want to wait more than 30 seconds
First we need to freeze this cgroup (so that no hacker would create new processes in a cgroup probably faster than we kill them).
Then we should recursively enumerate all processes in this cgroup and all its subgroups and kill every process with SIGKILL.