While working on crun, I got surprised by how much time the kernel
spent in the copy_mount_options
function.
The root cause was in using an empty string instead of NULL
when
there are no options for the mount
syscall.
In the common mount
case, copy_mount_options
takes most of the
time.
The data option in the mount(2)
syscall allows the user to pass down
from userspace to the kernel some additional options. How these
options are interpreted is specific to the file system. Generally it
is a comma-separed string of values.
int mount(const char *source, const char *target,
const char *filesystemtype, unsigned long mountflags,
const void *data);
On a mount, the kernel internally allocates a page of memory where
data
is copied to. If the whole page cannot be copied from upstream
because a fault happened, the remaining buffer is memset'ed to 0.
If there are no options to pass down, using NULL
is preferable to
the empty string, as the kernel will not allocate a memory page and
won't attempt any copy from user space.
To give a measure of the improvements, I've tried to run the following program:
#define _GNU_SOURCE
#include <sched.h>
#include <sys/mount.h>
int main()
{
int i;
unshare (CLONE_NEWNS);
for (i = 0; i < 100000; i++)
mount (NULL, "/tmp", NULL, MS_REMOUNT | MS_PRIVATE, "");
return 0;
}
and I got these results:
# \time ./do-mounts
0.04user 7.64system 0:07.72elapsed 99%CPU (0avgtext+0avgdata 1192maxresident)k
0inputs+0outputs (0major+62minor)pagefaults 0swaps
Replacing the empty options string with NULL
:
# \time ./do-mounts
0.04user 0.61system 0:00.66elapsed 99%CPU (0avgtext+0avgdata 1100maxresident)k
0inputs+0outputs (0major+60minor)pagefaults 0swaps
That is almost 12 times faster!