Run containers without pulling images
CRFS is a Google project that aims at running a container without pre-pulling the image first. The key insight is that in practice a container process only accesses a small fraction of the files in its image, so fetching the entire image before startup wastes both time and disk space. CRFS achieves this through the stargz (Seekable tar.gz) format, which restructures each compressed layer so that individual files can be fetched on demand rather than requiring the entire tarball to be downloaded and extracted upfront.
The idea is quite smart: an OCI layer (that is basically a compressed tarball), is modified in a way that it is possible to seek content inside of it and access a single file. It is designed around the stargz (Seekable tar.gz) format. Instead of having a single compressed tar stream, the stargz modifies it to concatenate the gzipped stream of each file. Old clients are still able to handle the stargz’ipped stream as a regular .tar.gz file.
In an attempt to support CRFS with fuse-overlayfs, I’ve worked on adding a plugin system to fuse-overlayfs (https://github.com/containers/fuse-overlayfs/pull/119). It will make possible to extend it and support different ways to retrieve data from the lower layers.
The second step is a plugin that can handle CRFS, it is still a PoC but seems to work quite nicely: https://github.com/giuseppe/crfs-plugin
To create a stargz image, you’d need to use stargzify
|
|
Once stargzify is installed, an image can be converted as:
|
|
The image was pushed to the registry. Let’s create a container:
|
|
The image, passed to fuse-overlayfs encoded in base 64, is mounted at the merged directory.
|
|
To run the container, we can take advantage of the Podman –rootfs feature. It tells Podman to not manage the storage for the container, but to use the specified path as its rootfs.
|
|
Now we are in a container where files from the lower layers will be loaded on demand when requested.