Docker pain

Ok, so you want to run some nice docker containers. After some success you realise your little application has a several hundred MB sized container! After a bit of digging you realise your container is based at some point on a full Linux distro. Turns out this is pretty common.

So after some more reading you find that you have a few options:

  • Base your image off something like "busybox" which will give you a pretty nice and small base.
  • Base your image off "scratch" which is a reserved container name that is completely empty.
  • Build your binary in a full OS image and chain the scratch image, copying just the things you need into scratch.

Wait, what?

The busybox image is nice, but it contains a bunch of things your probably never going to use. If your obsessed with minimalism it's going to annoy you looking inside that container.

You can try to run regular compiled programs, but you have to use ldd $(which program) to figure out all the shared libraries and copy them into your container - fragile!! It's fragile because the environment you build from is probably different than the busybox environment (eg: different libc).

It turns out the closest you will get to success is to write your thing in golang. Why? Because golang gives you the option of compiling statically. Yay! And it mostly works. It works until you use some golang library that depends on a shared object. Dammit, back to copying shared object files inside the container again :-/

The other road is to use a full OS image to build all your things, then make an inventory of all the required things and copy them from the build image into a scratch image. Ugh.

C rules again

So the "easiest" thing is to write your program in C and link as much as you can statically. For libs that don't provide a static option copy the shared objects into the container for runtime linking.

Writing webapps in C... Um.

Where too?

I don't know but I feel there are a few issues:

  • High level languages relying on being in an OS environment (I'm looking at you Perl - requiring a C complier for all that XS)
  • It's really hard to make a stand-alone executable that has zero dependencies (and does something useful).
  • People are lazy and just base their container images off debian or ubuntu because they can't be bothered figuring all this out.
  • There is a tendency for higher level languages to call into "C" shared objects (I know shared objects are not "C", but they are probably compiled from C so..). It's probably better than rewriting all that code hundreds of times but it does mean your fancy language inherits all that baggage from the old world.

There is no technical reason why we can't have tiny efficient Docker containers. A simple statically compiled golang program is still almost 10MB in size, and while it's a shit ton of machine code it's still small compared to containers based off a full debian distro for no good reason.

At the end of the day it would be nice to somehow revisit all this stuff now we are in the 21st century and try to throw off some of the baggage that we have. Debian actively discourages library packages from distributing static archives (for good reason), which makes liking libraries statically more difficult. So I guess what we need is a way to bundle up an application with it's shared libraries into a container-friendly thing. But then all the higher level things have to be taught how to play nicely with that, and that's unlikely.