paint-brush
Managing Unstable Dependencies in Docker: The Power of Intermediate Imagesby@robertmoskal

Managing Unstable Dependencies in Docker: The Power of Intermediate Images

by Robert MoskalOctober 5th, 2023
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

If your app has unstable third party dependencies, an intermediate image can ensure you can build anywhere.
featured image - Managing Unstable Dependencies in Docker: The Power of Intermediate Images
Robert Moskal HackerNoon profile picture

Docker changed the software development game by packaging an application and its dependencies together. It largely eliminated the pain of onboarding developers and deploying applications. But there was an upside to the old way where an oil-covered sysadmin manually built an environment for an application. As long as nothing was touched, it would run until the server gave out.


Today, it's quite common for an application to be built and deployed many, many times a day. If dependencies for your application are unstable, you have to be careful. Say your app depends on something like Joe's gcsfuse library for "mounting and accessing Cloud Storage buckets as local file systems."


This might be pretty crucial to the functioning of your application. It lives in a vendor repo on the internet and might get installed into your docker file something like this:


FROM python:3.9-buster
	
RUN set -e;  
apt-get update -y && apt-get install -y  
tini  
lsb-release;  
gcsFuseRepo=gcsfuse-`lsb_release -c -s`;  
gcsFuseRepo=gcsfuse-`lsb_release -c -s`; \
echo "deb http://packages.eat.at.joes.com/apt $gcsFuseRepo main" | \
tee /etc/apt/sources.list.d/gcsfuse.list; \
curl https://packages.eat.at.joes.com/apt/doc/apt-key.gpg | \
  
apt-key add -;  
apt-get update;  
apt-get install -y gcsfuse  
&& apt-get clean


Towards the bottom of the Docker file, you might install your app/language-specific dependencies and then specify the entry point into your container.


RUN pip install --no-cache-dir -r requirements.txt 
RUN chmod +x /app/cloud_driver.sh  
CMD ["/app/cloud_driver.sh"]  


As long as you don't edit the top part of the file, Docker uses the cached layer that lives on your local file system. And as long as the cache is intact, Docker will never pull from https://packages.eat.at.joes.com


But if the cache is cleared or you run the build process on another machine, that repository must be available. If it's not, YOU ARE DEAD IN THE WATER!


In fact, I wrote this up after Google's gcsFuse repository went offline. In this case, a bunch of folks complained on GitHub. Google eventually provided a workaround.


Still, my team was blocked for several days, and someone had to edit the docker file to implement the fix if we were dealing with eat.at.joes.com, who knows what might have happened.


Read on if it's crucial that you can always build your containerized application.

Intermediate Images to the Rescue

My preferred solution is to create two docker images, one to install the unstable dependencies and the other containing your app, which will use the first as a base.


So:

FROM python:3.9-buster

RUN set -e;  
apt-get update -y && apt-get install -y  
tini  
lsb-release;  
gcsFuseRepo=gcsfuse-`lsb_release -c -s`;  
gcsFuseRepo=gcsfuse-`lsb_release -c -s`; \
echo "deb http://packages.cloud.google.com/apt $gcsFuseRepo main" | \
tee /etc/apt/sources.list.d/gcsfuse.list; \
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | \
  
apt-key add -;  
apt-get update;  
apt-get install -y gcsfuse  
&& apt-get clean


Which you then push to a container repository…

docker build -t my-intermediary-image:latest .
docker push my-intermediary-image:latest


Base your application docker file on the intermediate one:

FROM my-intermediary-image:latest

RUN pip install --no-cache-dir -r requirements.txt 
RUN chmod +x /app/cloud_driver.sh  
CMD ["/app/cloud_driver.sh"]  


My-intermediary-image is effectively immutable, and you'll be able to build your application almost anywhere.


You can build and tag intermediate containers whenever there's a new and better version of gcsfuse. If you don't like it, you can simply build your app with a previous version.

What About CI/CD Providers?

Yes, CI/CD providers can cache your Docker layers between builds. But it's often a premium feature and in all cases, you'll have to make sure the vendor-specific tooling is set up to support it.

So go check your CI/CD setup! And if it's crucial for you to be able to build your app anywhere, use an intermediate image.

Other Solutions

There are other solutions like hosting the third-party code in an artifact repository like AWS S3, Nexus, Artifactory, etc. If you're already using services, those services might be a good solution for you, though it will complicate your docker file.


At an operating system level, you could mirror or proxy the repo using tools like apt-mirror or apt-cacher-ng.


These methods require you to come up with a method for updating the third-party code. In many cases, you'll have to implement versioning if you want to fall-back behavior you get out of the box with the intermediate container.