Intro
Poetry is my goto build tool now, but I found it hard to get a simple and working docker image for different scenarios.
Most of the time I will have some console scripts in my python package, and these console scripts are also available when
running poetry install
on my developer machine. My needs were the following:
- Poetry installed for development incl. python console scripts (entrypoints)
- additional user
dev
to run the docker images having the same UID as the developer (assuming 1000) - Creating production build using
poetry build
command for a slimmer final image - Proper separation between poetry related folders and the project folder (
$PWD
locally,/app
in docker)
It took me some time to figure out everything, so I'd like to share my learnings here.
Python base image
The base image does not yet contain poetry, but all system dependencies that your final application requires. Consider poetry to be a build tool, whereas the python container with python itself is enough to run a python package built by poetry. This stage is later used by the production stage, where only the built package is copied from one stage and then installed.
Lacking docker COPY
options for Windows users
The command COPY --chown
is not supported on Windows. This is used in the examples below. As workaround, I guess you
can solve the problem by using RUN chown -R dev:dev ...
inside the Dockerfile after the COPY
operations.
# Creating a python base with shared environment variables
ARG PYTHON_VERSION=3.10.7
FROM python:$PYTHON_VERSION-bullseye as python-base
ARG POETRY_VERSION=1.1.13
ARG DEV_UID=1000
# The env vars are inherited in all images that use FROM python-base as ....
# OSTYPE has be to set since it is normally there in linux and used by /venv/bin/activate
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=off \
PIP_DISABLE_PIP_VERSION_CHECK=on \
PIP_DEFAULT_TIMEOUT=100 \
POETRY_VIRTUALENVS_IN_PROJECT=false \
POETRY_NO_INTERACTION=1 \
POETRY_VERSION=$POETRY_VERSION \
OSTYPE='linux-gnu'
# Setting up proper permissions, running app not as root
# install dependencies for your application, update system
RUN mkdir -p /app \
&& groupadd --gid $DEV_UID -r dev \
&& useradd --create-home -d /home/dev --uid $DEV_UID -g -r dev dev \
&& chown dev:dev -R /app \
&& apt-get update && apt-get upgrade -y \
&& apt-get install --no-install-recommends -y \
bash
Explanations of environment variables
PYTHONUNBUFFERED
: Ensures output streams are straight und unbuffered to the terminalPYTHONDONTWRITEBYTECODE
: Will prevent writing*.pyc
files which are binary files created upon first run of the program. Does make sense if the main python process does not spawn further processes.PIP_NO_CACHE_DIR
: Disables pip cache as inpip install --no-cache-dir ...
resulting in smaller imagesPOETRY_VIRTUALENVS_IN_PROJECT
: Tell poetry not to create a.venv
inside/app
. On your developer machine, it is then best to setpoetry config virtualenvs.in-project
tofalse
as well.
Development build
You would build this stage for development, additionally mounting $PWD
to /app
e.g. to detect file changes when developing.
The development stage does:
- Install
poetry
respecting thePOETRY_*
environment variables during install - The
pyproject.toml
and thepoetry.lock
are copied - Installing package dependencies (incl. dev ones) using
poetry install
- Copy the entire source code
COPY . .
using userdev
and workdir/app
- Optional:
RUN poetry install
under userdev
a second time to install entrypoint scripts
FROM python-base as development
ENV PATH="/home/dev/.local/bin:$PATH"
# COPY entrypoint.sh /entrypoint.sh
# RUN chmod +x '/entrypoint.sh'
# Install Poetry - installer respects $POETRY_VERSION & $POETRY_HOME
# RUN curl -sSL 'https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py' | python
# the official way of installing, but piping some script from the internet is kind of dangerous
# Continue under dev user to install poetry using poetry defaults for installation, cache and virtualenv dirs
USER dev
RUN pip install poetry==$POETRY_VERSION
WORKDIR /app
COPY --chown=dev:dev ./poetry.lock ./pyproject.toml ./
RUN poetry install
# Note: --chown is necessary that the files are owned by dev and not root
COPY --chown=dev:dev . .
# small hack to get cli installed if you have entries in [tool.poetry.scripts] in pyproject.toml
# RUN poetry install
# ENTRYPOINT /entrypoint.sh $0 $@
CMD ["poetry", "run", "python", "somefile.py"]
Note
- Use
poetry run ...
in order to activate the virtual environment - poetry itself as well as requirements are installed as user
dev
Should I use a .dockerignore to exclude sensible files being built into the image?
When using a multistage build like this, we might build files into the development container that are not necessary in production.
This is fine as long as we make sure the final stage doesn't contain (sensible) developer files built into the container (caution with
COPY . .
).
So you can do without a .dockerignore
file as well.
Making entrypoints available
If you have console scripts in your pyproject.toml
, your entrypoints (in /opt/.cache/virtualenvs/.../bin
) will
not be created unless the source code of your package is available. This is the reason why we can get away with a
small hack using poetry install
after COPY . .
a second time, but this time around requires minimal time to perform the step.
Example docker-compose
A docker-compose
file may look like this (replace package_name
with the name of your package
according to pyproject.toml
:
version: "3.8"
services:
app:
container_name: package_name
build:
context: .
target: development
image: "package_name:development"
restart: always
volumes:
- $PWD:/app
Production build
Here we use two stages actually for the reason of getting rid of poetry in the final image. We use the development stage that has poetry and the source code. We build the package first, and in the second stage we just copy sdist package and install it into the container itself.
FROM development as prod-builder
# Since we deal with a cli that can not be installed during poetry install (none of the code is present at this step)
RUN poetry build -f sdist && ls ./dist
FROM python-base as production
WORKDIR /app
COPY --from=prod-builder /app/dist ./dist
RUN pip install $(find dist -type f -name "*.tar.gz") && rm -r dist
USER dev
CMD ["mycli", "--help"]
Complete example
Here is the full Dockerfile
consisting of all the stages above:
# Creating a python base with shared environment variables
ARG PYTHON_VERSION=3.10.7
FROM python:$PYTHON_VERSION-bullseye as python-base
ARG POETRY_VERSION=1.1.13
ARG DEV_UID=1000
# The env vars are inherited in all images that use FROM python-base as ....
# OSTYPE has be to set since it is normally there in linux and used by /venv/bin/activate
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=off \
PIP_DISABLE_PIP_VERSION_CHECK=on \
PIP_DEFAULT_TIMEOUT=100 \
POETRY_VIRTUALENVS_IN_PROJECT=false \
POETRY_NO_INTERACTION=1 \
POETRY_VERSION=$POETRY_VERSION \
OSTYPE='linux-gnu'
# Setting up proper permissions, running app not as root
# install dependencies for your application, update system
RUN mkdir -p /app \
&& groupadd --gid $DEV_UID -r dev \
&& useradd --create-home -d /home/dev --uid $DEV_UID -g -r dev dev \
&& chown dev:dev -R /app \
&& apt-get update && apt-get upgrade -y \
&& apt-get install --no-install-recommends -y \
bash
FROM python-base as development
ENV PATH="/home/dev/.local/bin:$PATH"
# COPY entrypoint.sh /entrypoint.sh
# RUN chmod +x '/entrypoint.sh'
# Install Poetry - installer respects $POETRY_VERSION & $POETRY_HOME
# RUN curl -sSL 'https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py' | python
# the official way of installing, but piping some script from the internet is kind of dangerous
# Continue under dev user to install poetry using poetry defaults for installation, cache and virtualenv dirs
USER dev
RUN pip install poetry==$POETRY_VERSION
WORKDIR /app
COPY --chown=dev:dev ./poetry.lock ./pyproject.toml ./
RUN poetry install
# Note: --chown is necessary that the files are owned by dev and not root
COPY --chown=dev:dev . .
# small hack to get cli installed if you have entries in [tool.poetry.scripts] in pyproject.toml
# RUN poetry install
# ENTRYPOINT /entrypoint.sh $0 $@
CMD ["poetry", "run", "python", "somefile.py"]
FROM development as prod-builder
# Since we deal with a cli that can not be installed during poetry install (none of the code is present at this step)
RUN poetry build -f sdist && ls ./dist
FROM python-base as production
WORKDIR /app
COPY --from=prod-builder /app/dist ./dist
RUN pip install $(find dist -type f -name "*.tar.gz") && rm -r dist
USER dev
CMD ["mycli", "--help"]
Example build script
Here is an example build script that takes advantage of the info in pyproject.toml
regarding package name and version for tagging. It shows the usage of the build args used in the Dockerfile
above.
#!/usr/bin/env bash
version=$(grep -Po '^version\s=\s"\K\d\.\d\.\d' pyproject.toml)
target=${1:-development}
app_name=$(grep -Po '^name\s=\s"\K[\w_-]+' pyproject.toml)
python_version=3.10.7
dev_uid=$(id -u)
function build {
docker build --tag $app_name:$target --target $target \
--build-arg PYTHON_VERSION=$python_version --build-arg DEV_UID=$dev_uid .
echo "$app_name:$target"
}
function build_prod {
docker build --tag $app_name:$version --target $target \
--build-arg PYTHON_VERSION=$python_version --build-arg DEV_UID=$dev_uid .
echo "$app_name:$version"
}
function cleanup() {
docker image prune -f && docker container prune -f
}
if [ "$target" = 'production' ]; then
build_prod
else
build
fi
cleanup > /dev/null
Usage:
Erstellt: November 8, 2022