Zum Inhalt

Intro

Poetry is my goto build tool now, but I found it hard to get a simple and working docker image for different scenarios. Most of the time I will have some console scripts in my python package, and these console scripts are also available when running poetry install on my developer machine. My needs were the following:

  • Poetry installed for development incl. python console scripts (entrypoints)
  • additional user dev to run the docker images having the same UID as the developer (assuming 1000)
  • Creating production build using poetry build command for a slimmer final image
  • Proper separation between poetry related folders and the project folder ($PWD locally, /app in docker)

It took me some time to figure out everything, so I'd like to share my learnings here.

Python base image

The base image does not yet contain poetry, but all system dependencies that your final application requires. Consider poetry to be a build tool, whereas the python container with python itself is enough to run a python package built by poetry. This stage is later used by the production stage, where only the built package is copied from one stage and then installed.

Lacking docker COPY options for Windows users

The command COPY --chown is not supported on Windows. This is used in the examples below. As workaround, I guess you can solve the problem by using RUN chown -R dev:dev ... inside the Dockerfile after the COPY operations.

examples/dockerfiles/poetry/python_base

# Creating a python base with shared environment variables
ARG PYTHON_VERSION=3.10.7
FROM python:$PYTHON_VERSION-bullseye as python-base
ARG POETRY_VERSION=1.1.13
ARG DEV_UID=1000

# The env vars are inherited in all images that use FROM python-base as ....
# OSTYPE has be to set since it is normally there in linux and used by /venv/bin/activate
ENV PYTHONUNBUFFERED=1 \
    PYTHONDONTWRITEBYTECODE=1 \
    PIP_NO_CACHE_DIR=off \
    PIP_DISABLE_PIP_VERSION_CHECK=on \
    PIP_DEFAULT_TIMEOUT=100 \
    POETRY_VIRTUALENVS_IN_PROJECT=false \
    POETRY_NO_INTERACTION=1 \
    POETRY_VERSION=$POETRY_VERSION \
    OSTYPE='linux-gnu'

# Setting up proper permissions, running app not as root
# install dependencies for your application, update system
RUN mkdir -p /app \
    && groupadd --gid $DEV_UID -r dev \
    && useradd --create-home -d /home/dev --uid $DEV_UID -g -r dev dev \
    && chown dev:dev -R /app \
    && apt-get update && apt-get upgrade -y \
  && apt-get install --no-install-recommends -y \
    bash

Explanations of environment variables

  • PYTHONUNBUFFERED: Ensures output streams are straight und unbuffered to the terminal
  • PYTHONDONTWRITEBYTECODE: Will prevent writing *.pyc files which are binary files created upon first run of the program. Does make sense if the main python process does not spawn further processes.
  • PIP_NO_CACHE_DIR: Disables pip cache as in pip install --no-cache-dir ... resulting in smaller images
  • POETRY_VIRTUALENVS_IN_PROJECT: Tell poetry not to create a .venv inside /app. On your developer machine, it is then best to set poetry config virtualenvs.in-project to false as well.

Development build

You would build this stage for development, additionally mounting $PWD to /app e.g. to detect file changes when developing. The development stage does:

  • Install poetry respecting the POETRY_* environment variables during install
  • The pyproject.toml and the poetry.lock are copied
  • Installing package dependencies (incl. dev ones) using poetry install
  • Copy the entire source code COPY . . using user dev and workdir /app
  • Optional: RUN poetry install under user dev a second time to install entrypoint scripts

examples/dockerfiles/poetry/stage_development

FROM python-base as development

ENV PATH="/home/dev/.local/bin:$PATH"

# COPY entrypoint.sh /entrypoint.sh
# RUN chmod +x '/entrypoint.sh'

# Install Poetry - installer respects $POETRY_VERSION & $POETRY_HOME
# RUN curl -sSL 'https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py' | python
# the official way of installing, but piping some script from the internet is kind of dangerous

# Continue under dev user to install poetry using poetry defaults for installation, cache and virtualenv dirs
USER dev
RUN pip install poetry==$POETRY_VERSION

WORKDIR /app
COPY --chown=dev:dev ./poetry.lock ./pyproject.toml ./
RUN poetry install

# Note: --chown is necessary that the files are owned by dev and not root
COPY --chown=dev:dev . .

# small hack to get cli installed if you have entries in [tool.poetry.scripts] in pyproject.toml
# RUN poetry install

# ENTRYPOINT /entrypoint.sh $0 $@
CMD ["poetry", "run", "python", "somefile.py"]

Note

  • Use poetry run ... in order to activate the virtual environment
  • poetry itself as well as requirements are installed as user dev
Should I use a .dockerignore to exclude sensible files being built into the image?

When using a multistage build like this, we might build files into the development container that are not necessary in production. This is fine as long as we make sure the final stage doesn't contain (sensible) developer files built into the container (🔥caution with COPY . .). So you can do without a .dockerignore file as well.

Making entrypoints available

If you have console scripts in your pyproject.toml, your entrypoints (in /opt/.cache/virtualenvs/.../bin) will not be created unless the source code of your package is available. This is the reason why we can get away with a small hack using poetry install after COPY . . a second time, but this time around requires minimal time to perform the step.

Example docker-compose

A docker-compose file may look like this (replace package_name with the name of your package according to pyproject.toml:

examples/dockerfiles/poetry/docker-compose.yml

version: "3.8"

services:
  app:
    container_name: package_name
    build:
      context: .
      target: development
    image: "package_name:development"
    restart: always
    volumes:
      - $PWD:/app

Production build

Here we use two stages actually for the reason of getting rid of poetry in the final image. We use the development stage that has poetry and the source code. We build the package first, and in the second stage we just copy sdist package and install it into the container itself.

examples/dockerfiles/poetry/stage_production

FROM development as prod-builder
# Since we deal with a cli that can not be installed during poetry install (none of the code is present at this step)
RUN poetry build -f sdist && ls ./dist

FROM python-base as production
WORKDIR /app
COPY --from=prod-builder /app/dist ./dist
RUN pip install $(find dist -type f -name "*.tar.gz") && rm -r dist
USER dev
CMD ["mycli", "--help"]

Complete example

Here is the full Dockerfile consisting of all the stages above:

# Creating a python base with shared environment variables
ARG PYTHON_VERSION=3.10.7
FROM python:$PYTHON_VERSION-bullseye as python-base
ARG POETRY_VERSION=1.1.13
ARG DEV_UID=1000

# The env vars are inherited in all images that use FROM python-base as ....
# OSTYPE has be to set since it is normally there in linux and used by /venv/bin/activate
ENV PYTHONUNBUFFERED=1 \
    PYTHONDONTWRITEBYTECODE=1 \
    PIP_NO_CACHE_DIR=off \
    PIP_DISABLE_PIP_VERSION_CHECK=on \
    PIP_DEFAULT_TIMEOUT=100 \
    POETRY_VIRTUALENVS_IN_PROJECT=false \
    POETRY_NO_INTERACTION=1 \
    POETRY_VERSION=$POETRY_VERSION \
    OSTYPE='linux-gnu'

# Setting up proper permissions, running app not as root
# install dependencies for your application, update system
RUN mkdir -p /app \
    && groupadd --gid $DEV_UID -r dev \
    && useradd --create-home -d /home/dev --uid $DEV_UID -g -r dev dev \
    && chown dev:dev -R /app \
    && apt-get update && apt-get upgrade -y \
  && apt-get install --no-install-recommends -y \
    bash

FROM python-base as development

ENV PATH="/home/dev/.local/bin:$PATH"

# COPY entrypoint.sh /entrypoint.sh
# RUN chmod +x '/entrypoint.sh'

# Install Poetry - installer respects $POETRY_VERSION & $POETRY_HOME
# RUN curl -sSL 'https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py' | python
# the official way of installing, but piping some script from the internet is kind of dangerous

# Continue under dev user to install poetry using poetry defaults for installation, cache and virtualenv dirs
USER dev
RUN pip install poetry==$POETRY_VERSION

WORKDIR /app
COPY --chown=dev:dev ./poetry.lock ./pyproject.toml ./
RUN poetry install

# Note: --chown is necessary that the files are owned by dev and not root
COPY --chown=dev:dev . .

# small hack to get cli installed if you have entries in [tool.poetry.scripts] in pyproject.toml
# RUN poetry install

# ENTRYPOINT /entrypoint.sh $0 $@
CMD ["poetry", "run", "python", "somefile.py"]

FROM development as prod-builder
# Since we deal with a cli that can not be installed during poetry install (none of the code is present at this step)
RUN poetry build -f sdist && ls ./dist

FROM python-base as production
WORKDIR /app
COPY --from=prod-builder /app/dist ./dist
RUN pip install $(find dist -type f -name "*.tar.gz") && rm -r dist
USER dev
CMD ["mycli", "--help"]

Example build script

Here is an example build script that takes advantage of the info in pyproject.toml regarding package name and version for tagging. It shows the usage of the build args used in the Dockerfile above.

examples/scripts/poetry_build.sh

#!/usr/bin/env bash

version=$(grep -Po '^version\s=\s"\K\d\.\d\.\d' pyproject.toml)
target=${1:-development}
app_name=$(grep -Po '^name\s=\s"\K[\w_-]+' pyproject.toml)
python_version=3.10.7
dev_uid=$(id -u)

function build {
    docker build --tag $app_name:$target --target $target \
      --build-arg PYTHON_VERSION=$python_version --build-arg DEV_UID=$dev_uid .
    echo "$app_name:$target"
}

function build_prod {
    docker build --tag $app_name:$version --target $target \
      --build-arg PYTHON_VERSION=$python_version --build-arg DEV_UID=$dev_uid .
    echo "$app_name:$version"
}

function cleanup() {
  docker image prune -f && docker container prune -f
}

if [ "$target" = 'production' ]; then
  build_prod
else
  build
fi

cleanup > /dev/null

Usage:

# For development build
./build.sh
# indication other stage
./build.sh production

Letztes Update: November 8, 2022
Erstellt: November 8, 2022