There Was a need to reduce a python application docker image size, also had to have offline installation either in the PYPI packages installation or in APT update and installing packages.
Lets get to the Optimization Line by Line.
Docker build buildx
syntax
- First of all make sure docker uses the
docker buildx
as its defaultdocker build
.
If you have installed Docker Desktop, you don't need to enable BuildKit. If you are running a version of Docker Engine version earlier than 23.0, you can enable BuildKit either by setting an environment variable, or by making BuildKit the default setting in the daemon configuration.
DOCKER_BUILDKIT=1 docker build --file /path/to/dockerfile -t docker_image_name:tag
-
# syntax=docker/dockerfile:1.4
for the heredoc in Dockerfile.
# syntax=docker/dockerfile:1.4 # Required for heredocs [3, 4]
Project Directory Tree
├── main.py
├── requirements.txt
└── src
├── log.py
└── prometheus.py
Multistage Dockerfile
as mentioned before, at first provisioning the base
stage to be used in the next build
and runtime
stages.
base
stage
- base image
ARG JFROG=jfrog.example.com
FROM ${JFROG}/docker/python:3.13-slim AS base
- Change the default SHELL
- A safe way with custom shell with
pipefail
anderrexit
options, its very useful in the Heredoc in the Debian Private repo setup section.
- A safe way with custom shell with
SHELL ["/bin/bash", "-c", "-o", "pipefail", "-o", "errexit"]
- Environments
ARG JFROG=jfrog.example.com
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_DISABLE_PIP_VERSION_CHECK=on \
PIP_INDEX_URL=https://${JFROG}/artifactory/api/pypi/python/simple/
PYTHONUNBUFFERED
PYTHONDONTWRITEBYTECODE
-
PIP_DISABLE_PIP_VERSION_CHECK
: makes pip check or not to check its version during the requirements installation. (on
/off
) -
PIP_INDEX_URL
: sets the custom index url for pip globally to download and install. If the structure of the PYPI repo is different in a private repo , please change the value of
PIP_INDEX_URL
.Private Debian Repository (Offline Installation)
Used heredoc
in Docker to change the base image apt sources to update and install packages from private Debian repository. heredoc needs the dockerfile syntax mentioned before.
If the structure of the Debian repo is different in a private repo , please change the URIs
.
DEB822 format (apt .sources files)
# Using DEB822 format (.sources files) - for newer systems
RUN <<EOF
CODENAME=$(grep VERSION_CODENAME /etc/os-release | cut -d'=' -f2)
DISTRO=$(grep '^ID=' /etc/os-release | cut -d'=' -f2)
cat > /etc/apt/sources.list.d/debian.sources <<SOURCE_FILE_CONTENT
Types: deb
URIs: https://${JFROG}/artifactory/debian/debian/
Suites: ${CODENAME} ${CODENAME}-updates
Components: main
Trusted: true
Types: deb
URIs: https://${JFROG}/artifactory/debian/debian-security/
Suites: ${CODENAME}-security
Components: main
Trusted: true
SOURCE_FILE_CONTENT
EOF
-
Install Shared and common packages in all stages.
- In the package installation there is no need to install recommended packages to reduces the image size.
- After installation, for the sake of size image there is need to remove packages downloads.
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \
curl \
gnupg \
lsb-release \
&& rm -rf /var/lib/apt/lists/*
build
stage
- Use the prepared
base
image asbuild
image
FROM base AS build
- There was no need for build specific packages in all stages, so just install them in
build
stage.
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential && \
rm -rf /var/lib/apt/lists/*
-
Install requirements
- Change directory to
app
. - Create virtualenv, in the
runtime
stage, thevirtualenv
will be copied in image. - use the cache mount for faster build.
- For the sake of image size install requirements with disabling pip cache with
--no-cache-dir
flag.
- Change directory to
WORKDIR /app
RUN python -m venv .venv
ENV PATH="/app/.venv/bin:$PATH"
COPY requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
pip --timeout 100 install --no-cache-dir -r requirements.txt
runtime
stage
- Use the prepared
base
image asbuild
image
FROM base AS build
WORKDIR /app
-
Security best practices
- Create
group
anduser
to leverage the kubernetesrunAsUser
,runAsGroup
andfsGroup
securityContext
- Create
RUN addgroup --gid 1001 --system nonroot && \
adduser --no-create-home --shell /bin/false \
--disabled-password --uid 1001 --system --group nonroot
USER nonroot:nonroot
-
VirtualEnv
- Add the
/app/.venv/bin
intoPATH
. -
Copy the
virtualenv
from build stage.
- Add the
ENV VIRTUAL_ENV=/app/.venv \
PATH="/app/.venv/bin:$PATH"
COPY --from=build --chown=nonroot:nonroot /app/.venv /app/.venv
- Copy
src
directory.
COPY --chown=nonroot:nonroot src /app/src
COPY --chown=nonroot:nonroot main.py .
-
CMD
to run container from image.
CMD ["python", "/app/main.py"]
Before And After The optimization
Before The Optimization
The Dockerfile was:
FROM jfrog.example.com/docker/python:latest
WORKDIR /app
ADD src/ .
RUN pip config set global.index-url https://jfrog.example.com/artifactory/api/pypi/python/simple/ && \
pip --timeout 100 install -r requirements.txt
CMD ["python","-u","main.py"]
After build its size was 1.02GB
.
Final Dockerfile After Optimization
After all Optimization and multistage Dockerfile its size reduced to 242MB
.
# syntax=docker/dockerfile:1.4
ARG JFROG=jfrog.example.com
FROM ${JFROG}/docker/python:3.13-slim AS base
SHELL ["/bin/bash", "-c", "-o", "pipefail", "-o", "errexit"]
ARG JFROG=jfrog.example.com
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_DISABLE_PIP_VERSION_CHECK=on \
PIP_INDEX_URL=https://${JFROG}/artifactory/api/pypi/python/simple/
# Using DEB822 format (.sources files) - for newer systems
RUN <<EOF
CODENAME=$(grep VERSION_CODENAME /etc/os-release | cut -d'=' -f2)
DISTRO=$(grep '^ID=' /etc/os-release | cut -d'=' -f2)
cat > /etc/apt/sources.list.d/debian.sources <<SOURCE_FILE_CONTENT
Types: deb
URIs: https://${JFROG}/artifactory/debian/debian/
Suites: ${CODENAME} ${CODENAME}-updates
Components: main
Trusted: true
Types: deb
URIs: https://${JFROG}/artifactory/debian/debian-security/
Suites: ${CODENAME}-security
Components: main
Trusted: true
SOURCE_FILE_CONTENT
EOF
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \
curl \
gnupg \
lsb-release \
&& rm -rf /var/lib/apt/lists/*
FROM base AS build
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential && \
rm -rf /var/lib/apt/lists/*
WORKDIR /app
RUN python -m venv .venv
ENV PATH="/app/.venv/bin:$PATH"
COPY requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
pip --timeout 100 install --no-cache-dir -r requirements.txt
FROM base AS runtime
WORKDIR /app
RUN addgroup --gid 1001 --system nonroot && \
adduser --no-create-home --shell /bin/false \
--disabled-password --uid 1001 --system --group nonroot
USER nonroot:nonroot
ENV VIRTUAL_ENV=/app/.venv \
PATH="/app/.venv/bin:$PATH"
COPY --from=build --chown=nonroot:nonroot /app/.venv /app/.venv
COPY --chown=nonroot:nonroot src /app/src
COPY --chown=nonroot:nonroot main.py .
CMD ["python", "/app/main.py"]
Top comments (2)
This is a great optimization! One potential limitation is relying on private repositories for both APT and PyPI—how would you handle builds if the private repos were temporarily unavailable, or for developers who may not have access? Any suggestions for fallbacks or more robust offline support?
I will make sure the private repository is HA.
You're right, if a developer has no access, it won't be built.
The entire pip index URL could be an ARG if there is no PyPI repository
Also, the APT repo can be copied from a local file to image, and that also could be an ARG
if the arg has been set, so it will be used from private repo, otherwise use the base image apt sources