Layered Dockerimages and missing target architectures

I tried for some time to create a toolchain for LUMI that uses Software compiled for the Zen3 architecture, but as the runners we are using don't have Zen3, I was not able to achieve this using the spack setup.

Job #1368386 failed for ed4c4a44:

I dropped to cross compile within our CI for now.

Another issue though is, that the ci.yml becomes overly complicated and convoluted when setting up different base-images for different bootstraps like ROCm, CUDAx, etc. as well as specialized target architectures like Zen3, various ARM, etc.

I tried to approach the build using a layered approach.

Dockerfile.Zen3-layer on top of ROCm base-image:

FROM i10git.cs.fau.de:5005/ci/images/spack-23-rocm:latest
RUN spack config add "packages:all:target:[zen3]"
RUN spack compiler find && \
    spack install gcc@12.4.0 target=zen3 && \
    spack install llvm@17.0.6 target=zen3
RUN spack load gcc@12.4.0 && \
    spack compiler find
RUN spack config add "packages:all:compiler:[gcc@12.4.0]"
RUN spack clean -a

But this would result in a bunch of Dockerfiles.

Another idea was to use BUILD ARGs from Docker to get rid of redundancy from bootstrap images.

Dockerfile.generic-spack:

ARG BASE_IMAGE=nvidia/cuda:12.8.1-devel-ubuntu24.04
FROM ${BASE_IMAGE}
# Rest of the code

gitlab-ci.yml:

spack-23-cuda-dockerimage:
  script:
    - |
      docker build --pull -f Dockerfile.generic-spack \
        --build-arg BASE_IMAGE=nvidia/cuda:12.8.1-devel-ubuntu24.04 \
        -t i10git.cs.fau.de:5005/ci/images/spack-23-cuda\
        --cache-from i10git.cs.fau.de:5005/ci/images/spack-23-cuda.
    - docker push i10git.cs.fau.de:5005/ci/images/spack-23-cuda

spack-23-rocm-dockerimage:
  script:
    - |
      docker build --pull -f Dockerfile.generic-spack \
        --build-arg BASE_IMAGE=rocm/dev-ubuntu-24.04:6.4.3-complete \
        -t i10git.cs.fau.de:5005/ci/images/spack-23-rocm\
        --cache-from i10git.cs.fau.de:5005/ci/images/spack-23-rocm.
    - docker push i10git.cs.fau.de:5005/ci/images/spack-23-rocm

A last solution I found is a multistage Dockerfile.

Dockerfile.unified

# ===================================================================
# STAGE 1: A reusable "template" for the basic Spack setup.
# ===================================================================
ARG BASE_IMAGE=nvidia/cuda:12.8.1-devel-ubuntu24.04
FROM ${BASE_IMAGE} AS spack-base-template
# ... Content of the base file ....
ENTRYPOINT ["spack-env"]
CMD ["interactive-shell"]

# ===================================================================
# FINAL IMAGES: Define the final images you want to be able to build.
# ===================================================================

# Final target for a generic Spack image (uses default BASE_IMAGE)
FROM spack-base-template AS spack-generic

# Final target for the ROCm base image with Spack
FROM spack-base-template AS rocm

# Final target for the CUDA base image with Spack
FROM spack-base-template AS cuda

# Final target for the ROCm image WITH Zen3 compilers.
# It starts FROM the 'rocm' stage, which already has the right base.
FROM rocm AS rocm-zen3

RUN spack config add "packages:all:target:[zen3]"
RUN spack compiler find && \
    spack install gcc@12.4.0 target=zen3 && \
    spack install llvm@17.0.6 target=zen3
RUN spack load gcc@12.4.0 && \
    spack compiler find
RUN spack config add "packages:all:compiler:[gcc@12.4.0]"
RUN spack clean -a

# Final target for the CUDA12 image WITH Zen3 compilers.
# It starts FROM the 'cuda' stage, which already has the right base.

FROM rocm AS cuda-zen3

# ... Exactly the same code as the rocm-zen3 one, or doing a base and using Copy which is not ideal either.

All of these solutions would result in overly complicated gitlab-ci.yml files that are not nice to maintain and would likely require code-generation.