Dockerfiles
Two configurations are possible depending on how OpenIFS might be used in a container:
- The user works interactively inside the container. The external experiment directory is mounted as a sub-directory inside the container environment. Depending on the configuration the user can either have access to the entire OpenIFS installation inside the container or the user may be prevented from accessing the source code.
- The user only works from the experiment directory, instead of executing the model binary program the OpenIFS run script starts up a container environment wherein the experiment runs in isolation. Immediately after the experiment has completed the container is removed. The user has no access to any part of the model installation.
The Dockerfile describes the build process of the container image. Examples are provided in OpenIFS from version 43r3v1 onwards. The naming convention for Dockerfiles is:
Dockerfile.oifs<RELEASE>.<NOTE>.<ARCH>.<TYPE>
RELEASE: The OpenIFS release , e.g. 43r3v1.
NOTE: Describes features of this build, in our examples this is either 'user' or 'root'.
ARCH: The architecture for which this image is built, e.g. x86_64.
TYPE: Type of build. Here 'bld' is used but could be changed to 'dev' or 'test' for example.
Example: Dockerfile.oifs43r3v1.user.x86_64.bld
will generate an image of the OpenIFS 43r3v1 release.
Example Build Process
This section describes the generation process of a container image from the Dockerfile.
1. Start by navigating to the docker directory in the OpenIFS distribution:
% cd tools/docker
2. Make a copy of one of the Dockerfile templates that can be found in this directory. There are two versions:
% ls Dockerfile* Dockerfile.oifs43r3v1.root.x86_64.bld Dockerfile.oifs43r3v1.user.x86_64.bld % cp Dockerfile.oifs43r3v1.user.x86_64.bld Dockerfile
The note 'root
' indicates that OpenIFS is installed into a system directory /usr/local
whereas the 'user
' version installs into the user's account. Both Dockerfiles create a user called 'oifs'.
Which you choose depends on your application. The 'root' version might be more suited to a training workshop for example. We'll use the 'user
' version in this example.
3. Put a copy of the OpenIFS distribution tarfile as downloaded into the same directory as the Dockerfile.
Make sure the version number of the tarfile matches that specified in the Dockerfile, the build process will unpack this file inside the container.
You should have these files in your build directory (your version numbers may be different):
-rw-r--r-- 1 glenn staff 4255 6 May 17:35 Dockerfile -rw-r--r-- 1 glenn staff 30611901 6 May 18:13 oifs43r3v1.tar.gz
4. Build the OpenIFS Docker image.
The following command builds the image oifs43r3v1.user. Change 'user' to 'root' if building the other variant.
% docker build -t oifs43r3v1.user . # note the trailing '.' to build in the current dir
For the docker image on the workstations at ECMWF four variables need to be set for network proxies in order to access the internet from within the container.
docker build -t oifs43r3v1.user --build-arg http_proxy="$http_proxy" --build-arg ftp_proxy="$ftp_proxy" --build-arg https_proxy="$https_proxy" --build-arg no_proxy="$no_proxy" .
This runs the build process of the image which contains the minimum of software that is required to run OpenIFS.
The image is based on a Ubuntu Linux LTS version. After downloading the base Ubuntu image, the Dockerfile executes the following steps: the necessary developer tools are installed (e.g. GNU compiler, MPI and maths libraries).; the ecCodes library is downloaded and compiled; the OpenIFS sources are unpacked from the tar archive, and the model binaries are compiled. The last step sets file permissions and the model executable is moved to an install location.
At the end of the build process the successful image creation is shown as:
Successfully tagged oifs43r3v1.user:latest
Running the docker image
We can verify that the image is available and load it into a container using the docker run
command:
% docker images REPOSITORY TAG IMAGE ID CREATED SIZE oifs43r3v1.user latest 982f6e82bb93 39 minutes ago 873MB ubuntu latest 72300a873c2c 13 days ago 64.2MB % docker run -it oifs43r3v1.user oifs@40a923f11202:~$
Our command line prompt has changed as we are now the user 'oifs
' inside the container.
A directory listing shows the following structure (your version numbers may be different):
oifs@40a923f11202:~$ ls oifs43r3v1 oifs@40a923f11202:~$ ls oifs43r3v1 CHANGES COPYING NOTICE READMEs examples make oifs-config.sh t21test tools CITE LICENSE README bin fcm oifs-config.editme.sh src t21test_xios oifs@40a923f11202:~$
The compiled model executables can be found in and can be moved to another install location:
oifs@40a923f11202:~$ ls oifs43r3v1/make/gnu-opt/oifs/bin getres.exe grib_set_vtable.exe master.exe spinterp.exe timeint.exe vod2uv.exe gptosp.exe intsst.exe rgrid.exe sptogp.exe uvtovod.exe
The ecCodes library is found in its default destination under /usr/local/lib
.
If using the 'root' Dockerfile, the install location will be in /usr/local and not the home directory of the 'oifs' user.
In order to run the acceptance test as the root user the file t21test/job
needs editing:
EXPID=epc8
MASTER=/usr/local/bin/master.exe
In order to run the executable with the command mpirun
as root the following option needs to be added: $OIFS_RUNCMD --allow-run-as-root $MASTER -e $EXPID
With the command 'exit
' the container is removed and all created or changed files in the container are lost. The next section will show how results can be retained and OpenIFS experiments can be run using a container.
Running OpenIFS experiments in a container
Due to the temporary nature of containers all model results that are created in an experiment need to be stored outside the container. One possible method is to mount an external experiment directory inside the container. Data written to the mounted directory will be retained once the container is removed.
Assume an experiment directory at
/scratch/user/exp/.
Sub-directories are allowed however symbolic links to other file system locations will not work; the symbolic links created by oifs_run
at its first run will need to be manually created as sub-directories.
This experiment directory is mounted to the container when it is invoked:
docker run -v /scratch/user/exp:/home/oifs/exp:rw -it oifs43r3v1.user
A mount of the experiment directory can be found inside the container in sub-directory /exp
with read and write permissions.
In order to mount the external experiment directory successfully, all the files or sub-directories need to have full read-write-executable access: chmod -R 777 /scratch/user/exp
All the files in the mounted directory that were newly created or modified are owned by the container user, seen from outside the container their file ownership will be different.
Invoking the container from the OpenIFS run script
An alternative method of using OpenIFS in a container consists of including the docker call inside the oifs_run
script, replacing the execution of the model binary with mpirun.
This method is only suitable for running the model interactively (i.e. no batch job submission). The modification in the script is as follows:
- set:
export OIFS_EXE=/home/oifs/oifs43r3v1/make/gnu-opt/oifs/bin/master.exe # or wherever your master.exe is located
- comment out the code block that checks for the OIFS executable:
###if [ -d "$OIFS_EXE" ]; then ..... fi
- do not copy the executable:
##\cp -f "$OIFS_EXE" . || true
- replace the call of the RUNCMD:
Remove this line:$RUNCMD ./$(basename "$OIFS_EXE") || {
and replace with this line:docker run -v /scratch/user/exp/:/home/oifs/exp:rw <oifs_image> bash -c "cd exp && ulimit -s unlimited && $OIFS_EXE" || {
<oifs_image> is the name of the OpenIFS docker image. You may need to adjust the directories used above depending on your docker image.
When using this method the Docker container environment remains "concealed" from the model user and requires no further interaction with it.
Batch Job Submission
The use of Docker containers when running OpenIFS on HPC facilities has been tested successfully and with good scalability on the Piz Daint Cray XC50 at the Swiss National Supercomputing Centre in December 2019 using local computing support. At present we do not yet offer this capability at ECMWF.
Crib Sheet: Important Docker commands
This section contains a list of frequently used Docker commands.
Start the Docker deamon on your machine (if not already running):
sudo systemctl start docker
sudo systemctl restart docker
sudo systemctl status docker
which is actually: sudo /usr/bin/systemctl status docker
ECMWF users may need to contact servicedesk to request permission to run Docker.
Which images are on my machine:
docker images
docker rmi oifs # remove image oifs, might need -f option
docker rmi $(docker images -qa) # removes all images, might need -f option
docker save -o oifs_image.tar oifs # saves image oifs to a tar file
docker load -i oifs_image.tar # loads saved docker image into memory
Which containers are running:
docker ps
docker ps -a # show all containers
docker rm 6skd897asd # removes container beginning with 6sk...
docker rm $(docker ps -qa) # removes all containers, might need -f option
Build docker image:
docker build -t <image name> . # uses file called Dockerfile
docker build -t <image name> -f <docker file>
At ECMWF, use the proxy arguments:
docker build -t oifs --build-arg http_proxy="$http_proxy" --build-arg ftp_proxy="$ftp_proxy" --build-arg https_proxy="$https_proxy" --build-arg no_proxy="$no_proxy" .
Run docker images in container:
docker run -it ubuntu # run interactively with tty output
docker run -it oifs # run image oifs interactively
docker run -v /scratch/user:/scratch:rw -it oifs # mount volume $SCRATCH inside container
docker run -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=unix$DISPLAY metview metview # allows Metview to open X Window from inside the container
2 Comments
Unknown User (nagc)
To add files into a running container, we can do this:
This will tar up the files in the directory 'oifs43r3v1' on the host computer, and untar them on the running docker container into directory '/usr/local/oifs'
Unknown User (nagc)
To debug a failed docker build:
Look for the last container id before the error message:
The 'id' we need is '5f112d3aeefb'.
Commit the container and then run it starting a shell. The container will be at the point where the last command succeeded. You can then proceed to execute the commands by hand to find out which failed and put it right in the Dockerfile.