...
The following exercises will help you understand how to build your own software on the ATOS HPCF or ECS.
Table of Contents |
---|
...
Before we start...
Ensure your environment is clean by running:
No Format module reset
Create a directory for this tutorial and cd into it:
No Format mkdir -p compiling_tutorial cd compiling_tutorial
Building simple programs
With your favourite editor, create three hello world programs, one in C, one in C++ and one in FortranWe are going to use a simple program that will display versions of different libraries linked to it. Create a file called
versions.c
using your favourite editor with the following contents:Code Block language cpp title versionshello.c collapse true #include <stdio.h> #include <hdf5.h> #include <netcdf.h> #include <eccodes.h> int main()int { #if defined(__INTEL_LLVM_COMPILER) argc, char** argv) { printf("Compiler: Intel LLVM %dHello World from a C program\n", __INTEL_LLVM_COMPILER); return #elif defined(__INTEL_COMPILER) printf("Compiler: Intel Classic %d\n", __INTEL_COMPILER); #elif defined(__clang_version__) printf("Compiler: Clang %s\n", __clang_version__); #elif defined(__GNUC__)0; }
Code Block language cpp title hello++.cc collapse true #include <iostream> int main() { std::cout << "Hello World from a C++ program!" << std::endl; }
Code Block language cpp title hellof.f90 collapse true program hello print *, "Hello World from a Fortran Program!" printf("Compiler: GCC %d.%d.%d\n", __GNUC__, __GNUC_MINOR__, __GNUC_PATCHLEVEL__); #else printf("Compiler information not available\n"); #endif // HDF5 version unsigned majnum, minnum, relnum; H5get_libversion(&majnum, &minnum, &relnum); printf("HDF5 version: %u.%u.%u\n", majnum, minnum, relnum); // NetCDF version printf("NetCDF version: %s\n", nc_inq_libvers()); // ECCODES version printf("ECCODES version: "); codes_print_api_version(stdout); printf("\n"); return 0; }
Try to naively compile this program with:
No Format $CC -o versions versions.c
end program hello
Compile and run each one of them with the GNU compilers (
gcc
,g++
,gfortran
)Expand title Solution We can build them with:
No Format gcc -o hello hello.c g++ -o hello++ hello++.cc gfortran -o hellof hellof.f90
All going well, we should have now the executables that we can run
No Format $ for exe in hello hello++ hellof; do ./$exe; done Hello World from a C program Hello World from a C++ program! Hello World from a Fortran Program!
Now, use the generic environment variables for the different compilers (
$CC
,$CXX
,$FC
) and rerun, you should see no difference to the above results.Expand title Solution We can rebuild them with:
No Format $CC -o hello hello.c $CXX -o hello++ hello++.cc $FC -o hellof hellof.f90
We can now run them exactly in the same way:
No Format $ for exe in hello hello++ hellof; do ./$exe; done Hello World from a C program Hello World from a C++ program! Hello World from a Fortran Program!
Tip title Always use the environment variables for compilers We always recommend using the environment variables since it will make it easier for you to switch to a different compiler.
Managing Dependencies
We are now going to use a simple program that will display versions of different libraries linked to it. Create a file called
versions.c
using your favourite editor with the following contents:Code Block language cpp title versions.c collapse true #include <stdio.h> #include <hdf5.h> #include <netcdf.h> #include <eccodes.h> int main() { #if defined(__INTEL_LLVM_COMPILER) printf("Compiler: Intel LLVM %d\n", __INTEL_LLVM_COMPILER); #elif defined(__INTEL_COMPILER) printf("Compiler: Intel Classic %d\n", __INTEL_COMPILER); #elif defined(__clang_version__) printf("Compiler: Clang %s\n", __clang_version__); #elif defined(__GNUC__) printf("Compiler: GCC %d.%d.%d\n", __GNUC__, __GNUC_MINOR__, __GNUC_PATCHLEVEL__); #else printf("Compiler information not available\n"); #endif // HDF5 version unsigned majnum, minnum, relnum; H5get_libversion(&majnum, &minnum, &relnum); printf("HDF5 version: %u.%u.%u\n", majnum, minnum, relnum); // NetCDF version printf("NetCDF version: %s\n", nc_inq_libvers()); // ECCODES version printf("ECCODES version: "); codes_print_api_version(stdout); printf("\n"); return 0; }
Try to naively compile this program with:
No Format $CC -o versions versions.c
- The compilation above fails as it does not know where to find the different libraries. We need to add some additional flags so the compiler can find both the include headers and link to the actual libraries.
Let's use the existing software installed on the system with modules, and benefit from the corresponding environment variables
*_DIR
which are defined in them to manually construct the include and library flags:No Format $CC -o versions versions.c -I$HDF5
Let's use the existing software installed on the system with modules, and benefit from the corresponding environment variables *_DIR which are defined in them to manually construct the include and library flags:
No Format $CC -o versions versions.c -I$HDF5_DIR/include -I$NETCDF4_DIR/include -I/$ECCODES_DIR/include -L$HDF5_DIR/lib -lhdf5 -L$NETCDF4_DIR/lib -lnetcdf -L$ECCODES_DIR/lib -leccodes
Load the appropriate modules so that the line above completes successfully and generates the
versions
executable:Expand title Solution You will need to load the following modules to. have those variables defined.:
No Format module load hdf5 netcdf ecmwf-toolbox $CC -o versions versions.c -I$HDF5_DIR/include -I$NETCDF4_DIR/include -I/$ECCODES_DIR/include -L$HDF5_DIR/lib -lhdf5 -L$NETCDF4_DIR/lib -lnetcdf -L$ECCODES_DIR/lib -leccodes
The
versions
executable should now be in your current directory:No Format ls versions
Run
./versions
. You will get an error such as the one below:No Format $ ./versions ./versions: error while loading shared libraries: libhdf5.so.200: cannot open shared object file: No such file or directory
While you passed the location of the libraries at compile time, the program cannot not find the libraries at runtime. Inspect the executable with
ldd
to see what libraries are missingExpand title Solution ldd is a utility that prints the shared libraries required by each program or shared library specified on the command line:
No Format $ ldd versions linux-vdso.so.1 (0x00007ffffada9000) libhdf5.so.200 => not found libnetcdf.so.19 => not found libeccodes.so => not found libc.so.6 => /lib64/libc.so.6 (0x000014932ff36000) /lib64/ld-linux-x86-64.so.2 (0x00001493302fb000)
Can you make that program run successfully?
Expand title Solution While you passed the location of the libraries at compile time, the program cannot not find the libraries at runtime. There are two solutions:
Use the environment variable LD_LIBRARY_PATH- not recommended for long term
Use the environment variable LD_LIBRARY_PATH. Check that ldd with the environment variable defined reports all libraries found:
No Format LD_LIBRARY_PATH=$HDF5_DIR/lib:$NETCDF4_DIR/lib:$ECCODES_DIR/lib ldd ./versions
Use
rpath
- robust solutionUse the
rpath
strategy to engrave the library paths into the actual executable at link time, so it always knows where to find them at runtime. Rebuild your program with: $CC -o versions versions.c -I$HDF5No Format _DIR/include -I$NETCDF4_DIR/include -
I/$ECCODESI$ECCODES_DIR/include -L$HDF5_DIR/lib
-Wl,-rpath,$HDF5_LIB-lhdf5 -L$NETCDF4_DIR/lib
-Wl,-
rpath,$NETCDF4_DIR/lib -lnetcdf -L$ECCODES
_DIR/lib -Wl,-rpath,$ECCODES_DIR/lib -leccodes
Check that ldd with the environment variable defined reports all libraries found:
No Format unset LD_LIBRARY_PATH ldd ./versions
Final version
For convenience, all those software modules define the*_INCLUDE
and*_LIB
variablesLoad the appropriate modules so that the line above completes successfully and generates the
versions
executable:Expand title Solution You will need to load the following modules to have those variables defined:
No Format module showload hdf5 netcdf4 ecmwf-toolbox | grep -e _INCLUDE -e _LIB
You can use those in for your compilation directly, with the following simplified compilation line:
No Format $CC -o versions versions.c $HDF5_INCLUDE $NETCDF4_INCLUDE $ECCODES_INCLUDE $HDF5_LIB $NETCDF4_LIB $ECCODES_LIB
Now you can run your program without any additional settings:
No Format $ ./versions Compiler: GCC 8.5.0 HDF5 version: <HDF5 version> NetCDF version: <NetCDF version> of <date> $ ECCODES version: <ecCodes version>
-I$HDF5_DIR/include -I$NETCDF4_DIR/include -I$ECCODES_DIR/include -L$HDF5_DIR/lib -lhdf5 -L$NETCDF4_DIR/lib -lnetcdf -L$ECCODES_DIR/lib -leccodes
The
versions
executable should now be in your current directory:No Format ls versions
Run
./versions
. You will get an error such as the one below:No Format ./versions: error while loading shared libraries: libhdf5.so.200: cannot open shared object file: No such file or directory
While you passed the location of the libraries at compile time, the program cannot not find the libraries at runtime. Inspect the executable with
ldd
to see what libraries are missingExpand title Solution
ldd is a utility that prints the shared libraries required by each program or shared library specified on the command line:
No Format
$
And then rebuild and run the program:
No Format $CC -o versions versions.c $HDF5_INCLUDE $NETCDF4_INCLUDE $ECCODES_INCLUDE $HDF5_LIB $NETCDF4_LIB $ECCODES_LIB ./versions
The output should match the versions loaded by the modules:
No Format echo $HDF5_VERSION $NETCDF4_VERSION $ECCODES_VERSION
Repeat the operation with
No Format module load --latest hdf5 netcdf4 ecmwf-toolbox
To simplify the build process, let's create a simple Makefile for this program. With your favourite editor, create a file called
Makefile
in the same directory with the following contents:Code Block language bash title Makefile collapse true # # Makefile # # Make sure all the relevant modules are loaded before running make EXEC = versions # TODO: Add the necessary variables into CFLAGS and LDFLAGS definition CFLAGS = LDFLAGS = all: $(EXEC) %: %.c $(CC) -o $@ $^ $(CFLAGS) $(LDFLAGS) test: $(EXEC) ./$(EXEC) clean: rm -f $(EXEC)
You can test it works by running:
No Format make clean ldd test
Expand title Solution Edit the Makefile and add the
*_INCLUDE
and*_LIB
variables which are defined by the modules:Code Block language bash title Makefile collapse true # # Makefile # # Make sure all the relevant modules are loaded before running make EXEC = versions CFLAGS = $(HDF5_INCLUDE) $(NETCDF4_INCLUDE) $(ECCODES_INCLUDE) LDFLAGS = $(HDF5_LIB) $(NETCDF4_LIB) $(ECCODES_LIB) all: $(EXEC) %: %.c $(CC) -o $@ $^ $(CFLAGS) $(LDFLAGS) test: $(EXEC) ./$(EXEC) ldd: $(EXEC) @ldd $(EXEC) | grep -e netcdf.so -e eccodes.so -e hdf5.so clean: rm -f $(EXEC)
Then run it with:
No Format make clean ldd test
Using different toolchains: prgenv
So far we have used the default compiler toolchain to build this program. Because of the installation paths of the library, it is easy to see both the version of the library used as well as the compiler flavour with ldd:
No Format |
---|
$ make ldd
libhdf5.so.200 => /usr/local/apps/hdf5/<HDF5 version>/GNU/8.5/lib/libhdf5.so.200 (0x000014f612b7d000)
libnetcdf.so.19 => /usr/local/apps/netcdf4/<NetCDF version>/GNU/8.5/lib/libnetcdf.so.19 (0x000014f611f2a000)
libeccodes.so => /usr/local/apps/ecmwf-toolbox/<ecCodes version>/GNU/8.5/lib/libeccodes.so (0x000014f611836000) |
Rebuild the program with:
- The default GNU GCC compiler.
- The default Classic Intel compiler.
- The default LLVM-based Intel compiler.
- The default AMD AOCC.
Use the following command to test and show what versions of the libraries are being used at any point:
No Format |
---|
make clean ldd test |
Expand | ||
---|---|---|
| ||
You can perform this test with the following one-liner, exploiting the
Pay attention to the following aspects:
|
ldd versions linux-vdso.so.1 (0x00007ffffada9000) libhdf5.so.200 => not found libnetcdf.so.19 => not found libeccodes.so => not found libc.so.6 => /lib64/libc.so.6 (0x000014932ff36000) /lib64/ld-linux-x86-64.so.2 (0x00001493302fb000)
Can you make that program run successfully?
Expand title Solution While you passed the location of the libraries at compile time, the program cannot not find the libraries at runtime. There are two solutions:
Use the environment variable LD_LIBRARY_PATH- not recommended for long term
Use the environment variable LD_LIBRARY_PATH. Check that ldd with the environment variable defined reports all libraries found:
No Format LD_LIBRARY_PATH=$HDF5_DIR/lib:$NETCDF4_DIR/lib:$ECCODES_DIR/lib ldd ./versions
Rebuild with
rpath
- robust solutionUse the
rpath
strategy to engrave the library paths into the actual executable at link time, so it always knows where to find them at runtime. Rebuild your program with:No Format $CC -o versions versions.c -I$HDF5_DIR/include -I$NETCDF4_DIR/include -I/$ECCODES_DIR/include -L$HDF5_DIR/lib -Wl,-rpath,$HDF5_LIB -lhdf5 -L$NETCDF4_DIR/lib -Wl,-rpath,$NETCDF4_DIR/lib -lnetcdf -L$ECCODES_DIR/lib -Wl,-rpath,$ECCODES_DIR/lib -leccodes
Check that ldd with the environment variable defined reports all libraries found:
No Format unset LD_LIBRARY_PATH ldd ./versions
Final version
For convenience, all those software modules define the
*_INCLUDE
and*_LIB
variables:No Format module show hdf5 netcdf4 ecmwf-toolbox | grep -e _INCLUDE -e _LIB
You can use those in for your compilation directly, with the following simplified compilation line:
No Format $CC -o versions versions.c $HDF5_INCLUDE $NETCDF4_INCLUDE $ECCODES_INCLUDE $HDF5_LIB $NETCDF4_LIB $ECCODES_LIB
Now you can run your program without any additional settings:
No Format $ ./versions Compiler: GCC 8.5.0 HDF5 version: <HDF5 version> NetCDF version: <NetCDF version> of <date> $ ECCODES version: <ecCodes version>
Can you rebuild the program so it uses the "old" versions of all those libraries in modules? Ensure the output of the program matches the versions loaded in modules? Do the same with the latest.
Expand title Solution You need to load the desired versions or the modules:
No Format module load hdf5/old netcdf4/old ecmwf-toolbox/old
And then rebuild and run the program:
No Format $CC -o versions versions.c $HDF5_INCLUDE $NETCDF4_INCLUDE $ECCODES_INCLUDE $HDF5_LIB $NETCDF4_LIB $ECCODES_LIB ./versions
The output should match the versions loaded by the modules:
No Format echo $HDF5_VERSION $NETCDF4_VERSION $ECCODES_VERSION
Repeat the operation with
No Format module load --latest hdf5 netcdf4 ecmwf-toolbox
To simplify the build process, let's create a simple Makefile for this program. With your favourite editor, create a file called
Makefile
in the same directory with the following contents:Note title Watch the indentation Make sure that the indentations at the beginning of the lines are tabs and not spaces!
Code Block language bash title Makefile collapse true # # Makefile # # Make sure all the relevant modules are loaded before running make EXEC = hello hello++ hellof versions # TODO: Add the necessary variables into CFLAGS and LDFLAGS definition CFLAGS = LDFLAGS = all: $(EXEC) %: %.c $(CC) -o $@ $^ $(CFLAGS) $(LDFLAGS) %: %.cc $(CXX) -o $@ $^ %: %.f90 $(F90) -o $@ $^ test: $(EXEC) @for exe in $(EXEC); do ./$$exe; done ldd: versions @ldd versions | grep -e netcdf.so -e eccodes.so -e hdf5.so clean: rm -f $(EXEC)
You can test it works by running:
No Format make clean test ldd
Expand title Solution Edit the Makefile and add the
*_INCLUDE
and*_LIB
variables which are defined by the modules:Code Block language bash title Makefile collapse true # # Makefile # # Make sure all the relevant modules are loaded before running make EXEC = versions CFLAGS = $(HDF5_INCLUDE) $(NETCDF4_INCLUDE) $(ECCODES_INCLUDE) LDFLAGS = $(HDF5_LIB) $(NETCDF4_LIB) $(ECCODES_LIB) all: $(EXEC) %: %.c $(CC) -o $@ $^ $(CFLAGS) $(LDFLAGS) %: %.cc $(CXX) -o $@ $^ %: %.f90 $(F90) -o $@ $^ test: $(EXEC) @for exe in $(EXEC); do ./$$exe; done ldd: versions @ldd versions | grep -e netcdf.so -e eccodes.so -e hdf5.so clean: rm -f $(EXEC)
Then run it with:
No Format make clean test ldd
Using different toolchains: prgenv
So far we have used the default compiler toolchain to build this program. Because of the installation paths of the library, it is easy to see both the version of the library used as well as the compiler flavour with ldd:
No Format |
---|
$ make ldd
libhdf5.so.200 => /usr/local/apps/hdf5/<HDF5 version>/GNU/8.5/lib/libhdf5.so.200 (0x000014f612b7d000)
libnetcdf.so.19 => /usr/local/apps/netcdf4/<NetCDF version>/GNU/8.5/lib/libnetcdf.so.19 (0x000014f611f2a000)
libeccodes.so => /usr/local/apps/ecmwf-toolbox/<ecCodes version>/GNU/8.5/lib/libeccodes.so (0x000014f611836000) |
Rebuild the program with:
- The default GNU GCC compiler.
- The default Classic Intel compiler.
- The default LLVM-based Intel compiler.
- The default AMD AOCC.
Use the following command to test and show what versions of the libraries are being used at any point:
No Format make clean test
Expand title Solution You can perform this test with the following one-liner, exploiting the
prgenv
module:No Format for pe in gnu intel intel-llvm amd; do module load prgenv/$pe; make clean test ldd; echo "******************"; done
Pay attention to the following aspects:
- The Lmod module command informs you that it has reloaded the corresponding modules when changing the
prgenv
. This ensures the libraries used in your program are built with the same compiler for maximum compatibility. - The compiler command changes automatically, since we are using the environment variable $
CC
in the Makefile. - The include and library flags in the compilation lines are adapted automatically based on the libraries loaded.
- The final binary is linked with the corresponding libraries for the version of the compiler as shown by
ldd
output.
Rebuild the program with the "new" GNU GCC compiler. Use the same command as above to test and show what versions of the libraries are being used at any point.
Expand title Solution This time we need to be on the GNU prgenv, but also select the "new" gcc compiler instead of just the default.
Remember you can look at the versions available in modules and their corresponding labels with:
No Format module avail gcc
This sequence of commands should do the trick:
No Format module load prgenv/gnu gcc/new make clean test ldd
Rebuild the program with the Classic Intel compiler once again, but this time reset your module environment once the executable has been produced and before running it. What happens when you run it?
Expand title Solution Let's look at the following sequence of commands:
No Format module load prgenv/intel make clean versions module reset make test
The result will be something similar to:
No Format ./versions: error while loading shared libraries: libifport.so.5: cannot open shared object file: No such file or directory
Inspecting the executable with ldd throws some missing libraries at runtime:
No Format $ ldd versions | grep "not found" libcilkrts.so.5 => not found libifcoremt.so.5 => not found libifport.so.5 => not found libimf.so => not found libintlc.so.5 => not found libirng.so => not found libsvml.so => not found ...
That shows how, for Intel-built programs, you must have the intel environment set up at both compile and run times.
Bringing MPI into the mix
Beyond the different compiler flavours in offer, we can also choose different MPI implementations for our MPI parallel programs. On the Atos HPCF and ECS, we can choose from the following implementations:
Implementation | Module | Description |
---|---|---|
OpenMPI | openmpi | Standard OpenMPI implementation provided by Atos |
Intel MPI | intel-mpi | Intel's MPI implementation based on MPICH. Part of part of the Intel OneAPI distribution |
HPC-X OpenMPI | hpcx-openmpi | NVIDIA's optimised flavour of OpenMPI. This is the recommended option |
For the next exercise, we will use this adapted hello world code for MPI.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
#include <stdio.h>
#include <string.h>
#include <mpi.h>
int main(int argc, char **argv) {
int rank, size;
char compiler[100];
char mpi_version[MPI_MAX_LIBRARY_VERSION_STRING];
int len;
#if defined(__INTEL_LLVM_COMPILER)
sprintf(compiler, "Intel LLVM %d", __INTEL_LLVM_COMPILER);
#elif defined(__INTEL_COMPILER)
sprintf(compiler, "Intel Classic %d", __INTEL_COMPILER);
#elif defined(__clang_version__)
sprintf(compiler, "Clang %s", __clang_version__);
#elif defined(__GNUC__)
sprintf(compiler, "GCC %d.%d.%d", __GNUC__, __GNUC_MINOR__, __GNUC_PATCHLEVEL__);
#else
sprintf(compiler,"information not available");
#endif
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Get_library_version(mpi_version, &len);
mpi_version[len] = '\0';
printf("Hello from MPI rank %d of %d. Compiler: %s MPI Flavour: %s\n", rank, size, compiler, mpi_version);
MPI_Finalize();
return 0;
} |
Reset your environment with:
No Format module reset
With your favourite editor, create the file
mpiversions.c
with the code above, and compile it into the executablempiversions
. Hint: You may use the modulehpcx-openmpi
.Expand title Solution In order to compile MPI parallel programs, we need to use the MPI wrappers such as
mpicc
for C,mpicxx
/mpic++
for C++ ormpif77
/mpif90
/mpifort
for Fortran code. They will in turn call the corresponding compiler loaded in your environment, and will make sure all the necessary flags to compile and link agains the MPI library are set.These wrappers are made available when you load one of the MPI modules in the system. We will use the default
hpcx-openmpi
.Since we are building a C program, we will use
mpicc
:No Format module load hpcx-openmpi mpicc -o mpiversions mpiversions.c
Write a small batch job that will compile and run the program using 2 processors and submit it to the batch system:
Expand title Solution You may write a job script similar to the following
Code Block language bash title mpiversions.sh #!/bin/bash #SBATCH -J mpiversions #SBATCH -o %x.out #SBATCH -n 2 module load hpcx-openmpi mpicc -o mpiversions mpiversions.c srun ./mpiversions
You may then submit it to the batch system with
sbatch
:No Format sbatch mpiversions.sh
Inspect the output to check what versions of Compiler and MPI are reported
Tweak the previous job to build and run the
mpiversions
program with as many combinations of compiler families and MPI implementations as you can.Expand title Solution You may amend your existing
mpiversions.sh
script with a couple of loops on the prgenvs and MPI implementations:Code Block language bash title mpiversions.sh #!/bin/bash #SBATCH -J mpiversions #SBATCH -o %x.out #SBATCH -n 2 for pe in gnu intel intel-llvm amd; do for mpi in hpcx-openmpi intel-mpi openmpi; do module load prgenv/$pe $mpi module list mpicc -o mpiversions mpiversions.c srun ./mpiversions echo "******************" done done
You may then submit it to the batch system with
sbatch
:No Format sbatch mpiversions.sh
Inspect again the output to check what versions of Compiler and MPI are reported.
Real-world example: CDO
To put into practice what we have learned so far, let's try to build and install CDO. You would typically not need to build this particular application, since it is already available as part of the standard software stack via modules or easily installable with conda. However, it is a good illustration of how to build a real-world software with dependencies to other software packages and libraries.
The goal of this exercise is for you to be able to build CDO and install it under one of your storage spaces (HOME or PERM), and then successfully run:
No Format |
---|
<PREFIX>/bin/cdo -V |
You will need to:
- Familiarise yourself with the installation instructions of this package in the official documentation.
- Decide your installation path and your build path.
- Download the source code from the CDO website.
- Set up your build environment (i.e. modules, environment variables) for a successful build
- Build and install the software
- Test that works with the command above.
Make sure that CDO is built at least with support to:
- NetCDF
- HDF5
- SZLIB (hint: use AEC)
- ecCodes
- PROJ
- CMOR
- UDUNITS
Tip | |||||||||
---|---|---|---|---|---|---|---|---|---|
| |||||||||
It is strongly recommended you bundle all your build process in a job script that you can submit in batch. That way you can request additional cpus and speed up your compilation exploiting build parallelism with If you would like a starting point for such a job, you can start from the following example, adding and amending the necessary bits as needed:
|
Expand | ||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||
We will take a step-by-step approach, taking the First thing we need to do is to decide where to install your personal CDO. A good choice would be using your PERM space, and we may use the same structure as the production installation. Because in the script we already have a variable called $VERSION containing the CDO version to install, we can also use that. This way we could have multiple versions of the same package installed alongside should we require it in the future:
Next comes the decision on what to use for the build itself. Since it is a relatively small build, for performance you might use
However, if you are going to submit it as a batch job, the directory will be wiped at the end. While you are putting the build script together, it may be more practical to have the build directory somewhere that is not deleted after a failed build so you can inspect output files and troubleshoot. As an example, we could pick the following directory in PERM:
Let's look at the environment for the build. All of dependencies listed above are already available, so we may leverage that by loading all the corresponding modules:
At this point, we need to refer to the installation instructions of this package in the official documentation. We can see it is a classic autotools package, which is typically built with the
Since we are just getting some short help, we can just run the script locally to get the configure output.
We should inspect the output of the configure help command, and identify what options are to be used:
Since all those dependencies are not installed on system paths, we will need to specify the installation directory for each one of them. We may then use the We will also define where to install the package with the --prefix option. Let's amend the configure line with:
Note that for PROJ, since the variable We are now ready to attempt our first build. Submit it the build script to the batch system with:
While it builds, we may keep an eye on the progress with:
At this point CDO build and installation should complete successfully, but the execution of the newly installed CDO at the end fails with:
If we inspect the resulting binary with ldd, we will notice there are a few libraries that cannot be found at runtime:
|
Real-world example: CDO
To put into practice what we have learned so far, let's try to build and install CDO. You would typically not need to build this particular application, since it is already available as part of the standard software stack via modules or easily installable with conda. However, it is a good illustration of how to build a real-world software with dependencies to other software packages and libraries.
The goal of this exercise is for you to be able to build CDO and install it under one of your storage spaces (HOME or PERM), and then successfully run:
No Format |
---|
<PREFIX>/bin/cdo -V |
You will need to:
- Familiarise yourself with the installation instructions of this package in the official documentation.
- Decide your installation path and your build path.
- Download the source code from the CDO website.
- Set up your build environment (i.e. modules, environment variables) for a successful build
- Build and install the software
- Test that works with the command above.
Make sure that CDO is built at least with support to:
- NetCDF
- HDF5
- SZLIB (hint: use AEC)
- ecCodes
- PROJ
- CMOR
- UDUNITS
Tip | |||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
#!/bin/bash
#SBATCH -J build_cdo
#SBATCH -o %x.out
#SBATCH -n 8
set -x
set -e
set -u
set -o pipefail
# Get the URL and VERSION for the latest CDO available
URL=$(curl -s https://code.mpimet.mpg.de/projects/cdo/files | grep attachments/download | sed -e "s:.*href=\"\(.*\)\".*:https\://code.mpimet.mpg.de\1:" | head -n 1)
VERSION=$(echo $URL | sed -e "s:.*/cdo-\(.*\).tar.gz:\1:")
# TODO: Define installation prefix and build directory
# Hint: Use somewhere in your PERM for installation.
PREFIX=
BUILDDIR=
# Move to our BUILD DIRECTORY
mkdir -p $BUILDDIR
cd $BUILDDIR
# Download source
[ -f cdo-$VERSION.tar.gz ] || wget $URL
[ -d cdo-$VERSION ] || tar xvf cdo-$VERSION.tar.gz
cd cdo-$VERSION
# TODO: Define the environment for the build
# TODO: Configure the build
# Make sure we start a clean build
make clean
# Build
make -j $SLURM_NTASKS
# Install
make install
# Check installed binary
$PREFIX/bin/cdo -V |
title | Solution |
---|
We will take a step-by-step approach, taking the build_cdo.sh
template above as the starting point.
First thing we need to do is to decide where to install your personal CDO. A good choice would be using your PERM space, and we may use the same structure as the production installation. Because in the script we already have a variable called $VERSION containing the CDO version to install, we can also use that. This way we could have multiple versions of the same package installed alongside should we require it in the future:
Code Block | ||
---|---|---|
| ||
# Define installation prefix and build directory
PREFIX=$PERM/apps/cdo/$VERSION |
Next comes the decision on what to use for the build itself. Since it is a relatively small build, for performance you might use $TMPDIR
which is local to the node.
Code Block | ||
---|---|---|
| ||
BUILDDIR=$TMPDIR |
However, if you are going to submit it as a batch job, the directory will be wiped at the end. While you are putting the build script together, it may be more practical to have the build directory somewhere that is not deleted after a failed build so you can inspect output files and troubleshoot. As an example, we could pick the following directory in PERM:
Code Block | ||
---|---|---|
| ||
BUILDDIR=$PERM/apps/cdo/build |
Let's look at the environment for the build. All of dependencies listed above are already available, so we may leverage that by loading all the corresponding modules:
Code Block | ||
---|---|---|
| ||
# Define the environment for the build
module load hdf5 netcdf4 aec ecmwf-toolbox proj cmor udunits |
At this point, we need to refer to the installation instructions of this package in the official documentation. We can see it is a classic autotools package, which is typically built with the configure - make - make install
sequence. We should then look at the configure --help
to see how to enable all the extra options to configure the build:
No Format |
---|
# Configure the build
./configure --help && exit |
Since we are just getting some short help, we can just run the script locally to get the configure output.
No Format |
---|
bash ./build_cdo.sh |
We should inspect the output of the configure help command, and identify what options are to be used:
No Format |
---|
--with-szlib=<yes|no|directory> (default=no)
--with-hdf5=<yes|no|directory> (default=no)
--with-netcdf=<yes|no|directory> (default=no)
--with-udunits2=<directory>
--with-cmor=<directory> Specify location of CMOR library.
--with-eccodes=<yes|no|directory> (default=no)
--with-proj=<directory> Specify location of PROJ library for cartographic
projections. |
Since all those dependencies are not installed on system paths, we will need to specify the installation directory for each one of them. We may then use the *_DIR
environment variables defined by the corresponding modules we load just before.
We will also define where to install the package with the --prefix option. Let's amend the configure line with:
No Format |
---|
# Configure the build
./configure --prefix=$PREFIX --with-hdf5=$HDF5_DIR --with-netcdf=$NETCDF4_DIR --with-szlib=$AEC_DIR --with-eccodes=$ECCODES_DIR --with-proj=$proj_DIR --with-cmor=$CMOR_DIR --with-udunits2=$UDUNITS_DIR |
Note that for PROJ, since the variable $PROJ_DIR
has a special meaning in the package itself, we must use the lowercase version $proj_DIR.
We are now ready to attempt our first build. Submit it the build script to the batch system with:
No Format |
---|
sbatch build_cdo.sh |
While it builds, we may keep an eye on the progress with:
No Format |
---|
tail -f build_cdo.out |
At this point CDO build and installation should complete successfully, but the execution of the newly installed CDO at the end fails with:
No Format |
---|
$ grep -v ECMWF-INFO build_cdo.out | tail
make[1]: Leaving directory '/etc/ecmwf/nfs/dh2_perm_a/user/apps/cdo/build/cdo-2.3.0/test/pytest'
make[1]: Entering directory '/etc/ecmwf/nfs/dh2_perm_a/user/apps/cdo/build/cdo-2.3.0'
make[2]: Entering directory '/etc/ecmwf/nfs/dh2_perm_a/user/apps/cdo/build/cdo-2.3.0'
make[2]: Nothing to be done for 'install-exec-am'.
make[2]: Nothing to be done for 'install-data-am'.
make[2]: Leaving directory '/etc/ecmwf/nfs/dh2_perm_a/user/apps/cdo/build/cdo-2.3.0'
make[1]: Leaving directory '/etc/ecmwf/nfs/dh2_perm_a/user/apps/cdo/build/cdo-2.3.0'
+ /perm/user/apps/cdo/2.3.0/bin/cdo -V
/perm/user/apps/cdo/2.3.0/bin/cdo: error while loading shared libraries: libproj.so.25: cannot open shared object file: No such file or directory |
If we inspect the resulting binary with ldd, we will notice there are a few libraries that cannot be found at runtime:
No Format |
---|
$ ldd $PERM/apps/cdo/2.3.0/bin/cdo
linux-vdso.so.1 (0x00007ffc3fbb6000)
libproj.so.25 => not found
libeccodes.so => not found
libcmor.so => /usr/local/apps/cmor/3.7.1/lib/libcmor.so (0x00001484b970d000)
libudunits2.so.0 => /usr/local/apps/udunits/2.2.28/lib/libudunits2.so.0 (0x00001484b94ed000)
libexpat.so.1 => /lib64/libexpat.so.1 (0x00001484b92b1000)
libnetcdf.so.19 => /usr/local/apps/netcdf4/4.9.1/GNU/8.5/lib/libnetcdf.so.19 (0x00001484b8e86000)
libbz2.so.1 => /lib64/libbz2.so.1 (0x00001484b8c75000)
libzstd.so.1 => /lib64/libzstd.so.1 (0x00001484b89d1000)
libxml2.so.2 => /lib64/libxml2.so.2 (0x00001484b8669000)
libcurl.so.4 => /lib64/libcurl.so.4 (0x00001484b83db000)
libhdf5_hl.so.200 => /usr/local/apps/hdf5/1.12.2/GNU/8.5/lib/libhdf5_hl.so.200 (0x00001484b81ba000)
libhdf5.so.200 => /usr/local/apps/hdf5/1.12.2/GNU/8.5/lib/libhdf5.so.200 (0x00001484b7bca000)
libz.so.1 => /lib64/libz.so.1 (0x00001484b79b2000)
libdl.so.2 => /lib64/libdl.so.2 (0x00001484b77ae000)
libsz.so.2 => /lib64/libsz.so.2 (0x00001484b75ab000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00001484b738b000)
libuuid.so.1 => /usr/local/apps/cmor/3.7.1/lib/libuuid.so.1 (0x00001484b7187000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00001484b6df2000)
libm.so.6 => /lib64/libm.so.6 (0x00001484b6a70000)
libgomp.so.1 => /lib64/libgomp.so.1 (0x00001484b6838000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00001484b6620000)
libc.so.6 => /lib64/libc.so.6 (0x00001484b625b000)
/lib64/ld-linux-x86-64.so.2 (0x000014850b1a3000)
libjson-c.so.4 => /usr/local/apps/cmor/3.7.1/lib/libjson-c.so.4 (0x00001484b604c000)
libgfortran.so.5 => /lib64/libgfortran.so.5 (0x00001484b5bcd000)
liblzma.so.5 => /lib64/liblzma.so.5 (0x00001484b59a6000)
libnghttp2.so.14 => /lib64/libnghttp2.so.14 (0x00001484b577f000)
libidn2.so.0 => /lib64/libidn2.so.0 (0x00001484b5561000)
libssh.so.4 => /lib64/libssh.so.4 (0x00001484b52f2000)
libpsl.so.5 => /lib64/libpsl.so.5 (0x00001484b50e1000)
libssl.so.1.1 => /lib64/libssl.so.1.1 (0x00001484b4e4d000)
libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00001484b4964000)
libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00001484b470f000)
libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00001484b4425000)
libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00001484b420e000)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00001484b400a000)
libldap-2.4.so.2 => /lib64/libldap-2.4.so.2 (0x00001484b3dbb000)
liblber-2.4.so.2 => /lib64/liblber-2.4.so.2 (0x00001484b3bab000)
libbrotlidec.so.1 => /lib64/libbrotlidec.so.1 (0x00001484b399e000)
libaec.so.0 => /lib64/libaec.so.0 (0x00001484b3796000)
libquadmath.so.0 => /lib64/libquadmath.so.0 (0x00001484b3555000)
libunistring.so.2 => /lib64/libunistring.so.2 (0x00001484b31d4000)
librt.so.1 => /lib64/librt.so.1 (0x00001484b2fcc000)
libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00001484b2dbb000)
libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00001484b2bb7000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00001484b29a0000)
libsasl2.so.3 => /lib64/libsasl2.so.3 (0x00001484b2782000)
libbrotlicommon.so.1 => /lib64/libbrotlicommon.so.1 (0x00001484b2561000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00001484b2337000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00001484b210e000)
libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00001484b1e8a000) |
We are missing the PROJ and ecCodes libraries. We will need to explicitly set RPATHs when we build CDO to make sure the libraries are found. In Autotools packages, and as shown in the configure help we ran earlier, we may pass any extra link flags through the LDFLAGS
environment variable. We can amend our build script setting that variable just after loading the dependant modules:
No Format |
---|
# Define the environment for the build module load hdf5 netcdf4 aec ecmwf-toolbox proj cmor udunits # We will need to explicitly set rpath for proj and eccodes since CDO build system will not export LDFLAGS="-Wl,-rpath,$proj_DIR/lib -Wl,-rpath,$ECCODES_DIR/lib" |
If we submit the the build job again and wait for it to complete, we should see something like:
No Format |
---|
$ grep -v ECMWF-INFO build_cdo.out | tail -n 25 make[2]: Nothing to be done for 'install-exec-am'. make[2]: Nothing to be done for 'install-data-am'. make[2]: Leaving directory '/etc/ecmwf/nfs/dh2_perm_a/user/apps/cdo/build/cdo-2.3.0' make[1]: Leaving directory '/etc/ecmwf/nfs/dh2_perm_a/user/apps/cdo/build/cdo-2.3.0' + /perm/user/apps/cdo/2.3.0/bin/cdo -V Climate Data Operators version 2.3.0 (https://mpimet.mpg.de/cdo) System: x86_64-pc-linux-gnu CXX Compiler: g++ -std=gnu++17 -g -O2 -fopenmp -pthread CXX version : g++ (GCC) 8.5.0 20210514 (Red Hat 8.5.0-15) C Compiler: gcc -g -O2 -fopenmp -pthread -pthread C version : gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-15) F77 Compiler: gfortran -g -O2 F77 version : GNU Fortran (GCC) 8.5.0 20210514 (Red Hat 8.5.0-15) Features: 7/503GB 16/256threads c++17 OpenMP45 Fortran pthreads HDF5 NC4/HDF5/threadsafe OPeNDAP sz udunits2 proj cmor sse2 Libraries: yac/3.0.1 NetCDF/4.9.1 HDF5/1.12.2 proj/9.1.1 cmor/3.7.1 CDI data types: SizeType=size_t CDI file types: srv ext ieg grb1 grb2 nc1 nc2 nc4 nc4c nc5 nczarr CDI library version : 2.3.0 cgribex library version : 2.1.1 ecCodes library version : 2.30.2 NetCDF library version : 4.9.1 of Feb 9 2023 13:54:09 $ exse library version : 1.5.0 FILE library version : 1.9.1 |
For reference, this is the complete and functional job script to build CDO:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
#!/bin/bash #SBATCH -J build_cdo #SBATCH -o %x.out #SBATCH -n 8 set -x set -e set -u set -o pipefail # Get the URL and VERSION for the latest CDO available URL=$(curl -s https://code.mpimet.mpg.de/projects/cdo/files | grep attachments/download | sed -e "s:.*href=\"\(.*\)\".*:https\://code.mpimet.mpg.de\1:" | head -n 1) VERSION=$(echo $URL | sed -e "s:.*/cdo-\(.*\).tar.gz:\1:") # Define installation prefix and build directory PREFIX=$PERM/apps/cdoparmetis/$VERSION/$EC_COMPILER_FAMILY/ #BUILDDIR=$TMPDIR BUILDDIR=$PERM/apps/cdo/build # Move to our BUILD DIRECTORY mkdir -p $BUILDDIR cd $BUILDDIR # Download source [ -f cdo-$VERSION.tar.gz ] || wget $URL [ -d cdo-$VERSION ] || tar xvf cdo-$VERSION.tar.gz cd cdo-$VERSION # Define the environment for the build module load hdf5 netcdf4 aec ecmwf-toolbox proj cmor udunits # We will need to explicitly set rpath for proj and eccodes since CDO build system will not export LDFLAGS="-Wl,-rpath,$proj_DIR/lib -Wl,-rpath,$ECCODES_DIR/lib" # Configure the build ./configure --prefix=$PREFIX --with-hdf5=$HDF5_DIR --with-netcdf=$NETCDF4_DIR --with-szlib=$AEC_DIR --with-eccodes=$ECCODES_DIR --with-proj=$proj_DIR --with-cmor=$CMOR_DIR --with-udunits2=$UDUNITS_DIR # Make sure we start a clean build make clean # Build make -j $SLURM_NTASKS # Install make install # Check installed binary $PREFIX/bin/cdo -V |
...