Navigation and service

Usage of ELPA on JUWELS

ELPA 2018.11.001 is available on JUWELS in a pure MPI and a hybrid MPI and OpenMP version and in a version for GPU-usage.
ELPA 2016.05.004 is also available but only in the pure MPI version and only in older Stages.The user-interfaces for both versions are different.Version 2018.11.001 will be the last version with the legacy interface to ELPA. All further versions will only support the new interface.
All versions are available with Intel compiler and ParaStaionMPI and IntelMPI and with GCC and ParaStationMPI.
There are versions with GPU usage available on JUWELS. To get full GPU performance it is necessary to enable NVIDIA Multi-process service by adding

#SBATCH --gres=gpu:4 --partition=gpus --cuda-mps

in your batch script.NVIDIA Multi-process service is not compatible with usage of OpenMP on the CPUs thus for GPU usage only the pure MPI versions are available.
All other versions are combined versions with the pure MPI version and the hybrid MPI/OpenMP version in one module.

ELPA needs MKL for ScaLAPACK, BLACS, LAPACK, and BLAS. It cannot be used with OpenMPI as the BLACS from MKL cannot be used with OpenMPI.

The latest available version is 2020.05.001 in Stages/Devel-2019a. To access that version say

ml load Stages/Devel-2019a

Compiling and linking a Fortran name.f calling ELPA routines with pure MPI looks as follows:

module load intel-para [Intel IntelMPI] [GCC ParaStationMPI]
module load ELPA/2018.11.001
mpif90 name.f [-O3 ] \
-I$ELPA_INCLUDE/elpa -L$ELPA_LIB -lelpa \
-lmkl_scalapack_lp64 -lblacs_intelmpi_lp64 \
-lmkl_intel_lp64 -lmkl_sequential -lmkl_core [ -liomp5 -lpthread]

For the hybrid version it is recommended (but not necessary) to load the threaded version of Parastation MPI (for IntelMPI this is the default), so for the hybrid version use

module load Intel
module load ParaStationMPI/5.2.2-1-mt [module load IntelMPI]
module load ELPA/2018.05.001
mpif90 [-O3] -qopenmp name.f \
-I$ELPA_INCLUDE_OPENMP/elpa -L$ELPA_LIB -lelpa_openmp \
-lmkl_scalapack_lp64 -lblacs_intelmpi_lp64 \
-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread

Even with the hybrid version of ELPA you should link the threaded version of MKL, because part of the threaded parallelization comes from using threaded BLAS routines. When executing a hybrid program using ELPA it is necessary to add

export ELPA_DEFAULT_omp=<number of OpenMP threads per MPI process wanted>

in the batch script befor the srun command to activate the OpenMP version of ELPA.

Examples with a Makefile to make the examples can be found in $ELPA_ROOT/examples.