Navigation and service

Usage of ELPA on JURECA

ELPA 2018.11.001 is available on JURECA in a pure MPI and a hybrid MPI and OpenMP version.
A version compiled for GPU usage is also available.
On Booster a version with KNL usage is also available.
ELPA 2016.05.004 is also available but only in an older Stage and in the pure MPI version.The user-interfaces for both versions are different.Version 2018.11.001 will be the last version with the legacy interface to ELPA. All further versions will only support the new interface.ELPA needs MKL for ScaLAPACK, BLACS, LAPACK, and BLAS.It cannot be used with OpenMPI as the BLACS from MKL cannot be used with OpenMPI.

The latest available version is 2020.05.001 in Stages/Devel-2019a. To access that version say

ml load Stages/Devel-2019a

There are versions with GPU usage available on JURECA. To get full GPU performance it is necessary to enable NVIDIA Multi-process service by adding

#SBATCH --gres=gpu:4 --partition=gpus --cuda-mps

in your batch script.NVIDIA Multi-process service is not compatible with usage of OpenMP on the CPUs thus for GPU usage only the pure MPI versions are available. All other versions are combined versions with the pure MPI version and the hybrid MPI/OpenMP version in one module.

Compiling and linking a Fortran name.f calling ELPA routines from the pure MPI version looks as follows:

module load intel-para
module load ELPA/2018.05.001
mpif90 name.f [-O3 ] \
-I$ELPA_INCLUDE/elpa -L$ELPA_LIB -lelpa \
-lmkl_scalapack_lp64 -lblacs_intelmpi_lp64 \
-lmkl_intel_lp64 -lmkl_sequential -lmkl_core [-liomp5 -lpthread]

For the hybrid version it is recommended (but not necessary) to load the threaded version of Parastation MPI (for IntelMPI this is the default), so for the hbrid version use

module load Intel
module load ParaStationMPI/5.2.2-1-mt
module load ELPA/2018.05.001
mpif90 [-O3] -qopenmp name.f \
-I$ELPA_INCLUDE_OPENMP/elpa -L$ELPA_LIB -lelpa_openmp \
-lmkl_scalapack_lp64 -lblacs_intelmpi_lp64 \
-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread

Even with the hybrid version of ELPA you should link the threaded version of MKL, because part of the threaded parallelization comes from using threaded BLAS routines. When executing a hybrid program using ELPA it is necessary to add

export ELPA_DEFAULT_omp=<number of OpenMP threads per MPI process wanted>

in the batch script befor the srun command to activate the OpenMP version of ELPA.

Examples with a Makefile to make the examples can be found in $ELPA_ROOT/examples.