Extending available MPI runtime environments¶
This module is another in a sequence that form the overall capabilities of the HTC library (see HTC MPI-Aware Tasks for the most relevant previous module where support for forked MPI workloads was added). This module adds support for additional MPI runtimes to make the library a more portable solution between HPC systems.
Purpose of Module¶
This module extends the supported MPI runtimes of jobqueue_features
, beyond the original SLURM and mpiexec
, to
OpenMPI, Intel MPI and MPICH. This support includes the relevant arguments to provide reasonable process pinning
arguments to the runtimes based on the system architecture and resources requested for each worker.
Background Information¶
To date, we have only included MPI launchers that do not require complex configuration (srun and mpiexec). In order to extend the supported MPI launchers we also need to be able to take into account the distribution of processes and threads by the launcher. We have this information since it is dictated by the system configuration file and the arguments the user provides when creating the Dask cluster to which they submit their tasks.
The main goal here is to make a best effort mapping between the user request and the MPI launcher options that will distribute and pin the processes/threads across the target system.
Building and Testing¶
The library is a Python module and can be installed with
python setup.py install
More details about how to install a Python package can be found at, for example, Install Python packages on the research computing systems at IU
To run the tests for the MPI launchers within the library, you need the pytest
Python package. You can run all the
relevant tests from the jobqueue_features
directory with
pytest tests/test_mpi_wrapper.py
Source Code¶
The latest version of the library is available on the jobqueue_features GitHub repository
The code that was originally created specifically for this module can be seen in the Merge Request that added support for OpenMPI and Intel MPI, and the Merge Request that added support for MPICH.