HTC Multi-node Tasks¶
This module is the third in a sequence that will form the overall capabilities of the HTC library (see HTC Library Configuration in YAML for the previous module). This module deals with enabling tasks to be run over a set of nodes (specifically MPI/OpenMP tasks).
Purpose of Module¶
The initial goal is to allow the HTC library to control tasks that are executed via the MPI launcher command. The task tracked by Dask is actually the process created by the launcher. The launcher is a forked process from within the library.
The implementation is intended to be generic but the specific example implementation provided is for srun
launcher
that is used on
JURECA system.
Background Information¶
This module builds upon the work described in HTC Library Configuration in YAML.
Building and Testing¶
The library is a Python module and can be installed with
python setup.py install
More details about how to install a Python package can be found at, for example, Install Python packages on the research computing systems at IU
To run the tests for the decorators within the library, you need the pytest
Python package. You can run all the
relevant tests from the jobqueue_features
directory with
pytest tests/test_mpi_wrapper.py
Specific examples of usage for the JURECA system are available in the examples
subdirectory.
Source Code¶
The latest version of the library is available on the jobqueue_features GitHub repository
The code that was originally created specifically for this module can be seen in the HTC/MPI Merge Request which can be found in the original private repository of the code. Additional, more complex, examples were provided in the HTC/MPI examples Merge Request