IntroductionΒΆ

The Mantik platform and Python package allow to make use of the Submitting an Application and its API to run applications on remote systems such as high-performance computers (HPC, e.g. JUWELS)

Projects have to follow a set of conventions that are required to add a level of abstraction that allows to execute the same application on different systems with ease.

The basis is built by the MLflow project conventions. On top of that, we extend these convention with the Compute Backend Config to configure the execution environment. We optionally allow to define Apptainer images or Python virtual environments that will be used on the remote system to run the application.

In summary, three essential files are required:

  1. Python script (or package) with CLI: To run a Python application with Mantik, you need either at least one Python script or a Python package that has a CLI.

    This script - or the code that is invoked by the CLI - might, for example, be responsible for training a ML model or using a trained model for inference. In any case, this script may use MLflow methods to track training/inference parameters, model metrics, and artifacts (e.g. plots). (Details on experiment tracking are explained in Using MLflow for Tracking.)

    Here, we assume that an application has a Python script that makes use of the argparse module to parse arguments passed to the script to modify the behavior of the code at execution time. This allows to use Mantik to its full potential.

    A CLI - either in a script or via a package - is optional, though, and an application might as well allow no arguments.

  2. MLproject file: Defines the application’s entry points, i.e. its scripts and their arguments.

    Its structure is defined by the MLflow conventions for MLprojects.

    This file declares how your script can be invoked and what parameters it allows. The MLproject file is used by Mantik to allow modification of input parameters to an application when submitting it to HPC.

    Details are explained in Preparing Your Application.

  3. Compute Backend config: Configures how the execution environment an MLproject on an external system such as HPC.

    The config can be written in YAML or JSON format.

    Details are explained in Compute Backend Config.

If the these three files are provided, a project can be executed on an external system either via the Mantik platform, the Python library, or the CLI.