2. Mantik for Inference

This section explains how you can use the mantik platform for creating and using models for inference.

It assumes that you already have a successfully trained model with the platform. If that’s not the case, then firstly train a model and come back to this tutorial.

First install the mantik package. You will also need docker for this.

pip install mantik[docker]

Setup the necessary environment variables.

export MANTIK_USERNAME=<MANTIK USERNAME>
export MANTIK_PASSWORD=<MANTIK PASSWORD>

eval $(mantik init)

2.1. Register a Run as a Trained Model

Use Mantik API directly for registering a trained model.

Make a POST request to register a trained model like below:

curl -X 'POST' \
  'https://api.cloud.mantik.ai/projects/<PROJECT ID>/models/trained' \
  -H 'accept: application/json' \
  -H "Authorization: Bearer ${MLFLOW_TRACKING_TOKEN}" \
  -H 'Content-Type: application/json' \
  -d '{
  "uri": "<MLFLOW ARTIFACT URI>",
  "location": "S3"
}'

Get the MLFLOW ARTIFACT URI by clicking on the Name of your Run and finding it in the MLflow UI.

find_mlflow_artifact_uri_part1.png find_mlflow_artifact_uri_part2.png

The PROJECT ID you can get from the project settings.

An example request looks like:

curl -X 'POST' \
  'https://api.cloud.mantik.ai/projects/3db1391f-db74-4d2d-a1e1-10c4575d53ab/models/trained' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer eyJraWQiOiIuhASOUDHs9df0123' \
  -H 'Content-Type: application/json' \
  -d '{
  "uri": "mlflow-artifacts:/60/4a83e0a502374b17b9244484d3566d84/artifacts/model",
  "location": "S3"
}'

If the request was successful, you will get a response like

{"modelId":"5472e202-c9b1-4f7b-a552-7f8c43cb14b2"}

Copy the modelId. That’s your MODEL ID.

2.2. Dockerize a Trained Model

Get the PROJECT ID and MODEL ID of the model you want to dockerize.

Call the related API endpoint to start the dockerization process:

curl -X 'POST' \
  'https://api.cloud.mantik.ai/projects/<PROJECT ID>/models/trained/<MODEL ID>/docker/build' \
  -H 'accept: application/json' \
  -H "Authorization: Bearer ${MLFLOW_TRACKING_TOKEN}" \
  -d ''

you should get a response null back. This process will take a few minutes.

2.3. Download a Docker Image with the Model

Given enough time, the dockerized model should be ready. There is not easy way currently to check whether a dockerized model is ready for download.

Download the model using

mantik models download --project-id="<PROJECT ID>" --model-id="<MODEL ID>" --load

The flag --load tells mantik to unzip and load the model image directly into docker. To instead keep it as a .tar.gz file to easily share with others, omit the --load flag.

Available options are

Options:
  --project-id UUID               [required]
  --model-id UUID                 [required]
  --target-dir TEXT               Path to directory where the zipped tarball
                                  image will be downloaded.  [default: ./]
  --image-type [docker|apptainer]
                                  Type of the image to fetch from the mantik
                                  platform.  [default: docker]
  --load / --no-load              Load the tarball image into docker.
                                  [default: no-load]
  --help                          Show this message and exit.

2.4. Run the Model for Inference

Now it’s time to run the model.

Assuming it has been downloaded and loaded into docker, it can be started using

docker run -p 8080:8080 <MODEL ID>-docker

It is now ready for inference.

To actually run inference, you need to call its invocations endpoint. Open a jupyter notebook on your local machine, or simply copy the below code to a python file and run it.

import pandas as pd
import json
import requests

columns = [
    "fixed acidity",
    "volatile acidity",
    "citric acid",
    "residual sugar",
    "chlorides",
    "free sulfur dioxide",
    "total sulfur dioxide",
    "density",
    "pH",
    "sulphates",
    "alcohol",
    ]

data = [
    [7,0.27, 0.36, 20.7, 0.045, 45, 170, 1.001, 3, 0.45, 8.8],
]

df = pd.DataFrame(data, columns=columns)

_predict_json = json.dumps({"dataframe_split": df.to_dict(orient="split")})

response = requests.post(
    "http://127.0.0.1:8080/invocations",
    headers={"Content-Type": "application/json"},
    data=_predict_json
    )
print(response.text)

The output is something like

{"predictions": [5.550530190667395]}

which is the result of the inference.