FAQ and Common Issues¶
Q: Why am I getting an Invalid token
response although my credentials are correct?¶
A: Tokens are stored locally in a JSON file at ~/.mantik/tokens.json
.
If you experience any issues with the authentication, simply delete that file and re-try.
Q: Why is an experiment created via the MLflow GUI is not recognized by Mantik?¶
A: When an experiment is created directly in the MLflow UI, it will not be recorded in Mantik. It is therefore possible to have a mismatched state between Mantik and MLflow. For experiments to be recognized both by MLflow and Mantik, you have to create them using the Mantik platform.
Hence, when creating experiments directly via MLflow (mlflow.create_experiment
),
the experiment will also not be recognized by Mantik.
Q: Why am I getting multiple runs when submitting my application via Mantik?¶
A: Counter question: Do you pass a run_id
or run_name
to mlflow.start_run()
or excplicitly set the MLFLOW_RUN_ID
environment variable?
If yes: don’t do it! If no, please submit a bug report.
Managing runs explicitly from scripts provides some challenges:
There is a concept called
ActiveRun
, which represents the currently active MLflow run. When a run is submitted remotely, a MLflow run is created automatically, and that run’s ID is set asMLFLOW_RUN_ID
in the run’s environment. When usingmlflow.start_run()
, MLflow will pick up this run and use it for logging.The MLflow docs state that an
ActiveRun
can be picked up in a script usingmlflow.active_run()
. This method, though, might not behave as expected: it only picks up a run that was created and/or resumed earlier on in the current process (or script) usingmlflow.start_run()
. Unlikemlflow.start_run()
, it does not consume theMLFLOW_RUN_ID
.
If you really want to manage runs directly from scripts, mlflow.end_run()
needs to be called to stop the created run created
and be able to create a new MLflow run.
Q: How can I improve performance when tracking with MLflow?¶
A: Tests performed by us have shown that the usage of the following MLflow methods is preferred when tracking:
log_params
: at least 3X faster, up to 20X faster for a large number of parameters compared tolog_param
log_metrics
: at least 3X faster, up to 20X faster for a large number of parameters compared tolog_metric
log_artifact
: up to 2X faster comapred tolog_artifacts
(which is different than forlog_params
vs.log_param
andlog_metrics
vslog_metric
)
Q: How can I schedule a run without triggering a run immediately?¶
A: Unfortunately, this is not possible as of right now.