Deploying an MLFlow Remote Server with Docker, S3 and SQL

MLFlow is an open-source platform for managing your machine learning lifecycle. You can either run MLFlow locally on your system, or host an MLFlow Tracking server, which allows for mutiple people to log models and store them remotely in a model repository for quick deployment/reuse.

In this article, I’ll tell you how to deploy MLFlow on a remote server using Docker, an S3 storage container of your choice Minio or Ceph and SQL SQLite or MySQL.

Setting up the Server

mkdir mlflow-server
cd mlflow-server
python3 -m venv env
source env/bin/activate
 pip3 install mlflow

Setting up the Backend Store

MLFlow model parameters and run stats are all stored in a backend store which you can specify. By default, MLFlow uses the filesystem, but that is not an ideal setup for a remote tracking server. Here, I’ll tell you how to set it up using either SQLite3 or MySQL — whichever you may prefer. I would recommend using MySQL in a production environment.

Using SQLite3

sudo apt install sqlite3
sqlite3 store.db
# press cntrl+D to exit sqlite
# nothing else needs to be done

Using MySQL

pip3 install pymysql
CREATE USER ‘mlflow-user’ IDENTIFIED BY ‘password’;
CREATE DATABASE ‘mlflowruns’;
GRANT ALL PRIVILEGES ON mlflowruns.* TO ‘mlflow-user’;

Setting up the Artifact Store

The artifact store is where your model’s files are stored. This includes your environment and other files that you can use to recreate and deploy your model instantly. By default your model’s artifacts are also stored on the filesystem in a fodler named mlruns, but when deploying a remote tracking server this is not an option as your client needs to have read and write permissions to the artifact — which is not possible for a filesystem path storage. Here, we’ll use S3 buckets to store model artifacts — the two options which we’ll look at are Minio and Ceph. Minio is comparatively lightweight, though Ceph is much more powerful and easier to start with, thanks to Ceph Nano.

Using Minio

sudo docker pull minio/minio
docker run -p <exposed_port_for_minio>:9000 — name minio1 \
-e MINIO_ACCESS_KEY=<your_access_key_id> \
-e MINIO_SECRET_KEY=<your_secret_key> \
-v /mnt/data:/data \
-v /mnt/config:/root/.minio \
minio/minio server /data
pip3 install minio
pip3 install boto3
export MLFLOW_S3_ENDPOINT_URL=http://127.0.0.1:<exposed_port_for_minio>
export AWS_ACCESS_KEY_ID=<your_access_key_id>
export AWS_SECRET_ACCESS_KEY=<your_secret_key>
wget https://gist.githubusercontent.com/kaushalvivek/9f1905e25a28526dfeaaecf80ef5c361/raw/3bbb30954d8ee7144fe2ec183b78999518226040/create_bucket.py
python3 create_bucket.py
rm create_bucket.py

Using Ceph

./cn cluster start -d /tmp mlflow-cluster -f huge
./cn s3 mb mlflow-cluster mlflow-buc
export MLFLOW_S3_ENDPOINT_URL=http://127.0.0.1:<ceph-endpoint-url>
export AWS_ACCESS_KEY_ID=<your_access_key_id>
export AWS_SECRET_ACCESS_KEY=<your_secret_key>

Starting the Server

mlflow server -p <exposed_port_for_mlflow> \
— host 0.0.0.0 \
— backend-store-uri <enter_backend_store_uri> \
— default-artifact-root <enter_deafult_artifact_root>

Setting up the Client

mkdir mlflow-work
cd mlflow-work
python3 -m venv env
source env/bin/activate
pip3 install mlflow
export MLFLOW_TRACKING_URI=http://<remote_server_ip>:<exposed_port_for_mlflow>
export MLFLOW_S3_ENDPOINT_URL=http://<remote_server_ip>:<exposed_port_for_artifact_root>
export AWS_ACCESS_KEY_ID=<enter_aws_access_key_id>
export AWS_SECRET_ACCESS_KEY=<enter_aws_secret_access_key>

Hacker | Engineer — vivekkaushal.com