Using Cog and Acorn to Deploy and Scale Machine Learning Models on Kubernetes

by | May 24, 2023

Spread the word

Deploying and Scaling Containerized Machine Learning Models – Part 2

This four-part series focuses on leveraging Cog and Acorn frameworks to package, deploy, and scale machine learning models in cloud native environments. Part 1 introduces Cog as the framework for containerizing ML models, while this part (part 2) focuses on integrating Cog with Acorn to target Kubernetes clusters for deployment. Part 3 discusses leveraging GPUs to accelerate the inference of computer vision models, and finally, part 4 covers deploying transformer models that deal with natural language processing (NLP). 

In the first part of this tutorial series, I introduced Cog as the MLOps tool to containerize machine learning models. This guide will explore how Cog and Acorn can work together in deploying the models and scaling machine learning on Kubernetes. 

The Acorn framework complements Cog by extending the workflow to Kubernetes. As we have seen in the previous tutorial, Cog simplifies building Docker images to perform inference on ML models. But, it can’t wrap the image inside a Kubernetes deployment manifest and expose the REST API as a service. 

Acorn is a framework to simplify the packaging and deployment of container images. So, its workflow starts from where Cog ends. 

In the previous part of this series, we generated a Docker image through Cog and tested it locally. Let’s extend this further to integrate with Acorn. 

First, let’s push the Docker image to a container registry. In this case, it is Docker Hub. 

export DOCKER_HUB_USERNAME=janakiramm
docker push $DOCKER_HUB_USERNAME/salpred

Building our Acornfile

It’s time to define the Acornfile that turns the Docker image into a set of Kubernetes resources.

Create a directory called deploy and create an Acornfile with the below contents:

containers:{
    "salary": {
        image: "janakiramm/salpred:latest"
        ports: publish: "80:5000/http"
    }
}

The content of this Acornfile is self-explanatory. It builds an OCI image from the Docker image from Docker Hub and pushes that to the Kubernetes cluster while exposing port 5000 as 80. Remember that Cog, by default, exposes its REST API endpoint on port 5000. 

Let’s go ahead and run the Acorn application. 

acorn run -n salpred .

Wait for the URI to be generated, then use cURL to test the endpoint.

curl -X POST -H "Content-type: application/json" -d '{"input": {"exp": "25"}}'  --silent http://salary-salpred-0ca08a35.d0s0td.alpha.on-acorn.io/predictions

Using the jq CLI to filter the JSON output shows us just the prediction.

Great! Now, our ML model’s inference code is running in Kubernetes as a microservice. 

Scaling our ML Model on Kubernetes

Let’s go ahead and scale the service by modifying the Acornfile to add the scale key with a value of 4.

containers:{
    "salary": {
        image: "janakiramm/salpred:latest"
        ports: publish: "80:5000/http"
        scale: 4
    }
}

Delete the Acorn app and run it again. 

The Kubernetes namespace associated with the Acorn shows us 4 pods confirming that the deployment has scaled. 

We can take advantage of the Acorn framework to add probes, command line arguments, external volumes, and more. We’ve been able to accelerate and simplify running machine learning on Kubernetes, without reducing flexibility or portability and we’ve shown how to scale machine learning models as demand increases.

In the next part of this series, we will deploy a computer vision model based on a GPU using Cog and Acorn. 

Janakiram is a practicing architect, analyst, and advisor focusing on emerging infrastructure technologies. He provides strategic advisory to hyperscalers, technology platform companies, startups, ISVs, and enterprises. As a practitioner working with a diverse Enterprise customer base across cloud native, machine learning, IoT, and edge domains, Janakiram gains insight into the enterprise challenges, pitfalls, and opportunities involved in emerging technology adoption. Janakiram is an Amazon, Microsoft, and Google certified cloud architect, as well as a CNCF Ambassador and Microsoft Regional Director. He is an active contributor at Gigaom Research, Forbes, The New Stack, and InfoWorld. You can follow him on Twitter.

Header Photo by Jeffrey Betts on Unsplash


Spread the word