MLOps Best Practices

Challenges arise as the production of machine learning models scale up to an enterprise level. MLOps plays a role in mitigating some of the challenges like handling scalability, automation, reducing dependencies, and streamlining decision making. Simply put, MLOps is like the cousin of DevOps.

It’s a set of practices that unify the process of ML development and operation.

This article serves as a general guide for someone looking to develop their next machine learning pipeline, delivering summaries of topics that will introduce topics of MLOps.

1. Communication and collaboration between roles — “ML products are a team effort”

A successful data science team will also consist of roles that bring different skill sets and responsibilities to the project. Subject matter experts, data scientists, software engineers, and business analysts each play their important role, but they don’t use the same tools or share basic knowledge to communicate effectively with one another. That is why it is important to practice collaboration and communication between the roles every step of the way will ensure getting around the track as quickly as possible.

2. Establish Business Objectives — “Have a clear goal in mind”

Problems for data science teams arise all the time when they struggle to prove how their model is providing value to the company to stakeholders and upper management. Particularly, a model falls short because there weren’t clear objectives during the exploratory and maintenance phases of development. To combat this, broad business questions like, “How to get people to stay longer on a web-page?” need to be translated into performance metrics that can be set as a goal for the model to strive for. The point of this practice is to have a foundational starting point for the data engineers and scientists to work from, and avoiding the risk of solving a problem that doesn’t serve the business in the long run.

3. Obtaining the Ground Truth — “Validate the dataset”

Legitimizing the source of the data and labeling it correctly can be an arduous process, and even maybe the most time-consuming of all. That is why it is important to recognize the amount of time and resources early on in the development process because depending on the size of the dataset could be a hindrance to model performance. For example, if the model is trained to detect objects in a picture, obtaining the ground truth will involve labeling each observation in the dataset with a bounding box, which has to be done manually.

After the model is put into production, the performance might drift away from the original predictions, and will not reflect the population that it had been trained on. In this case, retraining the model on new labels will be necessary and will cost time and resources to do so.

4. Choosing the Right Model — “Experiment and Reproduce”

This process will involve a lot of experimentation, validation, and ultimately be able to be reproduced consistently by the DevOps team for deployment. Practice experimenting with simple models and working up in complexity. This will help you find the best balance between effectiveness and use of resources. The goal is to ultimately end up with a model that is plug and play (like an API), scalable, inputs and outputs are easy to understand and compatible with the production environment.

5. Determine the Type of Deployment — “Continuous Integration and Delivery”

But it is important to have this in mind before setting off on the development of a machine learning pipeline because certain software frameworks will only support specific packages. The production environment needs to be cohesive with the model of choice.

MLOps addresses the challenges that arise once the model is ready to enter production. MLOps borrows the continuous integration and continuous delivery (CI/CD) principle commonly used in DevOps. However, the difference between them is that with ML, the data is continuously being updated, as well as the models. While traditional software only requires code to have CI/CD.

Once the model has been trained and evaluated to perform in a way that delivers value to its user, the next step is deploying the pipeline in a way that can be continuously integrated as the data changes over time. This adds some challenges and needs for MLOps to continuously deliver newly trained and validated models.

6. Containerization — “Works every time, all the time”

Tools like Docker provides an isolated environment for the model and accompanying applications to run in production. Ensuring the environment for the model and its accompanying application will always have the required packages available.

Each module of the pipeline can be kept in a container that keeps the environment variables consistent. This will reduce the number of dependencies for the model to work correctly. When there are multiple containers in play with deployment, then Kubernetes serves as a great MLOps tool.


More businesses at an enterprise level are looking to invest in MLOps.

MLOps will serve to streamline, automate, and help scale their ML pipelines. The field is growing every day as well and not everything could be covered in the article. However, keep some of these ideas in mind. That way when your team is considering your next project you can introduce the concept of MLOps.


Introducing MLOps by (Mark Treveil, Nicolas Omont, Clément Stenac, Kenji Lefevre, Du Phan, Joachim Zentici, Adrien Lavoillotte, Makoto Miyazaki, Lynn Heidmann; 2020).

Original post:

Leave a Reply

Your email address will not be published. Required fields are marked *