Continuous Delivery Principles for Machine Learning
Real world Software Engineering is an iterative process and one of its main objectives is to get changes all of types - including new features, configuration changes, bug fixes and experiments into production and into the hands of the users, safely, quickly and in a sustainable way. Continuous Delivery (CD), a software engineering discipline, with its principled approach allows you to solve this exact problem. The core idea of CD is to create a repeatable, reliable and incrementally improving process for taking software from concept to the end user. Like software development, building real world machine learning (ML) algorithms is an also an iterative process with a similar objective - How do I get my ML algorithms into production and in the hands of the users in a safe, quick and sustainable way. The current process of building models, testing and deploying them into production is at best an ad-hoc process in most companies. At Indix, while building the Google of Products, we have had some good success in combining the best practices of continuous delivery in building our machine learning pipelines using open source tools and frameworks. The talk will not focus on the theory of ML or about choosing the right ML algorithm but specifically on the last mile problem of taking models to production and the lessons learned while applying the concept of CD to ML.. Here are some of the key questions that the talk with try to answer. 1. ML Models Repository as analogous to Software Artifacts Repository - Similar to a software repository, what are the features of a Models Repository to aid traceability and reproducibility? Specifically, how do you manage models end to end - managing model metadata, visualization and lineage etc? 2. ML Pipelines to orchestrate and visualize the end to end flow - A typical ML workflow has multiple stages. How do you model your entire workflow as a pipeline (similar to Build Pipeline in CD) to automate the entire process and help visualize the entire end to end flow? 3. Model Quality Assurance - What quality gates and evaluation metrics, either manual and automated, should be used before exporting (promoting) models for serving in production? What happens when several different models are in play? How do you measure the models individually and then also in combination 4. Serving Models in Production - How do you serve and scale these models in production? What happens when these models are heterogenous (built using different languages - Scala, Python etc.)?