Forem

MLOps Community

MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2

Second installation David and Demetrios reviewing the google paper about Continuous training and automated pipelines. They dive deep into machine learning monitoring and also what exactly continuous training actually entails. Some key highlights are:

Automatically retraining and serving the models:
When to do it?
Outlier detection
Drift detection

Outlier detection:
What is it?
How you deal with it
Drift detection
Individual features may start to drift. This could be a bug or it could be perfectly normal behavior that indicates that the world has changed requiring the model to be retrained.

Example changes:
shifts in people’s preferences
marketing campaigns
competitor moves
the weather
the news cycle
Locations
Time
Devices (clients)

If the world you're working with is changing over time, model deployment should be treated as a continuous process. What this tells me is that you should keep the data scientists and engineers working on the model instead of immediately moving to another project.

Deeper dive into concept drift
Feature/target distributions change

An overview of concept drift applications: “.. data analysis applications, data evolve over time and must be analyzed in near real time. Patterns and relations in such data often evolve over time, thus, models built for analyzing such data quickly become obsolete over time. In machine learning and data mining this phenomenon is referred to as concept drift.”
https://www.win.tue.nl/~mpechen/publications/pubs/CD_applications15.pdf
https://www-ai.cs.tu-dortmund.de/LEHRE/FACHPROJEKT/SS12/paper/concept-drift/tsymbal2004.pdf

Types of concept drift:
Sudden
Gradual
Google in some way is trying to address this concern - the world is changing and you want your ML system to change as well so it can avoid decreased performance but also improve over time and adapt to its environment. This sort of robustness is necessary for certain domains.
Continuous delivery and automation of pipelines (data, training, prediction service) was built with this in mind. Minimizing the commit-to-deploy interval and maximize the velocity software delivery and its components: maintainability, extensibility, and testability
Then the pipeline is ready, you can now run it. So you can do this continuously. After the pipeline is deployed to the production environment, it will be executed automatically and repetitively to produce a trained model that is stored in a central model registry.
This pipeline should be able to be run on a schedule or based on triggers: certain events that you have configured to your business domain - new data or drop in performance from the prod model.
The link between the model artifact and the pipeline is never severed. What pipeline trained them? What data was extracted, validated and how was it prepared? What was the training configuration and how was it evaluated? Etc. metrics are key here! Lineage tracking!!!
Keeping a close tie between the dev/experiment pipeline and the continuous production pipeline helps avoid inconsistencies between model artifacts produced by the pipeline and models beings served - hard to debug

Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/
Connect with Cris Sterry on LinkedIn: https://www.linkedin.com/in/chrissterry/

Episode source