ML Infra Best Practices: Monitoring Overview

One of the first questions to answer before deploying a model is what is the business risk of deploying this model?

Besides the obvious things to monitor related to the health of the model, there are several aspects about the performance of a model that you will need to monitor in a production setup.

Operational Health

This involves monitoring the basic health, uptime, metrics, latency et al of the model. Other things to monitor wrt traffic include rate changes of traffic, There are many…