/
...
/
/
24. Learning from monitoring more than 30 ML use-cases, Lina Weichbrodt,
Search
Duplicate
Try Notion

24. Learning from monitoring more than 30 ML use-cases, Lina Weichbrodt,

Software monitoring basics
Detect errors asap and prioritize the severity
4 signals to watch:
latency
traffic
errors
saturation
Focus on symptoms for end-users
Higher complexity in ML systems than in traditional ones
Monitoring motivation: silent failures can have a huge commercial impact
Unit changes
Data loss when calling an external client
Aggressive post-processing filters applied, seasonality drift made data scarcer
Symptoms-based monitoring: focusing on output first
Priority 1
DS asked “can I monitor evaluation metric in prod?” it depends on the target feedback loop latency.
Focus on what stakeholders want to avoid
ML app needs trust: need to reassure business and product stakeholders
Priority 2
A lot of ML monitoring is done with evaluation metrics
Evaluation metrics are often not available in real-time (Recall, FBeta score, etc)
Monitoring metrics focus on detecting problems, not on evaluating data quality and are easier to implement than evaluation metrics
Measure response distribution and use simple stats tests (heuristics), or KS sample test, D1 distance, stability index
Priority 3
Useful to understand how input change
Do I need an MLOps observability tool?
focus on monitoring and explainability with Aporia or Superwise
monitoring as part of a full-featured tool like Seldon or Sagemaker
add loggings to inference, display them on dashboard and create alerts
you often don’t need a new tool