DataRobot Model Monitoring Skill

This skill provides comprehensive guidance for monitoring deployed models, tracking performance metrics, detecting data drift, and managing model health.

Quick Start

Most common use case: Check deployment health and data drift

Check service stats: deployment.get_service_stats(...) to review prediction volume/latency
Check drift: deployment.get_feature_drift(...) / deployment.get_target_drift(...)
Compare over time: Use get_service_stats_over_time(...) and drift periods to assess trends

Example: "Check the health of deployment abc123 and report any data drift issues"

When to use this skill

Use this skill when you need to:

Monitor model performance in production
Track data drift and feature drift
Detect prediction anomalies
Monitor prediction accuracy over time
Set up alerts for model degradation
Analyze model health metrics
Compare production performance to training performance

Key capabilities

1. Performance Monitoring

Track prediction accuracy and metrics over time
Compare production metrics to training metrics
Monitor prediction volume and latency
Identify performance degradation trends

2. Data Drift Detection

Detect changes in feature distributions
Identify feature drift (statistical changes)
Monitor target drift (if actuals available)
Alert on significant drift events

3. Prediction Monitoring

Monitor prediction distributions
Detect prediction anomalies
Track prediction confidence scores
Identify unusual prediction patterns

4. Health Management

Assess overall model health
Generate monitoring reports
Set up automated alerts
Manage model retraining triggers

Workflow examples

Example 1: Check model health and drift

User request: "Check the health of deployment abc123 and report any data drift issues."

Agent workflow:

Get deployment monitoring status
Retrieve recent performance metrics
Check for data drift in key features
Compare current metrics to baseline (training)
Identify any significant drift or degradation
Report findings with recommendations

Example 2: Set up drift monitoring alerts

User request: "Set up alerts for deployment xyz789 to notify when feature drift exceeds 0.2."

Agent workflow:

Get deployment configuration
Configure drift threshold (0.2)
Set up alert notifications
Specify which features to monitor
Test alert configuration
Confirm monitoring is active

Using DataRobot SDK

This skill guides you to use the DataRobot Python SDK directly. Install the SDK if needed:

pip install datarobot

Key SDK Operations

Use these DataRobot SDK and MLOps API methods for monitoring:

Deployment Monitoring:

deployment.get_service_stats(...) - Get service statistics (latency, volume, etc.)
deployment.get_feature_drift(...) - Get feature drift metrics (returns FeatureDrift objects)
deployment.get_target_drift(...) - Get target drift metrics (returns TargetDrift)
deployment.get_prediction_results(...) - Retrieve recorded prediction results (if enabled)

Model Performance:

model.get_metrics() - Get model performance metrics
model.get_roc_curve() - Get ROC curve for comparison

Note: Some monitoring features may require DataRobot MLOps API. See the Common Patterns section below for examples.

Best practices

Regular monitoring: Check model health regularly, not just when issues arise
Baseline comparison: Always compare production metrics to training baseline
Drift thresholds: Set appropriate drift thresholds based on your domain
Key features: Focus monitoring on high-importance features
Automated alerts: Set up alerts for critical issues
Historical analysis: Track trends over time, not just point-in-time metrics

Common patterns

Pattern 1: Health check

import datarobot as dr
import os

# Initialize client
client = dr.Client(
    token=os.getenv("DATAROBOT_API_TOKEN"),
    endpoint=os.getenv("DATAROBOT_ENDPOINT")
)

# Get deployment
deployment = dr.Deployment.get("abc123")

# Get service stats (requires MLOps monitoring to be enabled)
stats = deployment.get_service_stats()
print(f"Prediction count: {stats.prediction_count}")
print(f"Mean response time (ms): {stats.mean_response_time}")

# Get recorded prediction results (if available / enabled)
try:
    recent = deployment.get_prediction_results(limit=10)
    print(f"Recent prediction results: {len(recent)}")
except Exception as e:
    print(f"Prediction results not available: {e}")

Pattern 2: Drift detection

import datarobot as dr

# Get deployment
deployment = dr.Deployment.get("abc123")

# Get feature drift (requires MLOps monitoring)
try:
    drifts = deployment.get_feature_drift()
    high = [d for d in drifts if (d.drift_score or 0) > 0.2]
    print(f"Features with drift_score > 0.2: {len(high)}")
    for d in high[:10]:
        print(f"{d.name}: {d.drift_score}")
except Exception as e:
    print(f"Feature drift requires MLOps monitoring: {e}")

Monitoring metrics

Performance Metrics

Accuracy: Prediction accuracy (classification)
RMSE/MAE: Prediction error (regression)
AUC: Model discrimination (classification)
Prediction volume: Number of predictions made

Drift Metrics

Feature drift: Statistical changes in feature distributions
Target drift: Changes in target distribution (if available)
Prediction drift: Changes in prediction distributions
Drift score: Overall drift severity (0-1 scale)

Alert thresholds

Recommended thresholds:

High drift: > 0.3 (significant changes, investigate immediately)
Medium drift: 0.15-0.3 (moderate changes, monitor closely)
Low drift: < 0.15 (minor changes, normal variation)

Adjust thresholds based on your domain and use case sensitivity.

Model health status

Healthy: Performance within expected range, minimal drift
Degrading: Performance declining, some drift detected
Unhealthy: Significant performance issues or high drift
Unknown: Insufficient data for assessment

Error handling

Common errors and solutions:

Insufficient data: Need minimum prediction volume for monitoring
Baseline unavailable: Ensure training baseline is available
Access issues: Verify deployment permissions and access

SDK Setup

Install DataRobot SDK

pip install datarobot

Initialize Client

import datarobot as dr
import os

client = dr.Client(
    token=os.getenv("DATAROBOT_API_TOKEN"),
    endpoint=os.getenv("DATAROBOT_ENDPOINT", "https://app.datarobot.com")
)

Note: Some monitoring features require DataRobot MLOps API access. Check your DataRobot plan for MLOps availability.

datarobot-model-monitoring

How to add

Drop this on your repo README

Related skills

xlsx

mem-search

weekly-digests

how-it-works

Get new Dados e Análise skills every Monday