Uncertainty Profiler

Gaussian Processes provide predictive uncertainty as a first-class output — but raw variance values are hard to interpret. The UncertaintyProfiler turns them into actionable diagnostics: it classifies regions as interpolation vs. extrapolation, detects when confidence intervals are too wide or too narrow, and can recalibrate the uncertainty scale against held-out validation data.

Initialization

import gpclarity, GPy, numpy as np

X = np.linspace(0, 10, 50).reshape(-1, 1)
y = np.sin(X).flatten() + 0.1 * np.random.randn(50)
model = GPy.models.GPRegression(X, y[:, None], GPy.kern.RBF(1))
model.optimize()

profiler = gpclarity.UncertaintyProfiler(model, X_train=X)

Pass X_train so the profiler can distinguish interpolation from extrapolation. Providing config (an UncertaintyConfig object) lets you adjust thresholds such as the high-uncertainty percentile cutoff.

Prediction with Intervals

X_test = np.linspace(-2, 12, 200).reshape(-1, 1)
result = profiler.predict(X_test)

print(result.mean.shape)       # (200, 1)
print(result.variance.shape)   # (200, 1)

# 2-sigma confidence interval
lower, upper = result.get_interval(2.0)

predict() returns a PredictionResult dataclass. get_interval(sigma) computes mean ± sigma * std.

Diagnostics

compute_diagnostics() returns a plain dictionary summarising uncertainty across the test set:

diag = profiler.compute_diagnostics(X_test)

print(diag["mean_uncertainty"])        # average predictive variance
print(diag["max_uncertainty"])         # peak variance
print(diag["uncertainty_std"])         # spread across the test set
print(diag["high_uncertainty_ratio"])  # fraction above the 90th percentile
print(diag["n_extrapolation_points"])  # points outside the training hull
print(diag["coefficient_of_variation"])# std / mean — scale-free spread metric

A coefficient_of_variation between 0.1 and 10.0 is considered well-calibrated. Values outside that range suggest the uncertainty scale may need adjustment.

Region Classification

classify_regions() assigns each test point one of four labels from the UncertaintyRegion enum:

labels = profiler.classify_regions(X_test)
# Each label is one of:
# UncertaintyRegion.INTERPOLATION  — inside the training hull, low uncertainty
# UncertaintyRegion.BOUNDARY       — near the edge of the training data
# UncertaintyRegion.EXTRAPOLATION  — outside the training hull
# UncertaintyRegion.STRUCTURAL     — high uncertainty despite dense training data

identify_uncertainty_regions() returns the actual points in each category:

regions = profiler.identify_uncertainty_regions(X_test, threshold_percentile=90)
print(regions["high_uncertainty_points"]["points"])
print(regions["low_uncertainty_points"]["points"])
print(regions["threshold"])   # variance threshold used

Uncertainty Calibration

If you have held-out validation data with known targets, calibrate_uncertainty() finds the scalar multiplier that brings the model’s uncertainty in line with observed prediction errors:

X_val = np.linspace(0, 10, 20).reshape(-1, 1)
y_val = np.sin(X_val).flatten() + 0.1 * np.random.randn(20)

cal = profiler.calibrate_uncertainty(X_val, y_val)
print(cal["optimal_scale"])    # multiply raw std by this factor
print(cal["coverage_before"])  # empirical 95% coverage before calibration
print(cal["coverage_after"])   # coverage after applying optimal_scale

A well-calibrated model has coverage_after close to 0.95 for 2-sigma intervals. If optimal_scale >> 1, the model is overconfident; if << 1, it is underconfident.

Visualization

profiler.plot(
    X_test,
    X_train=X, y_train=y,
    confidence_level=2.0,    # number of sigma for shaded band
    show_regions=True,       # colour-code extrapolation regions
)

The plot shows the posterior mean, confidence band, training data, and (optionally) a colour overlay for extrapolation regions.

Full Summary

get_summary() combines diagnostics, region classification, and recommendations into a single dict:

summary = profiler.get_summary(X_test)

print(summary["mean_uncertainty"])
print(summary["n_extrapolation_points"])

for rec in summary["recommendations"]:
    print("-", rec)

The recommendations list contains actionable strings such as “High extrapolation ratio — restrict predictions to the training domain” or “Model is overconfident — consider calibrating uncertainty scale”.