Now: sampled replay on native traces is live
Experiments
The experiments control plane now replays sampled native traces against a candidate prompt version, compares output and latency, and auto-promotes on pass.
Judging: automatic regression checks use output similarity and latency thresholds
Release: passing candidates auto-promote, failing ones are blocked