Anomaly & Waste Detection
Five detection cadences. One Remediation catalog. CloudPi catches billing anomalies, waste cleanup, environment mismatch, and rightsizing opportunities with trust earned one rung at a time.
The problem
A misconfigured auto-scaling group. A forgotten data transfer pipeline. A test workload left running after a demo. None trigger monitoring alerts. They add $500/day to the bill until someone notices.
Meanwhile, non-prod environments run 24/7 for teams that work 8 hours. Dev SKUs drift to prod-tier sizes. Unattached disks pile up sprint after sprint.
How CloudPi fixes it
Five detection families, each on its natural cadence:
| Family | Cadence | What it catches |
|---|---|---|
| Billing anomalies | Daily | Spikes at resource / service / region / project level |
| Budget policies | Daily | Burn vs threshold (70% Review, 90% Escalation) |
| Waste cleanup | 7 / 14 day | Unattached disks, stopped VMs, orphan public IPs |
| Environment mismatch | 14 / 30 day | Dev/QA resources running prod-tier SKUs |
| Rightsizing | 15 / 30 day | Over-provisioned compute, idle databases, nodes under 10% CPU |
Short cadences catch fast-moving money. Long cadences catch slow-forming drift.
Every finding fires into the same Remediation catalog - a safe-actions inventory the owner can inspect, tune, and (when ready) automate.
The maturity ladder
The owner moves each policy family up or down independently, per environment:
- Rung 1 - Ticket-only (crawl). Every finding opens an ADO ticket. Nothing runs without approval.
- Rung 2 - Gated (walk). Fix pre-staged, 1-click approval. If no response within SLA, escalates.
- Rung 3 - Auto-save (run). Policy fires, CloudPi executes, audit entry written, saving on the dashboard tomorrow.
Safety nets
- Grace periods + snapshots. 7-day grace for unattached disks. 14-day for stopped VMs. Everything reversible.
- Environment tier gates. Auto-save allowed sandbox/dev first. Never prod without explicit owner promotion.
- Circuit breakers. More than N remediations in M hours - auto-demote to Gated, ping the owner.
Features used
- Policy-based recommendations (5 families)
- Billing analysis and anomaly detection
- Workflow automation (3 rungs)
- ADO / Jira / ServiceNow ticket integration
- Self-service dashboards
Catch waste on day 1, not day 21. Book a Demo