feat: add rain data audit and prediction scripts

2026-03-05 08:01:54 +11:00
parent 5bfa910495
commit 96e72d7c43
13 changed files with 1004 additions and 182 deletions
@@ -0,0 +1,57 @@
+# Predictive Model TODO
+
+Priority key: `P0` = critical/blocking, `P1` = important, `P2` = later optimization.
+
+## 1) Scope and Success Criteria
+- [x] [P0] Lock v1 target: predict `rain_next_1h >= 0.2mm`.
+- [x] [P0] Define the decision use-case (alerts vs dashboard signal).
+- [x] [P0] Set acceptance metrics and thresholds (precision, recall, ROC-AUC).
+- [x] [P0] Freeze training window with explicit UTC start/end timestamps.
+
+## 2) Data Quality and Label Validation
+- [ ] [P0] Audit `observations_ws90` and `observations_baro` for missingness, gaps, duplicates, and out-of-order rows. (script ready: `scripts/audit_rain_data.py`; run on runtime machine)
+- [ ] [P0] Validate rain label construction from `rain_mm` (counter resets, negative deltas, spikes). (script ready: `scripts/audit_rain_data.py`; run on runtime machine)
+- [ ] [P0] Measure class balance by week (rain-positive vs rain-negative). (script ready: `scripts/audit_rain_data.py`; run on runtime machine)
+- [ ] [P1] Document known data issues and mitigation rules.
+
+## 3) Dataset and Feature Engineering
+- [ ] [P1] Extract reusable dataset-builder logic from training script into a maintainable module/workflow.
+- [ ] [P1] Add lag/rolling features (means, stddev, deltas) for core sensor inputs.
+- [ ] [P1] Encode wind direction properly (cyclical encoding).
+- [ ] [P2] Add calendar features (hour-of-day, day-of-week, seasonality proxies).
+- [ ] [P1] Join aligned forecast features from `forecast_openmeteo_hourly` (precip prob, cloud cover, wind, pressure).
+- [ ] [P1] Persist versioned dataset snapshots for reproducibility.
+
+## 4) Modeling and Validation
+- [x] [P0] Keep logistic regression as baseline.
+- [ ] [P1] Add at least one tree-based baseline (e.g. gradient boosting).
+- [x] [P0] Use strict time-based train/validation/test splits (no random shuffling).
+- [ ] [P1] Add walk-forward backtesting across multiple temporal folds.
+- [ ] [P1] Tune hyperparameters on validation data only.
+- [ ] [P1] Calibrate probabilities (Platt or isotonic) and compare calibration quality.
+- [x] [P0] Choose and lock the operating threshold based on use-case costs.
+
+## 5) Evaluation and Reporting
+- [x] [P0] Report ROC-AUC, PR-AUC, confusion matrix, precision, recall, and Brier score.
+- [ ] [P1] Compare against naive baselines (persistence and simple forecast-threshold rules).
+- [ ] [P2] Slice performance by periods/weather regimes (day/night, rainy weeks, etc.).
+- [ ] [P1] Produce a short model card (data window, features, metrics, known limitations).
+
+## 6) Packaging and Deployment
+- [ ] [P1] Version model artifacts and feature schema together.
+- [x] [P0] Implement inference path with feature parity between training and serving.
+- [x] [P0] Add prediction storage table for predicted probabilities and realized outcomes.
+- [ ] [P1] Expose predictions via API and optionally surface in web dashboard.
+- [ ] [P2] Add scheduled retraining with rollback to last-known-good model.
+
+## 7) Monitoring and Operations
+- [ ] [P1] Track feature drift and prediction drift over time.
+- [ ] [P1] Track calibration drift and realized performance after deployment.
+- [ ] [P1] Add alerts for training/inference/data pipeline failures.
+- [ ] [P1] Document runbook for train/evaluate/deploy/rollback.
+
+## 8) Immediate Next Steps (This Week)
+- [ ] [P0] Run first full data audit and label-quality checks. (blocked here; run on runtime machine)
+- [ ] [P0] Train baseline model on full available history and capture metrics. (blocked here; run on runtime machine)
+- [ ] [P1] Add one expanded feature set and rerun evaluation.
+- [x] [P0] Decide v1 threshold and define deployment interface.