another bugfix

2026-03-12 20:29:29 +11:00
parent d1237eed44
commit 20316cee91
8 changed files with 293 additions and 23 deletions
@@ -11,7 +11,7 @@ This document captures known data-quality issues observed in the rain-model pipe
 | Sensor gaps | Missing 5-minute buckets from WS90/barometer ingestion. | Resample to 5-minute grid; barometer interpolated with short limit (`limit=3`); gap lengths tracked by audit. |
 | Out-of-order arrivals | Late MQTT events can arrive with older `ts`. | Audit reports out-of-order count by sorting on `received_at` and checking `ts` monotonicity. |
 | Duplicate rows | Replays/reconnects can duplicate sensor rows. | Audit reports duplicate counts by `(ts, station_id)` for WS90 and `(ts, source)` for barometer. |
-| Forecast sparsity/jitter | Hourly forecast retrieval cadence does not always align with 5-minute features. | Select latest forecast per `ts` (`DISTINCT ON` + `retrieved_at DESC`), resample to 5 minutes, short forward/backfill windows, and clip `fc_precip_prob` to `[0,1]`. |
+| Forecast sparsity/jitter | Hourly forecast retrieval cadence does not always align with 5-minute features. | Select latest forecast per `ts` (`DISTINCT ON` + `retrieved_at DESC`), resample to 5 minutes, short forward/backfill windows, and clip `fc_precip_prob` to `[0,1]`. If `precip_prob` is unavailable upstream, backfill from `precip_mm` (`>0 => 1`, else `0`). |
 | Local vs UTC day boundary | Daily rainfall resets can look wrong when local timezone is not respected. | Station timezone is configured via `site.timezone` and used by Wunderground uploader; model training/inference stays UTC-based for split consistency. |

 ## Audit Command
@@ -39,6 +39,7 @@ Review in report:
 - `candidate_models[*].hyperparameter_tuning`
 - `candidate_models[*].calibration_comparison`
 - `naive_baselines_test`
+- `sliced_performance_test`
 - `walk_forward_backtest`

 ## 3) Deploy
@@ -65,10 +66,10 @@ python scripts/predict_rain_model.py \

 ## 4) Rollback

-1. Identify the last known-good model artifact in `models/`.
-2. Point deployment to that artifact (worker env `RAIN_MODEL_PATH` or manual inference path).
-3. Re-run inference command and verify writes in `predictions_rain_1h`.
-4. Keep the failed artifact/report for postmortem.
+1. The worker now keeps a backup model at `RAIN_MODEL_BACKUP_PATH` and promotes new models only after candidate training succeeds.
+2. If promotion fails or no candidate model is produced, the worker keeps the active model unchanged.
+3. If inference starts without `RAIN_MODEL_PATH` but backup exists, the worker restores from backup automatically.
+4. Keep failed candidate artifacts for postmortem.

 ## 5) Monitoring

@@ -134,6 +135,7 @@ The script exits non-zero on failure, so it can directly drive alerting.
 - `RAIN_CALIBRATION_METHODS`
 - `RAIN_WALK_FORWARD_FOLDS`
 - `RAIN_ALLOW_EMPTY_DATA`
+- `RAIN_MODEL_BACKUP_PATH`
 - `RAIN_MODEL_CARD_PATH`

 Recommended production defaults:
@@ -48,6 +48,8 @@ Feature-set options:
 - `baseline`: original 5 local observation features.
 - `extended`: adds wind-direction encoding, lag/rolling stats, recent rain accumulation,
  and aligned forecast features from `forecast_openmeteo_hourly`.
+- `extended_calendar`: `extended` plus UTC calendar seasonality features
+  (`hour_*`, `dow_*`, `month_*`, `is_weekend`).

 Model-family options (`train_rain_model.py`):
 - `logreg`: logistic regression baseline.
@@ -117,6 +119,20 @@ python scripts/train_rain_model.py \
  --dataset-out "models/datasets/rain_dataset_{model_version}_{feature_set}.csv"
 ```

+### 3b.1) Train expanded + calendar (P2) feature-set model
+```sh
+python scripts/train_rain_model.py \
+  --site "home" \
+  --start "2026-02-01T00:00:00Z" \
+  --end "2026-03-03T23:55:00Z" \
+  --feature-set "extended_calendar" \
+  --model-family "auto" \
+  --forecast-model "ecmwf" \
+  --model-version "rain-auto-v1-extended-calendar" \
+  --out "models/rain_model_extended_calendar.pkl" \
+  --report-out "models/rain_model_report_extended_calendar.json"
+```
+
 ### 3c) Train tree-based baseline (P1)
 ```sh
 python scripts/train_rain_model.py \
@@ -186,6 +202,7 @@ The `rainml` service in `docker-compose.yml` now runs:
 - configurable tuning/calibration behavior (`RAIN_TUNE_HYPERPARAMETERS`,
  `RAIN_MAX_HYPERPARAM_TRIALS`, `RAIN_CALIBRATION_METHODS`)
 - graceful gap handling for temporary source outages (`RAIN_ALLOW_EMPTY_DATA=true`)
+- automatic rollback path for last-known-good model (`RAIN_MODEL_BACKUP_PATH`)
 - optional model-card output (`RAIN_MODEL_CARD_PATH`)

 Artifacts are persisted to `./models` on the host.
@@ -198,6 +215,7 @@ docker compose logs -f rainml
 ## Output
 - Audit report: `models/rain_data_audit.json`
 - Training report: `models/rain_model_report.json`
+- Regime slices in training report: `sliced_performance_test`
 - Model card: `models/model_card_<model_version>.md`
 - Model artifact: `models/rain_model.pkl`
 - Dataset snapshot: `models/datasets/rain_dataset_<model_version>_<feature_set>.csv`
@@ -222,6 +240,12 @@ docker compose logs -f rainml
 - `fc_temp_c`, `fc_rh`, `fc_pressure_msl_hpa`, `fc_wind_m_s`, `fc_wind_gust_m_s`,
  `fc_precip_mm`, `fc_precip_prob`, `fc_cloud_cover`

+## Model Features (extended_calendar extras)
+- `hour_sin`, `hour_cos`
+- `dow_sin`, `dow_cos`
+- `month_sin`, `month_cos`
+- `is_weekend`
+
 ## Notes
 - Data is resampled into 5-minute buckets.
 - Label is derived from incremental rain from WS90 cumulative `rain_mm`.