another bugfix
This commit is contained in:
@@ -11,7 +11,7 @@ This document captures known data-quality issues observed in the rain-model pipe
|
||||
| Sensor gaps | Missing 5-minute buckets from WS90/barometer ingestion. | Resample to 5-minute grid; barometer interpolated with short limit (`limit=3`); gap lengths tracked by audit. |
|
||||
| Out-of-order arrivals | Late MQTT events can arrive with older `ts`. | Audit reports out-of-order count by sorting on `received_at` and checking `ts` monotonicity. |
|
||||
| Duplicate rows | Replays/reconnects can duplicate sensor rows. | Audit reports duplicate counts by `(ts, station_id)` for WS90 and `(ts, source)` for barometer. |
|
||||
| Forecast sparsity/jitter | Hourly forecast retrieval cadence does not always align with 5-minute features. | Select latest forecast per `ts` (`DISTINCT ON` + `retrieved_at DESC`), resample to 5 minutes, short forward/backfill windows, and clip `fc_precip_prob` to `[0,1]`. |
|
||||
| Forecast sparsity/jitter | Hourly forecast retrieval cadence does not always align with 5-minute features. | Select latest forecast per `ts` (`DISTINCT ON` + `retrieved_at DESC`), resample to 5 minutes, short forward/backfill windows, and clip `fc_precip_prob` to `[0,1]`. If `precip_prob` is unavailable upstream, backfill from `precip_mm` (`>0 => 1`, else `0`). |
|
||||
| Local vs UTC day boundary | Daily rainfall resets can look wrong when local timezone is not respected. | Station timezone is configured via `site.timezone` and used by Wunderground uploader; model training/inference stays UTC-based for split consistency. |
|
||||
|
||||
## Audit Command
|
||||
|
||||
@@ -39,6 +39,7 @@ Review in report:
|
||||
- `candidate_models[*].hyperparameter_tuning`
|
||||
- `candidate_models[*].calibration_comparison`
|
||||
- `naive_baselines_test`
|
||||
- `sliced_performance_test`
|
||||
- `walk_forward_backtest`
|
||||
|
||||
## 3) Deploy
|
||||
@@ -65,10 +66,10 @@ python scripts/predict_rain_model.py \
|
||||
|
||||
## 4) Rollback
|
||||
|
||||
1. Identify the last known-good model artifact in `models/`.
|
||||
2. Point deployment to that artifact (worker env `RAIN_MODEL_PATH` or manual inference path).
|
||||
3. Re-run inference command and verify writes in `predictions_rain_1h`.
|
||||
4. Keep the failed artifact/report for postmortem.
|
||||
1. The worker now keeps a backup model at `RAIN_MODEL_BACKUP_PATH` and promotes new models only after candidate training succeeds.
|
||||
2. If promotion fails or no candidate model is produced, the worker keeps the active model unchanged.
|
||||
3. If inference starts without `RAIN_MODEL_PATH` but backup exists, the worker restores from backup automatically.
|
||||
4. Keep failed candidate artifacts for postmortem.
|
||||
|
||||
## 5) Monitoring
|
||||
|
||||
@@ -134,6 +135,7 @@ The script exits non-zero on failure, so it can directly drive alerting.
|
||||
- `RAIN_CALIBRATION_METHODS`
|
||||
- `RAIN_WALK_FORWARD_FOLDS`
|
||||
- `RAIN_ALLOW_EMPTY_DATA`
|
||||
- `RAIN_MODEL_BACKUP_PATH`
|
||||
- `RAIN_MODEL_CARD_PATH`
|
||||
|
||||
Recommended production defaults:
|
||||
|
||||
@@ -48,6 +48,8 @@ Feature-set options:
|
||||
- `baseline`: original 5 local observation features.
|
||||
- `extended`: adds wind-direction encoding, lag/rolling stats, recent rain accumulation,
|
||||
and aligned forecast features from `forecast_openmeteo_hourly`.
|
||||
- `extended_calendar`: `extended` plus UTC calendar seasonality features
|
||||
(`hour_*`, `dow_*`, `month_*`, `is_weekend`).
|
||||
|
||||
Model-family options (`train_rain_model.py`):
|
||||
- `logreg`: logistic regression baseline.
|
||||
@@ -117,6 +119,20 @@ python scripts/train_rain_model.py \
|
||||
--dataset-out "models/datasets/rain_dataset_{model_version}_{feature_set}.csv"
|
||||
```
|
||||
|
||||
### 3b.1) Train expanded + calendar (P2) feature-set model
|
||||
```sh
|
||||
python scripts/train_rain_model.py \
|
||||
--site "home" \
|
||||
--start "2026-02-01T00:00:00Z" \
|
||||
--end "2026-03-03T23:55:00Z" \
|
||||
--feature-set "extended_calendar" \
|
||||
--model-family "auto" \
|
||||
--forecast-model "ecmwf" \
|
||||
--model-version "rain-auto-v1-extended-calendar" \
|
||||
--out "models/rain_model_extended_calendar.pkl" \
|
||||
--report-out "models/rain_model_report_extended_calendar.json"
|
||||
```
|
||||
|
||||
### 3c) Train tree-based baseline (P1)
|
||||
```sh
|
||||
python scripts/train_rain_model.py \
|
||||
@@ -186,6 +202,7 @@ The `rainml` service in `docker-compose.yml` now runs:
|
||||
- configurable tuning/calibration behavior (`RAIN_TUNE_HYPERPARAMETERS`,
|
||||
`RAIN_MAX_HYPERPARAM_TRIALS`, `RAIN_CALIBRATION_METHODS`)
|
||||
- graceful gap handling for temporary source outages (`RAIN_ALLOW_EMPTY_DATA=true`)
|
||||
- automatic rollback path for last-known-good model (`RAIN_MODEL_BACKUP_PATH`)
|
||||
- optional model-card output (`RAIN_MODEL_CARD_PATH`)
|
||||
|
||||
Artifacts are persisted to `./models` on the host.
|
||||
@@ -198,6 +215,7 @@ docker compose logs -f rainml
|
||||
## Output
|
||||
- Audit report: `models/rain_data_audit.json`
|
||||
- Training report: `models/rain_model_report.json`
|
||||
- Regime slices in training report: `sliced_performance_test`
|
||||
- Model card: `models/model_card_<model_version>.md`
|
||||
- Model artifact: `models/rain_model.pkl`
|
||||
- Dataset snapshot: `models/datasets/rain_dataset_<model_version>_<feature_set>.csv`
|
||||
@@ -222,6 +240,12 @@ docker compose logs -f rainml
|
||||
- `fc_temp_c`, `fc_rh`, `fc_pressure_msl_hpa`, `fc_wind_m_s`, `fc_wind_gust_m_s`,
|
||||
`fc_precip_mm`, `fc_precip_prob`, `fc_cloud_cover`
|
||||
|
||||
## Model Features (extended_calendar extras)
|
||||
- `hour_sin`, `hour_cos`
|
||||
- `dow_sin`, `dow_cos`
|
||||
- `month_sin`, `month_cos`
|
||||
- `is_weekend`
|
||||
|
||||
## Notes
|
||||
- Data is resampled into 5-minute buckets.
|
||||
- Label is derived from incremental rain from WS90 cumulative `rain_mm`.
|
||||
|
||||
Reference in New Issue
Block a user