another bugfix

This commit is contained in:
2026-03-12 20:29:29 +11:00
parent d1237eed44
commit 20316cee91
8 changed files with 293 additions and 23 deletions

View File

@@ -48,6 +48,8 @@ Feature-set options:
- `baseline`: original 5 local observation features.
- `extended`: adds wind-direction encoding, lag/rolling stats, recent rain accumulation,
and aligned forecast features from `forecast_openmeteo_hourly`.
- `extended_calendar`: `extended` plus UTC calendar seasonality features
(`hour_*`, `dow_*`, `month_*`, `is_weekend`).
Model-family options (`train_rain_model.py`):
- `logreg`: logistic regression baseline.
@@ -117,6 +119,20 @@ python scripts/train_rain_model.py \
--dataset-out "models/datasets/rain_dataset_{model_version}_{feature_set}.csv"
```
### 3b.1) Train expanded + calendar (P2) feature-set model
```sh
python scripts/train_rain_model.py \
--site "home" \
--start "2026-02-01T00:00:00Z" \
--end "2026-03-03T23:55:00Z" \
--feature-set "extended_calendar" \
--model-family "auto" \
--forecast-model "ecmwf" \
--model-version "rain-auto-v1-extended-calendar" \
--out "models/rain_model_extended_calendar.pkl" \
--report-out "models/rain_model_report_extended_calendar.json"
```
### 3c) Train tree-based baseline (P1)
```sh
python scripts/train_rain_model.py \
@@ -186,6 +202,7 @@ The `rainml` service in `docker-compose.yml` now runs:
- configurable tuning/calibration behavior (`RAIN_TUNE_HYPERPARAMETERS`,
`RAIN_MAX_HYPERPARAM_TRIALS`, `RAIN_CALIBRATION_METHODS`)
- graceful gap handling for temporary source outages (`RAIN_ALLOW_EMPTY_DATA=true`)
- automatic rollback path for last-known-good model (`RAIN_MODEL_BACKUP_PATH`)
- optional model-card output (`RAIN_MODEL_CARD_PATH`)
Artifacts are persisted to `./models` on the host.
@@ -198,6 +215,7 @@ docker compose logs -f rainml
## Output
- Audit report: `models/rain_data_audit.json`
- Training report: `models/rain_model_report.json`
- Regime slices in training report: `sliced_performance_test`
- Model card: `models/model_card_<model_version>.md`
- Model artifact: `models/rain_model.pkl`
- Dataset snapshot: `models/datasets/rain_dataset_<model_version>_<feature_set>.csv`
@@ -222,6 +240,12 @@ docker compose logs -f rainml
- `fc_temp_c`, `fc_rh`, `fc_pressure_msl_hpa`, `fc_wind_m_s`, `fc_wind_gust_m_s`,
`fc_precip_mm`, `fc_precip_prob`, `fc_cloud_cover`
## Model Features (extended_calendar extras)
- `hour_sin`, `hour_cos`
- `dow_sin`, `dow_cos`
- `month_sin`, `month_cos`
- `is_weekend`
## Notes
- Data is resampled into 5-minute buckets.
- Label is derived from incremental rain from WS90 cumulative `rain_mm`.