improve model training

This commit is contained in:
2026-03-12 20:39:44 +11:00
parent 20316cee91
commit 9785fc0235
8 changed files with 536 additions and 4 deletions

View File

@@ -43,6 +43,7 @@ pip install -r scripts/requirements.txt
`predictions_rain_1h`.
- `scripts/run_rain_ml_worker.py`: long-running worker for periodic training + prediction.
- `scripts/check_rain_pipeline_health.py`: freshness/failure check for alerting.
- `scripts/recommend_rain_model.py`: rank saved training reports and recommend a deployment candidate.
Feature-set options:
- `baseline`: original 5 local observation features.
@@ -181,6 +182,22 @@ python scripts/train_rain_model.py \
--model-card-out "models/model_card_{model_version}.md"
```
### 3f) Walk-forward threshold policy (more temporally robust alert threshold)
```sh
python scripts/train_rain_model.py \
--site "home" \
--start "2026-02-01T00:00:00Z" \
--end "2026-03-03T23:55:00Z" \
--feature-set "extended" \
--model-family "auto" \
--forecast-model "ecmwf" \
--threshold-policy "walk_forward" \
--walk-forward-folds 4 \
--model-version "rain-auto-v1-extended-wf-threshold" \
--out "models/rain_model_auto.pkl" \
--report-out "models/rain_model_report_auto.json"
```
### 4) Run inference and store prediction
```sh
python scripts/predict_rain_model.py \
@@ -200,7 +217,7 @@ The `rainml` service in `docker-compose.yml` now runs:
- periodic retraining (default every 24 hours)
- periodic prediction writes (default every 10 minutes)
- configurable tuning/calibration behavior (`RAIN_TUNE_HYPERPARAMETERS`,
`RAIN_MAX_HYPERPARAM_TRIALS`, `RAIN_CALIBRATION_METHODS`)
`RAIN_MAX_HYPERPARAM_TRIALS`, `RAIN_CALIBRATION_METHODS`, `RAIN_THRESHOLD_POLICY`)
- graceful gap handling for temporary source outages (`RAIN_ALLOW_EMPTY_DATA=true`)
- automatic rollback path for last-known-good model (`RAIN_MODEL_BACKUP_PATH`)
- optional model-card output (`RAIN_MODEL_CARD_PATH`)
@@ -222,6 +239,15 @@ docker compose logs -f rainml
- Prediction rows: `predictions_rain_1h` (probability + threshold decision + realized
outcome fields once available)
### 7) Recommend deploy candidate from saved reports
```sh
python scripts/recommend_rain_model.py \
--reports-glob "models/rain_model_report*.json" \
--require-walk-forward \
--top-k 5 \
--json-out "models/rain_model_recommendation.json"
```
## Model Features (v1 baseline)
- `pressure_trend_1h`
- `humidity`