[P0] Audit observations_ws90 and observations_baro for missingness, gaps, duplicates, and out-of-order rows. (completed on runtime machine)
[P0] Validate rain label construction from rain_mm (counter resets, negative deltas, spikes). (completed on runtime machine)
[P0] Measure class balance by week (rain-positive vs rain-negative). (completed on runtime machine)
[P1] Document known data issues and mitigation rules. (see docs/rain_data_issues.md)

3) Dataset and Feature Engineering

[P1] Extract reusable dataset-builder logic from training script into a maintainable module/workflow.
[P1] Add lag/rolling features (means, stddev, deltas) for core sensor inputs.
[P1] Encode wind direction properly (cyclical encoding).
[P2] Add calendar features (hour-of-day, day-of-week, seasonality proxies). (feature-set=extended_calendar)
[P1] Join aligned forecast features from forecast_openmeteo_hourly (precip prob, cloud cover, wind, pressure).
[P1] Persist versioned dataset snapshots for reproducibility.

4) Modeling and Validation

[P0] Keep logistic regression as baseline.
[P1] Add at least one tree-based baseline (e.g. gradient boosting). (implemented via hist_gb; runtime evaluation pending local Python deps)
[P0] Use strict time-based train/validation/test splits (no random shuffling).
[P1] Add walk-forward backtesting across multiple temporal folds. (train_rain_model.py --walk-forward-folds)
[P1] Tune hyperparameters on validation data only. (train_rain_model.py --tune-hyperparameters)
[P1] Calibrate probabilities (Platt or isotonic) and compare calibration quality. (--calibration-methods)
[P0] Choose and lock the operating threshold based on use-case costs.

[P0] Report ROC-AUC, PR-AUC, confusion matrix, precision, recall, and Brier score.
[P1] Compare against naive baselines (persistence and simple forecast-threshold rules).
[P2] Slice performance by periods/weather regimes (day/night, rainy weeks, etc.). (sliced_performance_test)
[P1] Produce a short model card (data window, features, metrics, known limitations). (--model-card-out)

[P1] Version model artifacts and feature schema together.
[P0] Implement inference path with feature parity between training and serving.
[P0] Add prediction storage table for predicted probabilities and realized outcomes.
[P1] Expose predictions via API and optionally surface in web dashboard.
[P2] Add scheduled retraining with rollback to last-known-good model. (run_rain_ml_worker.py candidate promote + RAIN_MODEL_BACKUP_PATH)

[P1] Track feature drift and prediction drift over time. (view: rain_feature_drift_daily, rain_prediction_drift_daily)
[P1] Track calibration drift and realized performance after deployment. (view: rain_calibration_drift_daily)
[P1] Add alerts for training/inference/data pipeline failures. (scripts/check_rain_pipeline_health.py)
[P1] Document runbook for train/evaluate/deploy/rollback. (see docs/rain_model_runbook.md)

[P0] Run first full data audit and label-quality checks. (completed on runtime machine)
[P0] Train baseline model on full available history and capture metrics. (completed on runtime machine)
[P1] Add one expanded feature set and rerun evaluation. (completed on runtime machine 2026-03-12 with feature_set=extended, model_version=rain-auto-v1-extended-202603120932)
[P0] Decide v1 threshold and define deployment interface.

Powered by Gitea Version: 1.25.5 Page: 195ms Template: 2ms

English