Files
go-weatherstation/docs/rain_prediction.md
2026-02-02 17:08:43 +11:00

2.5 KiB

Rain Prediction (Next 1 Hour)

This project now includes a starter training script for a binary rain prediction:

Will we see >= 0.2 mm of rain in the next hour?

It uses local observations (WS90 + barometric pressure) and trains a lightweight logistic regression model. This is a baseline you can iterate on as you collect more data.

What the script does

  • Pulls data from TimescaleDB.
  • Resamples observations to 5-minute buckets.
  • Derives pressure trend (1h) from barometer data.
  • Computes future 1-hour rainfall from the cumulative rain_mm counter.
  • Trains a model and prints evaluation metrics.

The output is a saved model file (optional) you can use later for inference.

Requirements

Python 3.10+ and:

pandas
numpy
scikit-learn
psycopg2-binary
joblib

Install with:

python3 -m venv .venv
source .venv/bin/activate
pip install -r scripts/requirements.txt

Usage

python scripts/train_rain_model.py \
  --db-url "postgres://postgres:postgres@localhost:5432/micrometeo?sslmode=disable" \
  --site "home" \
  --start "2026-01-01" \
  --end "2026-02-01" \
  --out "models/rain_model.pkl"

You can also provide the connection string via DATABASE_URL:

export DATABASE_URL="postgres://postgres:postgres@localhost:5432/micrometeo?sslmode=disable"
python scripts/train_rain_model.py --site home

Output

The script prints metrics including:

  • accuracy
  • precision / recall
  • ROC AUC
  • confusion matrix

If joblib is installed, it saves a model bundle:

models/rain_model.pkl

This bundle contains:

  • The trained model pipeline
  • The feature list used during training

Data needs / when to run

For a reliable model, you will want:

  • At least 2-4 weeks of observations
  • A mix of rainy and non-rainy periods

Training with only a few days will produce an unstable model.

Features used

The baseline model uses:

  • pressure_trend_1h (hPa)
  • humidity (%)
  • temperature_c (C)
  • wind_avg_m_s (m/s)
  • wind_max_m_s (m/s)

These are easy to expand once you have more data (e.g. add forecast features).

Notes / assumptions

  • Rain detection is based on incremental rain derived from the WS90 rain_mm cumulative counter.
  • Pressure comes from observations_baro.
  • All timestamps are treated as UTC.

Next improvements

Ideas once more data is available:

  • Add forecast precipitation and cloud cover as features
  • Try gradient boosted trees (e.g. XGBoost / LightGBM)
  • Train per-season models
  • Calibrate probabilities (Platt scaling / isotonic regression)