go-weatherstation/docs/rain_data_issues.md

# Rain Model Data Issues and Mitigations

This document captures known data-quality issues observed in the rain-model pipeline and the mitigation rules used in code.

## Issue Register

| Area | Known issue | Mitigation in code/workflow |
|---|---|---|
| WS90 rain counter (`rain_mm`) | Counter resets can produce negative deltas. | `rain_inc_raw = diff(rain_mm)` then `rain_inc = clip(lower=0)`; reset events tracked as `rain_reset`. |
| WS90 rain spikes | Isolated large 5-minute jumps may be sensor/transmission anomalies. | Spikes flagged as `rain_spike_5m` when increment >= `5.0mm/5m`; counts tracked in audit/training report. |
| Sensor gaps | Missing 5-minute buckets from WS90/barometer ingestion. | Resample to 5-minute grid; barometer interpolated with short limit (`limit=3`); gap lengths tracked by audit. |
| Out-of-order arrivals | Late MQTT events can arrive with older `ts`. | Audit reports out-of-order count by sorting on `received_at` and checking `ts` monotonicity. |
| Duplicate rows | Replays/reconnects can duplicate sensor rows. | Audit reports duplicate counts by `(ts, station_id)` for WS90 and `(ts, source)` for barometer. |
| Forecast sparsity/jitter | Hourly forecast retrieval cadence does not always align with 5-minute features. | Select latest forecast per `ts` (`DISTINCT ON` + `retrieved_at DESC`), resample to 5 minutes, short forward/backfill windows, and clip `fc_precip_prob` to `[0,1]`. |
| Local vs UTC day boundary | Daily rainfall resets can look wrong when local timezone is not respected. | Station timezone is configured via `site.timezone` and used by Wunderground uploader; model training/inference stays UTC-based for split consistency. |

## Audit Command

Run this regularly and retain JSON reports for comparison:

```sh
python scripts/audit_rain_data.py \
  --site home \
  --start "2026-02-01T00:00:00Z" \
  --end "2026-03-03T23:55:00Z" \
  --feature-set "extended" \
  --forecast-model "ecmwf" \
  --out "models/rain_data_audit.json"
```

## Operational Rules

- Treat large jumps in `rain_reset_count` or `rain_spike_5m_count` as data-quality incidents.
- If `gaps_5m.ws90_max_gap_minutes` or `gaps_5m.baro_max_gap_minutes` exceeds one hour, avoid model refresh until ingestion stabilizes.
- If forecast rows are unexpectedly low for an `extended` feature run, either fix forecast ingestion first or temporarily fall back to `baseline` feature set.