@@ -124,6 +124,13 @@ The benchmark command:
|
||||
- Runs Go and SQL aggregation cores for the latest available daily/monthly windows.
|
||||
- Writes results to startup logs and exits without changing scheduled defaults.
|
||||
|
||||
### Benchmark method and decision record
|
||||
- Run the benchmark on the target environment and database profile before deciding defaults:
|
||||
- `vctp -settings /path/to/vctp.yml -benchmark-aggregations -benchmark-runs 3`
|
||||
- Current local comparison snapshot (2026-04-20) is recorded in `phase-metrics-2026-04-20.md`.
|
||||
- Default-path decision remains `settings.scheduled_aggregation_engine: go`.
|
||||
- Promote SQL only when representative production-scale **Postgres** runs show clear, repeatable wins.
|
||||
|
||||
## Database Configuration
|
||||
By default the app uses SQLite and creates/opens `db.sqlite3`.
|
||||
|
||||
@@ -351,6 +358,44 @@ These endpoints are considered legacy and are disabled by default unless `settin
|
||||
|
||||
When disabled, they return HTTP `410 Gone` with JSON error payload.
|
||||
|
||||
## Compatibility mode lifecycle (`snapshot_table_compat_mode`)
|
||||
- Default is `true` during migration phases.
|
||||
- `true`: scheduled hourly capture continues writing legacy `inventory_hourly_*` outputs in addition to canonical tables.
|
||||
- `false`: scheduled hourly capture writes canonical hourly cache and lifecycle/totals caches only.
|
||||
- Disable criteria:
|
||||
- parity/integration/compatibility test gates are passing
|
||||
- baseline-vs-post-change metrics comparison is recorded and accepted
|
||||
- repair/backfill workflows are validated in the target environment
|
||||
- Rollback to legacy hourly output is immediate: set `snapshot_table_compat_mode: true` and restart the service.
|
||||
- Compatibility repair/backfill workflows remain available through:
|
||||
- `POST /api/snapshots/aggregate`
|
||||
- `POST /api/snapshots/repair`
|
||||
- `POST /api/snapshots/repair/all`
|
||||
- `POST /api/snapshots/regenerate-hourly-reports`
|
||||
- `POST /api/vcenters/cache/rebuild`
|
||||
- `vctp -settings /path/to/vctp.yml -backfill-vcenter-cache`
|
||||
|
||||
## Migration runbook (staged rollout, rollback, repair)
|
||||
1. Baseline: capture current metrics/state (`phase0-baseline.md` style snapshot) and verify auth/report contracts.
|
||||
2. Enable canonical runtime settings (already defaulted): `capture_write_batch_size: 1000`, `snapshot_table_compat_mode: true`, `async_report_generation: true`, `scheduled_aggregation_engine: go`.
|
||||
3. Deploy and monitor: review `/metrics`, `snapshot_runs`, `cron_status`, and generated reports for at least one full hourly/daily cycle.
|
||||
4. Validate canonicity gates: run parity/integration/compatibility suites and compare baseline vs post-change metrics.
|
||||
5. Optional compatibility reduction: set `snapshot_table_compat_mode: false` only after step 4 passes and repair workflows are validated.
|
||||
6. SQL default switch gate: only evaluate after production-scale Postgres benchmark evidence; otherwise keep `scheduled_aggregation_engine: go`.
|
||||
|
||||
Rollback triggers:
|
||||
- sustained increase in `vctp_*_failed_total` metrics
|
||||
- missing/stale summary tables or report outputs
|
||||
- material mismatch between totals endpoints and expected aggregates
|
||||
- repeated job timeout or cron failure indicators
|
||||
|
||||
Rollback actions:
|
||||
1. Set `scheduled_aggregation_engine: go` (if changed) and restart.
|
||||
2. Set `snapshot_table_compat_mode: true` and restart.
|
||||
3. Run `POST /api/snapshots/repair/all`.
|
||||
4. Run `POST /api/snapshots/regenerate-hourly-reports` and/or `-backfill-vcenter-cache` as needed.
|
||||
5. Re-check `/metrics`, `snapshot_runs`, and endpoint/report correctness before closing the incident.
|
||||
|
||||
## Settings Reference
|
||||
All configuration lives under the top-level `settings:` key in `vctp.yml`.
|
||||
|
||||
@@ -417,6 +462,9 @@ Snapshots:
|
||||
- `settings.hourly_index_max_age_days`: age gate for keeping per-hourly-table indexes (`-1` disables cleanup, `0` trims all)
|
||||
- `settings.snapshot_cleanup_cron`: cron expression for cleanup job
|
||||
- `settings.reports_dir`: directory to store generated XLSX reports (default: `/var/lib/vctp/reports`)
|
||||
- `settings.capture_write_batch_size`: hourly canonical write batch size (default: `1000`)
|
||||
- `settings.snapshot_table_compat_mode`: keep writing legacy hourly snapshot tables during migration (default: `true`)
|
||||
- `settings.async_report_generation`: defer report generation from the hourly capture hot path (default: `true`)
|
||||
- `settings.report_summary_pivots`: optional list to override Summary worksheet pivot titles/names/ranges in daily/monthly XLSX reports
|
||||
- `metric`: one of `avg_vcpu`, `avg_ram`, `prorated_vm_count`, `vm_name_count`
|
||||
- `title`: pivot title text shown on Summary sheet
|
||||
|
||||
Reference in New Issue
Block a user