Validation Guide

Context

Customers often ask us "how accurate are your forecasts?"
Many commercial engagements begin with a validation process to build confidence
Salient calculates and releases skill scores with every model release
This guide shows you how to reproduce and understand Salient's internal skill score calculations
If you're starting to work with Salient, this recipe can accelerate your testing
The follow-on step is to demonstrate business value by testing Salient's forecasts in decision-making

Validating vs ERA5

https://www.loom.com/share/6cafcaa62f4f4a949955f97efd77c0ca

Salient evaluates probabilistic forecasts using the Continuous Ranked Probability Score (CRPS):
- Salient's models are trained using rigorous cross-validation to the ERA5 reanalysis dataset, which serves as "truth"
- We compare skill between two forecasts with the Continuous Ranked Probability Skill Score (CRPSS)
We calculate CRPS for Salient and other forecasts and publish skill scores for our customers:
- Calculating skill is part of our product development and improvement process
- Every variable, timescale, lead time, and lat/lon point has a unique skill score
- The easiest way to see Salient's skill metrics is visually in the dashboard's "Skill" tab
- All precomputed skill scores are also available via the hindcast_summary software development kit (SDK) function
Reproduce the methodology that calculates Salient's skill scores with this process:
- Start with the validate Python notebook in salientsdk [get it now]
- Download all forecasts since 2015, for both the Salient model and a "reference" model for comparison
  - Note: Salient's models are trained on pre-2015 data, so these forecasts are out of sample with no leakage of truth into the model training
- Download historical ERA5, which will be the "truth" for the CRPS calculation
- Calculate skill ****to the Salient forecast using the included SDK's crps function
  - Also calculate a CRPS for the reference model
- Calculate relative skill with the SDK's crpss function to quantify the improvement of the Salient forecast over the reference.
- (Optional) Compare your hand-calculated skill score to the hindcast_summary precomputed scores. If you set split_set="test" in the notebook your locally calculated scores will match the precomputed scores.
Customize your validation by changing flexible control parameters:
- Variable: temperature, precipitation, wind speed, solar irradiance
- Timescale: weeks 1-5, months 1-3, quarters 1-4
- Compare: Salient blend vs. climatology, NOAA GEFS, ECMWF ENS, or ECMWF SEAS5
- Location: any vector of lat/lons, or a polygon shapefile

Key takeaway: Reproducing Salient's skill methodology empowers you to understand their origin and build confidence in the process.

Validating vs Met Station Observations

https://www.loom.com/share/9d1616ac752e4d139b5e7bf241fd314f

Some customers prefer to validate Salient's forecasts against on-the-ground weather station observations as the source of truth, instead of the ERA5 reanalysis dataset
- Because ERA5 assimilates multiple inputs over a large geographic area, it may not perfectly represent conditions at a specific point
First, you'll need a source of observed meteorological data. You can use proprietary tabular data or via the included function get_ghcnd
- Use the function make_observed_ds to format daily tabular station observations into the same xarray.Dataset format returned by Salient's historical ERA5 function
- (Optional) Compare the historical observation-ERA5 bias to characterize the magnitude and consistency of differences
(Optional) Calculate the skill of the native Salient forecast (trained to ERA5) with the observation stations as truth. This establishes a baseline to show how much debiasing can improve the forecasts.
Download historical debiased forecasts since 2015, which represent the native Salient forecast with a debiasing factor to match GHCN observations