Evaluating Ensemble Streamflow Forecasts#

Overview#

This notebook introduces using TEEHR to evaluate ensemble forecasts. Note that ensemble analysis in TEEHR is currently ongoing, and additional metrics and analyses are in development.

In this exercise we will use a pre-existing evaluation stored in the TEEHR repository consisting of two USGS gage locations with Hydrologic Ensemble Forecasting System (HEFS) forecasts from CNRFC.

Set up the example Evaluation#

import shutil
from pathlib import Path

import teehr
from teehr.example_data.setup_ensemble_example import setup_hefs_example

# Tell Bokeh to output plots in the notebook
from bokeh.io import output_notebook
output_notebook()

Loading BokehJS ...

# Define the directory where the Evaluation will be created.
test_eval_dir = Path(Path().home(), "temp", "09_ensemble_metrics")
shutil.rmtree(test_eval_dir, ignore_errors=True)

# Setup the example evaluation using data from the TEEHR repository.
setup_hefs_example(tmpdir=test_eval_dir)

# Initialize the evaluation.
ev = teehr.LocalReadWriteEvaluation(dir_path=test_eval_dir)

Explore the evaluation data#

In this example we have two USGS gage locations corresponding to CNRFC forecast points.

locations_gdf = ev.locations.to_geopandas()
locations_gdf

	id	name	geometry	created_at	updated_at	properties
0	usgs-11402000	SPANISH C AB BLACKHAWK C AT KEDDIE CA	POINT (-120.9544 40.00295)	2026-05-18 14:07:42.913717	2026-05-18 14:07:42.913717	None
1	usgs-11421000	YUBA R NR MARYSVILLE CA	POINT (-121.52496 39.17573)	2026-05-18 14:07:42.913717	2026-05-18 14:07:42.913717	None

The secondary location IDs correspond to the CNRFC forecast point IDs.

ev.location_crosswalks.to_pandas()

	primary_location_id	secondary_location_id	created_at	updated_at	properties
0	usgs-11421000	MRYC1	2026-05-18 14:07:44.452125	2026-05-18 14:07:44.452125	None
1	usgs-11402000	SCBC1	2026-05-18 14:07:44.452125	2026-05-18 14:07:44.452125	None

The primary timeseries are the USGS gage observations.

pt_df = ev. primary_timeseries.to_pandas()
pt_df.head()

	reference_time	value_time	configuration_name	unit_name	variable_name	value	location_id	created_at	updated_at
0	NaT	2024-04-01 16:00:00	usgs_observations	ft^3/s	streamflow_hourly_inst	686.0	usgs-11402000	2026-05-18 14:07:46.418238	2026-05-18 14:07:46.418238
1	NaT	2024-04-02 01:00:00	usgs_observations	ft^3/s	streamflow_hourly_inst	639.0	usgs-11402000	2026-05-18 14:07:46.418238	2026-05-18 14:07:46.418238
2	NaT	2024-04-02 03:00:00	usgs_observations	ft^3/s	streamflow_hourly_inst	625.0	usgs-11402000	2026-05-18 14:07:46.418238	2026-05-18 14:07:46.418238
3	NaT	2024-04-02 09:00:00	usgs_observations	ft^3/s	streamflow_hourly_inst	625.0	usgs-11402000	2026-05-18 14:07:46.418238	2026-05-18 14:07:46.418238
4	NaT	2024-04-02 17:00:00	usgs_observations	ft^3/s	streamflow_hourly_inst	606.0	usgs-11402000	2026-05-18 14:07:46.418238	2026-05-18 14:07:46.418238

The secondary timeseries are the CNRFC HEFS ensemble forecasts corresponding to each location.

st_df = ev.secondary_timeseries.to_pandas()
st_df.head()

	reference_time	value_time	configuration_name	unit_name	variable_name	value	location_id	member	created_at	updated_at
0	2024-04-01 12:00:00	2024-04-01 12:00:00	CNRFC_HEFS	ft^3/s	streamflow_hourly_inst	739.859985	SCBC1	1	2026-05-18 14:07:48.829157	2026-05-18 14:07:48.829157
1	2024-04-01 12:00:00	2024-04-01 12:00:00	CNRFC_HEFS	ft^3/s	streamflow_hourly_inst	739.859985	SCBC1	18	2026-05-18 14:07:48.829157	2026-05-18 14:07:48.829157
2	2024-04-01 12:00:00	2024-04-01 12:00:00	CNRFC_HEFS	ft^3/s	streamflow_hourly_inst	739.859985	SCBC1	38	2026-05-18 14:07:48.829157	2026-05-18 14:07:48.829157
3	2024-04-01 12:00:00	2024-04-01 12:00:00	CNRFC_HEFS	ft^3/s	streamflow_hourly_inst	739.859985	SCBC1	39	2026-05-18 14:07:48.829157	2026-05-18 14:07:48.829157
4	2024-04-01 12:00:00	2024-04-01 13:00:00	CNRFC_HEFS	ft^3/s	streamflow_hourly_inst	729.409973	SCBC1	19	2026-05-18 14:07:48.829157	2026-05-18 14:07:48.829157

HEFS is probabilistic and each forecast consists of several ensemble traces or members. In TEEHR,each trace is represented by the member column in the timeseries table schema.

In this case each forecast contains 42 members.

st_df.member.unique().size

Let’s visualize the data. We’ll use some helper functions to plot the forecasts against the USGS gage data. Note that visualization functionality within TEEHR is in-development.

import pandas as pd
import hvplot.pandas  # noqa

def plot_teehr_timeseries(
    df: pd.DataFrame,
    x_column: str = "value_time",
    y_column: str = "value",
    title: str = None
):
    columns = df.columns
    variable_columns = [col for col in columns if df[col].nunique() > 1]
    variable_columns.remove(x_column)
    variable_columns.remove(y_column)
    figure = df.hvplot(x="value_time", y="value", by=variable_columns)
    return figure.options(
        width=700,
        height=400,
        show_grid=True,
        title=title
    )

# Get the HEFS forecast for location SCBC1.
st_df = ev.secondary_timeseries.filter("location_id == 'SCBC1'").to_pandas()

# Create a line plot of the corresponding USGS observed flow data.
usgs_plot = pt_df[pt_df.location_id == "usgs-11402000"].hvplot(x="value_time", y="value").options(line_width=3, line_dash="dashed", color="black")

# Create a line plot of the HEFS forecast data with the USGS observed flow data.
(plot_teehr_timeseries(st_df, title="SCBC1 HEFS Ensemble vs. USGS") * usgs_plot).options(show_legend=False)

# Get the HEFS forecast for location MRYC1.
st_df = ev.secondary_timeseries.filter("location_id == 'MRYC1'").to_pandas()

# Create a line plot of the corresponding USGS observed flow data.
usgs_plot = pt_df[pt_df.location_id == "usgs-11421000"].hvplot(x="value_time", y="value").options(line_width=3, line_dash="dashed", color="black")

# Create a line plot of the HEFS forecast data with the USGS observed flow data.
(plot_teehr_timeseries(st_df, title="MRYC1 HEFS Ensemble vs. USGS") * usgs_plot).options(show_legend=False)

Calculating the Continuous Ranked Probability Score (CRPS)#

The continuous ranked probability score (CRPS) is a common metric used to evaluate the performance of ensemble forecasts. TEEHR implements CRPS through the scoringrules library.

We can define a CRPS metric object and customize it’s parameters in the same way we defined metrics in previous examples.

crps = teehr.ProbabilisticMetrics.CRPS()

# Specify the estimator to use.
#   "pwm" - Probability Weighted Moment form (default)
#   "nrg" - Energy form
#   "fair" - Fair version of energy form
crps.estimator = "pwm"

# Specify the backend to use.
#   "numba" - Use the numba library for calculations (default)
#   "numpy" - Use the numpy library for calculations
crps.backend = "numba"

crps.output_field_name = "crps_ensemble"

ev.joined_timeseries_view().aggregate(
    metrics=[crps],
    group_by=[
        "primary_location_id",
        "reference_time",
        "configuration_name"
    ],
).order_by("primary_location_id").to_pandas()

	primary_location_id	reference_time	configuration_name	crps_ensemble
0	usgs-11402000	2024-04-01 12:00:00	CNRFC_HEFS	[84.4178695678711, 90.95592498779297, 110.0693...
1	usgs-11421000	2024-04-01 12:00:00	CNRFC_HEFS	[53.0365104675293, 228.47833251953125, 298.672...

By default, CRPS returns an array of scores corresponding to each value time. We can summarize these values by defining a function (typically mean) in the metric object.

import numpy as np

crps.summary_func = np.mean

crps.output_field_name = "mean_crps_ensemble"

ev.joined_timeseries_view().aggregate(
    metrics=[crps],
    group_by=[
        "primary_location_id",
        "reference_time",
        "configuration_name"
    ],
).order_by("primary_location_id").to_pandas()

	primary_location_id	reference_time	configuration_name	mean_crps_ensemble
0	usgs-11402000	2024-04-01 12:00:00	CNRFC_HEFS	62.913597
1	usgs-11421000	2024-04-01 12:00:00	CNRFC_HEFS	577.685852

Summary#

Currently only the CRPS is implemented in TEEHR however additonal metrics and functionality is ongoing including ensemble timeseries visualization, skill score calculation, Brier Score, and more.

ev.spark.stop()