Metrics#
TEEHR provides comprehensive metrics for evaluating hydrologic model performance.
The aggregate() method on tables and views computes metrics across grouped data,
with support for bootstrapping, transforms, multiple metric categories, and a choice of
aggregation engine (Spark-native or Python/pandas-UDF).
Using the Aggregate Method#
The aggregate() method is available on all Table and View objects. It computes
specified metrics grouped by selected fields:
import teehr
from teehr.metrics import DeterministicMetrics
ev = teehr.LocalReadWriteEvaluation(dir_path="/path/to/evaluation")
# Basic metrics query
metrics_df = ev.table("joined_timeseries").aggregate(
metrics=[
DeterministicMetrics.KlingGuptaEfficiency(),
DeterministicMetrics.NashSutcliffeEfficiency(),
],
group_by=["primary_location_id"],
).to_pandas()
Aggregate Parameters#
Parameter |
Description |
|---|---|
|
List of metric instances to compute |
|
List of fields to group by before computing metrics |
|
Aggregation engine: |
Choosing an Engine#
The engine parameter controls how metrics are computed under the hood.
Engine |
Behaviour |
|---|---|
|
Routes each metric to the fastest available path. Metrics that have a Spark-native implementation run without pandas UDFs; remaining metrics fall back to the Python/pandas-UDF path. Results are joined before being returned. |
|
Forces the Spark-native path for every metric. Raises
|
|
Forces the Python/pandas-UDF path for every metric. Behaves identically to the pre-engine-parameter behavior. |
Metrics supported on the Spark-native path (no transform, no bootstrap):
Signature metrics: Count, Minimum, Maximum, Average, Sum, Variance, MaxValueTime
Deterministic metrics: MeanError, RelativeBias, MultiplicativeBias, MeanSquareError, RootMeanSquareError, MeanAbsoluteError, MeanAbsoluteRelativeError, PearsonCorrelation, SpearmanCorrelation, Rsquared, NashSutcliffeEfficiency, NormalizedNashSutcliffeEfficiency, VariabilityRatio, RootMeanStandardDeviationRatio, KlingGuptaEfficiency, KlingGuptaEfficiencyMod1, KlingGuptaEfficiencyMod2, RelativeMean, RelativeMedian, RelativeMinimum, RelativeMaximum, RelativeStandardDeviation
Note
Metrics that use a transform (e.g. transform="log") or a
bootstrap configuration are always routed to the Python path,
even in engine="auto" mode.
Note
Spark-native quantile-derived metrics may use Spark approximate quantile
algorithms rather than exact order statistics. In the current Spark-native
metric set, this mainly affects metrics that depend on medians, such as
RelativeMedian. The approximation is computed from a distributed summary
of the full group, not from a simple random sample, so it is usually a good
tradeoff for large datasets. The main practical effect is that values very
close to the cutoff can shift slightly relative to an exact pandas result.
If exact quantile behavior is important for your analysis, use
engine="python".
from teehr import DeterministicMetrics
# Explicitly use Spark-native path for a pure-native query
metrics_df = ev.table("joined_timeseries").aggregate(
metrics=[
DeterministicMetrics.KlingGuptaEfficiency(),
DeterministicMetrics.NashSutcliffeEfficiency(),
DeterministicMetrics.RelativeBias(),
],
group_by=["primary_location_id"],
engine="spark",
).to_pandas()
# Auto mode – mix native and python metrics transparently
metrics_df = ev.table("joined_timeseries").aggregate(
metrics=[
DeterministicMetrics.MeanError(), # spark-native
DeterministicMetrics.MeanError(transform="log"), # python path (transform)
],
group_by=["primary_location_id"],
).to_pandas()
Group By Fields#
The group_by parameter controls how metrics are aggregated. Common groupings:
import teehr.calculated_fields.models.row_level as rcf
# Group by location only
jt.aggregate(metrics=[...], group_by=["primary_location_id"])
# Group by location and configuration
jt.aggregate(metrics=[...], group_by=["primary_location_id", "configuration_name"])
# Group by calculated fields
jt = ev.joined_timeseries_view().add_calculated_fields([
rcf.Month(),
rcf.WaterYear(),
])
jt.aggregate(metrics=[...], group_by=["primary_location_id", "water_year", "month"])
Using Metrics#
Import metric classes and instantiate them:
from teehr.metrics import DeterministicMetrics, Signatures, ProbabilisticMetrics
# Deterministic metrics
kge = DeterministicMetrics.KlingGuptaEfficiency()
nse = DeterministicMetrics.NashSutcliffeEfficiency()
rmse = DeterministicMetrics.RootMeanSquareError()
# Signatures (single field statistics)
avg = Signatures.Average()
fdc = Signatures.FlowDurationCurveSlope()
# Probabilistic metrics (ensemble forecasts)
crps = ProbabilisticMetrics.CRPS()
Transforms#
Apply mathematical transformations before computing metrics:
from teehr.metrics.models.base import TransformEnum
# Log-transformed RMSE
rmse = DeterministicMetrics.RootMeanSquareError()
rmse.transform = TransformEnum.log
rmse.add_epsilon = True # Avoid log(0)
Available transforms: log, sqrt, square, cube, exp, inv, abs
Bootstrapping#
Compute confidence intervals using bootstrap resampling:
For CircularBlock and Stationary bootstrapping, block_size is
optional. If omitted (or set to None), TEEHR uses
arch.bootstrap.optimal_block_length to estimate an optimal block size from
the primary metric input series.
from teehr.metrics.models.bootstrap import Bootstrappers
# Configure bootstrap
boot = Bootstrappers.CircularBlock(
reps=1000,
# block_size=None -> auto estimate using optimal_block_length (b_cb)
block_size=None,
seed=42,
quantiles=[0.05, 0.5, 0.95]
)
# Optional: provide a fixed block size if desired
# boot = Bootstrappers.CircularBlock(reps=1000, block_size=365, seed=42)
# Apply to metric
kge = DeterministicMetrics.KlingGuptaEfficiency()
kge.bootstrap = boot
kge.unpack_results = True # Separate columns for quantiles
metrics_df = jt.aggregate(
metrics=[kge],
group_by=["primary_location_id"],
).to_pandas()
# Results: kling_gupta_efficiency_0.05, _0.5, _0.95
See also: Bootstrappers
Complete Example#
import teehr
from teehr.metrics import DeterministicMetrics, Signatures
import teehr.calculated_fields.models.row_level as rcf
ev = teehr.LocalReadWriteEvaluation(dir_path="/path/to/evaluation")
# Build view with calculated fields
metrics_df = (
ev.joined_timeseries_view(add_attrs=True)
.add_calculated_fields([rcf.WaterYear(), rcf.Seasons()])
.filter("water_year >= 2015")
.aggregate(
metrics=[
DeterministicMetrics.KlingGuptaEfficiency(),
DeterministicMetrics.RelativeBias(),
Signatures.Average(),
],
group_by=["primary_location_id", "season"],
)
.order_by(["primary_location_id", "season"])
.to_pandas()
)
print(metrics_df.head())
ev.spark.stop()
Available Metrics#
The metrics currently built into TEEHR are listed in the tables below. The metrics currently built into TEEHR are listed in the tables below. Please note that some are still in development and planned for inclusion in future versions.
Signatures#
Signatures#
Signatures operate on a single field to characterize timeseries properties. Signatures operate on a single field to characterize timeseries properties.
Available |
Description |
Short Name |
Equation |
API Reference |
|---|---|---|---|---|
Average |
\(Average\) |
\(\frac{\sum(prim)}{count}\) |
||
Count |
\(Count\) |
\(count\) |
||
Flow Duration Curve Slope |
\(FDC\ Slope\) |
\(\frac{q85-q25}{p85-p25}\) |
||
Max Value Time |
\(Max\ Value\ Time\) |
\(peak\ time_{prim}\) |
||
Maximum |
\(Max\) |
\(max(prim)\) |
||
Minimum |
\(Min\) |
\(min(prim)\) |
||
Sum |
\(Sum\) |
\(\sum(prim)\) |
||
Variance |
\(Variance\) |
\(\sigma^2_{prim}\) |
Deterministic Metrics#
Deterministic metrics compare two timeseries, typically primary (“observed”) vs. secondary (“modeled”) values.
Available |
Description |
Short Name |
Equation |
API Reference |
|---|---|---|---|---|
Mean Error |
\(Mean\ Error\) |
\(\frac{\sum(sec-prim)}{count}\) |
||
Relative Bias |
\(Relative\ Bias\) |
\(\frac{\sum(sec-prim)}{\sum(prim)}\) |
||
Multiplicative Bias |
\(Mult.\ Bias\) |
\(\frac{\mu_{sec}}{\mu_{prim}}\) |
||
Relative Mean |
\(RelMean\) |
\(\frac{mean(sec)}{mean(prim)}\) |
||
Relative Median |
\(RelMedian\) |
\(\frac{median(sec)}{median(prim)}\) |
||
Relative Minimum |
\(RelMin\) |
\(\frac{min(sec)}{min(prim)}\) |
||
Relative Maximum |
\(RelMax\) |
\(\frac{max(sec)}{max(prim)}\) |
||
Relative Standard Deviation |
\(RelStd\) |
\(\frac{std(sec)}{std(prim)}\) |
||
Mean Square Error |
\(MSE\) |
\(\frac{\sum(sec-prim)^2}{count}\) |
||
Root Mean Square Error |
\(RMSE\) |
\(\sqrt{\frac{\sum(sec-prim)^2}{count}}\) |
||
Mean Absolute Error |
\(MAE\) |
\(\frac{\sum|sec-prim|}{count}\) |
||
Mean Absolute Relative Error |
\(Relative\ MAE\) |
\(\frac{\sum|sec-prim|}{\sum(prim)}\) |
||
Pearson Correlation Coefficient |
\(r\) |
\(r(sec, prim)\) |
||
Variability Ratio |
\(VR\) |
\(\frac{\sigma_{sec}}{\sigma_{prim}}\) |
|
|
Coefficient of Determination |
\(r^2\) |
\(r(sec, prim)^2\) |
||
Nash-Sutcliffe Efficiency |
\(NSE\) |
\(1-\frac{\sum(prim-sec)^2}{\sum(prim-\mu_{prim}^2)}\) |
||
Normalized Nash-Sutcliffe Efficiency |
\(NNSE\) |
\(\frac{1}{(2-NSE)}\) |
||
Kling Gupta Efficiency - original |
\(KGE\) |
\(1-\sqrt{(r(sec, prim)-1)^2+(\frac{\sigma_{sec}}{\sigma_{prim}}-1)^2+(\frac{\mu_{sec}}{\mu_{sec}/\mu_{prim}}-1)^2}\) |
||
Kling Gupta Efficiency - modified 1 (2012) |
\(KGE'\) |
\(1-\sqrt{(r(sec, prim)-1)^2+(\frac{\sigma_{sec}/\mu_{sec}}{\sigma_{prim}/\mu_{prim}}-1)^2+(\frac{\mu_{sec}}{\mu_{sec}/\mu_{prim}}-1)^2}\) |
||
Kling Gupta Efficiency - modified 2 (2021) |
\(KGE''\) |
\(1-\sqrt{(r(sec, prim)-1)^2+(\frac{\sigma_{sec}}{\sigma_{prim}}-1)^2+\frac{(\mu_{sec}-\mu_{prim})^2}{\sigma_{prim}^2}}\) |
||
Annual Peak Relative Bias |
\(Ann\ PF\ Bias\) |
\(\frac{\sum(ann.\ peak_{sec}-ann.\ peak_{prim})}{\sum(ann.\ peak_{prim})}\) |
||
Spearman Rank Correlation Coefficient |
\(r_s\) |
\(1-\frac{6*\sum|rank_{prim}-rank_{sec}|^2}{count(count^2-1)}\) |
||
Max Value Delta |
\(Max\ Val\ Delta\) |
\(max(sec) - max(prim)\) |
||
Root Mean Standard Deviation Ratio |
\(RSR\) |
\(\frac{RMSE}{\sigma_{prim}}\) |
||
Max Value Time Delta |
\(Max\ Val\ Time\ Delta\) |
\(time(max(sec)) - time(max(prim))\) |
||
Coming Soon |
Flow Duration Curve Slope Error |
\(Slope\ FDC\ Error\) |
\(\frac{q66_{sec}-q33_{sec}}{33}-\frac{q66_{prim}-q33_{prim}}{33}\) |
N/A |
Event Peak Flow Relative Bias |
\(Peak\ Bias\) |
\(\frac{\sum(peak_{sec}-peak_{prim})}{\sum(peak_{prim})}\) |
N/A |
|
Event Peak Flow Timing Error |
\(Peak\ Time\ Error\) |
\(\frac{\sum(peak\ time_{sec}-peak\ time_{prim})}{count}\) |
N/A |
|
Coming Soon |
Baseflow Index Error |
\(BFI\ Error\) |
\(\frac{\frac{\mu(baseflow_{sec})}{\mu(sec)}-\frac{\mu(baseflow_{prim})}{\mu(prim)}}{\frac{\mu(baseflow_{prim})}{\mu(prim)}}\) |
N/A |
Coming Soon |
Rising Limb Density Error |
\(RLD\ Error\) |
\(\frac{count(rising\ limb\ events_{sec})}{count(rising\ limb\ timesteps_{sec})}-\frac{count(rising\ limb\ events_{prim})}{count(rising\ limb\ timesteps_{prim})}\) |
N/A |
Coming Soon |
Mean Square Error Skill Score (generalized reference) |
\(MSESS\) |
\(1-\frac{\sum(prim-sec)^2}{\sum(prim-reference)^2}\) |
N/A |
Coming Soon |
Runoff Ratio Error |
\(RR\ Error\) |
\(abs\left\|\frac{\mu(volume_{sec})}{\mu(precip\ volume)}-\frac{\mu(volume_{prim})}{\mu(precip\ volume)}\right\|\) |
N/A |
Confusion Matrix |
\(CM\) |
\(TP,\ TN,\ FP,\ FN\) |
||
False Alarm Ratio |
\(FAR\) |
\(\frac{n_{FP}}{n_{TP}+n_{FP}}\) |
||
Probability of Detection |
\(POD\) |
\(\frac{n_{TP}}{n_{TP}+n_{FN}}\) |
||
Probability of False Detection |
\(POFD\) |
\(\frac{n_{FP}}{n_{TN}+n_{FP}}\) |
||
Critical Success Index (Threat Score) |
\(CSI\) |
\(\frac{n_{TP}}{n_{TP}+n_{FN}+n_{FP}}\) |
||
Success Ratio |
\(SR\) |
\(\frac{n_{TP}+n_{TN}}{n_{TP}+n_{FP}+n_{FN}+n_{TN}}\) |
||
Frequency Bias Index |
\(FBI\) |
\(\frac{n_{TP}+n_{FP}}{n_{TP}+n_{FN}}\) |
Probabilistic Metrics#
Probabilistic metrics compare a value against a distribution of predicted values, such as ensemble forecasts.
Available |
Description |
Short Name |
Equation |
API Reference |
|---|---|---|---|---|
Continuous Ranked Probability Score |
\(CRPS\) |
\(\int_{-\infty}^{\infty} (F(x) - \mathbf{1}_{x \geq y})^2 dx\) |
||
Brier Score |
\(BS\) |
\(\frac{\sum(sec\ ensemble\ prob-prim\ outcome)^2}{n}\) |
||
Brier Skill Score |
\(BSS\) |
\(1-\frac{BS}{BS_{ref}}\) |
||
Continuous Ranked Probability Skill Score |
\(CRPSS\) |
\(1-\frac{CRPS}{CRPS_{ref}}\) |