Signatures#
- class Signatures[source]#
Define and customize signature metrics (single-field statistics).
Notes
Signatures operate on a single field (typically primary_value) to characterize timeseries properties. Available signatures:
Basic Statistics: - Count, Minimum, Maximum, Average, Sum, Variance
Time-Based: - MaxValueTime
Hydrologic: - FlowDurationCurveSlope
Example
>>> from teehr import Signatures >>> avg = Signatures.Average() >>> fdc = Signatures.FlowDurationCurveSlope(lower_quantile=0.33, upper_quantile=0.66)
Methods
- class Average(*, return_type: str | ~pyspark.sql.types.ArrayType | ~pyspark.sql.types.MapType = 'float', unpack_results: bool = False, unpack_function: ~typing.Callable = <function unpack_sdf_dict_columns>, reference_configuration: str = None, bootstrap: ~typing.Any = None, add_epsilon: bool = False, transform: ~typing.Any = None, output_field_name: str = None, func: ~typing.Callable = None, input_field_names: str | ~teehr.models.str_enum.StrEnum | ~typing.List[str | ~teehr.models.str_enum.StrEnum] = None, attrs: ~typing.Dict = None)#
Average: arithmetic mean of the series.
- default_func() Callable#
Create average metric function.
- class Count(*, return_type: str | ~pyspark.sql.types.ArrayType | ~pyspark.sql.types.MapType = 'float', unpack_results: bool = False, unpack_function: ~typing.Callable = <function unpack_sdf_dict_columns>, reference_configuration: str = None, bootstrap: ~typing.Any = None, add_epsilon: bool = False, transform: ~typing.Any = None, output_field_name: str = None, func: ~typing.Callable = None, input_field_names: str | ~teehr.models.str_enum.StrEnum | ~typing.List[str | ~teehr.models.str_enum.StrEnum] = None, attrs: ~typing.Dict = None)#
Count: number of non-null values in the series.
- default_func() Callable#
Create count metric function.
- class FlowDurationCurveSlope(*, return_type: str | ~pyspark.sql.types.ArrayType | ~pyspark.sql.types.MapType = 'float', unpack_results: bool = False, unpack_function: ~typing.Callable = <function unpack_sdf_dict_columns>, reference_configuration: str = None, bootstrap: ~typing.Any = None, add_epsilon: bool = False, transform: ~typing.Any = None, output_field_name: str = None, func: ~typing.Callable = None, input_field_names: str | ~teehr.models.str_enum.StrEnum | ~typing.List[str | ~teehr.models.str_enum.StrEnum] = None, attrs: ~typing.Dict = None, lower_quantile: float = 0.25, upper_quantile: float = 0.85, as_percentile: bool = False)#
Flow Duration Curve Slope: slope of the FDC between quantiles.
Additional Parameters#
- lower_quantilefloat
The lower exceedance probability quantile, by default 0.25.
- upper_quantilefloat
The upper exceedance probability quantile, by default 0.85.
- as_percentilebool
Whether to express exceedance probability as percentile (0-100) or fraction (0-1), by default False.
- default_func() Callable#
Create flow duration curve slope metric function.
- class MaxValueTime(*, return_type: str = 'timestamp', unpack_results: bool = False, unpack_function: ~typing.Callable = <function unpack_sdf_dict_columns>, reference_configuration: str = None, bootstrap: ~typing.Any = None, add_epsilon: bool = False, transform: ~typing.Any = None, output_field_name: str = None, func: ~typing.Callable = None, input_field_names: str | ~teehr.models.str_enum.StrEnum | ~typing.List[str | ~teehr.models.str_enum.StrEnum] = None, attrs: ~typing.Dict = None)#
Max Value Time: timestamp when the maximum value occurs.
This signature requires both primary_value and value_time fields.
- default_func() Callable#
Create max_value_time metric function.
- class Maximum(*, return_type: str | ~pyspark.sql.types.ArrayType | ~pyspark.sql.types.MapType = 'float', unpack_results: bool = False, unpack_function: ~typing.Callable = <function unpack_sdf_dict_columns>, reference_configuration: str = None, bootstrap: ~typing.Any = None, add_epsilon: bool = False, transform: ~typing.Any = None, output_field_name: str = None, func: ~typing.Callable = None, input_field_names: str | ~teehr.models.str_enum.StrEnum | ~typing.List[str | ~teehr.models.str_enum.StrEnum] = None, attrs: ~typing.Dict = None)#
Maximum: largest value in the series.
- default_func() Callable#
Create maximum metric function.
- class Minimum(*, return_type: str | ~pyspark.sql.types.ArrayType | ~pyspark.sql.types.MapType = 'float', unpack_results: bool = False, unpack_function: ~typing.Callable = <function unpack_sdf_dict_columns>, reference_configuration: str = None, bootstrap: ~typing.Any = None, add_epsilon: bool = False, transform: ~typing.Any = None, output_field_name: str = None, func: ~typing.Callable = None, input_field_names: str | ~teehr.models.str_enum.StrEnum | ~typing.List[str | ~teehr.models.str_enum.StrEnum] = None, attrs: ~typing.Dict = None)#
Minimum: smallest value in the series.
- default_func() Callable#
Create minimum metric function.
- class Sum(*, return_type: str | ~pyspark.sql.types.ArrayType | ~pyspark.sql.types.MapType = 'float', unpack_results: bool = False, unpack_function: ~typing.Callable = <function unpack_sdf_dict_columns>, reference_configuration: str = None, bootstrap: ~typing.Any = None, add_epsilon: bool = False, transform: ~typing.Any = None, output_field_name: str = None, func: ~typing.Callable = None, input_field_names: str | ~teehr.models.str_enum.StrEnum | ~typing.List[str | ~teehr.models.str_enum.StrEnum] = None, attrs: ~typing.Dict = None)#
Sum: total of all values in the series.
- default_func() Callable#
Create sum metric function.
- class Variance(*, return_type: str | ~pyspark.sql.types.ArrayType | ~pyspark.sql.types.MapType = 'float', unpack_results: bool = False, unpack_function: ~typing.Callable = <function unpack_sdf_dict_columns>, reference_configuration: str = None, bootstrap: ~typing.Any = None, add_epsilon: bool = False, transform: ~typing.Any = None, output_field_name: str = None, func: ~typing.Callable = None, input_field_names: str | ~teehr.models.str_enum.StrEnum | ~typing.List[str | ~teehr.models.str_enum.StrEnum] = None, attrs: ~typing.Dict = None)#
Variance: statistical variance of the series.
- default_func() Callable#
Create variance metric function.