bootstrap_funcs#
Contains functions for bootstrap calculations for use in Spark queries.
Functions
Return a hashable key that identifies identical bootstrap configs. |
|
Create the CircularBlock bootstrap function. |
|
Create the Gumboot bootstrap function. |
|
Create a single bootstrap UDF that evaluates multiple metrics per draw. |
|
Create the Stationary bootstrap function. |
|
Split metrics into non-bootstrap and bootstrap-sharing groups. |
- bootstrap_group_key(metric: MetricsBasemodel) tuple | None[source]#
Return a hashable key that identifies identical bootstrap configs.
Two metrics with the same key can share a single set of bootstrap samples. Returns
Nonefor metrics without a bootstrap configuration.
- create_circularblock_func(model: MetricsBasemodel) Callable[source]#
Create the CircularBlock bootstrap function.
If
model.bootstrap.block_sizeisNone, the block size is estimated usingarch.bootstrap.optimal_block_length(b_cbcolumn).
- create_gumboot_func(model: MetricsBasemodel) Callable[source]#
Create the Gumboot bootstrap function.
Create a single bootstrap UDF that evaluates multiple metrics per draw.
All metrics in metrics must share the same bootstrap configuration (same class, reps, seed, block_size, quantiles, and input fields).
- Parameters:
metrics (
List[MetricsBasemodel]) – Metrics sharing the same bootstrap config.minimum_sample_size (
int, optional) – Minimum sample count to run bootstrap. Default 30.minimum_mean (
float, optional) – Minimum mean value of primary series to run bootstrap. Default 0.01.minimum_variance (
float, optional) – Minimum variance of primary series to run bootstrap. Default 0.000025.
- Returns:
Callable– UDF returning dict with per-metric quantiles or raw bootstrap arrays.
- create_stationary_func(model: MetricsBasemodel) Callable[source]#
Create the Stationary bootstrap function.
If
model.bootstrap.block_sizeisNone, the block size is estimated usingarch.bootstrap.optimal_block_length(b_sbcolumn).
- partition_metrics_by_bootstrap(metrics: List[MetricsBasemodel]) Tuple[List[MetricsBasemodel], Dict[tuple, List[MetricsBasemodel]]][source]#
Split metrics into non-bootstrap and bootstrap-sharing groups.
- Returns:
no_boot (
list) – Metrics without a bootstrap config.boot_groups (
dict) – Mapping of group key → list of metrics that can share samples. Singleton groups (len==1) are included so callers can treat all bootstrap metrics uniformly.