bootstrap_funcs#

Contains functions for bootstrap calculations for use in Spark queries.

Functions

bootstrap_group_key

Return a hashable key that identifies identical bootstrap configs.

create_circularblock_func

Create the CircularBlock bootstrap function.

create_gumboot_func

Create the Gumboot bootstrap function.

create_shared_bootstrap_func

Create a single bootstrap UDF that evaluates multiple metrics per draw.

create_stationary_func

Create the Stationary bootstrap function.

partition_metrics_by_bootstrap

Split metrics into non-bootstrap and bootstrap-sharing groups.

bootstrap_group_key(metric: MetricsBasemodel) tuple | None[source]#

Return a hashable key that identifies identical bootstrap configs.

Two metrics with the same key can share a single set of bootstrap samples. Returns None for metrics without a bootstrap configuration.

create_circularblock_func(model: MetricsBasemodel) Callable[source]#

Create the CircularBlock bootstrap function.

If model.bootstrap.block_size is None, the block size is estimated using arch.bootstrap.optimal_block_length (b_cb column).

create_gumboot_func(model: MetricsBasemodel) Callable[source]#

Create the Gumboot bootstrap function.

create_shared_bootstrap_func(metrics: List[MetricsBasemodel], minimum_sample_size: int = 30, minimum_mean: float = 0.01, minimum_variance: float = 2.5e-05) Callable[source]#

Create a single bootstrap UDF that evaluates multiple metrics per draw.

All metrics in metrics must share the same bootstrap configuration (same class, reps, seed, block_size, quantiles, and input fields).

Parameters:
  • metrics (List[MetricsBasemodel]) – Metrics sharing the same bootstrap config.

  • minimum_sample_size (int, optional) – Minimum sample count to run bootstrap. Default 30.

  • minimum_mean (float, optional) – Minimum mean value of primary series to run bootstrap. Default 0.01.

  • minimum_variance (float, optional) – Minimum variance of primary series to run bootstrap. Default 0.000025.

Returns:

Callable – UDF returning dict with per-metric quantiles or raw bootstrap arrays.

create_stationary_func(model: MetricsBasemodel) Callable[source]#

Create the Stationary bootstrap function.

If model.bootstrap.block_size is None, the block size is estimated using arch.bootstrap.optimal_block_length (b_sb column).

partition_metrics_by_bootstrap(metrics: List[MetricsBasemodel]) Tuple[List[MetricsBasemodel], Dict[tuple, List[MetricsBasemodel]]][source]#

Split metrics into non-bootstrap and bootstrap-sharing groups.

Returns:

  • no_boot (list) – Metrics without a bootstrap config.

  • boot_groups (dict) – Mapping of group key → list of metrics that can share samples. Singleton groups (len==1) are included so callers can treat all bootstrap metrics uniformly.