teehr.evaluation.fetch.Fetch.nwm_retrospective_grids#
- Fetch.nwm_retrospective_grids(nwm_version: SupportedNWMRetroVersionsEnum, variable_name: ForcingVariablesEnum, start_date: str | datetime | Timestamp, end_date: str | datetime | Timestamp, calculate_zonal_weights: bool = True, overwrite_output: bool | None = False, chunk_by: NWMChunkByEnum | None = None, domain: SupportedNWMRetroDomainsEnum | None = 'CONUS', location_id_prefix: str | None = None, timeseries_type: TimeseriesTypeEnum = 'primary', write_mode: TableWriteEnum = 'append', zonal_weights_filepath: Path | str | None = None)[source]#
Fetch NWM retrospective gridded data, calculate zonal statistics (currently only mean is available) of selected variable for given zones, and load into the TEEHR dataset.
Data is fetched for the location IDs in the locations table having a given location_id_prefix. All dates and times within the files and in the cache file names are in UTC.
The zonal weights file, which contains the fraction each grid pixel overlaps each zone is necessary, and can be calculated and saved to the cache directory if it does not already exist.
- Parameters:
nwm_version (
SupportedNWMRetroVersionsEnum
) – NWM retrospective version to fetch. Currently nwm21 and nwm30 supported. Note that since there is no change in NWM configuration between version 2.1 and 2.2, no retrospective run was produced for v2.2.variable_name (
str
) – Name of the NWM forcing data variable to download. (e.g., “PRECIP”, “PSFC”, “Q2D”, …).start_date (
Union[str
,datetime
,pd.Timestamp]
) – Date to begin data ingest. Str formats can include YYYY-MM-DD or MM/DD/YYYY. Rounds down to beginning of day.v2.0: 1993-01-01
v2.1: 1979-01-01
v3.0: 1979-02-01
end_date (
Union[str
,datetime
,pd.Timestamp],
) – Last date to fetch. Rounds up to end of day. Str formats can include YYYY-MM-DD or MM/DD/YYYY.v2.0: 2018-12-31
v2.1: 2020-12-31
v3.0: 2023-01-31
calculate_zonal_weights (
bool
) – Flag specifying whether or not to calculate zonal weights. True = calculate; False = use existing file. Default is True.location_id_prefix (
Optional[str]
) – Prefix to include when filtering the locations table for polygon primary_location_id. Default is None, all locations are included.overwrite_output (
bool
) – Flag specifying whether or not to overwrite output files if they already exist. True = overwrite; False = fail.chunk_by (
Optional[NWMChunkByEnum] = None,
) – If None (default) saves all timeseries to a single file, otherwise the data is processed using the specified parameter. Can be: ‘week’ or ‘month’ for gridded data.domain (str =
"CONUS"
) – Geographical domain when NWM version is v3.0. Acceptable values are “Alaska”, “CONUS” (default), “Hawaii”, and “PR”. Only relevant when NWM version equals v3.0.timeseries_type (
str
) – Whether to consider as the “primary” or “secondary” timeseries. Default is “primary”.zonal_weights_filepath (
Optional[Union[Path
,str]]
) – The path to the zonal weights file. If None and calculate_zonal_weights is False, the weights file must exist in the cache for the configuration. Default is None.
Examples
Here we will calculate mean areal precipitation using NWM forcing data for the polygons in the locations table. Pixel weights (fraction of pixel overlap) are calculated for each polygon and stored in the evaluation cache directory.
(see:
generate_weights_file()
for weights calculation).>>> import teehr >>> ev = teehr.Evaluation()
>>> ev.fetch.nwm_retrospective_grids( >>> nwm_version="nwm30", >>> variable_name="RAINRATE", >>> calculate_zonal_weights=True, >>> start_date=datetime(2000, 1, 1), >>> end_date=datetime(2001, 1, 1), >>> location_id_prefix="huc10" >>> )
Note
NWM data can also be fetched outside of a TEEHR Evaluation by calling the method directly.
>>> from teehr.fetching.nwm.retrospective_grids import nwm_retro_grids_to_parquet
Perform the calculations, writing to the specified directory.
>>> nwm_retro_grids_to_parquet( >>> nwm_version="nwm30", >>> variable_name="RAINRATE", >>> zonal_weights_filepath=Path(Path.home(), "nextgen_03S_weights.parquet"), >>> start_date=2020-12-18, >>> end_date=2022-12-18, >>> output_parquet_dir=Path(Path.home(), "temp/parquet"), >>> location_id_prefix="huc10", >>> )
See also
teehr.fetching.nwm.nwm_grids.nwm_grids_to_parquet()