teehr.Fetch.nwm_retrospective_grids#
- Fetch.nwm_retrospective_grids(nwm_version: SupportedNWMRetroVersionsEnum, variable_name: ForcingVariablesEnum, zonal_weights_filepath: str | Path, start_date: str | datetime | Timestamp, end_date: str | datetime | Timestamp, chunk_by: NWMChunkByEnum | None = None, overwrite_output: bool | None = False, domain: SupportedNWMRetroDomainsEnum | None = 'CONUS', location_id_prefix: str | None = None, timeseries_type: TimeseriesTypeEnum = 'primary')[source]#
Fetch NWM retrospective gridded data, calculate zonal statistics (currently only mean is available) of selected variable for given zones, and load into the TEEHR dataset.
Data is fetched for all location IDs in the locations table, and all dates and times within the files and in the cache file names are in UTC.
Pixel values are summarized to zones based on a pre-computed zonal weights file.
- Parameters:
nwm_version (
SupportedNWMRetroVersionsEnum
) – NWM retrospective version to fetch. Currently nwm21 and nwm30 supported.variable_name (
str
) – Name of the NWM forcing data variable to download. (e.g., “PRECIP”, “PSFC”, “Q2D”, …).zonal_weights_filepath (
str,
) – Path to the array containing fraction of pixel overlap for each zone. The values in the location_id field from the zonal weights file are used in the output of this function.start_date (
Union[str
,datetime
,pd.Timestamp]
) – Date to begin data ingest. Str formats can include YYYY-MM-DD or MM/DD/YYYY. Rounds down to beginning of day.end_date (
Union[str
,datetime
,pd.Timestamp],
) – Last date to fetch. Rounds up to end of day. Str formats can include YYYY-MM-DD or MM/DD/YYYY.chunk_by (
Union[NWMChunkByEnum
,None] = None,
) – If None (default) saves all timeseries to a single file, otherwise the data is processed using the specified parameter. Can be: ‘week’ or ‘month’ for gridded data.overwrite_output (
bool = False,
) – Whether output should overwrite files if they exist. Default is False.domain (str =
"CONUS"
) – Geographical domain when NWM version is v3.0. Acceptable values are “Alaska”, “CONUS” (default), “Hawaii”, and “PR”. Only relevant when NWM version equals v3.0.location_id_prefix (
Union[str
,None]
) – Optional location ID prefix to add (prepend) or replace.timeseries_type (
str
) – Whether to consider as the “primary” or “secondary” timeseries. Default is “primary”.
Notes
The location_id values in the zonal weights file are used as location ids in the output of this function, unless a prefix is specified which will be prepended to the location_id values if none exists, or it will replace the existing prefix. It is assumed that the location_id follows the pattern ‘[prefix]-[unique id]’.
Examples
Here we will calculate mean areal precipitation using NWM forcing data for some watersheds (polygons) a using pre-calculated weights file (see:
generate_weights_file()
for weights calculation).>>> import teehr >>> ev = teehr.Evaluation()
>>> ev.fetch.nwm_retrospective_grids( >>> nwm_configuration="forcing_short_range", >>> variable_name="RAINRATE", >>> zonal_weights_filepath = Path(Path.home(), "nextgen_03S_weights.parquet"), >>> start_date=datetime(2000, 1, 1), >>> end_date=datetime(2001, 1, 1) >>> )
Note
NWM data can also be fetched outside of a TEEHR Evaluation by calling the method directly.
>>> from teehr.fetching.nwm.retrospective_grids import nwm_retro_grids_to_parquet
Perform the calculations, writing to the specified directory.
>>> nwm_retro_grids_to_parquet( >>> nwm_version="nwm22", >>> nwm_configuration="forcing_short_range", >>> variable_name="RAINRATE", >>> zonal_weights_filepath=Path(Path.home(), "nextgen_03S_weights.parquet"), >>> start_date=2020-12-18, >>> end_date=2022-12-18, >>> output_parquet_dir=Path(Path.home(), "temp/parquet") >>> )
See also
teehr.fetching.nwm.nwm_grids.nwm_grids_to_parquet()