Read#

class Read(ev=None)[source]#

Class to handle reading evaluation data from storage.

Methods

`from_cache`	Read data from table directory as a spark dataframe.
`from_warehouse`	Read data from table as a spark dataframe.

from_cache(path: str | Path, table_schema: DataFrameSchema, pattern: str = '**/*.parquet', file_format: str = 'parquet', show_missing_table_warning: bool = False, **options) → DataFrame[source]#

Read data from table directory as a spark dataframe.

Parameters:

path (Union[str, Path]) – The path to the cache directory containing the files.
table_schema (SparkDataFrameSchema) – The schema of the table.
pattern (str, optional) – The pattern to match files. The default is “**/*.parquet”.
file_format (str, optional) – The file format to read. The default is “parquet”.
show_missing_table_warning (bool, optional) – If True, show the warning an empty table was returned. The default is True.
**options – Additional options to pass to the spark read method.

Returns:

df (ps.DataFrame) – The spark dataframe.

from_warehouse(table_name: str, catalog_name: str = None, namespace_name: str = None) → DataFrame[source]#

Read data from table as a spark dataframe.

Parameters:

table_name (str) – The name of the table to read.
catalog_name (str, optional) – The catalog name. If None, uses the active catalog. The default is None.
namespace_name (str, optional) – The namespace name. If None, uses the active namespace. The default is None.

Returns:

df (ps.DataFrame) – The spark dataframe.

Notes

Users are directed to the ev.table class for reading tables which will call this method under the hood. This method is available for direct use if users want to read from the warehouse outside of the context of a table.

For example:

>>> sdf = ev.table("my_table").to_sdf()

Read#

This Page