pandas.read_parquet¶
- pandas.read_parquet(path, engine='auto', columns=None, use_nullable_dtypes=False, **kwargs)[source]¶
Load a parquet object from the file path, returning a DataFrame.
- Parameters
- pathstr, path object or file-like object
Any valid string path is acceptable. The string could be a URL. Valid URL schemes include http, ftp, s3, gs, and file. For file URLs, a host is expected. A local file could be:
file://localhost/path/to/table.parquet
. A file URL can also be a path to a directory that contains multiple partitioned parquet files. Both pyarrow and fastparquet support paths to directories as well as file URLs. A directory path could be:file://localhost/path/to/tables
ors3://bucket/partition_dir
If you want to pass in a path object, pandas accepts any
os.PathLike
.By file-like object, we refer to objects with a
read()
method, such as a file handle (e.g. via builtinopen
function) orStringIO
.- engine{‘auto’, ‘pyarrow’, ‘fastparquet’}, default ‘auto’
Parquet library to use. If ‘auto’, then the option
io.parquet.engine
is used. The defaultio.parquet.engine
behavior is to try ‘pyarrow’, falling back to ‘fastparquet’ if ‘pyarrow’ is unavailable.- columnslist, default=None
If not None, only these columns will be read from the file.
- use_nullable_dtypesbool, default False
If True, use dtypes that use
pd.NA
as missing value indicator for the resulting DataFrame (only applicable forengine="pyarrow"
). As new dtypes are added that supportpd.NA
in the future, the output with this option will change to use those dtypes. Note: this is an experimental option, and behaviour (e.g. additional support dtypes) may change without notice.New in version 1.2.0.
- **kwargs
Any additional kwargs are passed to the engine.
- Returns
- DataFrame