pandas.read_parquet#
- pandas.read_parquet(path, engine='auto', columns=None, storage_options=None, use_nullable_dtypes=_NoDefault.no_default, dtype_backend=_NoDefault.no_default, **kwargs)[source]#
- Load a parquet object from the file path, returning a DataFrame. - Parameters
- pathstr, path object or file-like object
- String, path object (implementing - os.PathLike[str]), or file-like object implementing a binary- read()function. The string could be a URL. Valid URL schemes include http, ftp, s3, gs, and file. For file URLs, a host is expected. A local file could be:- file://localhost/path/to/table.parquet. A file URL can also be a path to a directory that contains multiple partitioned parquet files. Both pyarrow and fastparquet support paths to directories as well as file URLs. A directory path could be:- file://localhost/path/to/tablesor- s3://bucket/partition_dir.
- engine{‘auto’, ‘pyarrow’, ‘fastparquet’}, default ‘auto’
- Parquet library to use. If ‘auto’, then the option - io.parquet.engineis used. The default- io.parquet.enginebehavior is to try ‘pyarrow’, falling back to ‘fastparquet’ if ‘pyarrow’ is unavailable.
- columnslist, default=None
- If not None, only these columns will be read from the file. 
- storage_optionsdict, optional
- Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to - urllib.request.Requestas header options. For other URLs (e.g. starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to- fsspec.open. Please see- fsspecand- urllibfor more details, and for more examples on storage options refer here.- New in version 1.3.0. 
- use_nullable_dtypesbool, default False
- If True, use dtypes that use - pd.NAas missing value indicator for the resulting DataFrame. (only applicable for the- pyarrowengine) As new dtypes are added that support- pd.NAin the future, the output with this option will change to use those dtypes. Note: this is an experimental option, and behaviour (e.g. additional support dtypes) may change without notice.- Deprecated since version 2.0. 
- dtype_backend{“numpy_nullable”, “pyarrow”}, defaults to NumPy backed DataFrames
- Which dtype_backend to use, e.g. whether a DataFrame should have NumPy arrays, nullable dtypes are used for all dtypes that have a nullable implementation when “numpy_nullable” is set, pyarrow is used for all dtypes if “pyarrow” is set. - The dtype_backends are still experimential. - New in version 2.0. 
- **kwargs
- Any additional kwargs are passed to the engine. 
 
- Returns
- DataFrame