pandas.read_orc#

pandas.read_orc(path, columns=None, use_nullable_dtypes=False, **kwargs)[source]#

Load an ORC object from the file path, returning a DataFrame.

New in version 1.0.0.

Parameters
pathstr, path object, or file-like object

String, path object (implementing os.PathLike[str]), or file-like object implementing a binary read() function. The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. A local file could be: file://localhost/path/to/table.orc.

columnslist, default None

If not None, only these columns will be read from the file. Output always follows the ordering of the file and not the columns list. This mirrors the original behaviour of pyarrow.orc.ORCFile.read().

use_nullable_dtypesbool, default False

If True, use dtypes that use pd.NA as missing value indicator for the resulting DataFrame.

The nullable dtype implementation can be configured by setting the global io.nullable_backend configuration option to "pandas" to use numpy-backed nullable dtypes or "pyarrow" to use pyarrow-backed nullable dtypes (using pd.ArrowDtype).

New in version 2.0.0.

**kwargs

Any additional kwargs are passed to pyarrow.

Returns
DataFrame

Notes

Before using this function you should read the user guide about ORC and install optional dependencies.