pandas.read_orc#
- pandas.read_orc(path, columns=None, dtype_backend=<no_default>, filesystem=None, **kwargs)[source]#
- Load an ORC object from the file path, returning a DataFrame. - Parameters:
- pathstr, path object, or file-like object
- String, path object (implementing - os.PathLike[str]), or file-like object implementing a binary- read()function. The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. A local file could be:- file://localhost/path/to/table.orc.
- columnslist, default None
- If not None, only these columns will be read from the file. Output always follows the ordering of the file and not the columns list. This mirrors the original behaviour of - pyarrow.orc.ORCFile.read().
- dtype_backend{‘numpy_nullable’, ‘pyarrow’}, default ‘numpy_nullable’
- Back-end data type applied to the resultant - DataFrame(still experimental). Behaviour is as follows:- "numpy_nullable": returns nullable-dtype-backed- DataFrame(default).
- "pyarrow": returns pyarrow-backed nullable- ArrowDtypeDataFrame.
 - Added in version 2.0. 
- filesystemfsspec or pyarrow filesystem, default None
- Filesystem object to use when reading the parquet file. - Added in version 2.1.0. 
- **kwargs
- Any additional kwargs are passed to pyarrow. 
 
- Returns:
- DataFrame
 
 - Notes - Before using this function you should read the user guide about ORC and install optional dependencies. - If - pathis a URI scheme pointing to a local or remote file (e.g. “s3://”), a- pyarrow.fsfilesystem will be attempted to read the file. You can also pass a pyarrow or fsspec filesystem object into the filesystem keyword to override this behavior.- Examples - >>> result = pd.read_orc("example_pa.orc")