pandas.ExcelFile#

class pandas.ExcelFile(path_or_buffer, engine=None, storage_options=None, engine_kwargs=None)[source]#

Class for parsing tabular Excel sheets into DataFrame objects.

See read_excel for more documentation.

Parameters:
path_or_bufferstr, bytes, pathlib.Path,

A file-like object, xlrd workbook or openpyxl workbook. If a string or path object, expected to be a path to a .xls, .xlsx, .xlsb, .xlsm, .odf, .ods, or .odt file.

enginestr, default None

If io is not a buffer or path, this must be set to identify io. Supported engines: xlrd, openpyxl, odf, pyxlsb, calamine Engine compatibility :

  • xlrd supports old-style Excel files (.xls).

  • openpyxl supports newer Excel file formats.

  • odf supports OpenDocument file formats (.odf, .ods, .odt).

  • pyxlsb supports Binary Excel files.

  • calamine supports Excel (.xls, .xlsx, .xlsm, .xlsb) and OpenDocument (.ods) file formats.

Changed in version 1.2.0: The engine xlrd now only supports old-style .xls files. When engine=None, the following logic will be used to determine the engine:

  • If path_or_buffer is an OpenDocument format (.odf, .ods, .odt), then odf will be used.

  • Otherwise if path_or_buffer is an xls format, xlrd will be used.

  • Otherwise if path_or_buffer is in xlsb format, pyxlsb will be used.

Added in version 1.3.0:

  • Otherwise if openpyxl is installed, then openpyxl will be used.

  • Otherwise if xlrd >= 2.0 is installed, a ValueError will be raised.

Warning

Please do not report issues when using xlrd to read .xlsx files. This is not supported, switch to using openpyxl instead.

storage_optionsdict, optional

Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to urllib.request.Request as header options. For other URLs (e.g. starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to fsspec.open. Please see fsspec and urllib for more details, and for more examples on storage options refer here.

engine_kwargsdict, optional

Arbitrary keyword arguments passed to excel engine.

See also

DataFrame.to_excel

Write DataFrame to an Excel file.

DataFrame.to_csv

Write DataFrame to a comma-separated values (csv) file.

read_csv

Read a comma-separated values (csv) file into DataFrame.

read_fwf

Read a table of fixed-width formatted lines into DataFrame.

Examples

>>> file = pd.ExcelFile("myfile.xlsx")  
>>> with pd.ExcelFile("myfile.xls") as xls:  
...     df1 = pd.read_excel(xls, "Sheet1")  

Attributes

book

Gets the Excel workbook.

sheet_names

Names of the sheets in the document.

Methods

close()

close io if necessary

parse([sheet_name, header, names, ...])

Parse specified sheet(s) into a DataFrame.