pandas.DataFrame.info#

DataFrame.info(verbose=None, buf=None, max_cols=None, memory_usage=None, show_counts=None)[source]#

Print a concise summary of a DataFrame.

This method prints information about a DataFrame including the index dtype and columns, non-NA values and memory usage.

Parameters:

verbosebool, optional

Whether to print the full summary. By default, the setting in pandas.options.display.max_info_columns is followed.

bufwritable buffer, defaults to sys.stdout

Where to send the output. By default, the output is printed to sys.stdout. Pass a writable buffer if you need to further process the output.

max_colsint, optional

When to switch from the verbose to the truncated output. If the DataFrame has more than max_cols columns, the truncated output is used. By default, the setting in pandas.options.display.max_info_columns is used.

memory_usagebool, str, optional

Specifies whether total memory usage of the DataFrame elements (including the index) should be displayed. By default, this follows the pandas.options.display.memory_usage setting.

True always show memory usage. False never shows memory usage. A value of ‘deep’ is equivalent to “True with deep introspection”. Memory usage is shown in human-readable units (base-2 representation). Without deep introspection a memory estimation is made based in column dtype and number of rows assuming values consume the same memory amount for corresponding dtypes. With deep memory introspection, a real memory usage calculation is performed at the cost of computational resources. See the Frequently Asked Questions for more details.

show_countsbool, optional

Whether to show the non-null counts. By default, this is shown only if the DataFrame is smaller than pandas.options.display.max_info_rows and pandas.options.display.max_info_columns. A value of True always shows the counts, and False never shows the counts.

Returns:

None: This method prints a summary of a DataFrame and returns None.

See also

DataFrame.describe: Generate descriptive statistics of DataFrame columns.
DataFrame.memory_usage: Memory usage of DataFrame columns.

Examples

>>> int_values = [1, 2, 3, 4, 5]
>>> text_values = ["alpha", "beta", "gamma", "delta", "epsilon"]
>>> float_values = [0.0, 0.25, 0.5, 0.75, 1.0]
>>> df = pd.DataFrame(
...     {
...         "int_col": int_values,
...         "text_col": text_values,
...         "float_col": float_values,
...     }
... )
>>> df
    int_col text_col  float_col
0        1    alpha       0.00
1        2     beta       0.25
2        3    gamma       0.50
3        4    delta       0.75
4        5  epsilon       1.00

Prints information of all columns:

>>> df.info(verbose=True)
<class 'pandas.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 3 columns):
 #   Column     Non-Null Count  Dtype
---  ------     --------------  -----
 0   int_col    5 non-null      int64
 1   text_col   5 non-null      str
 2   float_col  5 non-null      float64
dtypes: float64(1), int64(1), str(1)
memory usage: 278.0 bytes

Prints a summary of columns count and its dtypes but not per column information:

>>> df.info(verbose=False)
<class 'pandas.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Columns: 3 entries, int_col to float_col
dtypes: float64(1), int64(1), str(1)
memory usage: 278.0 bytes

Pipe output of DataFrame.info to buffer instead of sys.stdout, get buffer content and writes to a text file:

>>> import io
>>> buffer = io.StringIO()
>>> df.info(buf=buffer)
>>> s = buffer.getvalue()
>>> with open("df_info.txt", "w", encoding="utf-8") as f:
...     f.write(s)
260

The memory_usage parameter allows deep introspection mode, specially useful for big DataFrames and fine-tune memory optimization:

>>> random_strings_array = np.random.choice(["a", "b", "c"], 10**6)
>>> df = pd.DataFrame(
...     {
...         "column_1": np.random.choice(["a", "b", "c"], 10**6),
...         "column_2": np.random.choice(["a", "b", "c"], 10**6),
...         "column_3": np.random.choice(["a", "b", "c"], 10**6),
...     }
... )
>>> df.info()
<class 'pandas.DataFrame'>
RangeIndex: 1000000 entries, 0 to 999999
Data columns (total 3 columns):
 #   Column    Non-Null Count    Dtype
---  ------    --------------    -----
 0   column_1  1000000 non-null  str
 1   column_2  1000000 non-null  str
 2   column_3  1000000 non-null  str
dtypes: str(3)
memory usage: 25.7 MB

>>> df.info(memory_usage="deep")
<class 'pandas.DataFrame'>
RangeIndex: 1000000 entries, 0 to 999999
Data columns (total 3 columns):
 #   Column    Non-Null Count    Dtype
---  ------    --------------    -----
 0   column_1  1000000 non-null  str
 1   column_2  1000000 non-null  str
 2   column_3  1000000 non-null  str
dtypes: str(3)
memory usage: 25.7 MB