DataFrame#
Constructor#
  | 
Two-dimensional, size-mutable, potentially heterogeneous tabular data.  | 
Attributes and underlying data#
Axes
The index (row labels) of the DataFrame.  | 
|
The column labels of the DataFrame.  | 
Return the dtypes in the DataFrame.  | 
|
  | 
Print a concise summary of a DataFrame.  | 
  | 
Return a subset of the DataFrame's columns based on the column dtypes.  | 
Return a Numpy representation of the DataFrame.  | 
|
Return a list representing the axes of the DataFrame.  | 
|
Return an int representing the number of axes / array dimensions.  | 
|
Return an int representing the number of elements in this object.  | 
|
Return a tuple representing the dimensionality of the DataFrame.  | 
|
  | 
Return the memory usage of each column in bytes.  | 
Indicator whether Series/DataFrame is empty.  | 
|
  | 
Return a new object with updated flags.  | 
Conversion#
  | 
Cast a pandas object to a specified dtype   | 
  | 
Convert columns to best possible dtypes using dtypes supporting   | 
Attempt to infer better dtypes for object columns.  | 
|
  | 
Make a copy of this object's indices and data.  | 
Return the bool of a single element Series or DataFrame.  | 
Indexing, iteration#
  | 
Return the first n rows.  | 
Access a single value for a row/column label pair.  | 
|
Access a single value for a row/column pair by integer position.  | 
|
Access a group of rows and columns by label(s) or a boolean array.  | 
|
Purely integer-location based indexing for selection by position.  | 
|
  | 
Insert column into DataFrame at specified location.  | 
Iterate over info axis.  | 
|
Iterate over (column name, Series) pairs.  | 
|
(DEPRECATED) Iterate over (column name, Series) pairs.  | 
|
Get the 'info axis' (see Indexing for more).  | 
|
Iterate over DataFrame rows as (index, Series) pairs.  | 
|
  | 
Iterate over DataFrame rows as namedtuples.  | 
  | 
(DEPRECATED) Label-based "fancy indexing" function for DataFrame.  | 
  | 
Return item and drop from frame.  | 
  | 
Return the last n rows.  | 
  | 
Return cross-section from the Series/DataFrame.  | 
  | 
Get item from object for given key (ex: DataFrame column).  | 
  | 
Whether each element in the DataFrame is contained in values.  | 
  | 
Replace values where the condition is False.  | 
  | 
Replace values where the condition is True.  | 
  | 
Query the columns of a DataFrame with a boolean expression.  | 
For more information on .at, .iat, .loc, and
.iloc,  see the indexing documentation.
Binary operator functions#
  | 
Get Addition of dataframe and other, element-wise (binary operator add).  | 
  | 
Get Subtraction of dataframe and other, element-wise (binary operator sub).  | 
  | 
Get Multiplication of dataframe and other, element-wise (binary operator mul).  | 
  | 
Get Floating division of dataframe and other, element-wise (binary operator truediv).  | 
  | 
Get Floating division of dataframe and other, element-wise (binary operator truediv).  | 
  | 
Get Integer division of dataframe and other, element-wise (binary operator floordiv).  | 
  | 
Get Modulo of dataframe and other, element-wise (binary operator mod).  | 
  | 
Get Exponential power of dataframe and other, element-wise (binary operator pow).  | 
  | 
Compute the matrix multiplication between the DataFrame and other.  | 
  | 
Get Addition of dataframe and other, element-wise (binary operator radd).  | 
  | 
Get Subtraction of dataframe and other, element-wise (binary operator rsub).  | 
  | 
Get Multiplication of dataframe and other, element-wise (binary operator rmul).  | 
  | 
Get Floating division of dataframe and other, element-wise (binary operator rtruediv).  | 
  | 
Get Floating division of dataframe and other, element-wise (binary operator rtruediv).  | 
  | 
Get Integer division of dataframe and other, element-wise (binary operator rfloordiv).  | 
  | 
Get Modulo of dataframe and other, element-wise (binary operator rmod).  | 
  | 
Get Exponential power of dataframe and other, element-wise (binary operator rpow).  | 
  | 
Get Less than of dataframe and other, element-wise (binary operator lt).  | 
  | 
Get Greater than of dataframe and other, element-wise (binary operator gt).  | 
  | 
Get Less than or equal to of dataframe and other, element-wise (binary operator le).  | 
  | 
Get Greater than or equal to of dataframe and other, element-wise (binary operator ge).  | 
  | 
Get Not equal to of dataframe and other, element-wise (binary operator ne).  | 
  | 
Get Equal to of dataframe and other, element-wise (binary operator eq).  | 
  | 
Perform column-wise combine with another DataFrame.  | 
  | 
Update null elements with value in the same location in other.  | 
Function application, GroupBy & window#
  | 
Apply a function along an axis of the DataFrame.  | 
  | 
Apply a function to a Dataframe elementwise.  | 
  | 
Apply chainable functions that expect Series or DataFrames.  | 
  | 
Aggregate using one or more operations over the specified axis.  | 
  | 
Aggregate using one or more operations over the specified axis.  | 
  | 
Call   | 
  | 
Group DataFrame using a mapper or by a Series of columns.  | 
  | 
Provide rolling window calculations.  | 
  | 
Provide expanding window calculations.  | 
  | 
Provide exponentially weighted (EW) calculations.  | 
Computations / descriptive stats#
Return a Series/DataFrame with absolute numeric value of each element.  | 
|
  | 
Return whether all elements are True, potentially over an axis.  | 
  | 
Return whether any element is True, potentially over an axis.  | 
  | 
Trim values at input threshold(s).  | 
  | 
Compute pairwise correlation of columns, excluding NA/null values.  | 
  | 
Compute pairwise correlation.  | 
  | 
Count non-NA cells for each column or row.  | 
  | 
Compute pairwise covariance of columns, excluding NA/null values.  | 
  | 
Return cumulative maximum over a DataFrame or Series axis.  | 
  | 
Return cumulative minimum over a DataFrame or Series axis.  | 
  | 
Return cumulative product over a DataFrame or Series axis.  | 
  | 
Return cumulative sum over a DataFrame or Series axis.  | 
  | 
Generate descriptive statistics.  | 
  | 
First discrete difference of element.  | 
  | 
Evaluate a string describing operations on DataFrame columns.  | 
  | 
Return unbiased kurtosis over requested axis.  | 
  | 
Return unbiased kurtosis over requested axis.  | 
  | 
(DEPRECATED) Return the mean absolute deviation of the values over the requested axis.  | 
  | 
Return the maximum of the values over the requested axis.  | 
  | 
Return the mean of the values over the requested axis.  | 
  | 
Return the median of the values over the requested axis.  | 
  | 
Return the minimum of the values over the requested axis.  | 
  | 
Get the mode(s) of each element along the selected axis.  | 
  | 
Percentage change between the current and a prior element.  | 
  | 
Return the product of the values over the requested axis.  | 
  | 
Return the product of the values over the requested axis.  | 
  | 
Return values at the given quantile over requested axis.  | 
  | 
Compute numerical data ranks (1 through n) along axis.  | 
  | 
Round a DataFrame to a variable number of decimal places.  | 
  | 
Return unbiased standard error of the mean over requested axis.  | 
  | 
Return unbiased skew over requested axis.  | 
  | 
Return the sum of the values over the requested axis.  | 
  | 
Return sample standard deviation over requested axis.  | 
  | 
Return unbiased variance over requested axis.  | 
  | 
Count number of distinct elements in specified axis.  | 
  | 
Return a Series containing counts of unique rows in the DataFrame.  | 
Reindexing / selection / label manipulation#
  | 
Prefix labels with string prefix.  | 
  | 
Suffix labels with string suffix.  | 
  | 
Align two objects on their axes with the specified join method.  | 
  | 
Select values at particular time of day (e.g., 9:30AM).  | 
  | 
Select values between particular times of the day (e.g., 9:00-9:30 AM).  | 
  | 
Drop specified labels from rows or columns.  | 
  | 
Return DataFrame with duplicate rows removed.  | 
  | 
Return boolean Series denoting duplicate rows.  | 
  | 
Test whether two objects contain the same elements.  | 
  | 
Subset the dataframe rows or columns according to the specified index labels.  | 
  | 
Select initial periods of time series data based on a date offset.  | 
  | 
Return the first n rows.  | 
  | 
Return index of first occurrence of maximum over requested axis.  | 
  | 
Return index of first occurrence of minimum over requested axis.  | 
  | 
Select final periods of time series data based on a date offset.  | 
  | 
Conform Series/DataFrame to new index with optional filling logic.  | 
  | 
Return an object with matching indices as other object.  | 
  | 
Alter axes labels.  | 
  | 
Set the name of the axis for the index or columns.  | 
  | 
Reset the index, or a level of it.  | 
  | 
Return a random sample of items from an axis of object.  | 
  | 
Assign desired index to given axis.  | 
  | 
Set the DataFrame index using existing columns.  | 
  | 
Return the last n rows.  | 
  | 
Return the elements in the given positional indices along an axis.  | 
  | 
Truncate a Series or DataFrame before and after some index value.  | 
Missing data handling#
  | 
Synonym for   | 
  | 
Synonym for   | 
  | 
Remove missing values.  | 
  | 
Synonym for   | 
  | 
Fill NA/NaN values using the specified method.  | 
  | 
Fill NaN values using an interpolation method.  | 
Detect missing values.  | 
|
DataFrame.isnull is an alias for DataFrame.isna.  | 
|
Detect existing (non-missing) values.  | 
|
DataFrame.notnull is an alias for DataFrame.notna.  | 
|
  | 
Synonym for   | 
  | 
Replace values given in to_replace with value.  | 
Reshaping, sorting, transposing#
  | 
Return Series/DataFrame with requested index / column level(s) removed.  | 
  | 
Return reshaped DataFrame organized by given index / column values.  | 
  | 
Create a spreadsheet-style pivot table as a DataFrame.  | 
  | 
Rearrange index levels using input order.  | 
  | 
Sort by the values along either axis.  | 
  | 
Sort object by labels (along an axis).  | 
  | 
Return the first n rows ordered by columns in descending order.  | 
  | 
Return the first n rows ordered by columns in ascending order.  | 
  | 
Swap levels i and j in a   | 
  | 
Stack the prescribed level(s) from columns to index.  | 
  | 
Pivot a level of the (necessarily hierarchical) index labels.  | 
  | 
Interchange axes and swap values axes appropriately.  | 
  | 
Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.  | 
  | 
Transform each element of a list-like to a row, replicating index values.  | 
  | 
Squeeze 1 dimensional axis objects into scalars.  | 
Return an xarray object from the pandas object.  | 
|
  | 
Transpose index and columns.  | 
Combining / comparing / joining / merging#
  | 
(DEPRECATED) Append rows of other to the end of caller, returning a new object.  | 
  | 
Assign new columns to a DataFrame.  | 
  | 
Compare to another DataFrame and show the differences.  | 
  | 
Join columns of another DataFrame.  | 
  | 
Merge DataFrame or named Series objects with a database-style join.  | 
  | 
Modify in place using non-NA values from another DataFrame.  | 
Flags#
Flags refer to attributes of the pandas object. Properties of the dataset (like
the date is was recorded, the URL it was accessed from, etc.) should be stored
in DataFrame.attrs.
  | 
Flags that apply to pandas objects.  | 
Metadata#
DataFrame.attrs is a dictionary for storing global metadata for this DataFrame.
Warning
DataFrame.attrs is considered experimental and may change without warning.
Dictionary of global attributes of this dataset.  | 
Plotting#
DataFrame.plot is both a callable method and a namespace attribute for
specific plotting methods of the form DataFrame.plot.<kind>.
  | 
DataFrame plotting accessor and method  | 
  | 
Draw a stacked area plot.  | 
  | 
Vertical bar plot.  | 
  | 
Make a horizontal bar plot.  | 
  | 
Make a box plot of the DataFrame columns.  | 
  | 
Generate Kernel Density Estimate plot using Gaussian kernels.  | 
  | 
Generate a hexagonal binning plot.  | 
  | 
Draw one histogram of the DataFrame's columns.  | 
  | 
Generate Kernel Density Estimate plot using Gaussian kernels.  | 
  | 
Plot Series or DataFrame as lines.  | 
  | 
Generate a pie plot.  | 
  | 
Create a scatter plot with varying marker point size and color.  | 
  | 
Make a box plot from DataFrame columns.  | 
  | 
Make a histogram of the DataFrame's columns.  | 
Sparse accessor#
Sparse-dtype specific methods and attributes are provided under the
DataFrame.sparse accessor.
Ratio of non-sparse points to total (dense) data points.  | 
  | 
Create a new DataFrame from a scipy sparse matrix.  | 
Return the contents of the frame as a sparse SciPy COO matrix.  | 
|
Convert a DataFrame with sparse values to dense.  | 
Serialization / IO / conversion#
  | 
Construct DataFrame from dict of array-like or dicts.  | 
  | 
Convert structured or record ndarray to DataFrame.  | 
  | 
Write a DataFrame to the ORC format.  | 
  | 
Write a DataFrame to the binary parquet format.  | 
  | 
Pickle (serialize) object to file.  | 
  | 
Write object to a comma-separated values (csv) file.  | 
  | 
Write the contained data to an HDF5 file using HDFStore.  | 
  | 
Write records stored in a DataFrame to a SQL database.  | 
  | 
Convert the DataFrame to a dictionary.  | 
  | 
Write object to an Excel sheet.  | 
  | 
Convert the object to a JSON string.  | 
  | 
Render a DataFrame as an HTML table.  | 
  | 
Write a DataFrame to the binary Feather format.  | 
  | 
Render object to a LaTeX tabular, longtable, or nested table.  | 
  | 
Export DataFrame object to Stata dta format.  | 
  | 
Write a DataFrame to a Google BigQuery table.  | 
  | 
Convert DataFrame to a NumPy record array.  | 
  | 
Render a DataFrame to a console-friendly tabular output.  | 
  | 
Copy object to the system clipboard.  | 
  | 
Print DataFrame in Markdown-friendly format.  | 
Returns a Styler object.  | 
|
  | 
Return the dataframe interchange object implementing the interchange protocol.  |