DataFrame#
Constructor#
| 
 | Two-dimensional, size-mutable, potentially heterogeneous tabular data. | 
Attributes and underlying data#
Axes
| The index (row labels) of the DataFrame. | |
| The column labels of the DataFrame. | 
| Return the dtypes in the DataFrame. | |
| 
 | Print a concise summary of a DataFrame. | 
| 
 | Return a subset of the DataFrame's columns based on the column dtypes. | 
| Return a Numpy representation of the DataFrame. | |
| Return a list representing the axes of the DataFrame. | |
| Return an int representing the number of axes / array dimensions. | |
| Return an int representing the number of elements in this object. | |
| Return a tuple representing the dimensionality of the DataFrame. | |
| 
 | Return the memory usage of each column in bytes. | 
| Indicator whether Series/DataFrame is empty. | |
| 
 | Return a new object with updated flags. | 
Conversion#
| 
 | Cast a pandas object to a specified dtype  | 
| 
 | Convert columns to best possible dtypes using dtypes supporting  | 
| Attempt to infer better dtypes for object columns. | |
| 
 | Make a copy of this object's indices and data. | 
| Return the bool of a single element Series or DataFrame. | 
Indexing, iteration#
| 
 | Return the first n rows. | 
| Access a single value for a row/column label pair. | |
| Access a single value for a row/column pair by integer position. | |
| Access a group of rows and columns by label(s) or a boolean array. | |
| Purely integer-location based indexing for selection by position. | |
| 
 | Insert column into DataFrame at specified location. | 
| Iterate over info axis. | |
| Iterate over (column name, Series) pairs. | |
| (DEPRECATED) Iterate over (column name, Series) pairs. | |
| Get the 'info axis' (see Indexing for more). | |
| Iterate over DataFrame rows as (index, Series) pairs. | |
| 
 | Iterate over DataFrame rows as namedtuples. | 
| 
 | (DEPRECATED) Label-based "fancy indexing" function for DataFrame. | 
| 
 | Return item and drop from frame. | 
| 
 | Return the last n rows. | 
| 
 | Return cross-section from the Series/DataFrame. | 
| 
 | Get item from object for given key (ex: DataFrame column). | 
| 
 | Whether each element in the DataFrame is contained in values. | 
| 
 | Replace values where the condition is False. | 
| 
 | Replace values where the condition is True. | 
| 
 | Query the columns of a DataFrame with a boolean expression. | 
For more information on .at, .iat, .loc, and
.iloc,  see the indexing documentation.
Binary operator functions#
| 
 | Get Addition of dataframe and other, element-wise (binary operator add). | 
| 
 | Get Subtraction of dataframe and other, element-wise (binary operator sub). | 
| 
 | Get Multiplication of dataframe and other, element-wise (binary operator mul). | 
| 
 | Get Floating division of dataframe and other, element-wise (binary operator truediv). | 
| 
 | Get Floating division of dataframe and other, element-wise (binary operator truediv). | 
| 
 | Get Integer division of dataframe and other, element-wise (binary operator floordiv). | 
| 
 | Get Modulo of dataframe and other, element-wise (binary operator mod). | 
| 
 | Get Exponential power of dataframe and other, element-wise (binary operator pow). | 
| 
 | Compute the matrix multiplication between the DataFrame and other. | 
| 
 | Get Addition of dataframe and other, element-wise (binary operator radd). | 
| 
 | Get Subtraction of dataframe and other, element-wise (binary operator rsub). | 
| 
 | Get Multiplication of dataframe and other, element-wise (binary operator rmul). | 
| 
 | Get Floating division of dataframe and other, element-wise (binary operator rtruediv). | 
| 
 | Get Floating division of dataframe and other, element-wise (binary operator rtruediv). | 
| 
 | Get Integer division of dataframe and other, element-wise (binary operator rfloordiv). | 
| 
 | Get Modulo of dataframe and other, element-wise (binary operator rmod). | 
| 
 | Get Exponential power of dataframe and other, element-wise (binary operator rpow). | 
| 
 | Get Less than of dataframe and other, element-wise (binary operator lt). | 
| 
 | Get Greater than of dataframe and other, element-wise (binary operator gt). | 
| 
 | Get Less than or equal to of dataframe and other, element-wise (binary operator le). | 
| 
 | Get Greater than or equal to of dataframe and other, element-wise (binary operator ge). | 
| 
 | Get Not equal to of dataframe and other, element-wise (binary operator ne). | 
| 
 | Get Equal to of dataframe and other, element-wise (binary operator eq). | 
| 
 | Perform column-wise combine with another DataFrame. | 
| 
 | Update null elements with value in the same location in other. | 
Function application, GroupBy & window#
| 
 | Apply a function along an axis of the DataFrame. | 
| 
 | Apply a function to a Dataframe elementwise. | 
| 
 | Apply chainable functions that expect Series or DataFrames. | 
| 
 | Aggregate using one or more operations over the specified axis. | 
| 
 | Aggregate using one or more operations over the specified axis. | 
| 
 | Call  | 
| 
 | Group DataFrame using a mapper or by a Series of columns. | 
| 
 | Provide rolling window calculations. | 
| 
 | Provide expanding window calculations. | 
| 
 | Provide exponentially weighted (EW) calculations. | 
Computations / descriptive stats#
| Return a Series/DataFrame with absolute numeric value of each element. | |
| 
 | Return whether all elements are True, potentially over an axis. | 
| 
 | Return whether any element is True, potentially over an axis. | 
| 
 | Trim values at input threshold(s). | 
| 
 | Compute pairwise correlation of columns, excluding NA/null values. | 
| 
 | Compute pairwise correlation. | 
| 
 | Count non-NA cells for each column or row. | 
| 
 | Compute pairwise covariance of columns, excluding NA/null values. | 
| 
 | Return cumulative maximum over a DataFrame or Series axis. | 
| 
 | Return cumulative minimum over a DataFrame or Series axis. | 
| 
 | Return cumulative product over a DataFrame or Series axis. | 
| 
 | Return cumulative sum over a DataFrame or Series axis. | 
| 
 | Generate descriptive statistics. | 
| 
 | First discrete difference of element. | 
| 
 | Evaluate a string describing operations on DataFrame columns. | 
| 
 | Return unbiased kurtosis over requested axis. | 
| 
 | Return unbiased kurtosis over requested axis. | 
| 
 | (DEPRECATED) Return the mean absolute deviation of the values over the requested axis. | 
| 
 | Return the maximum of the values over the requested axis. | 
| 
 | Return the mean of the values over the requested axis. | 
| 
 | Return the median of the values over the requested axis. | 
| 
 | Return the minimum of the values over the requested axis. | 
| 
 | Get the mode(s) of each element along the selected axis. | 
| 
 | Percentage change between the current and a prior element. | 
| 
 | Return the product of the values over the requested axis. | 
| 
 | Return the product of the values over the requested axis. | 
| 
 | Return values at the given quantile over requested axis. | 
| 
 | Compute numerical data ranks (1 through n) along axis. | 
| 
 | Round a DataFrame to a variable number of decimal places. | 
| 
 | Return unbiased standard error of the mean over requested axis. | 
| 
 | Return unbiased skew over requested axis. | 
| 
 | Return the sum of the values over the requested axis. | 
| 
 | Return sample standard deviation over requested axis. | 
| 
 | Return unbiased variance over requested axis. | 
| 
 | Count number of distinct elements in specified axis. | 
| 
 | Return a Series containing counts of unique rows in the DataFrame. | 
Reindexing / selection / label manipulation#
| 
 | Prefix labels with string prefix. | 
| 
 | Suffix labels with string suffix. | 
| 
 | Align two objects on their axes with the specified join method. | 
| 
 | Select values at particular time of day (e.g., 9:30AM). | 
| 
 | Select values between particular times of the day (e.g., 9:00-9:30 AM). | 
| 
 | Drop specified labels from rows or columns. | 
| 
 | Return DataFrame with duplicate rows removed. | 
| 
 | Return boolean Series denoting duplicate rows. | 
| 
 | Test whether two objects contain the same elements. | 
| 
 | Subset the dataframe rows or columns according to the specified index labels. | 
| 
 | Select initial periods of time series data based on a date offset. | 
| 
 | Return the first n rows. | 
| 
 | Return index of first occurrence of maximum over requested axis. | 
| 
 | Return index of first occurrence of minimum over requested axis. | 
| 
 | Select final periods of time series data based on a date offset. | 
| 
 | Conform Series/DataFrame to new index with optional filling logic. | 
| 
 | Return an object with matching indices as other object. | 
| 
 | Alter axes labels. | 
| 
 | Set the name of the axis for the index or columns. | 
| 
 | Reset the index, or a level of it. | 
| 
 | Return a random sample of items from an axis of object. | 
| 
 | Assign desired index to given axis. | 
| 
 | Set the DataFrame index using existing columns. | 
| 
 | Return the last n rows. | 
| 
 | Return the elements in the given positional indices along an axis. | 
| 
 | Truncate a Series or DataFrame before and after some index value. | 
Missing data handling#
| 
 | Synonym for  | 
| 
 | Synonym for  | 
| 
 | Remove missing values. | 
| 
 | Synonym for  | 
| 
 | Fill NA/NaN values using the specified method. | 
| 
 | Fill NaN values using an interpolation method. | 
| Detect missing values. | |
| DataFrame.isnull is an alias for DataFrame.isna. | |
| Detect existing (non-missing) values. | |
| DataFrame.notnull is an alias for DataFrame.notna. | |
| 
 | Synonym for  | 
| 
 | Replace values given in to_replace with value. | 
Reshaping, sorting, transposing#
| 
 | Return Series/DataFrame with requested index / column level(s) removed. | 
| 
 | Return reshaped DataFrame organized by given index / column values. | 
| 
 | Create a spreadsheet-style pivot table as a DataFrame. | 
| 
 | Rearrange index levels using input order. | 
| 
 | Sort by the values along either axis. | 
| 
 | Sort object by labels (along an axis). | 
| 
 | Return the first n rows ordered by columns in descending order. | 
| 
 | Return the first n rows ordered by columns in ascending order. | 
| 
 | Swap levels i and j in a  | 
| 
 | Stack the prescribed level(s) from columns to index. | 
| 
 | Pivot a level of the (necessarily hierarchical) index labels. | 
| 
 | Interchange axes and swap values axes appropriately. | 
| 
 | Unpivot a DataFrame from wide to long format, optionally leaving identifiers set. | 
| 
 | Transform each element of a list-like to a row, replicating index values. | 
| 
 | Squeeze 1 dimensional axis objects into scalars. | 
| Return an xarray object from the pandas object. | |
| 
 | Transpose index and columns. | 
Combining / comparing / joining / merging#
| 
 | (DEPRECATED) Append rows of other to the end of caller, returning a new object. | 
| 
 | Assign new columns to a DataFrame. | 
| 
 | Compare to another DataFrame and show the differences. | 
| 
 | Join columns of another DataFrame. | 
| 
 | Merge DataFrame or named Series objects with a database-style join. | 
| 
 | Modify in place using non-NA values from another DataFrame. | 
Flags#
Flags refer to attributes of the pandas object. Properties of the dataset (like
the date is was recorded, the URL it was accessed from, etc.) should be stored
in DataFrame.attrs.
| 
 | Flags that apply to pandas objects. | 
Metadata#
DataFrame.attrs is a dictionary for storing global metadata for this DataFrame.
Warning
DataFrame.attrs is considered experimental and may change without warning.
| Dictionary of global attributes of this dataset. | 
Plotting#
DataFrame.plot is both a callable method and a namespace attribute for
specific plotting methods of the form DataFrame.plot.<kind>.
| 
 | DataFrame plotting accessor and method | 
| 
 | Draw a stacked area plot. | 
| 
 | Vertical bar plot. | 
| 
 | Make a horizontal bar plot. | 
| 
 | Make a box plot of the DataFrame columns. | 
| 
 | Generate Kernel Density Estimate plot using Gaussian kernels. | 
| 
 | Generate a hexagonal binning plot. | 
| 
 | Draw one histogram of the DataFrame's columns. | 
| 
 | Generate Kernel Density Estimate plot using Gaussian kernels. | 
| 
 | Plot Series or DataFrame as lines. | 
| 
 | Generate a pie plot. | 
| 
 | Create a scatter plot with varying marker point size and color. | 
| 
 | Make a box plot from DataFrame columns. | 
| 
 | Make a histogram of the DataFrame's columns. | 
Sparse accessor#
Sparse-dtype specific methods and attributes are provided under the
DataFrame.sparse accessor.
| Ratio of non-sparse points to total (dense) data points. | 
| 
 | Create a new DataFrame from a scipy sparse matrix. | 
| Return the contents of the frame as a sparse SciPy COO matrix. | |
| Convert a DataFrame with sparse values to dense. | 
Serialization / IO / conversion#
| 
 | Construct DataFrame from dict of array-like or dicts. | 
| 
 | Convert structured or record ndarray to DataFrame. | 
| 
 | Write a DataFrame to the ORC format. | 
| 
 | Write a DataFrame to the binary parquet format. | 
| 
 | Pickle (serialize) object to file. | 
| 
 | Write object to a comma-separated values (csv) file. | 
| 
 | Write the contained data to an HDF5 file using HDFStore. | 
| 
 | Write records stored in a DataFrame to a SQL database. | 
| 
 | Convert the DataFrame to a dictionary. | 
| 
 | Write object to an Excel sheet. | 
| 
 | Convert the object to a JSON string. | 
| 
 | Render a DataFrame as an HTML table. | 
| 
 | Write a DataFrame to the binary Feather format. | 
| 
 | Render object to a LaTeX tabular, longtable, or nested table. | 
| 
 | Export DataFrame object to Stata dta format. | 
| 
 | Write a DataFrame to a Google BigQuery table. | 
| 
 | Convert DataFrame to a NumPy record array. | 
| 
 | Render a DataFrame to a console-friendly tabular output. | 
| 
 | Copy object to the system clipboard. | 
| 
 | Print DataFrame in Markdown-friendly format. | 
| Returns a Styler object. | |
| 
 | Return the dataframe interchange object implementing the interchange protocol. |