Table Of Contents

Search

Enter search terms or a module, class or function name.

API Reference

Input/Output

Pickling

read_pickle(path) Load pickled pandas object (or any other pickled object) from the specified

Flat File

read_table(filepath_or_buffer[, sep, ...]) Read general delimited file into DataFrame
read_csv(filepath_or_buffer[, sep, dialect, ...]) Read CSV (comma-separated) file into DataFrame
read_fwf(filepath_or_buffer[, colspecs, widths]) Read a table of fixed-width formatted lines into DataFrame

Clipboard

read_clipboard(**kwargs) Read text from clipboard and pass to read_table.

Excel

read_excel(io, sheetname, **kwds) Read an Excel table into a pandas DataFrame
ExcelFile.parse(sheetname[, header, ...]) Read an Excel table into DataFrame

JSON

read_json([path_or_buf, orient, typ, dtype, ...]) Convert a JSON string to pandas object

HTML

read_html(io[, match, flavor, header, ...]) Read HTML tables into a list of DataFrame objects.

HDFStore: PyTables (HDF5)

read_hdf(path_or_buf, key, **kwargs) read from the store, close it if we opened it
HDFStore.put(key, value[, format, append]) Store object in HDFStore
HDFStore.append(key, value[, format, ...]) Append to Table in file. Node must already exist and be Table
HDFStore.get(key) Retrieve pandas object stored in file
HDFStore.select(key[, where, start, stop, ...]) Retrieve pandas object stored in file, optionally based on where

SQL

read_sql(sql, con[, index_col, ...]) Returns a DataFrame corresponding to the result set of the query
read_frame(sql, con[, index_col, ...]) Returns a DataFrame corresponding to the result set of the query
write_frame(frame, name, con[, flavor, ...]) Write records stored in a DataFrame to a SQL database.

Google BigQuery

read_gbq(query[, project_id, ...]) Load data from Google BigQuery.
to_gbq(dataframe, destination_table[, ...]) Write a DataFrame to a Google BigQuery table.

STATA

read_stata(filepath_or_buffer[, ...]) Read Stata file into DataFrame
StataReader.data([convert_dates, ...]) Reads observations from Stata file, converting them into a dataframe
StataReader.data_label() Returns data label of Stata file
StataReader.value_labels() Returns a dict, associating each variable name a dict, associating
StataReader.variable_labels() Returns variable labels as a dict, associating each variable name
StataWriter.write_file()

General functions

Data manipulations

melt(frame[, id_vars, value_vars, var_name, ...]) “Unpivots” a DataFrame from wide format to long format, optionally leaving
pivot_table(data[, values, rows, cols, ...]) Create a spreadsheet-style pivot table as a DataFrame. The levels in the
crosstab(rows, cols[, values, rownames, ...]) Compute a simple cross-tabulation of two (or more) factors.
cut(x, bins[, right, labels, retbins, ...]) Return indices of half-open bins to which each value of x belongs.
qcut(x, q[, labels, retbins, precision]) Quantile-based discretization function.
merge(left, right[, how, on, left_on, ...]) Merge DataFrame objects by performing a database-style join operation by
concat(objs[, axis, join, join_axes, ...]) Concatenate pandas objects along a particular axis with optional set logic along the other axes.
get_dummies(data[, prefix, prefix_sep, dummy_na]) Convert categorical variable into dummy/indicator variables

Top-level missing data

isnull(obj) Detect missing values (NaN in numeric arrays, None/NaN in object arrays)
notnull(obj) Replacement for numpy.isfinite / -numpy.isnan which is suitable for use on object arrays.

Top-level dealing with datetimes

to_datetime(arg[, errors, dayfirst, utc, ...]) Convert argument to datetime
to_timedelta(arg[, box, unit]) Convert argument to timedelta
date_range([start, end, periods, freq, tz, ...]) Return a fixed frequency datetime index, with day (calendar) as the default
bdate_range([start, end, periods, freq, tz, ...]) Return a fixed frequency datetime index, with business day as the default
period_range([start, end, periods, freq, name]) Return a fixed frequency datetime index, with day (calendar) as the default

Top-level evaluation

eval(expr[, parser, engine, truediv, ...]) Evaluate a Python expression as a string using various backends.

Standard moving window functions

rolling_count(arg, window[, freq, center, ...]) Rolling count of number of non-NaN observations inside provided window.
rolling_sum(arg, window[, min_periods, ...]) Moving sum
rolling_mean(arg, window[, min_periods, ...]) Moving mean
rolling_median(arg, window[, min_periods, ...]) O(N log(window)) implementation using skip list
rolling_var(arg, window[, min_periods, ...]) Unbiased moving variance
rolling_std(arg, window[, min_periods, ...]) Unbiased moving standard deviation
rolling_min(arg, window[, min_periods, ...]) Moving min of 1d array of dtype=float64 along axis=0 ignoring NaNs.
rolling_max(arg, window[, min_periods, ...]) Moving max of 1d array of dtype=float64 along axis=0 ignoring NaNs.
rolling_corr(arg1, arg2, window[, ...]) Moving sample correlation
rolling_corr_pairwise(df, window[, min_periods]) Computes pairwise rolling correlation matrices as Panel whose items are
rolling_cov(arg1, arg2, window[, ...]) Unbiased moving covariance
rolling_skew(arg, window[, min_periods, ...]) Unbiased moving skewness
rolling_kurt(arg, window[, min_periods, ...]) Unbiased moving kurtosis
rolling_apply(arg, window, func[, ...]) Generic moving function application
rolling_quantile(arg, window, quantile[, ...]) Moving quantile
rolling_window(arg[, window, win_type, ...]) Applies a moving window of type window_type and size window on the data.

Standard expanding window functions

expanding_count(arg[, freq, center, time_rule]) Expanding count of number of non-NaN observations.
expanding_sum(arg[, min_periods, freq, ...]) Expanding sum
expanding_mean(arg[, min_periods, freq, ...]) Expanding mean
expanding_median(arg[, min_periods, freq, ...]) O(N log(window)) implementation using skip list
expanding_var(arg[, min_periods, freq, ...]) Unbiased expanding variance
expanding_std(arg[, min_periods, freq, ...]) Unbiased expanding standard deviation
expanding_min(arg[, min_periods, freq, ...]) Moving min of 1d array of dtype=float64 along axis=0 ignoring NaNs.
expanding_max(arg[, min_periods, freq, ...]) Moving max of 1d array of dtype=float64 along axis=0 ignoring NaNs.
expanding_corr(arg1, arg2[, min_periods, ...]) Expanding sample correlation
expanding_corr_pairwise(df[, min_periods]) Computes pairwise expanding correlation matrices as Panel whose items are
expanding_cov(arg1, arg2[, min_periods, ...]) Unbiased expanding covariance
expanding_skew(arg[, min_periods, freq, ...]) Unbiased expanding skewness
expanding_kurt(arg[, min_periods, freq, ...]) Unbiased expanding kurtosis
expanding_apply(arg, func[, min_periods, ...]) Generic expanding function application
expanding_quantile(arg, quantile[, ...]) Expanding quantile

Exponentially-weighted moving window functions

ewma(arg[, com, span, halflife, ...]) Exponentially-weighted moving average
ewmstd(arg[, com, span, halflife, ...]) Exponentially-weighted moving std
ewmvar(arg[, com, span, halflife, ...]) Exponentially-weighted moving variance
ewmcorr(arg1, arg2[, com, span, halflife, ...]) Exponentially-weighted moving correlation
ewmcov(arg1, arg2[, com, span, halflife, ...]) Exponentially-weighted moving covariance

Series

Constructor

Series([data, index, dtype, name, copy, ...]) One-dimensional ndarray with axis labels (including time series).

Attributes and underlying data

Axes
  • index: axis labels
Series.values Return Series as ndarray
Series.dtype
Series.isnull() Return a boolean same-sized object indicating if the values are null
Series.notnull() Return a boolean same-sized object indicating if the values are

Conversion

Series.astype(dtype[, copy, raise_on_error]) Cast object to input numpy.dtype
Series.copy([deep]) Make a copy of this object
Series.isnull() Return a boolean same-sized object indicating if the values are null
Series.notnull() Return a boolean same-sized object indicating if the values are

Indexing, iteration

Series.get(label[, default]) Returns value occupying requested label, default to specified missing value if not present.
Series.at
Series.iat
Series.ix
Series.loc
Series.iloc
Series.__iter__()
Series.iteritems() Lazily iterate over (index, value) tuples

For more information on .at, .iat, .ix, .loc, and .iloc, see the indexing documentation.

Binary operator functions

Series.add(other[, level, fill_value, axis]) Binary operator add with support to substitute a fill_value for missing data
Series.sub(other[, level, fill_value, axis]) Binary operator sub with support to substitute a fill_value for missing data
Series.mul(other[, level, fill_value, axis]) Binary operator mul with support to substitute a fill_value for missing data
Series.div(other[, level, fill_value, axis]) Binary operator truediv with support to substitute a fill_value for missing data
Series.truediv(other[, level, fill_value, axis]) Binary operator truediv with support to substitute a fill_value for missing data
Series.floordiv(other[, level, fill_value, axis]) Binary operator floordiv with support to substitute a fill_value for missing data
Series.mod(other[, level, fill_value, axis]) Binary operator mod with support to substitute a fill_value for missing data
Series.pow(other[, level, fill_value, axis]) Binary operator pow with support to substitute a fill_value for missing data
Series.radd(other[, level, fill_value, axis]) Binary operator radd with support to substitute a fill_value for missing data
Series.rsub(other[, level, fill_value, axis]) Binary operator rsub with support to substitute a fill_value for missing data
Series.rmul(other[, level, fill_value, axis]) Binary operator rmul with support to substitute a fill_value for missing data
Series.rdiv(other[, level, fill_value, axis]) Binary operator rtruediv with support to substitute a fill_value for missing data
Series.rtruediv(other[, level, fill_value, axis]) Binary operator rtruediv with support to substitute a fill_value for missing data
Series.rfloordiv(other[, level, fill_value, ...]) Binary operator rfloordiv with support to substitute a fill_value for missing data
Series.rmod(other[, level, fill_value, axis]) Binary operator rmod with support to substitute a fill_value for missing data
Series.rpow(other[, level, fill_value, axis]) Binary operator rpow with support to substitute a fill_value for missing data
Series.combine(other, func[, fill_value]) Perform elementwise binary operation on two Series using given function
Series.combine_first(other) Combine Series values, choosing the calling Series’s values
Series.round([decimals, out]) Return a with each element rounded to the given number of decimals.
Series.lt(other)
Series.gt(other)
Series.le(other)
Series.ge(other)
Series.ne(other)
Series.eq(other)

Function application, GroupBy

Series.apply(func[, convert_dtype, args]) Invoke function on values of Series. Can be ufunc (a NumPy function
Series.map(arg[, na_action]) Map values of Series using input correspondence (which can be
Series.groupby([by, axis, level, as_index, ...]) Group series using mapper (dict or key function, apply given function

Computations / Descriptive Stats

Series.abs() Return an object with absolute value taken.
Series.any([axis, out]) Returns True if any of the elements of a evaluate to True.
Series.autocorr() Lag-1 autocorrelation
Series.between(left, right[, inclusive]) Return boolean Series equivalent to left <= series <= right. NA values
Series.clip([lower, upper, out]) Trim values at input threshold(s)
Series.clip_lower(threshold) Return copy of the input with values below given value truncated
Series.clip_upper(threshold) Return copy of input with values above given value truncated
Series.corr(other[, method, min_periods]) Compute correlation with other Series, excluding missing values
Series.count([level]) Return number of non-NA/null observations in the Series
Series.cov(other[, min_periods]) Compute covariance with Series, excluding missing values
Series.cummax([axis, dtype, out, skipna]) Return cumulative max over requested axis.
Series.cummin([axis, dtype, out, skipna]) Return cumulative min over requested axis.
Series.cumprod([axis, dtype, out, skipna]) Return cumulative prod over requested axis.
Series.cumsum([axis, dtype, out, skipna]) Return cumulative sum over requested axis.
Series.describe([percentile_width]) Generate various summary statistics of Series, excluding NaN
Series.diff([periods]) 1st discrete difference of object
Series.kurt([axis, skipna, level, numeric_only]) Return unbiased kurtosis over requested axis
Series.mad([axis, skipna, level]) Return the mean absolute deviation of the values for the requested axis
Series.max([axis, skipna, level, numeric_only]) This method returns the maximum of the values in the object.
Series.mean([axis, skipna, level, numeric_only]) Return the mean of the values for the requested axis
Series.median([axis, skipna, level, ...]) Return the median of the values for the requested axis
Series.min([axis, skipna, level, numeric_only]) This method returns the minimum of the values in the object.
Series.mode() Returns the mode(s) of the dataset.
Series.nunique() Return count of unique elements in the Series
Series.pct_change([periods, fill_method, ...]) Percent change over given number of periods
Series.prod([axis, skipna, level, numeric_only]) Return the product of the values for the requested axis
Series.quantile([q]) Return value at the given quantile, a la scoreatpercentile in
Series.rank([method, na_option, ascending]) Compute data ranks (1 through n).
Series.skew([axis, skipna, level, numeric_only]) Return unbiased skew over requested axis
Series.std([axis, skipna, level, ddof]) Return unbiased standard deviation over requested axis
Series.sum([axis, skipna, level, numeric_only]) Return the sum of the values for the requested axis
Series.unique() Return array of unique values in the Series. Significantly faster than
Series.var([axis, skipna, level, ddof]) Return unbiased variance over requested axis
Series.value_counts([normalize, sort, ...]) Returns Series containing counts of unique values. The resulting Series

Reindexing / Selection / Label manipulation

Series.align(other[, join, axis, level, ...]) Align two object on their axes with the
Series.drop(labels[, axis, level, inplace]) Return new object with labels in requested axis removed
Series.first(offset) Convenience method for subsetting initial periods of time series data
Series.head([n]) Returns first n rows
Series.idxmax([axis, out, skipna]) Index of first occurrence of maximum of values.
Series.idxmin([axis, out, skipna]) Index of first occurrence of minimum of values.
Series.isin(values) Return a boolean Series showing whether each element
Series.last(offset) Convenience method for subsetting final periods of time series data
Series.reindex([index]) Conform Series to new index with optional filling logic, placing
Series.reindex_like(other[, method, copy, limit]) return an object with matching indicies to myself
Series.rename([index]) Alter axes input function or functions.
Series.reset_index([level, drop, name, inplace]) Analogous to the DataFrame.reset_index function, see docstring there.
Series.select(crit[, axis]) Return data corresponding to axis labels matching criteria
Series.take(indices[, axis, convert]) Analogous to ndarray.take, return Series corresponding to requested
Series.tail([n]) Returns last n rows
Series.truncate([before, after, axis, copy]) Truncates a sorted NDFrame before and/or after some particular

Missing data handling

Series.dropna([axis, inplace]) Return Series without null values
Series.fillna([value, method, axis, ...]) Fill NA/NaN values using the specified method
Series.interpolate([method, axis, limit, ...]) Interpolate values according to different methods.

Reshaping, sorting

Series.argsort([axis, kind, order]) Overrides ndarray.argsort.
Series.order([na_last, ascending, kind]) Sorts Series object, by value, maintaining index-value link
Series.reorder_levels(order) Rearrange index levels using input order.
Series.sort([axis, kind, order, ascending]) Sort values and index labels by value, in place.
Series.sort_index([ascending]) Sort object by labels (along an axis)
Series.sortlevel([level, ascending]) Sort Series with MultiIndex by chosen level. Data will be
Series.swaplevel(i, j[, copy]) Swap levels i and j in a MultiIndex
Series.unstack([level]) Unstack, a.k.a.

Combining / joining / merging

Series.append(to_append[, verify_integrity]) Concatenate two or more Series. The indexes must not overlap
Series.replace([to_replace, value, inplace, ...]) Replace values given in ‘to_replace’ with ‘value’.
Series.update(other) Modify Series in place using non-NA values from passed

String handling

Series.str can be used to access the values of the series as strings and apply several methods to it. Due to implementation details the methods show up here as methods of the StringMethods class.

StringMethods.cat([others, sep, na_rep]) Concatenate arrays of strings with given separator
StringMethods.center(width) “Center” strings, filling left and right side with additional whitespace
StringMethods.contains(pat[, case, flags, na]) Check whether given pattern is contained in each string in the array
StringMethods.count(pat[, flags]) Count occurrences of pattern in each string
StringMethods.decode(encoding[, errors]) Decode character string to unicode using indicated encoding
StringMethods.encode(encoding[, errors]) Encode character string to some other encoding using indicated encoding
StringMethods.endswith(pat[, na]) Return boolean array indicating whether each string ends with passed
StringMethods.extract(pat[, flags]) Find groups in each string using passed regular expression
StringMethods.findall(pat[, flags]) Find all occurrences of pattern or regular expression
StringMethods.get(i) Extract element from lists, tuples, or strings in each element in the array
StringMethods.join(sep) Join lists contained as elements in array, a la str.join
StringMethods.len() Compute length of each string in array.
StringMethods.lower() Convert strings in array to lowercase
StringMethods.lstrip([to_strip]) Strip whitespace (including newlines) from left side of each string in the
StringMethods.match(pat[, flags]) Deprecated: Find groups in each string using passed regular expression.
StringMethods.pad(width[, side]) Pad strings with whitespace
StringMethods.repeat(repeats) Duplicate each string in the array by indicated number of times
StringMethods.replace(pat, repl[, n, case, ...]) Replace
StringMethods.rstrip([to_strip]) Strip whitespace (including newlines) from right side of each string in the
StringMethods.slice([start, stop, step]) Slice substrings from each element in array
StringMethods.slice_replace([i, j]) Slice substrings from each element in array
StringMethods.split([pat, n]) Split each string (a la re.split) in array by given pattern, propagating NA
StringMethods.startswith(pat[, na]) Return boolean array indicating whether each string starts with passed
StringMethods.strip([to_strip]) Strip whitespace (including newlines) from each string in the array
StringMethods.title() Convert strings to titlecased version
StringMethods.upper() Convert strings in array to uppercase

Plotting

Series.hist([by, ax, grid, xlabelsize, ...]) Draw histogram of the input series using matplotlib
Series.plot(series[, label, kind, ...]) Plot the input series with the index on the x-axis using matplotlib

Serialization / IO / Conversion

Series.from_csv(path[, sep, parse_dates, ...]) Read delimited file into Series
Series.to_pickle(path) Pickle (serialize) object to input file path
Series.to_csv(path[, index, sep, na_rep, ...]) Write Series to a comma-separated values (csv) file
Series.to_dict() Convert Series to {label -> value} dict
Series.to_frame([name]) Convert Series to DataFrame
Series.to_hdf(path_or_buf, key, **kwargs) activate the HDFStore
Series.to_json([path_or_buf, orient, ...]) Convert the object to a JSON string.
Series.to_sparse([kind, fill_value]) Convert Series to SparseSeries
Series.to_string([buf, na_rep, ...]) Render a string representation of the Series
Series.to_clipboard([excel, sep]) Attempt to write text representation of object to the system clipboard

DataFrame

Constructor

DataFrame([data, index, columns, dtype, copy]) Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns).

Attributes and underlying data

Axes

  • index: row labels
  • columns: column labels
DataFrame.as_matrix([columns]) Convert the frame to its Numpy-array matrix representation. Columns
DataFrame.dtypes
DataFrame.get_dtype_counts() return the counts of dtypes in this frame
DataFrame.values Numpy representation of NDFrame
DataFrame.axes
DataFrame.ndim Number of axes / array dimensions
DataFrame.shape

Conversion

DataFrame.astype(dtype[, copy, raise_on_error]) Cast object to input numpy.dtype
DataFrame.convert_objects([convert_dates, ...]) Attempt to infer better dtype for object columns
DataFrame.copy([deep]) Make a copy of this object
DataFrame.isnull() Return a boolean same-sized object indicating if the values are null
DataFrame.notnull() Return a boolean same-sized object indicating if the values are

Indexing, iteration

DataFrame.head([n]) Returns first n rows
DataFrame.at
DataFrame.iat
DataFrame.ix
DataFrame.loc
DataFrame.iloc
DataFrame.insert(loc, column, value[, ...]) Insert column into DataFrame at specified location.
DataFrame.__iter__() Iterate over infor axis
DataFrame.iteritems() Iterator over (column, series) pairs
DataFrame.iterrows() Iterate over rows of DataFrame as (index, Series) pairs.
DataFrame.itertuples([index]) Iterate over rows of DataFrame as tuples, with index value
DataFrame.lookup(row_labels, col_labels) Label-based “fancy indexing” function for DataFrame.
DataFrame.pop(item) Return item and drop from frame.
DataFrame.tail([n]) Returns last n rows
DataFrame.xs(key[, axis, level, copy, ...]) Returns a cross-section (row(s) or column(s)) from the DataFrame.
DataFrame.isin(values) Return boolean DataFrame showing whether each element in the
DataFrame.query(expr, **kwargs) Query the columns of a frame with a boolean expression.

For more information on .at, .iat, .ix, .loc, and .iloc, see the indexing documentation.

Binary operator functions

DataFrame.add(other[, axis, level, fill_value]) Binary operator add with support to substitute a fill_value for missing data in
DataFrame.sub(other[, axis, level, fill_value]) Binary operator sub with support to substitute a fill_value for missing data in
DataFrame.mul(other[, axis, level, fill_value]) Binary operator mul with support to substitute a fill_value for missing data in
DataFrame.div(other[, axis, level, fill_value]) Binary operator truediv with support to substitute a fill_value for missing data in
DataFrame.truediv(other[, axis, level, ...]) Binary operator truediv with support to substitute a fill_value for missing data in
DataFrame.floordiv(other[, axis, level, ...]) Binary operator floordiv with support to substitute a fill_value for missing data in
DataFrame.mod(other[, axis, level, fill_value]) Binary operator mod with support to substitute a fill_value for missing data in
DataFrame.pow(other[, axis, level, fill_value]) Binary operator pow with support to substitute a fill_value for missing data in
DataFrame.radd(other[, axis, level, fill_value]) Binary operator radd with support to substitute a fill_value for missing data in
DataFrame.rsub(other[, axis, level, fill_value]) Binary operator rsub with support to substitute a fill_value for missing data in
DataFrame.rmul(other[, axis, level, fill_value]) Binary operator rmul with support to substitute a fill_value for missing data in
DataFrame.rdiv(other[, axis, level, fill_value]) Binary operator rtruediv with support to substitute a fill_value for missing data in
DataFrame.rtruediv(other[, axis, level, ...]) Binary operator rtruediv with support to substitute a fill_value for missing data in
DataFrame.rfloordiv(other[, axis, level, ...]) Binary operator rfloordiv with support to substitute a fill_value for missing data in
DataFrame.rmod(other[, axis, level, fill_value]) Binary operator rmod with support to substitute a fill_value for missing data in
DataFrame.rpow(other[, axis, level, fill_value]) Binary operator rpow with support to substitute a fill_value for missing data in
DataFrame.lt(other[, axis, level]) Wrapper for flexible comparison methods lt
DataFrame.gt(other[, axis, level]) Wrapper for flexible comparison methods gt
DataFrame.le(other[, axis, level]) Wrapper for flexible comparison methods le
DataFrame.ge(other[, axis, level]) Wrapper for flexible comparison methods ge
DataFrame.ne(other[, axis, level]) Wrapper for flexible comparison methods ne
DataFrame.eq(other[, axis, level]) Wrapper for flexible comparison methods eq
DataFrame.combine(other, func[, fill_value, ...]) Add two DataFrame objects and do not propagate NaN values, so if for a
DataFrame.combineAdd(other) Add two DataFrame objects and do not propagate
DataFrame.combine_first(other) Combine two DataFrame objects and default to non-null values in frame
DataFrame.combineMult(other) Multiply two DataFrame objects and do not propagate NaN values, so if

Function application, GroupBy

DataFrame.apply(func[, axis, broadcast, ...]) Applies function along input axis of DataFrame.
DataFrame.applymap(func) Apply a function to a DataFrame that is intended to operate
DataFrame.groupby([by, axis, level, ...]) Group series using mapper (dict or key function, apply given function

Computations / Descriptive Stats

DataFrame.abs() Return an object with absolute value taken.
DataFrame.any([axis, bool_only, skipna, level]) Return whether any element is True over requested axis.
DataFrame.clip([lower, upper, out]) Trim values at input threshold(s)
DataFrame.clip_lower(threshold) Return copy of the input with values below given value truncated
DataFrame.clip_upper(threshold) Return copy of input with values above given value truncated
DataFrame.corr([method, min_periods]) Compute pairwise correlation of columns, excluding NA/null values
DataFrame.corrwith(other[, axis, drop]) Compute pairwise correlation between rows or columns of two DataFrame
DataFrame.count([axis, level, numeric_only]) Return Series with number of non-NA/null observations over requested
DataFrame.cov([min_periods]) Compute pairwise covariance of columns, excluding NA/null values
DataFrame.cummax([axis, dtype, out, skipna]) Return cumulative max over requested axis.
DataFrame.cummin([axis, dtype, out, skipna]) Return cumulative min over requested axis.
DataFrame.cumprod([axis, dtype, out, skipna]) Return cumulative prod over requested axis.
DataFrame.cumsum([axis, dtype, out, skipna]) Return cumulative sum over requested axis.
DataFrame.describe([percentile_width]) Generate various summary statistics of each column, excluding
DataFrame.diff([periods]) 1st discrete difference of object
DataFrame.eval(expr, **kwargs) Evaluate an expression in the context of the calling DataFrame
DataFrame.kurt([axis, skipna, level, ...]) Return unbiased kurtosis over requested axis
DataFrame.mad([axis, skipna, level]) Return the mean absolute deviation of the values for the requested axis
DataFrame.max([axis, skipna, level, ...]) This method returns the maximum of the values in the object.
DataFrame.mean([axis, skipna, level, ...]) Return the mean of the values for the requested axis
DataFrame.median([axis, skipna, level, ...]) Return the median of the values for the requested axis
DataFrame.min([axis, skipna, level, ...]) This method returns the minimum of the values in the object.
DataFrame.mode([axis, numeric_only]) Gets the mode of each element along the axis selected.
DataFrame.pct_change([periods, fill_method, ...]) Percent change over given number of periods
DataFrame.prod([axis, skipna, level, ...]) Return the product of the values for the requested axis
DataFrame.quantile([q, axis, numeric_only]) Return values at the given quantile over requested axis, a la
DataFrame.rank([axis, numeric_only, method, ...]) Compute numerical data ranks (1 through n) along axis.
DataFrame.skew([axis, skipna, level, ...]) Return unbiased skew over requested axis
DataFrame.sum([axis, skipna, level, ...]) Return the sum of the values for the requested axis
DataFrame.std([axis, skipna, level, ddof]) Return unbiased standard deviation over requested axis
DataFrame.var([axis, skipna, level, ddof]) Return unbiased variance over requested axis

Reindexing / Selection / Label manipulation

DataFrame.add_prefix(prefix) Concatenate prefix string with panel items names.
DataFrame.add_suffix(suffix) Concatenate suffix string with panel items names
DataFrame.align(other[, join, axis, level, ...]) Align two object on their axes with the
DataFrame.drop(labels[, axis, level, inplace]) Return new object with labels in requested axis removed
DataFrame.drop_duplicates([cols, take_last, ...]) Return DataFrame with duplicate rows removed, optionally only
DataFrame.duplicated([cols, take_last]) Return boolean Series denoting duplicate rows, optionally only
DataFrame.filter([items, like, regex, axis]) Restrict the info axis to set of items or wildcard
DataFrame.first(offset) Convenience method for subsetting initial periods of time series data
DataFrame.head([n]) Returns first n rows
DataFrame.idxmax([axis, skipna]) Return index of first occurrence of maximum over requested axis.
DataFrame.idxmin([axis, skipna]) Return index of first occurrence of minimum over requested axis.
DataFrame.last(offset) Convenience method for subsetting final periods of time series data
DataFrame.reindex([index, columns]) Conform DataFrame to new index with optional filling logic, placing
DataFrame.reindex_axis(labels[, axis, ...]) Conform input object to new index with optional filling logic,
DataFrame.reindex_like(other[, method, ...]) return an object with matching indicies to myself
DataFrame.rename([index, columns]) Alter axes input function or functions.
DataFrame.reset_index([level, drop, ...]) For DataFrame with multi-level index, return new DataFrame with
DataFrame.select(crit[, axis]) Return data corresponding to axis labels matching criteria
DataFrame.set_index(keys[, drop, append, ...]) Set the DataFrame index (row labels) using one or more existing
DataFrame.tail([n]) Returns last n rows
DataFrame.take(indices[, axis, convert, is_copy]) Analogous to ndarray.take
DataFrame.truncate([before, after, axis, copy]) Truncates a sorted NDFrame before and/or after some particular

Missing data handling

DataFrame.dropna([axis, how, thresh, ...]) Return object with labels on given axis omitted where alternately any
DataFrame.fillna([value, method, axis, ...]) Fill NA/NaN values using the specified method
DataFrame.replace([to_replace, value, ...]) Replace values given in ‘to_replace’ with ‘value’.

Reshaping, sorting, transposing

DataFrame.delevel(*args, **kwargs)
DataFrame.pivot([index, columns, values]) Reshape data (produce a “pivot” table) based on column values.
DataFrame.reorder_levels(order[, axis]) Rearrange index levels using input order.
DataFrame.sort([columns, column, axis, ...]) Sort DataFrame either by labels (along either axis) or by the values in
DataFrame.sort_index([axis, by, ascending, ...]) Sort DataFrame either by labels (along either axis) or by the values in
DataFrame.sortlevel([level, axis, ...]) Sort multilevel index by chosen axis and primary level.
DataFrame.swaplevel(i, j[, axis]) Swap levels i and j in a MultiIndex on a particular axis
DataFrame.stack([level, dropna]) Pivot a level of the (possibly hierarchical) column labels, returning a
DataFrame.unstack([level]) Pivot a level of the (necessarily hierarchical) index labels, returning
DataFrame.T Transpose index and columns
DataFrame.to_panel() Transform long (stacked) format (DataFrame) into wide (3D, Panel)
DataFrame.transpose() Transpose index and columns

Combining / joining / merging

DataFrame.append(other[, ignore_index, ...]) Append columns of other to end of this frame’s columns and index, returning a new object.
DataFrame.join(other[, on, how, lsuffix, ...]) Join columns with other DataFrame either on index or on a key
DataFrame.merge(right[, how, on, left_on, ...]) Merge DataFrame objects by performing a database-style join operation by
DataFrame.update(other[, join, overwrite, ...]) Modify DataFrame in place using non-NA values from passed

Time series-related

DataFrame.asfreq(freq[, method, how, normalize]) Convert all TimeSeries inside to specified frequency using DateOffset
DataFrame.shift([periods, freq, axis]) Shift index by desired number of periods with an optional time freq
DataFrame.first_valid_index() Return label for first non-NA/null value
DataFrame.last_valid_index() Return label for last non-NA/null value
DataFrame.resample(rule[, how, axis, ...]) Convenience method for frequency conversion and resampling of regular time-series data.
DataFrame.to_period([freq, axis, copy]) Convert DataFrame from DatetimeIndex to PeriodIndex with desired
DataFrame.to_timestamp([freq, how, axis, copy]) Cast to DatetimeIndex of timestamps, at beginning of period
DataFrame.tz_convert(tz[, axis, copy]) Convert TimeSeries to target time zone. If it is time zone naive, it
DataFrame.tz_localize(tz[, axis, copy, ...]) Localize tz-naive TimeSeries to target time zone

Plotting

DataFrame.boxplot([column, by, ax, ...]) Make a box plot from DataFrame column/columns optionally grouped
DataFrame.hist(data[, column, by, grid, ...]) Draw Histogram the DataFrame’s series using matplotlib / pylab.
DataFrame.plot([frame, x, y, subplots, ...]) Make line, bar, or scatter plots of DataFrame series with the index on the x-axis

Serialization / IO / Conversion

DataFrame.from_csv(path[, header, sep, ...]) Read delimited file into DataFrame
DataFrame.from_dict(data[, orient, dtype]) Construct DataFrame from dict of array-like or dicts
DataFrame.from_items(items[, columns, orient]) Convert (key, value) pairs to DataFrame. The keys will be the axis
DataFrame.from_records(data[, index, ...]) Convert structured or record ndarray to DataFrame
DataFrame.info([verbose, buf, max_cols]) Concise summary of a DataFrame.
DataFrame.to_pickle(path) Pickle (serialize) object to input file path
DataFrame.to_csv(path_or_buf[, sep, na_rep, ...]) Write DataFrame to a comma-separated values (csv) file
DataFrame.to_hdf(path_or_buf, key, **kwargs) activate the HDFStore
DataFrame.to_dict([outtype]) Convert DataFrame to dictionary.
DataFrame.to_excel(excel_writer[, ...]) Write DataFrame to a excel sheet
DataFrame.to_json([path_or_buf, orient, ...]) Convert the object to a JSON string.
DataFrame.to_html([buf, columns, col_space, ...]) Render a DataFrame as an HTML table.
DataFrame.to_latex([buf, columns, ...]) Render a DataFrame to a tabular environment table.
DataFrame.to_stata(fname[, convert_dates, ...]) A class for writing Stata binary dta files from array-like objects
DataFrame.to_records([index, convert_datetime64]) Convert DataFrame to record array. Index will be put in the
DataFrame.to_sparse([fill_value, kind]) Convert to SparseDataFrame
DataFrame.to_string([buf, columns, ...]) Render a DataFrame to a console-friendly tabular output.
DataFrame.to_clipboard([excel, sep]) Attempt to write text representation of object to the system clipboard

Panel

Constructor

Panel([data, items, major_axis, minor_axis, ...]) Represents wide format panel data, stored as 3-dimensional array

Attributes and underlying data

Axes

  • items: axis 0; each item corresponds to a DataFrame contained inside
  • major_axis: axis 1; the index (rows) of each of the DataFrames
  • minor_axis: axis 2; the columns of each of the DataFrames
Panel.values Numpy representation of NDFrame
Panel.axes index(es) of the NDFrame
Panel.ndim Number of axes / array dimensions
Panel.shape tuple of axis dimensions

Conversion

Panel.astype(dtype[, copy, raise_on_error]) Cast object to input numpy.dtype
Panel.copy([deep]) Make a copy of this object
Panel.isnull() Return a boolean same-sized object indicating if the values are null
Panel.notnull() Return a boolean same-sized object indicating if the values are

Getting and setting

Panel.get_value(*args) Quickly retrieve single value at (item, major, minor) location
Panel.set_value(*args) Quickly set single value at (item, major, minor) location

Indexing, iteration, slicing

Panel.at
Panel.iat
Panel.ix
Panel.loc
Panel.iloc
Panel.__iter__() Iterate over infor axis
Panel.iteritems() Iterate over (label, values) on info axis
Panel.pop(item) Return item and drop from frame.
Panel.xs(key[, axis, copy]) Return slice of panel along selected axis
Panel.major_xs(key[, copy]) Return slice of panel along major axis
Panel.minor_xs(key[, copy]) Return slice of panel along minor axis

For more information on .at, .iat, .ix, .loc, and .iloc, see the indexing documentation.

Binary operator functions

Panel.add(other[, axis]) Wrapper method for add
Panel.sub(other[, axis]) Wrapper method for sub
Panel.mul(other[, axis]) Wrapper method for mul
Panel.div(other[, axis]) Wrapper method for truediv
Panel.truediv(other[, axis]) Wrapper method for truediv
Panel.floordiv(other[, axis]) Wrapper method for floordiv
Panel.mod(other[, axis]) Wrapper method for mod
Panel.pow(other[, axis]) Wrapper method for pow
Panel.radd(other[, axis]) Wrapper method for radd
Panel.rsub(other[, axis]) Wrapper method for rsub
Panel.rmul(other[, axis]) Wrapper method for rmul
Panel.rdiv(other[, axis]) Wrapper method for rtruediv
Panel.rtruediv(other[, axis]) Wrapper method for rtruediv
Panel.rfloordiv(other[, axis]) Wrapper method for rfloordiv
Panel.rmod(other[, axis]) Wrapper method for rmod
Panel.rpow(other[, axis]) Wrapper method for rpow
Panel.lt(other) Wrapper for comparison method lt
Panel.gt(other) Wrapper for comparison method gt
Panel.le(other) Wrapper for comparison method le
Panel.ge(other) Wrapper for comparison method ge
Panel.ne(other) Wrapper for comparison method ne
Panel.eq(other) Wrapper for comparison method eq

Function application, GroupBy

Panel.apply(func[, axis]) Apply
Panel.groupby(function[, axis]) Group data on given axis, returning GroupBy object

Computations / Descriptive Stats

Panel.abs() Return an object with absolute value taken.
Panel.clip([lower, upper, out]) Trim values at input threshold(s)
Panel.clip_lower(threshold) Return copy of the input with values below given value truncated
Panel.clip_upper(threshold) Return copy of input with values above given value truncated
Panel.count([axis]) Return number of observations over requested axis.
Panel.cummax([axis, dtype, out, skipna]) Return cumulative max over requested axis.
Panel.cummin([axis, dtype, out, skipna]) Return cumulative min over requested axis.
Panel.cumprod([axis, dtype, out, skipna]) Return cumulative prod over requested axis.
Panel.cumsum([axis, dtype, out, skipna]) Return cumulative sum over requested axis.
Panel.max([axis, skipna, level, numeric_only]) This method returns the maximum of the values in the object.
Panel.mean([axis, skipna, level, numeric_only]) Return the mean of the values for the requested axis
Panel.median([axis, skipna, level, numeric_only]) Return the median of the values for the requested axis
Panel.min([axis, skipna, level, numeric_only]) This method returns the minimum of the values in the object.
Panel.pct_change([periods, fill_method, ...]) Percent change over given number of periods
Panel.prod([axis, skipna, level, numeric_only]) Return the product of the values for the requested axis
Panel.skew([axis, skipna, level, numeric_only]) Return unbiased skew over requested axis
Panel.sum([axis, skipna, level, numeric_only]) Return the sum of the values for the requested axis
Panel.std([axis, skipna, level, ddof]) Return unbiased standard deviation over requested axis
Panel.var([axis, skipna, level, ddof]) Return unbiased variance over requested axis

Reindexing / Selection / Label manipulation

Panel.add_prefix(prefix) Concatenate prefix string with panel items names.
Panel.add_suffix(suffix) Concatenate suffix string with panel items names
Panel.drop(labels[, axis, level, inplace]) Return new object with labels in requested axis removed
Panel.filter([items, like, regex, axis]) Restrict the info axis to set of items or wildcard
Panel.first(offset) Convenience method for subsetting initial periods of time series data
Panel.last(offset) Convenience method for subsetting final periods of time series data
Panel.reindex([items, major_axis, minor_axis]) Conform Panel to new index with optional filling logic, placing
Panel.reindex_axis(labels[, axis, method, ...]) Conform input object to new index with optional filling logic,
Panel.reindex_like(other[, method, copy, limit]) return an object with matching indicies to myself
Panel.rename([items, major_axis, minor_axis]) Alter axes input function or functions.
Panel.select(crit[, axis]) Return data corresponding to axis labels matching criteria
Panel.take(indices[, axis, convert, is_copy]) Analogous to ndarray.take
Panel.truncate([before, after, axis, copy]) Truncates a sorted NDFrame before and/or after some particular

Missing data handling

Panel.dropna([axis, how, inplace]) Drop 2D from panel, holding passed axis constant
Panel.fillna([value, method, axis, inplace, ...]) Fill NA/NaN values using the specified method

Reshaping, sorting, transposing

Panel.sort_index([axis, ascending]) Sort object by labels (along an axis)
Panel.swaplevel(i, j[, axis]) Swap levels i and j in a MultiIndex on a particular axis
Panel.transpose(*args, **kwargs) Permute the dimensions of the Panel
Panel.swapaxes(axis1, axis2[, copy]) Interchange axes and swap values axes appropriately
Panel.conform(frame[, axis]) Conform input DataFrame to align with chosen axis pair.

Combining / joining / merging

Panel.join(other[, how, lsuffix, rsuffix]) Join items with other Panel either on major and minor axes column
Panel.update(other[, join, overwrite, ...]) Modify Panel in place using non-NA values from passed

Time series-related

Panel.asfreq(freq[, method, how, normalize]) Convert all TimeSeries inside to specified frequency using DateOffset
Panel.shift(lags[, freq, axis]) Shift major or minor axis by specified number of leads/lags.
Panel.resample(rule[, how, axis, ...]) Convenience method for frequency conversion and resampling of regular time-series data.
Panel.tz_convert(tz[, axis, copy]) Convert TimeSeries to target time zone. If it is time zone naive, it
Panel.tz_localize(tz[, axis, copy, infer_dst]) Localize tz-naive TimeSeries to target time zone

Serialization / IO / Conversion

Panel.from_dict(data[, intersect, orient, dtype]) Construct Panel from dict of DataFrame objects
Panel.to_pickle(path) Pickle (serialize) object to input file path
Panel.to_excel(path[, na_rep, engine]) Write each DataFrame in Panel to a separate excel sheet
Panel.to_hdf(path_or_buf, key, **kwargs) activate the HDFStore
Panel.to_json([path_or_buf, orient, ...]) Convert the object to a JSON string.
Panel.to_sparse([fill_value, kind]) Convert to SparsePanel
Panel.to_frame([filter_observations]) Transform wide format into long (stacked) format as DataFrame
Panel.to_clipboard([excel, sep]) Attempt to write text representation of object to the system clipboard

Index

Many of these methods or variants thereof are available on the objects that contain an index (Series/Dataframe) and those should most likely be used before calling these methods directly.

Index Immutable ndarray implementing an ordered, sliceable set.

Modifying and Computations

Index.copy([names, name, dtype, deep]) Make a copy of this object.
Index.delete(loc) Make new Index with passed location deleted
Index.diff(other) Compute sorted set difference of two Index objects
Index.drop(labels) Make new Index with passed list of labels deleted
Index.equals(other) Determines if two Index objects contain the same elements.
Index.identical(other) Similar to equals, but check that other comparable attributes are
Index.insert(loc, item) Make new Index inserting new item at location
Index.order([return_indexer, ascending]) Return sorted copy of Index
Index.reindex(target[, method, level, ...]) For Index, simply returns the new index and the results of
Index.repeat(repeats[, axis]) Repeat elements of an array.
Index.set_names(names[, inplace]) Set new names on index.
Index.unique() Return array of unique values in the Index. Significantly faster than

Conversion

Index.astype(dtype)
Index.tolist() Overridden version of ndarray.tolist
Index.to_datetime([dayfirst]) For an Index containing strings or datetime.datetime objects, attempt
Index.to_series() return a series with both index and values equal to the index keys

Sorting

Index.argsort(*args, **kwargs) See docstring for ndarray.argsort
Index.order([return_indexer, ascending]) Return sorted copy of Index
Index.sort(*args, **kwargs)

Time-specific operations

Index.shift([periods, freq]) Shift Index containing datetime objects by input number of periods and

Combining / joining / merging

Index.append(other) Append a collection of Index options together
Index.intersection(other) Form the intersection of two Index objects. Sortedness of the result is
Index.join(other[, how, level, return_indexers]) Internal API method. Compute join_index and indexers to conform data
Index.union(other) Form the union of two Index objects and sorts if possible

Selecting

Index.get_indexer(target[, method, limit]) Compute indexer and mask for new index given the current index.
Index.get_indexer_non_unique(target, **kwargs) return an indexer suitable for taking from a non unique index
Index.get_level_values(level) Return vector of label values for requested level, equal to the length
Index.get_loc(key) Get integer location for requested label
Index.get_value(series, key) Fast lookup of value from 1-dimensional ndarray.
Index.isin(values) Compute boolean array of whether each index value is found in the
Index.slice_indexer([start, end, step]) For an ordered Index, compute the slice indexer for input labels and
Index.slice_locs([start, end]) For an ordered Index, compute the slice locations for input labels

DatetimeIndex

DatetimeIndex Immutable ndarray of datetime64 data, represented internally as int64, and

Selecting

DatetimeIndex.indexer_at_time(time[, asof]) Select values at particular time of day (e.g.
DatetimeIndex.indexer_between_time(...[, ...]) Select values between particular times of day (e.g., 9:00-9:30AM)

Time-specific operations

DatetimeIndex.normalize() Return DatetimeIndex with times to midnight. Length is unaltered
DatetimeIndex.snap([freq]) Snap time stamps to nearest occurring frequency
DatetimeIndex.tz_convert(tz) Convert DatetimeIndex from one time zone to another (using pytz)
DatetimeIndex.tz_localize(tz[, infer_dst]) Localize tz-naive DatetimeIndex to given time zone (using pytz)

Conversion

DatetimeIndex.to_datetime([dayfirst])
DatetimeIndex.to_period([freq]) Cast to PeriodIndex at a particular frequency
DatetimeIndex.to_pydatetime() Return DatetimeIndex as object ndarray of datetime.datetime objects

GroupBy

GroupBy objects are returned by groupby calls: pandas.DataFrame.groupby(), pandas.Series.groupby(), etc.

Indexing, iteration

GroupBy.__iter__() Groupby iterator
GroupBy.groups dict {group name -> group labels}
GroupBy.indices dict {group name -> group indices}
GroupBy.get_group(name[, obj]) Constructs NDFrame from group with provided name

Function application

GroupBy.apply(func, *args, **kwargs) Apply function and combine results together in an intelligent way.
GroupBy.aggregate(func, *args, **kwargs)
GroupBy.transform(func, *args, **kwargs)

Computations / Descriptive Stats

GroupBy.mean() Compute mean of groups, excluding missing values
GroupBy.median() Compute median of groups, excluding missing values
GroupBy.std([ddof]) Compute standard deviation of groups, excluding missing values
GroupBy.var([ddof]) Compute variance of groups, excluding missing values
GroupBy.ohlc() Compute sum of values, excluding missing values