Table Of Contents

Search

Enter search terms or a module, class or function name.

API Reference

Input/Output

Pickling

read_pickle(path) Load pickled pandas object (or any other pickled object) from the specified

Flat File

read_table(filepath_or_buffer[, sep, ...]) Read general delimited file into DataFrame
read_csv(filepath_or_buffer[, sep, dialect, ...]) Read CSV (comma-separated) file into DataFrame
read_fwf(filepath_or_buffer[, colspecs, widths]) Read a table of fixed-width formatted lines into DataFrame

Clipboard

read_clipboard(**kwargs) Read text from clipboard and pass to read_table.

Excel

read_excel(io[, sheetname]) Read an Excel table into a pandas DataFrame
ExcelFile.parse([sheetname, header, ...]) Read an Excel table into DataFrame

JSON

read_json([path_or_buf, orient, typ, dtype, ...]) Convert a JSON string to pandas object

HTML

read_html(io[, match, flavor, header, ...]) Read HTML tables into a list of DataFrame objects.

HDFStore: PyTables (HDF5)

read_hdf(path_or_buf, key, **kwargs) read from the store, close it if we opened it
HDFStore.put(key, value[, format, append]) Store object in HDFStore
HDFStore.append(key, value[, format, ...]) Append to Table in file. Node must already exist and be Table
HDFStore.get(key) Retrieve pandas object stored in file
HDFStore.select(key[, where, start, stop, ...]) Retrieve pandas object stored in file, optionally based on where

SQL

read_sql_table(table_name, con[, schema, ...]) Read SQL database table into a DataFrame.
read_sql_query(sql, con[, index_col, ...]) Read SQL query into a DataFrame.
read_sql(sql, con[, index_col, ...]) Read SQL query or database table into a DataFrame.

Google BigQuery

read_gbq(query[, project_id, index_col, ...]) Load data from Google BigQuery.
to_gbq(dataframe, destination_table[, ...]) Write a DataFrame to a Google BigQuery table.

STATA

read_stata(filepath_or_buffer[, ...]) Read Stata file into DataFrame
StataReader.data([convert_dates, ...]) Reads observations from Stata file, converting them into a dataframe
StataReader.data_label() Returns data label of Stata file
StataReader.value_labels() Returns a dict, associating each variable name a dict, associating
StataReader.variable_labels() Returns variable labels as a dict, associating each variable name
StataWriter.write_file()

General functions

Data manipulations

melt(frame[, id_vars, value_vars, var_name, ...]) “Unpivots” a DataFrame from wide format to long format, optionally leaving
pivot(index, columns, values) Produce ‘pivot’ table based on 3 columns of this DataFrame.
pivot_table(*args, **kwargs) Create a spreadsheet-style pivot table as a DataFrame. The levels in the
crosstab(*args, **kwargs) Compute a simple cross-tabulation of two (or more) factors.
cut(x, bins[, right, labels, retbins, ...]) Return indices of half-open bins to which each value of x belongs.
qcut(x, q[, labels, retbins, precision]) Quantile-based discretization function.
merge(left, right[, how, on, left_on, ...]) Merge DataFrame objects by performing a database-style join operation by
concat(objs[, axis, join, join_axes, ...]) Concatenate pandas objects along a particular axis with optional set logic along the other axes.
get_dummies(data[, prefix, prefix_sep, ...]) Convert categorical variable into dummy/indicator variables
factorize(values[, sort, order, na_sentinel]) Encode input values as an enumerated type or categorical variable

Top-level missing data

isnull(obj) Detect missing values (NaN in numeric arrays, None/NaN in object arrays)
notnull(obj) Replacement for numpy.isfinite / -numpy.isnan which is suitable for use on object arrays.

Top-level dealing with datetimelike

to_datetime(arg[, errors, dayfirst, utc, ...]) Convert argument to datetime
to_timedelta(arg[, unit, box, coerce]) Convert argument to timedelta
date_range([start, end, periods, freq, tz, ...]) Return a fixed frequency datetime index, with day (calendar) as the default
bdate_range([start, end, periods, freq, tz, ...]) Return a fixed frequency datetime index, with business day as the default
period_range([start, end, periods, freq, name]) Return a fixed frequency datetime index, with day (calendar) as the default
timedelta_range([start, end, periods, freq, ...]) Return a fixed frequency timedelta index, with day as the default

Top-level evaluation

eval(expr[, parser, engine, truediv, ...]) Evaluate a Python expression as a string using various backends.

Standard moving window functions

rolling_count(arg, window[, freq, center, how]) Rolling count of number of non-NaN observations inside provided window.
rolling_sum(arg, window[, min_periods, ...]) Moving sum.
rolling_mean(arg, window[, min_periods, ...]) Moving mean.
rolling_median(arg, window[, min_periods, ...]) O(N log(window)) implementation using skip list
rolling_var(arg, window[, min_periods, ...]) Numerically stable implementation using Welford’s method.
rolling_std(arg, window[, min_periods, ...]) Moving standard deviation.
rolling_min(arg, window[, min_periods, ...]) Moving min of 1d array of dtype=float64 along axis=0 ignoring NaNs.
rolling_max(arg, window[, min_periods, ...]) Moving max of 1d array of dtype=float64 along axis=0 ignoring NaNs.
rolling_corr(arg1[, arg2, window, ...]) Moving sample correlation.
rolling_corr_pairwise(df1[, df2, window, ...]) Deprecated.
rolling_cov(arg1[, arg2, window, ...]) Unbiased moving covariance.
rolling_skew(arg, window[, min_periods, ...]) Unbiased moving skewness.
rolling_kurt(arg, window[, min_periods, ...]) Unbiased moving kurtosis.
rolling_apply(arg, window, func[, ...]) Generic moving function application.
rolling_quantile(arg, window, quantile[, ...]) Moving quantile.
rolling_window(arg[, window, win_type, ...]) Applies a moving window of type window_type and size window on the data.

Standard expanding window functions

expanding_count(arg[, freq]) Expanding count of number of non-NaN observations.
expanding_sum(arg[, min_periods, freq]) Expanding sum.
expanding_mean(arg[, min_periods, freq]) Expanding mean.
expanding_median(arg[, min_periods, freq]) O(N log(window)) implementation using skip list
expanding_var(arg[, min_periods, freq]) Numerically stable implementation using Welford’s method.
expanding_std(arg[, min_periods, freq]) Expanding standard deviation.
expanding_min(arg[, min_periods, freq]) Moving min of 1d array of dtype=float64 along axis=0 ignoring NaNs.
expanding_max(arg[, min_periods, freq]) Moving max of 1d array of dtype=float64 along axis=0 ignoring NaNs.
expanding_corr(arg1[, arg2, min_periods, ...]) Expanding sample correlation.
expanding_corr_pairwise(df1[, df2, ...]) Deprecated.
expanding_cov(arg1[, arg2, min_periods, ...]) Unbiased expanding covariance.
expanding_skew(arg[, min_periods, freq]) Unbiased expanding skewness.
expanding_kurt(arg[, min_periods, freq]) Unbiased expanding kurtosis.
expanding_apply(arg, func[, min_periods, ...]) Generic expanding function application.
expanding_quantile(arg, quantile[, ...]) Expanding quantile.

Exponentially-weighted moving window functions

ewma(arg[, com, span, halflife, ...]) Exponentially-weighted moving average
ewmstd(arg[, com, span, halflife, ...]) Exponentially-weighted moving std
ewmvar(arg[, com, span, halflife, ...]) Exponentially-weighted moving variance
ewmcorr(arg1[, arg2, com, span, halflife, ...]) Exponentially-weighted moving correlation
ewmcov(arg1[, arg2, com, span, halflife, ...]) Exponentially-weighted moving covariance

Series

Constructor

Series([data, index, dtype, name, copy, ...]) One-dimensional ndarray with axis labels (including time series).

Attributes

Axes
  • index: axis labels
Series.values Return Series as ndarray
Series.dtype return the dtype object of the underlying data
Series.ftype return if the data is sparse|dense
Series.shape return a tuple of the shape of the underlying data
Series.size return the number of elements in the underlying data
Series.nbytes return the number of bytes in the underlying data
Series.ndim return the number of dimensions of the underlying data, by definition 1
Series.strides return the strides of the underlying data
Series.itemsize return the size of the dtype of the item of the underlying data
Series.base return the base object if the memory of the underlying data is shared
Series.T return the transpose, which is by definition self

Conversion

Series.astype(dtype[, copy, raise_on_error]) Cast object to input numpy.dtype
Series.copy([deep]) Make a copy of this object
Series.isnull() Return a boolean same-sized object indicating if the values are null ..
Series.notnull() Return a boolean same-sized object indicating if the values are not null ..

Indexing, iteration

Series.get(key[, default]) Get item from object for given key (DataFrame column, Panel slice,
Series.at
Series.iat
Series.ix
Series.loc
Series.iloc
Series.__iter__()
Series.iteritems() Lazily iterate over (index, value) tuples

For more information on .at, .iat, .ix, .loc, and .iloc, see the indexing documentation.

Binary operator functions

Series.add(other[, level, fill_value, axis]) Binary operator add with support to substitute a fill_value for missing data
Series.sub(other[, level, fill_value, axis]) Binary operator sub with support to substitute a fill_value for missing data
Series.mul(other[, level, fill_value, axis]) Binary operator mul with support to substitute a fill_value for missing data
Series.div(other[, level, fill_value, axis]) Binary operator truediv with support to substitute a fill_value for missing data
Series.truediv(other[, level, fill_value, axis]) Binary operator truediv with support to substitute a fill_value for missing data
Series.floordiv(other[, level, fill_value, axis]) Binary operator floordiv with support to substitute a fill_value for missing data
Series.mod(other[, level, fill_value, axis]) Binary operator mod with support to substitute a fill_value for missing data
Series.pow(other[, level, fill_value, axis]) Binary operator pow with support to substitute a fill_value for missing data
Series.radd(other[, level, fill_value, axis]) Binary operator radd with support to substitute a fill_value for missing data
Series.rsub(other[, level, fill_value, axis]) Binary operator rsub with support to substitute a fill_value for missing data
Series.rmul(other[, level, fill_value, axis]) Binary operator rmul with support to substitute a fill_value for missing data
Series.rdiv(other[, level, fill_value, axis]) Binary operator rtruediv with support to substitute a fill_value for missing data
Series.rtruediv(other[, level, fill_value, axis]) Binary operator rtruediv with support to substitute a fill_value for missing data
Series.rfloordiv(other[, level, fill_value, ...]) Binary operator rfloordiv with support to substitute a fill_value for missing data
Series.rmod(other[, level, fill_value, axis]) Binary operator rmod with support to substitute a fill_value for missing data
Series.rpow(other[, level, fill_value, axis]) Binary operator rpow with support to substitute a fill_value for missing data
Series.combine(other, func[, fill_value]) Perform elementwise binary operation on two Series using given function
Series.combine_first(other) Combine Series values, choosing the calling Series’s values
Series.round([decimals, out]) Return a with each element rounded to the given number of decimals.
Series.lt(other)
Series.gt(other)
Series.le(other)
Series.ge(other)
Series.ne(other)
Series.eq(other)

Function application, GroupBy

Series.apply(func[, convert_dtype, args]) Invoke function on values of Series. Can be ufunc (a NumPy function
Series.map(arg[, na_action]) Map values of Series using input correspondence (which can be
Series.groupby([by, axis, level, as_index, ...]) Group series using mapper (dict or key function, apply given function

Computations / Descriptive Stats

Series.abs() Return an object with absolute value taken.
Series.all([axis, out]) Returns True if all elements evaluate to True.
Series.any([axis, out]) Returns True if any of the elements of a evaluate to True.
Series.autocorr() Lag-1 autocorrelation
Series.between(left, right[, inclusive]) Return boolean Series equivalent to left <= series <= right. NA values
Series.clip([lower, upper, out]) Trim values at input threshold(s)
Series.clip_lower(threshold) Return copy of the input with values below given value truncated
Series.clip_upper(threshold) Return copy of input with values above given value truncated
Series.corr(other[, method, min_periods]) Compute correlation with other Series, excluding missing values
Series.count([level]) Return number of non-NA/null observations in the Series
Series.cov(other[, min_periods]) Compute covariance with Series, excluding missing values
Series.cummax([axis, dtype, out, skipna]) Return cumulative max over requested axis.
Series.cummin([axis, dtype, out, skipna]) Return cumulative min over requested axis.
Series.cumprod([axis, dtype, out, skipna]) Return cumulative prod over requested axis.
Series.cumsum([axis, dtype, out, skipna]) Return cumulative sum over requested axis.
Series.describe([percentile_width, ...]) Generate various summary statistics, excluding NaN values.
Series.diff([periods]) 1st discrete difference of object
Series.factorize([sort, na_sentinel]) Encode the object as an enumerated type or categorical variable
Series.kurt([axis, skipna, level, numeric_only]) Return unbiased kurtosis over requested axis
Series.mad([axis, skipna, level]) Return the mean absolute deviation of the values for the requested axis
Series.max([axis, skipna, level, numeric_only]) This method returns the maximum of the values in the object.
Series.mean([axis, skipna, level, numeric_only]) Return the mean of the values for the requested axis
Series.median([axis, skipna, level, ...]) Return the median of the values for the requested axis
Series.min([axis, skipna, level, numeric_only]) This method returns the minimum of the values in the object.
Series.mode() Returns the mode(s) of the dataset.
Series.pct_change([periods, fill_method, ...]) Percent change over given number of periods.
Series.prod([axis, skipna, level, numeric_only]) Return the product of the values for the requested axis
Series.quantile([q]) Return value at the given quantile, a la numpy.percentile.
Series.rank([method, na_option, ascending, pct]) Compute data ranks (1 through n).
Series.sem([axis, skipna, level, ddof]) Return unbiased standard error of the mean over requested axis.
Series.skew([axis, skipna, level, numeric_only]) Return unbiased skew over requested axis
Series.std([axis, skipna, level, ddof]) Return unbiased standard deviation over requested axis.
Series.sum([axis, skipna, level, numeric_only]) Return the sum of the values for the requested axis
Series.var([axis, skipna, level, ddof]) Return unbiased variance over requested axis.
Series.unique() Return array of unique values in the object.
Series.nunique([dropna]) Return number of unique elements in the object.
Series.value_counts([normalize, sort, ...]) Returns object containing counts of unique values.

Reindexing / Selection / Label manipulation

Series.align(other[, join, axis, level, ...]) Align two object on their axes with the
Series.drop(labels[, axis, level, inplace]) Return new object with labels in requested axis removed
Series.drop_duplicates([take_last, inplace]) Return Series with duplicate values removed
Series.duplicated([take_last]) Return boolean Series denoting duplicate values
Series.equals(other) Determines if two NDFrame objects contain the same elements. NaNs in the
Series.first(offset) Convenience method for subsetting initial periods of time series data
Series.head([n]) Returns first n rows
Series.idxmax([axis, out, skipna]) Index of first occurrence of maximum of values.
Series.idxmin([axis, out, skipna]) Index of first occurrence of minimum of values.
Series.isin(values) Return a boolean Series showing whether each element
Series.last(offset) Convenience method for subsetting final periods of time series data
Series.reindex([index]) Conform Series to new index with optional filling logic, placing
Series.reindex_like(other[, method, copy, limit]) return an object with matching indicies to myself
Series.rename([index]) Alter axes input function or functions.
Series.reset_index([level, drop, name, inplace]) Analogous to the pandas.DataFrame.reset_index() function, see
Series.select(crit[, axis]) Return data corresponding to axis labels matching criteria
Series.take(indices[, axis, convert, is_copy]) return Series corresponding to requested indices
Series.tail([n]) Returns last n rows
Series.truncate([before, after, axis, copy]) Truncates a sorted NDFrame before and/or after some particular

Missing data handling

Series.dropna([axis, inplace]) Return Series without null values
Series.fillna([value, method, axis, ...]) Fill NA/NaN values using the specified method
Series.interpolate([method, axis, limit, ...]) Interpolate values according to different methods.

Reshaping, sorting

Series.argsort([axis, kind, order]) Overrides ndarray.argsort.
Series.order([na_last, ascending, kind, ...]) Sorts Series object, by value, maintaining index-value link.
Series.reorder_levels(order) Rearrange index levels using input order.
Series.sort([axis, ascending, kind, ...]) Sort values and index labels by value.
Series.sort_index([ascending]) Sort object by labels (along an axis)
Series.sortlevel([level, ascending, ...]) Sort Series with MultiIndex by chosen level. Data will be
Series.swaplevel(i, j[, copy]) Swap levels i and j in a MultiIndex
Series.unstack([level]) Unstack, a.k.a.
Series.searchsorted(v[, side, sorter]) Find indices where elements should be inserted to maintain order.

Combining / joining / merging

Series.append(to_append[, verify_integrity]) Concatenate two or more Series. The indexes must not overlap
Series.replace([to_replace, value, inplace, ...]) Replace values given in ‘to_replace’ with ‘value’.
Series.update(other) Modify Series in place using non-NA values from passed

Datetimelike Properties

Series.dt can be used to access the values of the series as datetimelike and return several properties. Due to implementation details the methods show up here as methods of the DatetimeProperties/PeriodProperties/TimedeltaProperties classes. These can be accessed like Series.dt.<property>.

Datetime Properties

DatetimeProperties.date Returns numpy array of datetime.date.
DatetimeProperties.time Returns numpy array of datetime.time.
DatetimeProperties.year The year of the datetime
DatetimeProperties.month The month as January=1, December=12
DatetimeProperties.day The days of the datetime
DatetimeProperties.hour The hours of the datetime
DatetimeProperties.minute The minutes of the datetime
DatetimeProperties.second The seconds of the datetime
DatetimeProperties.microsecond The microseconds of the datetime
DatetimeProperties.nanosecond The nanoseconds of the datetime
DatetimeProperties.second The seconds of the datetime
DatetimeProperties.weekofyear The week ordinal of the year
DatetimeProperties.dayofweek The day of the week with Monday=0, Sunday=6
DatetimeProperties.weekday The day of the week with Monday=0, Sunday=6
DatetimeProperties.dayofyear The ordinal day of the year
DatetimeProperties.quarter The quarter of the date
DatetimeProperties.is_month_start Logical indicating if first day of month (defined by frequency)
DatetimeProperties.is_month_end Logical indicating if last day of month (defined by frequency)
DatetimeProperties.is_quarter_start Logical indicating if first day of quarter (defined by frequency)
DatetimeProperties.is_quarter_end Logical indicating if last day of quarter (defined by frequency)
DatetimeProperties.is_year_start Logical indicating if first day of year (defined by frequency)
DatetimeProperties.is_year_end Logical indicating if last day of year (defined by frequency)

Datetime Methods

DatetimeProperties.to_period(*args, **kwargs) Cast to PeriodIndex at a particular frequency
DatetimeProperties.to_pydatetime()
DatetimeProperties.tz_localize(*args, **kwargs) Localize tz-naive DatetimeIndex to given time zone (using pytz/dateutil),
DatetimeProperties.tz_convert(*args, **kwargs) Convert tz-aware DatetimeIndex from one time zone to another (using pytz/dateutil)

Timedelta Properties

TimedeltaProperties.days The number of integer days for each element
TimedeltaProperties.hours The number of integer hours for each element
TimedeltaProperties.minutes The number of integer minutes for each element
TimedeltaProperties.seconds The number of integer seconds for each element
TimedeltaProperties.milliseconds The number of integer milliseconds for each element
TimedeltaProperties.microseconds The number of integer microseconds for each element
TimedeltaProperties.nanoseconds The number of integer nanoseconds for each element
TimedeltaProperties.components

Timedelta Methods

TimedeltaProperties.to_pytimedelta()

String handling

Series.str can be used to access the values of the series as strings and apply several methods to it. Due to implementation details the methods show up here as methods of the StringMethods class. These can be acccessed like Series.str.<function/property>.

StringMethods.cat([others, sep, na_rep]) Concatenate arrays of strings with given separator
StringMethods.center(width) “Center” strings, filling left and right side with additional whitespace
StringMethods.contains(pat[, case, flags, ...]) Check whether given pattern is contained in each string in the array
StringMethods.count(pat[, flags]) Count occurrences of pattern in each string
StringMethods.decode(encoding[, errors]) Decode character string to unicode using indicated encoding
StringMethods.encode(encoding[, errors]) Encode character string to some other encoding using indicated encoding
StringMethods.endswith(pat[, na]) Return boolean array indicating whether each string ends with passed
StringMethods.extract(pat[, flags]) Find groups in each string using passed regular expression
StringMethods.findall(pat[, flags]) Find all occurrences of pattern or regular expression
StringMethods.get(i) Extract element from lists, tuples, or strings in each element in the array
StringMethods.join(sep) Join lists contained as elements in array, a la str.join
StringMethods.len() Compute length of each string in array.
StringMethods.lower() Convert strings in array to lowercase
StringMethods.lstrip([to_strip]) Strip whitespace (including newlines) from left side of each string in the
StringMethods.match(pat[, case, flags, na, ...]) Deprecated: Find groups in each string using passed regular expression.
StringMethods.pad(width[, side]) Pad strings with whitespace
StringMethods.repeat(repeats) Duplicate each string in the array by indicated number of times
StringMethods.replace(pat, repl[, n, case, ...]) Replace
StringMethods.rstrip([to_strip]) Strip whitespace (including newlines) from right side of each string in the
StringMethods.slice([start, stop, step]) Slice substrings from each element in array
StringMethods.slice_replace([i, j]) Slice substrings from each element in array
StringMethods.split([pat, n]) Split each string (a la re.split) in array by given pattern, propagating NA
StringMethods.startswith(pat[, na]) Return boolean array indicating whether each string starts with passed
StringMethods.strip([to_strip]) Strip whitespace (including newlines) from each string in the array
StringMethods.title() Convert strings to titlecased version
StringMethods.upper() Convert strings in array to uppercase
StringMethods.get_dummies([sep]) Split each string by sep and return a frame of dummy/indicator variables.

Categorical

If the Series is of dtype category, Series.cat can be used to change the the categorical data. This accessor is similar to the Series.dt or Series.str and has the following usable methods and properties (all available as Series.cat.<method_or_property>).

Categorical.categories The categories of this categorical.
Categorical.ordered bool(x) -> bool
Categorical.rename_categories(new_categories) Renames categories.
Categorical.reorder_categories(new_categories) Reorders categories as specified in new_categories.
Categorical.add_categories(new_categories[, ...]) Add new categories.
Categorical.remove_categories(removals[, ...]) Removes the specified categories.
Categorical.remove_unused_categories([inplace]) Removes categories which are not used.
Categorical.set_categories(new_categories[, ...]) Sets the categories to the specified new_categories.
Categorical.codes The category codes of this categorical.

To create a Series of dtype category, use cat = s.astype("category").

The following two Categorical constructors are considered API but should only be used when adding ordering information or special categories is need at creation time of the categorical data:

Categorical(values[, categories, ordered, ...]) Represents a categorical variable in classic R / S-plus fashion
Categorical.from_codes(codes, categories[, ...]) Make a Categorical type from codes and categories arrays.

np.asarray(categorical) works by implementing the array interface. Be aware, that this converts the Categorical back to a numpy array, so levels and order information is not preserved!

Categorical.__array__([dtype]) The numpy array interface.

Plotting

Series.hist([by, ax, grid, xlabelsize, ...]) Draw histogram of the input series using matplotlib
Series.plot(data[, kind, ax, figsize, ...]) Make plots of Series using matplotlib / pylab.

Serialization / IO / Conversion

Series.from_csv(path[, sep, parse_dates, ...]) Read delimited file into Series
Series.to_pickle(path) Pickle (serialize) object to input file path
Series.to_csv(path[, index, sep, na_rep, ...]) Write Series to a comma-separated values (csv) file
Series.to_dict() Convert Series to {label -> value} dict
Series.to_frame([name]) Convert Series to DataFrame
Series.to_hdf(path_or_buf, key, **kwargs) activate the HDFStore
Series.to_sql(name, con[, flavor, schema, ...]) Write records stored in a DataFrame to a SQL database.
Series.to_msgpack([path_or_buf]) msgpack (serialize) object to input file path
Series.to_json([path_or_buf, orient, ...]) Convert the object to a JSON string.
Series.to_sparse([kind, fill_value]) Convert Series to SparseSeries
Series.to_dense() Return dense representation of NDFrame (as opposed to sparse)
Series.to_string([buf, na_rep, ...]) Render a string representation of the Series
Series.to_clipboard([excel, sep]) Attempt to write text representation of object to the system clipboard

DataFrame

Constructor

DataFrame([data, index, columns, dtype, copy]) Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns).

Attributes and underlying data

Axes

  • index: row labels
  • columns: column labels
DataFrame.as_matrix([columns]) Convert the frame to its Numpy-array representation.
DataFrame.dtypes Return the dtypes in this object
DataFrame.ftypes Return the ftypes (indication of sparse/dense and dtype)
DataFrame.get_dtype_counts() Return the counts of dtypes in this object
DataFrame.get_ftype_counts() Return the counts of ftypes in this object
DataFrame.select_dtypes([include, exclude]) Return a subset of a DataFrame including/excluding columns based on
DataFrame.values Numpy representation of NDFrame
DataFrame.axes
DataFrame.ndim Number of axes / array dimensions
DataFrame.shape

Conversion

DataFrame.astype(dtype[, copy, raise_on_error]) Cast object to input numpy.dtype
DataFrame.convert_objects([convert_dates, ...]) Attempt to infer better dtype for object columns
DataFrame.copy([deep]) Make a copy of this object
DataFrame.isnull() Return a boolean same-sized object indicating if the values are null ..
DataFrame.notnull() Return a boolean same-sized object indicating if the values are not null ..

Indexing, iteration

DataFrame.head([n]) Returns first n rows
DataFrame.at
DataFrame.iat
DataFrame.ix
DataFrame.loc
DataFrame.iloc
DataFrame.insert(loc, column, value[, ...]) Insert column into DataFrame at specified location.
DataFrame.__iter__() Iterate over infor axis
DataFrame.iteritems() Iterator over (column, series) pairs
DataFrame.iterrows() Iterate over rows of DataFrame as (index, Series) pairs.
DataFrame.itertuples([index]) Iterate over rows of DataFrame as tuples, with index value
DataFrame.lookup(row_labels, col_labels) Label-based “fancy indexing” function for DataFrame.
DataFrame.pop(item) Return item and drop from frame.
DataFrame.tail([n]) Returns last n rows
DataFrame.xs(key[, axis, level, copy, ...]) Returns a cross-section (row(s) or column(s)) from the Series/DataFrame.
DataFrame.isin(values) Return boolean DataFrame showing whether each element in the
DataFrame.query(expr, **kwargs) Query the columns of a frame with a boolean expression.

For more information on .at, .iat, .ix, .loc, and .iloc, see the indexing documentation.

Binary operator functions

DataFrame.add(other[, axis, level, fill_value]) Binary operator add with support to substitute a fill_value for missing data in
DataFrame.sub(other[, axis, level, fill_value]) Binary operator sub with support to substitute a fill_value for missing data in
DataFrame.mul(other[, axis, level, fill_value]) Binary operator mul with support to substitute a fill_value for missing data in
DataFrame.div(other[, axis, level, fill_value]) Binary operator truediv with support to substitute a fill_value for missing data in
DataFrame.truediv(other[, axis, level, ...]) Binary operator truediv with support to substitute a fill_value for missing data in
DataFrame.floordiv(other[, axis, level, ...]) Binary operator floordiv with support to substitute a fill_value for missing data in
DataFrame.mod(other[, axis, level, fill_value]) Binary operator mod with support to substitute a fill_value for missing data in
DataFrame.pow(other[, axis, level, fill_value]) Binary operator pow with support to substitute a fill_value for missing data in
DataFrame.radd(other[, axis, level, fill_value]) Binary operator radd with support to substitute a fill_value for missing data in
DataFrame.rsub(other[, axis, level, fill_value]) Binary operator rsub with support to substitute a fill_value for missing data in
DataFrame.rmul(other[, axis, level, fill_value]) Binary operator rmul with support to substitute a fill_value for missing data in
DataFrame.rdiv(other[, axis, level, fill_value]) Binary operator rtruediv with support to substitute a fill_value for missing data in
DataFrame.rtruediv(other[, axis, level, ...]) Binary operator rtruediv with support to substitute a fill_value for missing data in
DataFrame.rfloordiv(other[, axis, level, ...]) Binary operator rfloordiv with support to substitute a fill_value for missing data in
DataFrame.rmod(other[, axis, level, fill_value]) Binary operator rmod with support to substitute a fill_value for missing data in
DataFrame.rpow(other[, axis, level, fill_value]) Binary operator rpow with support to substitute a fill_value for missing data in
DataFrame.lt(other[, axis, level]) Wrapper for flexible comparison methods lt
DataFrame.gt(other[, axis, level]) Wrapper for flexible comparison methods gt
DataFrame.le(other[, axis, level]) Wrapper for flexible comparison methods le
DataFrame.ge(other[, axis, level]) Wrapper for flexible comparison methods ge
DataFrame.ne(other[, axis, level]) Wrapper for flexible comparison methods ne
DataFrame.eq(other[, axis, level]) Wrapper for flexible comparison methods eq
DataFrame.combine(other, func[, fill_value, ...]) Add two DataFrame objects and do not propagate NaN values, so if for a
DataFrame.combineAdd(other) Add two DataFrame objects and do not propagate
DataFrame.combine_first(other) Combine two DataFrame objects and default to non-null values in frame
DataFrame.combineMult(other) Multiply two DataFrame objects and do not propagate NaN values, so if

Function application, GroupBy

DataFrame.apply(func[, axis, broadcast, ...]) Applies function along input axis of DataFrame.
DataFrame.applymap(func) Apply a function to a DataFrame that is intended to operate
DataFrame.groupby([by, axis, level, ...]) Group series using mapper (dict or key function, apply given function

Computations / Descriptive Stats

DataFrame.abs() Return an object with absolute value taken.
DataFrame.all([axis, bool_only, skipna, level]) Return whether all elements are True over requested axis.
DataFrame.any([axis, bool_only, skipna, level]) Return whether any element is True over requested axis.
DataFrame.clip([lower, upper, out]) Trim values at input threshold(s)
DataFrame.clip_lower(threshold) Return copy of the input with values below given value truncated
DataFrame.clip_upper(threshold) Return copy of input with values above given value truncated
DataFrame.corr([method, min_periods]) Compute pairwise correlation of columns, excluding NA/null values
DataFrame.corrwith(other[, axis, drop]) Compute pairwise correlation between rows or columns of two DataFrame
DataFrame.count([axis, level, numeric_only]) Return Series with number of non-NA/null observations over requested
DataFrame.cov([min_periods]) Compute pairwise covariance of columns, excluding NA/null values
DataFrame.cummax([axis, dtype, out, skipna]) Return cumulative max over requested axis.
DataFrame.cummin([axis, dtype, out, skipna]) Return cumulative min over requested axis.
DataFrame.cumprod([axis, dtype, out, skipna]) Return cumulative prod over requested axis.
DataFrame.cumsum([axis, dtype, out, skipna]) Return cumulative sum over requested axis.
DataFrame.describe([percentile_width, ...]) Generate various summary statistics, excluding NaN values.
DataFrame.diff([periods]) 1st discrete difference of object
DataFrame.eval(expr, **kwargs) Evaluate an expression in the context of the calling DataFrame
DataFrame.kurt([axis, skipna, level, ...]) Return unbiased kurtosis over requested axis
DataFrame.mad([axis, skipna, level]) Return the mean absolute deviation of the values for the requested axis
DataFrame.max([axis, skipna, level, ...]) This method returns the maximum of the values in the object.
DataFrame.mean([axis, skipna, level, ...]) Return the mean of the values for the requested axis
DataFrame.median([axis, skipna, level, ...]) Return the median of the values for the requested axis
DataFrame.min([axis, skipna, level, ...]) This method returns the minimum of the values in the object.
DataFrame.mode([axis, numeric_only]) Gets the mode of each element along the axis selected.
DataFrame.pct_change([periods, fill_method, ...]) Percent change over given number of periods.
DataFrame.prod([axis, skipna, level, ...]) Return the product of the values for the requested axis
DataFrame.quantile([q, axis, numeric_only]) Return values at the given quantile over requested axis, a la numpy.percentile.
DataFrame.rank([axis, numeric_only, method, ...]) Compute numerical data ranks (1 through n) along axis.
DataFrame.sem([axis, skipna, level, ddof]) Return unbiased standard error of the mean over requested axis.
DataFrame.skew([axis, skipna, level, ...]) Return unbiased skew over requested axis
DataFrame.sum([axis, skipna, level, ...]) Return the sum of the values for the requested axis
DataFrame.std([axis, skipna, level, ddof]) Return unbiased standard deviation over requested axis.
DataFrame.var([axis, skipna, level, ddof]) Return unbiased variance over requested axis.

Reindexing / Selection / Label manipulation

DataFrame.add_prefix(prefix) Concatenate prefix string with panel items names.
DataFrame.add_suffix(suffix) Concatenate suffix string with panel items names
DataFrame.align(other[, join, axis, level, ...]) Align two object on their axes with the
DataFrame.drop(labels[, axis, level, inplace]) Return new object with labels in requested axis removed
DataFrame.drop_duplicates(*args, **kwargs) Return DataFrame with duplicate rows removed, optionally only
DataFrame.duplicated(*args, **kwargs) Return boolean Series denoting duplicate rows, optionally only
DataFrame.equals(other) Determines if two NDFrame objects contain the same elements. NaNs in the
DataFrame.filter([items, like, regex, axis]) Restrict the info axis to set of items or wildcard
DataFrame.first(offset) Convenience method for subsetting initial periods of time series data
DataFrame.head([n]) Returns first n rows
DataFrame.idxmax([axis, skipna]) Return index of first occurrence of maximum over requested axis.
DataFrame.idxmin([axis, skipna]) Return index of first occurrence of minimum over requested axis.
DataFrame.last(offset) Convenience method for subsetting final periods of time series data
DataFrame.reindex([index, columns]) Conform DataFrame to new index with optional filling logic, placing
DataFrame.reindex_axis(labels[, axis, ...]) Conform input object to new index with optional filling logic,
DataFrame.reindex_like(other[, method, ...]) return an object with matching indicies to myself
DataFrame.rename([index, columns]) Alter axes input function or functions.
DataFrame.reset_index([level, drop, ...]) For DataFrame with multi-level index, return new DataFrame with
DataFrame.select(crit[, axis]) Return data corresponding to axis labels matching criteria
DataFrame.set_index(keys[, drop, append, ...]) Set the DataFrame index (row labels) using one or more existing
DataFrame.tail([n]) Returns last n rows
DataFrame.take(indices[, axis, convert, is_copy]) Analogous to ndarray.take
DataFrame.truncate([before, after, axis, copy]) Truncates a sorted NDFrame before and/or after some particular

Missing data handling

DataFrame.dropna([axis, how, thresh, ...]) Return object with labels on given axis omitted where alternately any
DataFrame.fillna([value, method, axis, ...]) Fill NA/NaN values using the specified method
DataFrame.replace([to_replace, value, ...]) Replace values given in ‘to_replace’ with ‘value’.

Reshaping, sorting, transposing

DataFrame.pivot([index, columns, values]) Reshape data (produce a “pivot” table) based on column values.
DataFrame.reorder_levels(order[, axis]) Rearrange index levels using input order.
DataFrame.sort([columns, axis, ascending, ...]) Sort DataFrame either by labels (along either axis) or by the values in
DataFrame.sort_index([axis, by, ascending, ...]) Sort DataFrame either by labels (along either axis) or by the values in
DataFrame.sortlevel([level, axis, ...]) Sort multilevel index by chosen axis and primary level.
DataFrame.swaplevel(i, j[, axis]) Swap levels i and j in a MultiIndex on a particular axis
DataFrame.stack([level, dropna]) Pivot a level of the (possibly hierarchical) column labels, returning a
DataFrame.unstack([level]) Pivot a level of the (necessarily hierarchical) index labels, returning
DataFrame.T Transpose index and columns
DataFrame.to_panel() Transform long (stacked) format (DataFrame) into wide (3D, Panel)
DataFrame.transpose() Transpose index and columns

Combining / joining / merging

DataFrame.append(other[, ignore_index, ...]) Append columns of other to end of this frame’s columns and index, returning a new object.
DataFrame.join(other[, on, how, lsuffix, ...]) Join columns with other DataFrame either on index or on a key
DataFrame.merge(right[, how, on, left_on, ...]) Merge DataFrame objects by performing a database-style join operation by
DataFrame.update(other[, join, overwrite, ...]) Modify DataFrame in place using non-NA values from passed

Time series-related

DataFrame.asfreq(freq[, method, how, normalize]) Convert all TimeSeries inside to specified frequency using DateOffset
DataFrame.shift([periods, freq, axis]) Shift index by desired number of periods with an optional time freq
DataFrame.first_valid_index() Return label for first non-NA/null value
DataFrame.last_valid_index() Return label for last non-NA/null value
DataFrame.resample(rule[, how, axis, ...]) Convenience method for frequency conversion and resampling of regular time-series data.
DataFrame.to_period([freq, axis, copy]) Convert DataFrame from DatetimeIndex to PeriodIndex with desired
DataFrame.to_timestamp([freq, how, axis, copy]) Cast to DatetimeIndex of timestamps, at beginning of period
DataFrame.tz_convert(tz[, axis, level, copy]) Convert the axis to target time zone.
DataFrame.tz_localize(*args, **kwargs) Localize tz-naive TimeSeries to target time zone

Plotting

DataFrame.boxplot([column, by, ax, ...]) Make a box plot from DataFrame column optionally grouped by some columns or
DataFrame.hist(data[, column, by, grid, ...]) Draw histogram of the DataFrame’s series using matplotlib / pylab.
DataFrame.plot(data[, x, y, kind, ax, ...]) Make plots of DataFrame using matplotlib / pylab.

Serialization / IO / Conversion

DataFrame.from_csv(path[, header, sep, ...]) Read delimited file into DataFrame
DataFrame.from_dict(data[, orient, dtype]) Construct DataFrame from dict of array-like or dicts
DataFrame.from_items(items[, columns, orient]) Convert (key, value) pairs to DataFrame. The keys will be the axis
DataFrame.from_records(data[, index, ...]) Convert structured or record ndarray to DataFrame
DataFrame.info([verbose, buf, max_cols, ...]) Concise summary of a DataFrame.
DataFrame.to_pickle(path) Pickle (serialize) object to input file path
DataFrame.to_csv(*args, **kwargs) Write DataFrame to a comma-separated values (csv) file
DataFrame.to_hdf(path_or_buf, key, **kwargs) activate the HDFStore
DataFrame.to_sql(name, con[, flavor, ...]) Write records stored in a DataFrame to a SQL database.
DataFrame.to_dict(*args, **kwargs) Convert DataFrame to dictionary.
DataFrame.to_excel(*args, **kwargs) Write DataFrame to a excel sheet
DataFrame.to_json([path_or_buf, orient, ...]) Convert the object to a JSON string.
DataFrame.to_html([buf, columns, col_space, ...]) Render a DataFrame as an HTML table.
DataFrame.to_latex([buf, columns, ...]) Render a DataFrame to a tabular environment table. You can splice
DataFrame.to_stata(fname[, convert_dates, ...]) A class for writing Stata binary dta files from array-like objects
DataFrame.to_msgpack([path_or_buf]) msgpack (serialize) object to input file path
DataFrame.to_gbq(destination_table[, ...]) Write a DataFrame to a Google BigQuery table.
DataFrame.to_records([index, convert_datetime64]) Convert DataFrame to record array. Index will be put in the
DataFrame.to_sparse([fill_value, kind]) Convert to SparseDataFrame
DataFrame.to_dense() Return dense representation of NDFrame (as opposed to sparse)
DataFrame.to_string([buf, columns, ...]) Render a DataFrame to a console-friendly tabular output.
DataFrame.to_clipboard([excel, sep]) Attempt to write text representation of object to the system clipboard

Panel

Constructor

Panel([data, items, major_axis, minor_axis, ...]) Represents wide format panel data, stored as 3-dimensional array

Attributes and underlying data

Axes

  • items: axis 0; each item corresponds to a DataFrame contained inside
  • major_axis: axis 1; the index (rows) of each of the DataFrames
  • minor_axis: axis 2; the columns of each of the DataFrames
Panel.values Numpy representation of NDFrame
Panel.axes index(es) of the NDFrame
Panel.ndim Number of axes / array dimensions
Panel.shape tuple of axis dimensions
Panel.dtypes Return the dtypes in this object
Panel.ftypes Return the ftypes (indication of sparse/dense and dtype)
Panel.get_dtype_counts() Return the counts of dtypes in this object
Panel.get_ftype_counts() Return the counts of ftypes in this object

Conversion

Panel.astype(dtype[, copy, raise_on_error]) Cast object to input numpy.dtype
Panel.copy([deep]) Make a copy of this object
Panel.isnull() Return a boolean same-sized object indicating if the values are null ..
Panel.notnull() Return a boolean same-sized object indicating if the values are not null ..

Getting and setting

Panel.get_value(*args, **kwargs) Quickly retrieve single value at (item, major, minor) location
Panel.set_value(*args, **kwargs) Quickly set single value at (item, major, minor) location

Indexing, iteration, slicing

Panel.at
Panel.iat
Panel.ix
Panel.loc
Panel.iloc
Panel.__iter__() Iterate over infor axis
Panel.iteritems() Iterate over (label, values) on info axis
Panel.pop(item) Return item and drop from frame.
Panel.xs(key[, axis, copy]) Return slice of panel along selected axis
Panel.major_xs(key[, copy]) Return slice of panel along major axis
Panel.minor_xs(key[, copy]) Return slice of panel along minor axis

For more information on .at, .iat, .ix, .loc, and .iloc, see the indexing documentation.

Binary operator functions

Panel.add(other[, axis]) Wrapper method for add
Panel.sub(other[, axis]) Wrapper method for sub
Panel.mul(other[, axis]) Wrapper method for mul
Panel.div(other[, axis]) Wrapper method for truediv
Panel.truediv(other[, axis]) Wrapper method for truediv
Panel.floordiv(other[, axis]) Wrapper method for floordiv
Panel.mod(other[, axis]) Wrapper method for mod
Panel.pow(other[, axis]) Wrapper method for pow
Panel.radd(other[, axis]) Wrapper method for radd
Panel.rsub(other[, axis]) Wrapper method for rsub
Panel.rmul(other[, axis]) Wrapper method for rmul
Panel.rdiv(other[, axis]) Wrapper method for rtruediv
Panel.rtruediv(other[, axis]) Wrapper method for rtruediv
Panel.rfloordiv(other[, axis]) Wrapper method for rfloordiv
Panel.rmod(other[, axis]) Wrapper method for rmod
Panel.rpow(other[, axis]) Wrapper method for rpow
Panel.lt(other) Wrapper for comparison method lt
Panel.gt(other) Wrapper for comparison method gt
Panel.le(other) Wrapper for comparison method le
Panel.ge(other) Wrapper for comparison method ge
Panel.ne(other) Wrapper for comparison method ne
Panel.eq(other) Wrapper for comparison method eq

Function application, GroupBy

Panel.apply(func[, axis]) Applies function along input axis of the Panel
Panel.groupby(function[, axis]) Group data on given axis, returning GroupBy object

Computations / Descriptive Stats

Panel.abs() Return an object with absolute value taken.
Panel.clip([lower, upper, out]) Trim values at input threshold(s)
Panel.clip_lower(threshold) Return copy of the input with values below given value truncated
Panel.clip_upper(threshold) Return copy of input with values above given value truncated
Panel.count([axis]) Return number of observations over requested axis.
Panel.cummax([axis, dtype, out, skipna]) Return cumulative max over requested axis.
Panel.cummin([axis, dtype, out, skipna]) Return cumulative min over requested axis.
Panel.cumprod([axis, dtype, out, skipna]) Return cumulative prod over requested axis.
Panel.cumsum([axis, dtype, out, skipna]) Return cumulative sum over requested axis.
Panel.max([axis, skipna, level, numeric_only]) This method returns the maximum of the values in the object.
Panel.mean([axis, skipna, level, numeric_only]) Return the mean of the values for the requested axis
Panel.median([axis, skipna, level, numeric_only]) Return the median of the values for the requested axis
Panel.min([axis, skipna, level, numeric_only]) This method returns the minimum of the values in the object.
Panel.pct_change([periods, fill_method, ...]) Percent change over given number of periods.
Panel.prod([axis, skipna, level, numeric_only]) Return the product of the values for the requested axis
Panel.sem([axis, skipna, level, ddof]) Return unbiased standard error of the mean over requested axis.
Panel.skew([axis, skipna, level, numeric_only]) Return unbiased skew over requested axis
Panel.sum([axis, skipna, level, numeric_only]) Return the sum of the values for the requested axis
Panel.std([axis, skipna, level, ddof]) Return unbiased standard deviation over requested axis.
Panel.var([axis, skipna, level, ddof]) Return unbiased variance over requested axis.

Reindexing / Selection / Label manipulation

Panel.add_prefix(prefix) Concatenate prefix string with panel items names.
Panel.add_suffix(suffix) Concatenate suffix string with panel items names
Panel.drop(labels[, axis, level, inplace]) Return new object with labels in requested axis removed
Panel.equals(other) Determines if two NDFrame objects contain the same elements. NaNs in the
Panel.filter([items, like, regex, axis]) Restrict the info axis to set of items or wildcard
Panel.first(offset) Convenience method for subsetting initial periods of time series data
Panel.last(offset) Convenience method for subsetting final periods of time series data
Panel.reindex([items, major_axis, minor_axis]) Conform Panel to new index with optional filling logic, placing
Panel.reindex_axis(labels[, axis, method, ...]) Conform input object to new index with optional filling logic,
Panel.reindex_like(other[, method, copy, limit]) return an object with matching indicies to myself
Panel.rename([items, major_axis, minor_axis]) Alter axes input function or functions.
Panel.select(crit[, axis]) Return data corresponding to axis labels matching criteria
Panel.take(indices[, axis, convert, is_copy]) Analogous to ndarray.take
Panel.truncate([before, after, axis, copy]) Truncates a sorted NDFrame before and/or after some particular

Missing data handling

Panel.dropna([axis, how, inplace]) Drop 2D from panel, holding passed axis constant
Panel.fillna([value, method, axis, inplace, ...]) Fill NA/NaN values using the specified method

Reshaping, sorting, transposing

Panel.sort_index([axis, ascending]) Sort object by labels (along an axis)
Panel.swaplevel(i, j[, axis]) Swap levels i and j in a MultiIndex on a particular axis
Panel.transpose(*args, **kwargs) Permute the dimensions of the Panel
Panel.swapaxes(axis1, axis2[, copy]) Interchange axes and swap values axes appropriately
Panel.conform(frame[, axis]) Conform input DataFrame to align with chosen axis pair.

Combining / joining / merging

Panel.join(other[, how, lsuffix, rsuffix]) Join items with other Panel either on major and minor axes column
Panel.update(other[, join, overwrite, ...]) Modify Panel in place using non-NA values from passed

Time series-related

Panel.asfreq(freq[, method, how, normalize]) Convert all TimeSeries inside to specified frequency using DateOffset
Panel.shift(*args, **kwargs) Shift major or minor axis by specified number of leads/lags.
Panel.resample(rule[, how, axis, ...]) Convenience method for frequency conversion and resampling of regular time-series data.
Panel.tz_convert(tz[, axis, level, copy]) Convert the axis to target time zone.
Panel.tz_localize(*args, **kwargs) Localize tz-naive TimeSeries to target time zone

Serialization / IO / Conversion

Panel.from_dict(data[, intersect, orient, dtype]) Construct Panel from dict of DataFrame objects
Panel.to_pickle(path) Pickle (serialize) object to input file path
Panel.to_excel(path[, na_rep, engine]) Write each DataFrame in Panel to a separate excel sheet
Panel.to_hdf(path_or_buf, key, **kwargs) activate the HDFStore
Panel.to_json([path_or_buf, orient, ...]) Convert the object to a JSON string.
Panel.to_sparse([fill_value, kind]) Convert to SparsePanel
Panel.to_frame([filter_observations]) Transform wide format into long (stacked) format as DataFrame whose
Panel.to_clipboard([excel, sep]) Attempt to write text representation of object to the system clipboard

Panel4D

Constructor

Panel4D([data, labels, items, major_axis, ...]) Represents a 4 dimensional structured

Attributes and underlying data

Axes

  • labels: axis 1; each label corresponds to a Panel contained inside
  • items: axis 2; each item corresponds to a DataFrame contained inside
  • major_axis: axis 3; the index (rows) of each of the DataFrames
  • minor_axis: axis 4; the columns of each of the DataFrames
Panel4D.values Numpy representation of NDFrame
Panel4D.axes index(es) of the NDFrame
Panel4D.ndim Number of axes / array dimensions
Panel4D.shape tuple of axis dimensions
Panel4D.dtypes Return the dtypes in this object
Panel4D.ftypes Return the ftypes (indication of sparse/dense and dtype)
Panel4D.get_dtype_counts() Return the counts of dtypes in this object
Panel4D.get_ftype_counts() Return the counts of ftypes in this object

Conversion

Panel4D.astype(dtype[, copy, raise_on_error]) Cast object to input numpy.dtype
Panel4D.copy([deep]) Make a copy of this object
Panel4D.isnull() Return a boolean same-sized object indicating if the values are null ..
Panel4D.notnull() Return a boolean same-sized object indicating if the values are not null ..

Index

Many of these methods or variants thereof are available on the objects that contain an index (Series/Dataframe) and those should most likely be used before calling these methods directly.

Index Immutable ndarray implementing an ordered, sliceable set.

Attributes

Index.values return the underlying data as an ndarray
Index.is_monotonic return if the index has monotonic (only equaly or increasing) values
Index.is_unique
Index.dtype
Index.inferred_type
Index.is_all_dates
Index.shape return a tuple of the shape of the underlying data
Index.size return the number of elements in the underlying data
Index.nbytes return the number of bytes in the underlying data
Index.ndim return the number of dimensions of the underlying data, by definition 1
Index.strides return the strides of the underlying data
Index.itemsize return the size of the dtype of the item of the underlying data
Index.base return the base object if the memory of the underlying data is shared
Index.T return the transpose, which is by definition self

Modifying and Computations

Index.all([axis, out]) Returns True if all elements evaluate to True.
Index.any([axis, out]) Returns True if any of the elements of a evaluate to True.
Index.argmin([axis]) return a ndarray of the minimum argument indexer
Index.argmax([axis]) return a ndarray of the maximum argument indexer
Index.copy([names, name, dtype, deep]) Make a copy of this object.
Index.delete(loc) Make new Index with passed location(-s) deleted
Index.diff(*args, **kwargs)
Index.sym_diff(other[, result_name]) Compute the sorted symmetric difference of two Index objects.
Index.drop(labels) Make new Index with passed list of labels deleted
Index.drop_duplicates([take_last]) Return Index with duplicate values removed
Index.duplicated([take_last]) Return boolean Index denoting duplicate values
Index.equals(other) Determines if two Index objects contain the same elements.
Index.factorize([sort, na_sentinel]) Encode the object as an enumerated type or categorical variable
Index.identical(other) Similar to equals, but check that other comparable attributes are
Index.insert(loc, item) Make new Index inserting new item at location. Follows
Index.min() The minimum value of the object
Index.max() The maximum value of the object
Index.order([return_indexer, ascending]) Return sorted copy of Index
Index.reindex(target[, method, level, limit]) Create index with target’s values (move/add/delete values as necessary)
Index.repeat(n) return a new Index of the values repeated n times
Index.take(indexer[, axis]) return a new Index of the values selected by the indexer
Index.putmask(mask, value) return a new Index of the values set with the mask
Index.set_names(names[, level, inplace]) Set new names on index.
Index.unique() Return array of unique values in the object.
Index.nunique([dropna]) Return number of unique elements in the object.
Index.value_counts([normalize, sort, ...]) Returns object containing counts of unique values.

Conversion

Index.astype(dtype)
Index.tolist() return a list of the Index values
Index.to_datetime([dayfirst]) For an Index containing strings or datetime.datetime objects, attempt
Index.to_series(**kwargs) Create a Series with both index and values equal to the index keys

Sorting

Index.argsort(*args, **kwargs) return an ndarray indexer of the underlying data
Index.order([return_indexer, ascending]) Return sorted copy of Index
Index.sort(*args, **kwargs)

Time-specific operations

Index.shift([periods, freq]) Shift Index containing datetime objects by input number of periods and

Combining / joining / merging

Index.append(other) Append a collection of Index options together
Index.intersection(other) Form the intersection of two Index objects. Sortedness of the result is
Index.join(other[, how, level, return_indexers]) Internal API method. Compute join_index and indexers to conform data
Index.union(other) Form the union of two Index objects and sorts if possible

Selecting

Index.get_indexer(target[, method, limit]) Compute indexer and mask for new index given the current index.
Index.get_indexer_non_unique(target, **kwargs) return an indexer suitable for taking from a non unique index
Index.get_level_values(level) Return vector of label values for requested level, equal to the length
Index.get_loc(key) Get integer location for requested label
Index.get_value(series, key) Fast lookup of value from 1-dimensional ndarray.
Index.isin(values[, level]) Compute boolean array of whether each index value is found in the
Index.slice_indexer([start, end, step]) For an ordered Index, compute the slice indexer for input labels and
Index.slice_locs([start, end]) For an ordered Index, compute the slice locations for input labels

DatetimeIndex

DatetimeIndex Immutable ndarray of datetime64 data, represented internally as int64, and

Time/Date Components

DatetimeIndex.year The year of the datetime
DatetimeIndex.month The month as January=1, December=12
DatetimeIndex.day The days of the datetime
DatetimeIndex.hour The hours of the datetime
DatetimeIndex.minute The minutes of the datetime
DatetimeIndex.second The seconds of the datetime
DatetimeIndex.microsecond The microseconds of the datetime
DatetimeIndex.nanosecond The nanoseconds of the datetime
DatetimeIndex.date Returns numpy array of datetime.date.
DatetimeIndex.time Returns numpy array of datetime.time.
DatetimeIndex.dayofyear The ordinal day of the year
DatetimeIndex.weekofyear The week ordinal of the year
DatetimeIndex.week The week ordinal of the year
DatetimeIndex.dayofweek The day of the week with Monday=0, Sunday=6
DatetimeIndex.weekday The day of the week with Monday=0, Sunday=6
DatetimeIndex.quarter The quarter of the date
DatetimeIndex.tz
DatetimeIndex.freq get/set the frequncy of the Index
DatetimeIndex.freqstr return the frequency object as a string if its set, otherwise None
DatetimeIndex.is_month_start Logical indicating if first day of month (defined by frequency)
DatetimeIndex.is_month_end Logical indicating if last day of month (defined by frequency)
DatetimeIndex.is_quarter_start Logical indicating if first day of quarter (defined by frequency)
DatetimeIndex.is_quarter_end Logical indicating if last day of quarter (defined by frequency)
DatetimeIndex.is_year_start Logical indicating if first day of year (defined by frequency)
DatetimeIndex.is_year_end Logical indicating if last day of year (defined by frequency)

Selecting

DatetimeIndex.indexer_at_time(time[, asof]) Select values at particular time of day (e.g.
DatetimeIndex.indexer_between_time(...[, ...]) Select values between particular times of day (e.g., 9:00-9:30AM)

Time-specific operations

DatetimeIndex.normalize() Return DatetimeIndex with times to midnight. Length is unaltered
DatetimeIndex.snap([freq]) Snap time stamps to nearest occurring frequency
DatetimeIndex.tz_convert(tz) Convert tz-aware DatetimeIndex from one time zone to another (using pytz/dateutil)
DatetimeIndex.tz_localize(*args, **kwargs) Localize tz-naive DatetimeIndex to given time zone (using pytz/dateutil),

Conversion

DatetimeIndex.to_datetime([dayfirst])
DatetimeIndex.to_period([freq]) Cast to PeriodIndex at a particular frequency
DatetimeIndex.to_pydatetime() Return DatetimeIndex as object ndarray of datetime.datetime objects
DatetimeIndex.to_series([keep_tz]) Create a Series with both index and values equal to the index keys

TimedeltaIndex

TimedeltaIndex Immutable ndarray of timedelta64 data, represented internally as int64, and

Components

TimedeltaIndex.days The number of integer days for each element
TimedeltaIndex.hours The number of integer hours for each element
TimedeltaIndex.minutes The number of integer minutes for each element
TimedeltaIndex.seconds The number of integer seconds for each element
TimedeltaIndex.milliseconds The number of integer milliseconds for each element
TimedeltaIndex.microseconds The number of integer microseconds for each element
TimedeltaIndex.nanoseconds The number of integer nanoseconds for each element
TimedeltaIndex.components Return a dataframe of the components of the Timedeltas

Conversion

TimedeltaIndex.to_pytimedelta() Return TimedeltaIndex as object ndarray of datetime.timedelta objects
TimedeltaIndex.to_series(**kwargs) Create a Series with both index and values equal to the index keys

GroupBy

GroupBy objects are returned by groupby calls: pandas.DataFrame.groupby(), pandas.Series.groupby(), etc.

Indexing, iteration

GroupBy.__iter__() Groupby iterator
GroupBy.groups dict {group name -> group labels}
GroupBy.indices dict {group name -> group indices}
GroupBy.get_group(name[, obj]) Constructs NDFrame from group with provided name
Grouper([key, level, freq, axis, sort]) A Grouper allows the user to specify a groupby instruction for a target object

Function application

GroupBy.apply(func, *args, **kwargs) Apply function and combine results together in an intelligent way.
GroupBy.aggregate(func, *args, **kwargs)
GroupBy.transform(func, *args, **kwargs)

Computations / Descriptive Stats

GroupBy.count([axis])
GroupBy.cumcount(**kwargs) Number each item in each group from 0 to the length of that group - 1.
GroupBy.first() Compute first of group values
GroupBy.head([n]) Returns first n rows of each group.
GroupBy.last() Compute last of group values
GroupBy.max() Compute max of group values
GroupBy.mean() Compute mean of groups, excluding missing values
GroupBy.median() Compute median of groups, excluding missing values
GroupBy.min() Compute min of group values
GroupBy.nth(n[, dropna]) Take the nth row from each group if n is an int, or a subset of rows if n is a list of ints.
GroupBy.ohlc() Compute sum of values, excluding missing values
GroupBy.prod() Compute prod of group values
GroupBy.size() Compute group sizes
GroupBy.sem([ddof]) Compute standard error of the mean of groups, excluding missing values
GroupBy.std([ddof]) Compute standard deviation of groups, excluding missing values
GroupBy.sum() Compute sum of group values
GroupBy.var([ddof]) Compute variance of groups, excluding missing values
GroupBy.tail([n]) Returns last n rows of each group

The following methods are available in both SeriesGroupBy and DataFrameGroupBy objects, but may differ slightly, usually in that the DataFrameGroupBy version usually permits the specification of an axis argument, and often an argument indicating whether to restrict application to columns of a specific data type.

DataFrameGroupBy.bfill([axis, inplace, ...]) Synonym for NDFrame.fillna(method=’bfill’)
DataFrameGroupBy.cummax([axis, dtype, out, ...]) Return cumulative max over requested axis.
DataFrameGroupBy.cummin([axis, dtype, out, ...]) Return cumulative min over requested axis.
DataFrameGroupBy.cumprod([axis, dtype, out, ...]) Return cumulative prod over requested axis.
DataFrameGroupBy.cumsum([axis, dtype, out, ...]) Return cumulative sum over requested axis.
DataFrameGroupBy.describe([...]) Generate various summary statistics, excluding NaN values.
DataFrameGroupBy.all([axis, bool_only, ...]) Return whether all elements are True over requested axis.
DataFrameGroupBy.any([axis, bool_only, ...]) Return whether any element is True over requested axis.
DataFrameGroupBy.corr([method, min_periods]) Compute pairwise correlation of columns, excluding NA/null values
DataFrameGroupBy.cov([min_periods]) Compute pairwise covariance of columns, excluding NA/null values
DataFrameGroupBy.diff([periods]) 1st discrete difference of object
DataFrameGroupBy.ffill([axis, inplace, ...]) Synonym for NDFrame.fillna(method=’ffill’)
DataFrameGroupBy.fillna([value, method, ...]) Fill NA/NaN values using the specified method
DataFrameGroupBy.hist(data[, column, by, ...]) Draw histogram of the DataFrame’s series using matplotlib / pylab.
DataFrameGroupBy.idxmax([axis, skipna]) Return index of first occurrence of maximum over requested axis.
DataFrameGroupBy.idxmin([axis, skipna]) Return index of first occurrence of minimum over requested axis.
DataFrameGroupBy.irow(i[, copy])
DataFrameGroupBy.mad([axis, skipna, level]) Return the mean absolute deviation of the values for the requested axis
DataFrameGroupBy.pct_change([periods, ...]) Percent change over given number of periods.
DataFrameGroupBy.plot(data[, x, y, kind, ...]) Make plots of DataFrame using matplotlib / pylab.
DataFrameGroupBy.quantile([q, axis, ...]) Return values at the given quantile over requested axis, a la numpy.percentile.
DataFrameGroupBy.rank([axis, numeric_only, ...]) Compute numerical data ranks (1 through n) along axis.
DataFrameGroupBy.resample(rule[, how, axis, ...]) Convenience method for frequency conversion and resampling of regular time-series data.
DataFrameGroupBy.shift([periods, freq, axis]) Shift index by desired number of periods with an optional time freq
DataFrameGroupBy.skew([axis, skipna, level, ...]) Return unbiased skew over requested axis
DataFrameGroupBy.take(indices[, axis, ...]) Analogous to ndarray.take
DataFrameGroupBy.tshift([periods, freq, axis]) Shift the time index, using the index’s frequency if available

The following methods are available only for SeriesGroupBy objects.

SeriesGroupBy.nlargest([n, take_last]) Return the largest n elements.
SeriesGroupBy.nsmallest([n, take_last]) Return the smallest n elements.
SeriesGroupBy.nunique([dropna]) Return number of unique elements in the object.
SeriesGroupBy.unique() Return array of unique values in the object.
SeriesGroupBy.value_counts([normalize, ...]) Returns object containing counts of unique values.

The following methods are available only for DataFrameGroupBy objects.

DataFrameGroupBy.corrwith(other[, axis, drop]) Compute pairwise correlation between rows or columns of two DataFrame
DataFrameGroupBy.boxplot(grouped[, ...]) Make box plots from DataFrameGroupBy data.

General utility functions

Working with options

describe_option(pat[, _print_desc]) Prints the description for one or more registered options.
reset_option(pat) Reset one or more options to their default value.
get_option(pat) Retrieves the value of the specified option.
set_option(pat, value) Sets the value of the specified option.
option_context(*args) Context manager to temporarily set options in the with statement context.