API Reference¶
This page gives an overview of all public pandas objects, functions and
methods. In general, all classes and functions exposed in the top-level
pandas.*
namespace are regarded as public.
Further some of the subpackages are public, including pandas.errors
,
pandas.plotting
, and pandas.testing
. Certain functions in the the
pandas.io
and pandas.tseries
submodules are public as well (those
mentioned in the documentation). Further, the pandas.api.types
subpackage
holds some public functions related to data types in pandas.
Warning
The pandas.core
, pandas.compat
, and pandas.util
top-level modules are considered to be PRIVATE. Stability of functionality in those modules in not guaranteed.
Input/Output¶
Pickling¶
read_pickle (path[, compression]) |
Load pickled pandas object (or any other pickled object) from the specified |
Flat File¶
read_table (filepath_or_buffer[, sep, ...]) |
Read general delimited file into DataFrame |
read_csv (filepath_or_buffer[, sep, ...]) |
Read CSV (comma-separated) file into DataFrame |
read_fwf (filepath_or_buffer[, colspecs, widths]) |
Read a table of fixed-width formatted lines into DataFrame |
read_msgpack (path_or_buf[, encoding, iterator]) |
Load msgpack pandas object from the specified |
Clipboard¶
read_clipboard ([sep]) |
Read text from clipboard and pass to read_table. |
Excel¶
read_excel (io[, sheet_name, header, ...]) |
Read an Excel table into a pandas DataFrame |
ExcelFile.parse ([sheet_name, header, ...]) |
Parse specified sheet(s) into a DataFrame |
JSON¶
read_json ([path_or_buf, orient, typ, dtype, ...]) |
Convert a JSON string to pandas object |
json_normalize (data[, record_path, meta, ...]) |
“Normalize” semi-structured JSON data into a flat table |
build_table_schema (data[, index, ...]) |
Create a Table schema from data . |
HTML¶
read_html (io[, match, flavor, header, ...]) |
Read HTML tables into a list of DataFrame objects. |
HDFStore: PyTables (HDF5)¶
read_hdf (path_or_buf[, key, mode]) |
read from the store, close it if we opened it |
HDFStore.put (key, value[, format, append]) |
Store object in HDFStore |
HDFStore.append (key, value[, format, ...]) |
Append to Table in file. |
HDFStore.get (key) |
Retrieve pandas object stored in file |
HDFStore.select (key[, where, start, stop, ...]) |
Retrieve pandas object stored in file, optionally based on where |
HDFStore.info () |
print detailed information on the store |
Feather¶
read_feather (path[, nthreads]) |
Load a feather-format object from the file path |
Parquet¶
read_parquet (path[, engine]) |
Load a parquet object from the file path, returning a DataFrame. |
SAS¶
read_sas (filepath_or_buffer[, format, ...]) |
Read SAS files stored as either XPORT or SAS7BDAT format files. |
SQL¶
read_sql_table (table_name, con[, schema, ...]) |
Read SQL database table into a DataFrame. |
read_sql_query (sql, con[, index_col, ...]) |
Read SQL query into a DataFrame. |
read_sql (sql, con[, index_col, ...]) |
Read SQL query or database table into a DataFrame. |
STATA¶
read_stata (filepath_or_buffer[, ...]) |
Read Stata file into DataFrame |
StataReader.data (**kwargs) |
Reads observations from Stata file, converting them into a dataframe |
StataReader.data_label () |
Returns data label of Stata file |
StataReader.value_labels () |
Returns a dict, associating each variable name a dict, associating |
StataReader.variable_labels () |
Returns variable labels as a dict, associating each variable name |
StataWriter.write_file () |
General functions¶
Data manipulations¶
melt (frame[, id_vars, value_vars, var_name, ...]) |
“Unpivots” a DataFrame from wide format to long format, optionally |
pivot (index, columns, values) |
Produce ‘pivot’ table based on 3 columns of this DataFrame. |
pivot_table (data[, values, index, columns, ...]) |
Create a spreadsheet-style pivot table as a DataFrame. |
crosstab (index, columns[, values, rownames, ...]) |
Compute a simple cross-tabulation of two (or more) factors. |
cut (x, bins[, right, labels, retbins, ...]) |
Return indices of half-open bins to which each value of x belongs. |
qcut (x, q[, labels, retbins, precision, ...]) |
Quantile-based discretization function. |
merge (left, right[, how, on, left_on, ...]) |
Merge DataFrame objects by performing a database-style join operation by columns or indexes. |
merge_ordered (left, right[, on, left_on, ...]) |
Perform merge with optional filling/interpolation designed for ordered data like time series data. |
merge_asof (left, right[, on, left_on, ...]) |
Perform an asof merge. |
concat (objs[, axis, join, join_axes, ...]) |
Concatenate pandas objects along a particular axis with optional set logic along the other axes. |
get_dummies (data[, prefix, prefix_sep, ...]) |
Convert categorical variable into dummy/indicator variables |
factorize (values[, sort, order, ...]) |
Encode input values as an enumerated type or categorical variable |
unique (values) |
Hash table-based unique. |
wide_to_long (df, stubnames, i, j[, sep, suffix]) |
Wide panel to long format. |
Top-level missing data¶
isna (obj) |
Detect missing values (NaN in numeric arrays, None/NaN in object arrays) |
isnull (obj) |
Detect missing values (NaN in numeric arrays, None/NaN in object arrays) |
notna (obj) |
Replacement for numpy.isfinite / -numpy.isnan which is suitable for use on object arrays. |
notnull (obj) |
Replacement for numpy.isfinite / -numpy.isnan which is suitable for use on object arrays. |
Top-level conversions¶
to_numeric (arg[, errors, downcast]) |
Convert argument to a numeric type. |
Top-level dealing with datetimelike¶
to_datetime (arg[, errors, dayfirst, ...]) |
Convert argument to datetime. |
to_timedelta (arg[, unit, box, errors]) |
Convert argument to timedelta |
date_range ([start, end, periods, freq, tz, ...]) |
Return a fixed frequency DatetimeIndex, with day (calendar) as the default |
bdate_range ([start, end, periods, freq, tz, ...]) |
Return a fixed frequency DatetimeIndex, with business day as the default |
period_range ([start, end, periods, freq, name]) |
Return a fixed frequency PeriodIndex, with day (calendar) as the default |
timedelta_range ([start, end, periods, freq, ...]) |
Return a fixed frequency TimedeltaIndex, with day as the default |
infer_freq (index[, warn]) |
Infer the most likely frequency given the input index. |
Top-level dealing with intervals¶
interval_range ([start, end, periods, freq, ...]) |
Return a fixed frequency IntervalIndex |
Series¶
Constructor¶
Series ([data, index, dtype, name, copy, ...]) |
One-dimensional ndarray with axis labels (including time series). |
Attributes¶
- Axes
- index: axis labels
Series.values |
Return Series as ndarray or ndarray-like |
Series.dtype |
return the dtype object of the underlying data |
Series.ftype |
return if the data is sparse|dense |
Series.shape |
return a tuple of the shape of the underlying data |
Series.nbytes |
return the number of bytes in the underlying data |
Series.ndim |
return the number of dimensions of the underlying data, |
Series.size |
return the number of elements in the underlying data |
Series.strides |
return the strides of the underlying data |
Series.itemsize |
return the size of the dtype of the item of the underlying data |
Series.base |
return the base object if the memory of the underlying data is |
Series.T |
return the transpose, which is by definition self |
Series.memory_usage ([index, deep]) |
Memory usage of the Series |
Conversion¶
Series.astype (dtype[, copy, errors]) |
Cast a pandas object to a specified dtype dtype . |
Series.infer_objects () |
Attempt to infer better dtypes for object columns. |
Series.copy ([deep]) |
Make a copy of this objects data. |
Series.isna () |
Return a boolean same-sized object indicating if the values are NA. |
Series.notna () |
Return a boolean same-sized object indicating if the values are not NA. |
Indexing, iteration¶
Series.get (key[, default]) |
Get item from object for given key (DataFrame column, Panel slice, etc.). |
Series.at |
Fast label-based scalar accessor |
Series.iat |
Fast integer location scalar accessor. |
Series.loc |
Purely label-location based indexer for selection by label. |
Series.iloc |
Purely integer-location based indexing for selection by position. |
Series.__iter__ () |
Return an iterator of the values. |
Series.iteritems () |
Lazily iterate over (index, value) tuples |
For more information on .at
, .iat
, .loc
, and
.iloc
, see the indexing documentation.
Binary operator functions¶
Series.add (other[, level, fill_value, axis]) |
Addition of series and other, element-wise (binary operator add). |
Series.sub (other[, level, fill_value, axis]) |
Subtraction of series and other, element-wise (binary operator sub). |
Series.mul (other[, level, fill_value, axis]) |
Multiplication of series and other, element-wise (binary operator mul). |
Series.div (other[, level, fill_value, axis]) |
Floating division of series and other, element-wise (binary operator truediv). |
Series.truediv (other[, level, fill_value, axis]) |
Floating division of series and other, element-wise (binary operator truediv). |
Series.floordiv (other[, level, fill_value, axis]) |
Integer division of series and other, element-wise (binary operator floordiv). |
Series.mod (other[, level, fill_value, axis]) |
Modulo of series and other, element-wise (binary operator mod). |
Series.pow (other[, level, fill_value, axis]) |
Exponential power of series and other, element-wise (binary operator pow). |
Series.radd (other[, level, fill_value, axis]) |
Addition of series and other, element-wise (binary operator radd). |
Series.rsub (other[, level, fill_value, axis]) |
Subtraction of series and other, element-wise (binary operator rsub). |
Series.rmul (other[, level, fill_value, axis]) |
Multiplication of series and other, element-wise (binary operator rmul). |
Series.rdiv (other[, level, fill_value, axis]) |
Floating division of series and other, element-wise (binary operator rtruediv). |
Series.rtruediv (other[, level, fill_value, axis]) |
Floating division of series and other, element-wise (binary operator rtruediv). |
Series.rfloordiv (other[, level, fill_value, ...]) |
Integer division of series and other, element-wise (binary operator rfloordiv). |
Series.rmod (other[, level, fill_value, axis]) |
Modulo of series and other, element-wise (binary operator rmod). |
Series.rpow (other[, level, fill_value, axis]) |
Exponential power of series and other, element-wise (binary operator rpow). |
Series.combine (other, func[, fill_value]) |
Perform elementwise binary operation on two Series using given function |
Series.combine_first (other) |
Combine Series values, choosing the calling Series’s values first. |
Series.round ([decimals]) |
Round each value in a Series to the given number of decimals. |
Series.lt (other[, level, fill_value, axis]) |
Less than of series and other, element-wise (binary operator lt). |
Series.gt (other[, level, fill_value, axis]) |
Greater than of series and other, element-wise (binary operator gt). |
Series.le (other[, level, fill_value, axis]) |
Less than or equal to of series and other, element-wise (binary operator le). |
Series.ge (other[, level, fill_value, axis]) |
Greater than or equal to of series and other, element-wise (binary operator ge). |
Series.ne (other[, level, fill_value, axis]) |
Not equal to of series and other, element-wise (binary operator ne). |
Series.eq (other[, level, fill_value, axis]) |
Equal to of series and other, element-wise (binary operator eq). |
Function application, GroupBy & Window¶
Series.apply (func[, convert_dtype, args]) |
Invoke function on values of Series. |
Series.aggregate (func[, axis]) |
Aggregate using callable, string, dict, or list of string/callables |
Series.transform (func, *args, **kwargs) |
Call function producing a like-indexed NDFrame |
Series.map (arg[, na_action]) |
Map values of Series using input correspondence (which can be |
Series.groupby ([by, axis, level, as_index, ...]) |
Group series using mapper (dict or key function, apply given function to group, return result as series) or by a series of columns. |
Series.rolling (window[, min_periods, freq, ...]) |
Provides rolling window calculations. |
Series.expanding ([min_periods, freq, ...]) |
Provides expanding transformations. |
Series.ewm ([com, span, halflife, alpha, ...]) |
Provides exponential weighted functions |
Computations / Descriptive Stats¶
Series.abs () |
Return an object with absolute value taken–only applicable to objects that are all numeric. |
Series.all ([axis, bool_only, skipna, level]) |
Return whether all elements are True over requested axis |
Series.any ([axis, bool_only, skipna, level]) |
Return whether any element is True over requested axis |
Series.autocorr ([lag]) |
Lag-N autocorrelation |
Series.between (left, right[, inclusive]) |
Return boolean Series equivalent to left <= series <= right. |
Series.clip ([lower, upper, axis, inplace]) |
Trim values at input threshold(s). |
Series.clip_lower (threshold[, axis, inplace]) |
Return copy of the input with values below given value(s) truncated. |
Series.clip_upper (threshold[, axis, inplace]) |
Return copy of input with values above given value(s) truncated. |
Series.corr (other[, method, min_periods]) |
Compute correlation with other Series, excluding missing values |
Series.count ([level]) |
Return number of non-NA/null observations in the Series |
Series.cov (other[, min_periods]) |
Compute covariance with Series, excluding missing values |
Series.cummax ([axis, skipna]) |
Return cumulative max over requested axis. |
Series.cummin ([axis, skipna]) |
Return cumulative minimum over requested axis. |
Series.cumprod ([axis, skipna]) |
Return cumulative product over requested axis. |
Series.cumsum ([axis, skipna]) |
Return cumulative sum over requested axis. |
Series.describe ([percentiles, include, exclude]) |
Generates descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values. |
Series.diff ([periods]) |
1st discrete difference of object |
Series.factorize ([sort, na_sentinel]) |
Encode the object as an enumerated type or categorical variable |
Series.kurt ([axis, skipna, level, numeric_only]) |
Return unbiased kurtosis over requested axis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0). |
Series.mad ([axis, skipna, level]) |
Return the mean absolute deviation of the values for the requested axis |
Series.max ([axis, skipna, level, numeric_only]) |
This method returns the maximum of the values in the object. |
Series.mean ([axis, skipna, level, numeric_only]) |
Return the mean of the values for the requested axis |
Series.median ([axis, skipna, level, ...]) |
Return the median of the values for the requested axis |
Series.min ([axis, skipna, level, numeric_only]) |
This method returns the minimum of the values in the object. |
Series.mode () |
Return the mode(s) of the dataset. |
Series.nlargest ([n, keep]) |
Return the largest n elements. |
Series.nsmallest ([n, keep]) |
Return the smallest n elements. |
Series.pct_change ([periods, fill_method, ...]) |
Percent change over given number of periods. |
Series.prod ([axis, skipna, level, numeric_only]) |
Return the product of the values for the requested axis |
Series.quantile ([q, interpolation]) |
Return value at the given quantile, a la numpy.percentile. |
Series.rank ([axis, method, numeric_only, ...]) |
Compute numerical data ranks (1 through n) along axis. |
Series.sem ([axis, skipna, level, ddof, ...]) |
Return unbiased standard error of the mean over requested axis. |
Series.skew ([axis, skipna, level, numeric_only]) |
Return unbiased skew over requested axis |
Series.std ([axis, skipna, level, ddof, ...]) |
Return sample standard deviation over requested axis. |
Series.sum ([axis, skipna, level, numeric_only]) |
Return the sum of the values for the requested axis |
Series.var ([axis, skipna, level, ddof, ...]) |
Return unbiased variance over requested axis. |
Series.unique () |
Return unique values in the object. |
Series.nunique ([dropna]) |
Return number of unique elements in the object. |
Series.is_unique |
Return boolean if values in the object are unique |
Series.is_monotonic |
Return boolean if values in the object are |
Series.is_monotonic_increasing |
Return boolean if values in the object are |
Series.is_monotonic_decreasing |
Return boolean if values in the object are |
Series.value_counts ([normalize, sort, ...]) |
Returns object containing counts of unique values. |
Reindexing / Selection / Label manipulation¶
Series.align (other[, join, axis, level, ...]) |
Align two objects on their axes with the |
Series.drop ([labels, axis, index, columns, ...]) |
Return new object with labels in requested axis removed. |
Series.drop_duplicates ([keep, inplace]) |
Return Series with duplicate values removed |
Series.duplicated ([keep]) |
Return boolean Series denoting duplicate values |
Series.equals (other) |
Determines if two NDFrame objects contain the same elements. |
Series.first (offset) |
Convenience method for subsetting initial periods of time series data based on a date offset. |
Series.head ([n]) |
Return the first n rows. |
Series.idxmax ([axis, skipna]) |
Index label of the first occurrence of maximum of values. |
Series.idxmin ([axis, skipna]) |
Index label of the first occurrence of minimum of values. |
Series.isin (values) |
Return a boolean Series showing whether each element in the Series is exactly contained in the passed sequence of values . |
Series.last (offset) |
Convenience method for subsetting final periods of time series data based on a date offset. |
Series.reindex ([index]) |
Conform Series to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index. |
Series.reindex_like (other[, method, copy, ...]) |
Return an object with matching indices to myself. |
Series.rename ([index]) |
Alter Series index labels or name |
Series.rename_axis (mapper[, axis, copy, inplace]) |
Alter the name of the index or columns. |
Series.reset_index ([level, drop, name, inplace]) |
Analogous to the pandas.DataFrame.reset_index() function, see docstring there. |
Series.sample ([n, frac, replace, weights, ...]) |
Returns a random sample of items from an axis of object. |
Series.select (crit[, axis]) |
Return data corresponding to axis labels matching criteria |
Series.set_axis (labels[, axis, inplace]) |
Assign desired index to given axis |
Series.take (indices[, axis, convert, is_copy]) |
Return the elements in the given positional indices along an axis. |
Series.tail ([n]) |
Return the last n rows. |
Series.truncate ([before, after, axis, copy]) |
Truncates a sorted DataFrame/Series before and/or after some particular index value. |
Series.where (cond[, other, inplace, axis, ...]) |
Return an object of same shape as self and whose corresponding entries are from self where cond is True and otherwise are from other. |
Series.mask (cond[, other, inplace, axis, ...]) |
Return an object of same shape as self and whose corresponding entries are from self where cond is False and otherwise are from other. |
Missing data handling¶
Series.dropna ([axis, inplace]) |
Return Series without null values |
Series.fillna ([value, method, axis, ...]) |
Fill NA/NaN values using the specified method |
Series.interpolate ([method, axis, limit, ...]) |
Interpolate values according to different methods. |
Reshaping, sorting¶
Series.argsort ([axis, kind, order]) |
Overrides ndarray.argsort. |
Series.reorder_levels (order) |
Rearrange index levels using input order. |
Series.sort_values ([axis, ascending, ...]) |
Sort by the values along either axis |
Series.sort_index ([axis, level, ascending, ...]) |
Sort object by labels (along an axis) |
Series.swaplevel ([i, j, copy]) |
Swap levels i and j in a MultiIndex |
Series.unstack ([level, fill_value]) |
Unstack, a.k.a. |
Series.searchsorted (value[, side, sorter]) |
Find indices where elements should be inserted to maintain order. |
Combining / joining / merging¶
Series.append (to_append[, ignore_index, ...]) |
Concatenate two or more Series. |
Series.replace ([to_replace, value, inplace, ...]) |
Replace values given in ‘to_replace’ with ‘value’. |
Series.update (other) |
Modify Series in place using non-NA values from passed Series. |
Datetimelike Properties¶
Series.dt
can be used to access the values of the series as
datetimelike and return several properties.
These can be accessed like Series.dt.<property>
.
Datetime Properties
Series.dt.date |
Returns numpy array of python datetime.date objects (namely, the date part of Timestamps without timezone information). |
Series.dt.time |
Returns numpy array of datetime.time. |
Series.dt.year |
The year of the datetime |
Series.dt.month |
The month as January=1, December=12 |
Series.dt.day |
The days of the datetime |
Series.dt.hour |
The hours of the datetime |
Series.dt.minute |
The minutes of the datetime |
Series.dt.second |
The seconds of the datetime |
Series.dt.microsecond |
The microseconds of the datetime |
Series.dt.nanosecond |
The nanoseconds of the datetime |
Series.dt.week |
The week ordinal of the year |
Series.dt.weekofyear |
The week ordinal of the year |
Series.dt.dayofweek |
The day of the week with Monday=0, Sunday=6 |
Series.dt.weekday |
The day of the week with Monday=0, Sunday=6 |
Series.dt.weekday_name |
The name of day in a week (ex: Friday) |
Series.dt.dayofyear |
The ordinal day of the year |
Series.dt.quarter |
The quarter of the date |
Series.dt.is_month_start |
Logical indicating if first day of month (defined by frequency) |
Series.dt.is_month_end |
Logical indicating if last day of month (defined by frequency) |
Series.dt.is_quarter_start |
Logical indicating if first day of quarter (defined by frequency) |
Series.dt.is_quarter_end |
Logical indicating if last day of quarter (defined by frequency) |
Series.dt.is_year_start |
Logical indicating if first day of year (defined by frequency) |
Series.dt.is_year_end |
Logical indicating if last day of year (defined by frequency) |
Series.dt.is_leap_year |
Logical indicating if the date belongs to a leap year |
Series.dt.daysinmonth |
The number of days in the month |
Series.dt.days_in_month |
The number of days in the month |
Series.dt.tz |
|
Series.dt.freq |
Datetime Methods
Series.dt.to_period (*args, **kwargs) |
Cast to PeriodIndex at a particular frequency |
Series.dt.to_pydatetime () |
|
Series.dt.tz_localize (*args, **kwargs) |
Localize tz-naive DatetimeIndex to given time zone (using |
Series.dt.tz_convert (*args, **kwargs) |
Convert tz-aware DatetimeIndex from one time zone to another (using |
Series.dt.normalize (*args, **kwargs) |
Return DatetimeIndex with times to midnight. |
Series.dt.strftime (*args, **kwargs) |
Return an array of formatted strings specified by date_format, which supports the same string format as the python standard library. |
Series.dt.round (*args, **kwargs) |
round the index to the specified freq |
Series.dt.floor (*args, **kwargs) |
floor the index to the specified freq |
Series.dt.ceil (*args, **kwargs) |
ceil the index to the specified freq |
Timedelta Properties
Series.dt.days |
Number of days for each element. |
Series.dt.seconds |
Number of seconds (>= 0 and less than 1 day) for each element. |
Series.dt.microseconds |
Number of microseconds (>= 0 and less than 1 second) for each element. |
Series.dt.nanoseconds |
Number of nanoseconds (>= 0 and less than 1 microsecond) for each element. |
Series.dt.components |
Return a dataframe of the components (days, hours, minutes, seconds, milliseconds, microseconds, nanoseconds) of the Timedeltas. |
Timedelta Methods
Series.dt.to_pytimedelta () |
|
Series.dt.total_seconds (*args, **kwargs) |
Total duration of each element expressed in seconds. |
String handling¶
Series.str
can be used to access the values of the series as
strings and apply several methods to it. These can be accessed like
Series.str.<function/property>
.
Series.str.capitalize () |
Convert strings in the Series/Index to be capitalized. |
Series.str.cat ([others, sep, na_rep]) |
Concatenate strings in the Series/Index with given separator. |
Series.str.center (width[, fillchar]) |
Filling left and right side of strings in the Series/Index with an additional character. |
Series.str.contains (pat[, case, flags, na, ...]) |
Return boolean Series/array whether given pattern/regex is contained in each string in the Series/Index. |
Series.str.count (pat[, flags]) |
Count occurrences of pattern in each string of the Series/Index. |
Series.str.decode (encoding[, errors]) |
Decode character string in the Series/Index using indicated encoding. |
Series.str.encode (encoding[, errors]) |
Encode character string in the Series/Index using indicated encoding. |
Series.str.endswith (pat[, na]) |
Return boolean Series indicating whether each string in the Series/Index ends with passed pattern. |
Series.str.extract (pat[, flags, expand]) |
For each subject string in the Series, extract groups from the first match of regular expression pat. |
Series.str.extractall (pat[, flags]) |
For each subject string in the Series, extract groups from all matches of regular expression pat. |
Series.str.find (sub[, start, end]) |
Return lowest indexes in each strings in the Series/Index where the substring is fully contained between [start:end]. |
Series.str.findall (pat[, flags]) |
Find all occurrences of pattern or regular expression in the Series/Index. |
Series.str.get (i) |
Extract element from lists, tuples, or strings in each element in the Series/Index. |
Series.str.index (sub[, start, end]) |
Return lowest indexes in each strings where the substring is fully contained between [start:end]. |
Series.str.join (sep) |
Join lists contained as elements in the Series/Index with passed delimiter. |
Series.str.len () |
Compute length of each string in the Series/Index. |
Series.str.ljust (width[, fillchar]) |
Filling right side of strings in the Series/Index with an additional character. |
Series.str.lower () |
Convert strings in the Series/Index to lowercase. |
Series.str.lstrip ([to_strip]) |
Strip whitespace (including newlines) from each string in the Series/Index from left side. |
Series.str.match (pat[, case, flags, na, ...]) |
Determine if each string matches a regular expression. |
Series.str.normalize (form) |
Return the Unicode normal form for the strings in the Series/Index. |
Series.str.pad (width[, side, fillchar]) |
Pad strings in the Series/Index with an additional character to specified side. |
Series.str.partition ([pat, expand]) |
Split the string at the first occurrence of sep, and return 3 elements containing the part before the separator, the separator itself, and the part after the separator. |
Series.str.repeat (repeats) |
Duplicate each string in the Series/Index by indicated number of times. |
Series.str.replace (pat, repl[, n, case, flags]) |
Replace occurrences of pattern/regex in the Series/Index with some other string. |
Series.str.rfind (sub[, start, end]) |
Return highest indexes in each strings in the Series/Index where the substring is fully contained between [start:end]. |
Series.str.rindex (sub[, start, end]) |
Return highest indexes in each strings where the substring is fully contained between [start:end]. |
Series.str.rjust (width[, fillchar]) |
Filling left side of strings in the Series/Index with an additional character. |
Series.str.rpartition ([pat, expand]) |
Split the string at the last occurrence of sep, and return 3 elements containing the part before the separator, the separator itself, and the part after the separator. |
Series.str.rstrip ([to_strip]) |
Strip whitespace (including newlines) from each string in the Series/Index from right side. |
Series.str.slice ([start, stop, step]) |
Slice substrings from each element in the Series/Index |
Series.str.slice_replace ([start, stop, repl]) |
Replace a slice of each string in the Series/Index with another string. |
Series.str.split ([pat, n, expand]) |
Split each string (a la re.split) in the Series/Index by given pattern, propagating NA values. |
Series.str.rsplit ([pat, n, expand]) |
Split each string in the Series/Index by the given delimiter string, starting at the end of the string and working to the front. |
Series.str.startswith (pat[, na]) |
Return boolean Series/array indicating whether each string in the Series/Index starts with passed pattern. |
Series.str.strip ([to_strip]) |
Strip whitespace (including newlines) from each string in the Series/Index from left and right sides. |
Series.str.swapcase () |
Convert strings in the Series/Index to be swapcased. |
Series.str.title () |
Convert strings in the Series/Index to titlecase. |
Series.str.translate (table[, deletechars]) |
Map all characters in the string through the given mapping table. |
Series.str.upper () |
Convert strings in the Series/Index to uppercase. |
Series.str.wrap (width, **kwargs) |
Wrap long strings in the Series/Index to be formatted in paragraphs with length less than a given width. |
Series.str.zfill (width) |
Filling left side of strings in the Series/Index with 0. |
Series.str.isalnum () |
Check whether all characters in each string in the Series/Index are alphanumeric. |
Series.str.isalpha () |
Check whether all characters in each string in the Series/Index are alphabetic. |
Series.str.isdigit () |
Check whether all characters in each string in the Series/Index are digits. |
Series.str.isspace () |
Check whether all characters in each string in the Series/Index are whitespace. |
Series.str.islower () |
Check whether all characters in each string in the Series/Index are lowercase. |
Series.str.isupper () |
Check whether all characters in each string in the Series/Index are uppercase. |
Series.str.istitle () |
Check whether all characters in each string in the Series/Index are titlecase. |
Series.str.isnumeric () |
Check whether all characters in each string in the Series/Index are numeric. |
Series.str.isdecimal () |
Check whether all characters in each string in the Series/Index are decimal. |
Series.str.get_dummies ([sep]) |
Split each string in the Series by sep and return a frame of dummy/indicator variables. |
Categorical¶
The dtype of a Categorical
can be described by a pandas.api.types.CategoricalDtype
.
api.types.CategoricalDtype ([categories, ordered]) |
Type for categorical data with the categories and orderedness |
If the Series is of dtype CategoricalDtype
, Series.cat
can be used to change the categorical
data. This accessor is similar to the Series.dt
or Series.str
and has the
following usable methods and properties:
Series.cat.categories |
The categories of this categorical. |
Series.cat.ordered |
Whether the categories have an ordered relationship |
Series.cat.codes |
Series.cat.rename_categories (*args, **kwargs) |
Renames categories. |
Series.cat.reorder_categories (*args, **kwargs) |
Reorders categories as specified in new_categories. |
Series.cat.add_categories (*args, **kwargs) |
Add new categories. |
Series.cat.remove_categories (*args, **kwargs) |
Removes the specified categories. |
Series.cat.remove_unused_categories (*args, ...) |
Removes categories which are not used. |
Series.cat.set_categories (*args, **kwargs) |
Sets the categories to the specified new_categories. |
Series.cat.as_ordered (*args, **kwargs) |
Sets the Categorical to be ordered |
Series.cat.as_unordered (*args, **kwargs) |
Sets the Categorical to be unordered |
To create a Series of dtype category
, use cat = s.astype("category")
.
The following two Categorical
constructors are considered API but should only be used when
adding ordering information or special categories is need at creation time of the categorical data:
Categorical (values[, categories, ordered, ...]) |
Represents a categorical variable in classic R / S-plus fashion |
Categorical.from_codes (codes, categories[, ...]) |
Make a Categorical type from codes and categories arrays. |
np.asarray(categorical)
works by implementing the array interface. Be aware, that this converts
the Categorical back to a numpy array, so categories and order information is not preserved!
Categorical.__array__ ([dtype]) |
The numpy array interface. |
Plotting¶
Series.plot
is both a callable method and a namespace attribute for
specific plotting methods of the form Series.plot.<kind>
.
Series.plot ([kind, ax, figsize, ....]) |
Series plotting accessor and method |
Series.plot.area (**kwds) |
Area plot |
Series.plot.bar (**kwds) |
Vertical bar plot |
Series.plot.barh (**kwds) |
Horizontal bar plot |
Series.plot.box (**kwds) |
Boxplot |
Series.plot.density (**kwds) |
Kernel Density Estimate plot |
Series.plot.hist ([bins]) |
Histogram |
Series.plot.kde (**kwds) |
Kernel Density Estimate plot |
Series.plot.line (**kwds) |
Line plot |
Series.plot.pie (**kwds) |
Pie chart |
Series.hist ([by, ax, grid, xlabelsize, ...]) |
Draw histogram of the input series using matplotlib |
Serialization / IO / Conversion¶
Series.from_csv (path[, sep, parse_dates, ...]) |
Read CSV file (DEPRECATED, please use pandas.read_csv() instead). |
Series.to_pickle (path[, compression, protocol]) |
Pickle (serialize) object to input file path. |
Series.to_csv ([path, index, sep, na_rep, ...]) |
Write Series to a comma-separated values (csv) file |
Series.to_dict ([into]) |
Convert Series to {label -> value} dict or dict-like object. |
Series.to_excel (excel_writer[, sheet_name, ...]) |
Write Series to an excel sheet |
Series.to_frame ([name]) |
Convert Series to DataFrame |
Series.to_xarray () |
Return an xarray object from the pandas object. |
Series.to_hdf (path_or_buf, key, **kwargs) |
Write the contained data to an HDF5 file using HDFStore. |
Series.to_sql (name, con[, flavor, schema, ...]) |
Write records stored in a DataFrame to a SQL database. |
Series.to_msgpack ([path_or_buf, encoding]) |
msgpack (serialize) object to input file path |
Series.to_json ([path_or_buf, orient, ...]) |
Convert the object to a JSON string. |
Series.to_sparse ([kind, fill_value]) |
Convert Series to SparseSeries |
Series.to_dense () |
Return dense representation of NDFrame (as opposed to sparse) |
Series.to_string ([buf, na_rep, ...]) |
Render a string representation of the Series |
Series.to_clipboard ([excel, sep]) |
Attempt to write text representation of object to the system clipboard This can be pasted into Excel, for example. |
Series.to_latex ([buf, columns, col_space, ...]) |
Render an object to a tabular environment table. |
Sparse¶
SparseSeries.to_coo ([row_levels, ...]) |
Create a scipy.sparse.coo_matrix from a SparseSeries with MultiIndex. |
SparseSeries.from_coo (A[, dense_index]) |
Create a SparseSeries from a scipy.sparse.coo_matrix. |
DataFrame¶
Constructor¶
DataFrame ([data, index, columns, dtype, copy]) |
Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). |
Attributes and underlying data¶
Axes
- index: row labels
- columns: column labels
DataFrame.as_matrix ([columns]) |
Convert the frame to its Numpy-array representation. |
DataFrame.dtypes |
Return the dtypes in this object. |
DataFrame.ftypes |
Return the ftypes (indication of sparse/dense and dtype) in this object. |
DataFrame.get_dtype_counts () |
Return the counts of dtypes in this object. |
DataFrame.get_ftype_counts () |
Return the counts of ftypes in this object. |
DataFrame.select_dtypes ([include, exclude]) |
Return a subset of a DataFrame including/excluding columns based on their dtype . |
DataFrame.values |
Numpy representation of NDFrame |
DataFrame.axes |
Return a list with the row axis labels and column axis labels as the only members. |
DataFrame.ndim |
Number of axes / array dimensions |
DataFrame.size |
number of elements in the NDFrame |
DataFrame.shape |
Return a tuple representing the dimensionality of the DataFrame. |
DataFrame.memory_usage ([index, deep]) |
Memory usage of DataFrame columns. |
Conversion¶
DataFrame.astype (dtype[, copy, errors]) |
Cast a pandas object to a specified dtype dtype . |
DataFrame.convert_objects ([convert_dates, ...]) |
Deprecated. |
DataFrame.infer_objects () |
Attempt to infer better dtypes for object columns. |
DataFrame.copy ([deep]) |
Make a copy of this objects data. |
DataFrame.isna () |
Return a boolean same-sized object indicating if the values are NA. |
DataFrame.notna () |
Return a boolean same-sized object indicating if the values are not NA. |
Indexing, iteration¶
DataFrame.head ([n]) |
Return the first n rows. |
DataFrame.at |
Fast label-based scalar accessor |
DataFrame.iat |
Fast integer location scalar accessor. |
DataFrame.loc |
Purely label-location based indexer for selection by label. |
DataFrame.iloc |
Purely integer-location based indexing for selection by position. |
DataFrame.insert (loc, column, value[, ...]) |
Insert column into DataFrame at specified location. |
DataFrame.__iter__ () |
Iterate over infor axis |
DataFrame.iteritems () |
Iterator over (column name, Series) pairs. |
DataFrame.iterrows () |
Iterate over DataFrame rows as (index, Series) pairs. |
DataFrame.itertuples ([index, name]) |
Iterate over DataFrame rows as namedtuples, with index value as first element of the tuple. |
DataFrame.lookup (row_labels, col_labels) |
Label-based “fancy indexing” function for DataFrame. |
DataFrame.pop (item) |
Return item and drop from frame. |
DataFrame.tail ([n]) |
Return the last n rows. |
DataFrame.xs (key[, axis, level, drop_level]) |
Returns a cross-section (row(s) or column(s)) from the Series/DataFrame. |
DataFrame.isin (values) |
Return boolean DataFrame showing whether each element in the DataFrame is contained in values. |
DataFrame.where (cond[, other, inplace, ...]) |
Return an object of same shape as self and whose corresponding entries are from self where cond is True and otherwise are from other. |
DataFrame.mask (cond[, other, inplace, axis, ...]) |
Return an object of same shape as self and whose corresponding entries are from self where cond is False and otherwise are from other. |
DataFrame.query (expr[, inplace]) |
Query the columns of a frame with a boolean expression. |
For more information on .at
, .iat
, .loc
, and
.iloc
, see the indexing documentation.
Binary operator functions¶
DataFrame.add (other[, axis, level, fill_value]) |
Addition of dataframe and other, element-wise (binary operator add). |
DataFrame.sub (other[, axis, level, fill_value]) |
Subtraction of dataframe and other, element-wise (binary operator sub). |
DataFrame.mul (other[, axis, level, fill_value]) |
Multiplication of dataframe and other, element-wise (binary operator mul). |
DataFrame.div (other[, axis, level, fill_value]) |
Floating division of dataframe and other, element-wise (binary operator truediv). |
DataFrame.truediv (other[, axis, level, ...]) |
Floating division of dataframe and other, element-wise (binary operator truediv). |
DataFrame.floordiv (other[, axis, level, ...]) |
Integer division of dataframe and other, element-wise (binary operator floordiv). |
DataFrame.mod (other[, axis, level, fill_value]) |
Modulo of dataframe and other, element-wise (binary operator mod). |
DataFrame.pow (other[, axis, level, fill_value]) |
Exponential power of dataframe and other, element-wise (binary operator pow). |
DataFrame.radd (other[, axis, level, fill_value]) |
Addition of dataframe and other, element-wise (binary operator radd). |
DataFrame.rsub (other[, axis, level, fill_value]) |
Subtraction of dataframe and other, element-wise (binary operator rsub). |
DataFrame.rmul (other[, axis, level, fill_value]) |
Multiplication of dataframe and other, element-wise (binary operator rmul). |
DataFrame.rdiv (other[, axis, level, fill_value]) |
Floating division of dataframe and other, element-wise (binary operator rtruediv). |
DataFrame.rtruediv (other[, axis, level, ...]) |
Floating division of dataframe and other, element-wise (binary operator rtruediv). |
DataFrame.rfloordiv (other[, axis, level, ...]) |
Integer division of dataframe and other, element-wise (binary operator rfloordiv). |
DataFrame.rmod (other[, axis, level, fill_value]) |
Modulo of dataframe and other, element-wise (binary operator rmod). |
DataFrame.rpow (other[, axis, level, fill_value]) |
Exponential power of dataframe and other, element-wise (binary operator rpow). |
DataFrame.lt (other[, axis, level]) |
Wrapper for flexible comparison methods lt |
DataFrame.gt (other[, axis, level]) |
Wrapper for flexible comparison methods gt |
DataFrame.le (other[, axis, level]) |
Wrapper for flexible comparison methods le |
DataFrame.ge (other[, axis, level]) |
Wrapper for flexible comparison methods ge |
DataFrame.ne (other[, axis, level]) |
Wrapper for flexible comparison methods ne |
DataFrame.eq (other[, axis, level]) |
Wrapper for flexible comparison methods eq |
DataFrame.combine (other, func[, fill_value, ...]) |
Add two DataFrame objects and do not propagate NaN values, so if for a |
DataFrame.combine_first (other) |
Combine two DataFrame objects and default to non-null values in frame calling the method. |
Function application, GroupBy & Window¶
DataFrame.apply (func[, axis, broadcast, ...]) |
Applies function along input axis of DataFrame. |
DataFrame.applymap (func) |
Apply a function to a DataFrame that is intended to operate elementwise, i.e. |
DataFrame.aggregate (func[, axis]) |
Aggregate using callable, string, dict, or list of string/callables |
DataFrame.transform (func, *args, **kwargs) |
Call function producing a like-indexed NDFrame |
DataFrame.groupby ([by, axis, level, ...]) |
Group series using mapper (dict or key function, apply given function to group, return result as series) or by a series of columns. |
DataFrame.rolling (window[, min_periods, ...]) |
Provides rolling window calculations. |
DataFrame.expanding ([min_periods, freq, ...]) |
Provides expanding transformations. |
DataFrame.ewm ([com, span, halflife, alpha, ...]) |
Provides exponential weighted functions |
Computations / Descriptive Stats¶
DataFrame.abs () |
Return an object with absolute value taken–only applicable to objects that are all numeric. |
DataFrame.all ([axis, bool_only, skipna, level]) |
Return whether all elements are True over requested axis |
DataFrame.any ([axis, bool_only, skipna, level]) |
Return whether any element is True over requested axis |
DataFrame.clip ([lower, upper, axis, inplace]) |
Trim values at input threshold(s). |
DataFrame.clip_lower (threshold[, axis, inplace]) |
Return copy of the input with values below given value(s) truncated. |
DataFrame.clip_upper (threshold[, axis, inplace]) |
Return copy of input with values above given value(s) truncated. |
DataFrame.corr ([method, min_periods]) |
Compute pairwise correlation of columns, excluding NA/null values |
DataFrame.corrwith (other[, axis, drop]) |
Compute pairwise correlation between rows or columns of two DataFrame objects. |
DataFrame.count ([axis, level, numeric_only]) |
Return Series with number of non-NA/null observations over requested axis. |
DataFrame.cov ([min_periods]) |
Compute pairwise covariance of columns, excluding NA/null values |
DataFrame.cummax ([axis, skipna]) |
Return cumulative max over requested axis. |
DataFrame.cummin ([axis, skipna]) |
Return cumulative minimum over requested axis. |
DataFrame.cumprod ([axis, skipna]) |
Return cumulative product over requested axis. |
DataFrame.cumsum ([axis, skipna]) |
Return cumulative sum over requested axis. |
DataFrame.describe ([percentiles, include, ...]) |
Generates descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values. |
DataFrame.diff ([periods, axis]) |
1st discrete difference of object |
DataFrame.eval (expr[, inplace]) |
Evaluate an expression in the context of the calling DataFrame instance. |
DataFrame.kurt ([axis, skipna, level, ...]) |
Return unbiased kurtosis over requested axis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0). |
DataFrame.mad ([axis, skipna, level]) |
Return the mean absolute deviation of the values for the requested axis |
DataFrame.max ([axis, skipna, level, ...]) |
This method returns the maximum of the values in the object. |
DataFrame.mean ([axis, skipna, level, ...]) |
Return the mean of the values for the requested axis |
DataFrame.median ([axis, skipna, level, ...]) |
Return the median of the values for the requested axis |
DataFrame.min ([axis, skipna, level, ...]) |
This method returns the minimum of the values in the object. |
DataFrame.mode ([axis, numeric_only]) |
Gets the mode(s) of each element along the axis selected. |
DataFrame.pct_change ([periods, fill_method, ...]) |
Percent change over given number of periods. |
DataFrame.prod ([axis, skipna, level, ...]) |
Return the product of the values for the requested axis |
DataFrame.quantile ([q, axis, numeric_only, ...]) |
Return values at the given quantile over requested axis, a la numpy.percentile. |
DataFrame.rank ([axis, method, numeric_only, ...]) |
Compute numerical data ranks (1 through n) along axis. |
DataFrame.round ([decimals]) |
Round a DataFrame to a variable number of decimal places. |
DataFrame.sem ([axis, skipna, level, ddof, ...]) |
Return unbiased standard error of the mean over requested axis. |
DataFrame.skew ([axis, skipna, level, ...]) |
Return unbiased skew over requested axis |
DataFrame.sum ([axis, skipna, level, ...]) |
Return the sum of the values for the requested axis |
DataFrame.std ([axis, skipna, level, ddof, ...]) |
Return sample standard deviation over requested axis. |
DataFrame.var ([axis, skipna, level, ddof, ...]) |
Return unbiased variance over requested axis. |
Reindexing / Selection / Label manipulation¶
DataFrame.add_prefix (prefix) |
Concatenate prefix string with panel items names. |
DataFrame.add_suffix (suffix) |
Concatenate suffix string with panel items names. |
DataFrame.align (other[, join, axis, level, ...]) |
Align two objects on their axes with the |
DataFrame.drop ([labels, axis, index, ...]) |
Return new object with labels in requested axis removed. |
DataFrame.drop_duplicates ([subset, keep, ...]) |
Return DataFrame with duplicate rows removed, optionally only |
DataFrame.duplicated ([subset, keep]) |
Return boolean Series denoting duplicate rows, optionally only |
DataFrame.equals (other) |
Determines if two NDFrame objects contain the same elements. |
DataFrame.filter ([items, like, regex, axis]) |
Subset rows or columns of dataframe according to labels in the specified index. |
DataFrame.first (offset) |
Convenience method for subsetting initial periods of time series data based on a date offset. |
DataFrame.head ([n]) |
Return the first n rows. |
DataFrame.idxmax ([axis, skipna]) |
Return index of first occurrence of maximum over requested axis. |
DataFrame.idxmin ([axis, skipna]) |
Return index of first occurrence of minimum over requested axis. |
DataFrame.last (offset) |
Convenience method for subsetting final periods of time series data based on a date offset. |
DataFrame.reindex ([labels, index, columns, ...]) |
Conform DataFrame to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index. |
DataFrame.reindex_axis (labels[, axis, ...]) |
Conform input object to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index. |
DataFrame.reindex_like (other[, method, ...]) |
Return an object with matching indices to myself. |
DataFrame.rename ([mapper, index, columns, ...]) |
Alter axes labels. |
DataFrame.rename_axis (mapper[, axis, copy, ...]) |
Alter the name of the index or columns. |
DataFrame.reset_index ([level, drop, ...]) |
For DataFrame with multi-level index, return new DataFrame with labeling information in the columns under the index names, defaulting to ‘level_0’, ‘level_1’, etc. |
DataFrame.sample ([n, frac, replace, ...]) |
Returns a random sample of items from an axis of object. |
DataFrame.select (crit[, axis]) |
Return data corresponding to axis labels matching criteria |
DataFrame.set_index (keys[, drop, append, ...]) |
Set the DataFrame index (row labels) using one or more existing columns. |
DataFrame.tail ([n]) |
Return the last n rows. |
DataFrame.take (indices[, axis, convert, is_copy]) |
Return the elements in the given positional indices along an axis. |
DataFrame.truncate ([before, after, axis, copy]) |
Truncates a sorted DataFrame/Series before and/or after some particular index value. |
Missing data handling¶
DataFrame.dropna ([axis, how, thresh, ...]) |
Return object with labels on given axis omitted where alternately any |
DataFrame.fillna ([value, method, axis, ...]) |
Fill NA/NaN values using the specified method |
DataFrame.replace ([to_replace, value, ...]) |
Replace values given in ‘to_replace’ with ‘value’. |
Reshaping, sorting, transposing¶
DataFrame.pivot ([index, columns, values]) |
Reshape data (produce a “pivot” table) based on column values. |
DataFrame.reorder_levels (order[, axis]) |
Rearrange index levels using input order. |
DataFrame.sort_values (by[, axis, ascending, ...]) |
Sort by the values along either axis |
DataFrame.sort_index ([axis, level, ...]) |
Sort object by labels (along an axis) |
DataFrame.nlargest (n, columns[, keep]) |
Get the rows of a DataFrame sorted by the n largest values of columns. |
DataFrame.nsmallest (n, columns[, keep]) |
Get the rows of a DataFrame sorted by the n smallest values of columns. |
DataFrame.swaplevel ([i, j, axis]) |
Swap levels i and j in a MultiIndex on a particular axis |
DataFrame.stack ([level, dropna]) |
Pivot a level of the (possibly hierarchical) column labels, returning a DataFrame (or Series in the case of an object with a single level of column labels) having a hierarchical index with a new inner-most level of row labels. |
DataFrame.unstack ([level, fill_value]) |
Pivot a level of the (necessarily hierarchical) index labels, returning a DataFrame having a new level of column labels whose inner-most level consists of the pivoted index labels. |
DataFrame.melt ([id_vars, value_vars, ...]) |
“Unpivots” a DataFrame from wide format to long format, optionally |
DataFrame.T |
Transpose index and columns |
DataFrame.to_panel () |
Transform long (stacked) format (DataFrame) into wide (3D, Panel) format. |
DataFrame.to_xarray () |
Return an xarray object from the pandas object. |
DataFrame.transpose (*args, **kwargs) |
Transpose index and columns |
Combining / joining / merging¶
DataFrame.append (other[, ignore_index, ...]) |
Append rows of other to the end of this frame, returning a new object. |
DataFrame.assign (**kwargs) |
Assign new columns to a DataFrame, returning a new object (a copy) with all the original columns in addition to the new ones. |
DataFrame.join (other[, on, how, lsuffix, ...]) |
Join columns with other DataFrame either on index or on a key column. |
DataFrame.merge (right[, how, on, left_on, ...]) |
Merge DataFrame objects by performing a database-style join operation by columns or indexes. |
DataFrame.update (other[, join, overwrite, ...]) |
Modify DataFrame in place using non-NA values from passed DataFrame. |
Time series-related¶
DataFrame.asfreq (freq[, method, how, ...]) |
Convert TimeSeries to specified frequency. |
DataFrame.asof (where[, subset]) |
The last row without any NaN is taken (or the last row without |
DataFrame.shift ([periods, freq, axis]) |
Shift index by desired number of periods with an optional time freq |
DataFrame.first_valid_index () |
Return index for first non-NA/null value. |
DataFrame.last_valid_index () |
Return index for first non-NA/null value. |
DataFrame.resample (rule[, how, axis, ...]) |
Convenience method for frequency conversion and resampling of time series. |
DataFrame.to_period ([freq, axis, copy]) |
Convert DataFrame from DatetimeIndex to PeriodIndex with desired |
DataFrame.to_timestamp ([freq, how, axis, copy]) |
Cast to DatetimeIndex of timestamps, at beginning of period |
DataFrame.tz_convert (tz[, axis, level, copy]) |
Convert tz-aware axis to target time zone. |
DataFrame.tz_localize (tz[, axis, level, ...]) |
Localize tz-naive TimeSeries to target time zone. |
Plotting¶
DataFrame.plot
is both a callable method and a namespace attribute for
specific plotting methods of the form DataFrame.plot.<kind>
.
DataFrame.plot ([x, y, kind, ax, ....]) |
DataFrame plotting accessor and method |
DataFrame.plot.area ([x, y]) |
Area plot |
DataFrame.plot.bar ([x, y]) |
Vertical bar plot |
DataFrame.plot.barh ([x, y]) |
Horizontal bar plot |
DataFrame.plot.box ([by]) |
Boxplot |
DataFrame.plot.density (**kwds) |
Kernel Density Estimate plot |
DataFrame.plot.hexbin (x, y[, C, ...]) |
Hexbin plot |
DataFrame.plot.hist ([by, bins]) |
Histogram |
DataFrame.plot.kde (**kwds) |
Kernel Density Estimate plot |
DataFrame.plot.line ([x, y]) |
Line plot |
DataFrame.plot.pie ([y]) |
Pie chart |
DataFrame.plot.scatter (x, y[, s, c]) |
Scatter plot |
DataFrame.boxplot ([column, by, ax, ...]) |
Make a box plot from DataFrame column optionally grouped by some columns or |
DataFrame.hist (data[, column, by, grid, ...]) |
Draw histogram of the DataFrame’s series using matplotlib / pylab. |
Serialization / IO / Conversion¶
DataFrame.from_csv (path[, header, sep, ...]) |
Read CSV file (DEPRECATED, please use pandas.read_csv() instead). |
DataFrame.from_dict (data[, orient, dtype]) |
Construct DataFrame from dict of array-like or dicts |
DataFrame.from_items (items[, columns, orient]) |
Convert (key, value) pairs to DataFrame. |
DataFrame.from_records (data[, index, ...]) |
Convert structured or record ndarray to DataFrame |
DataFrame.info ([verbose, buf, max_cols, ...]) |
Concise summary of a DataFrame. |
DataFrame.to_pickle (path[, compression, ...]) |
Pickle (serialize) object to input file path. |
DataFrame.to_csv ([path_or_buf, sep, na_rep, ...]) |
Write DataFrame to a comma-separated values (csv) file |
DataFrame.to_hdf (path_or_buf, key, **kwargs) |
Write the contained data to an HDF5 file using HDFStore. |
DataFrame.to_sql (name, con[, flavor, ...]) |
Write records stored in a DataFrame to a SQL database. |
DataFrame.to_dict ([orient, into]) |
Convert DataFrame to dictionary. |
DataFrame.to_excel (excel_writer[, ...]) |
Write DataFrame to an excel sheet |
DataFrame.to_json ([path_or_buf, orient, ...]) |
Convert the object to a JSON string. |
DataFrame.to_html ([buf, columns, col_space, ...]) |
Render a DataFrame as an HTML table. |
DataFrame.to_feather (fname) |
write out the binary feather-format for DataFrames |
DataFrame.to_latex ([buf, columns, ...]) |
Render an object to a tabular environment table. |
DataFrame.to_stata (fname[, convert_dates, ...]) |
A class for writing Stata binary dta files from array-like objects |
DataFrame.to_msgpack ([path_or_buf, encoding]) |
msgpack (serialize) object to input file path |
DataFrame.to_gbq (destination_table, project_id) |
Write a DataFrame to a Google BigQuery table. |
DataFrame.to_records ([index, convert_datetime64]) |
Convert DataFrame to record array. |
DataFrame.to_sparse ([fill_value, kind]) |
Convert to SparseDataFrame |
DataFrame.to_dense () |
Return dense representation of NDFrame (as opposed to sparse) |
DataFrame.to_string ([buf, columns, ...]) |
Render a DataFrame to a console-friendly tabular output. |
DataFrame.to_clipboard ([excel, sep]) |
Attempt to write text representation of object to the system clipboard This can be pasted into Excel, for example. |
Sparse¶
SparseDataFrame.to_coo () |
Return the contents of the frame as a sparse SciPy COO matrix. |
Panel¶
Constructor¶
Panel ([data, items, major_axis, minor_axis, ...]) |
Represents wide format panel data, stored as 3-dimensional array |
Attributes and underlying data¶
Axes
- items: axis 0; each item corresponds to a DataFrame contained inside
- major_axis: axis 1; the index (rows) of each of the DataFrames
- minor_axis: axis 2; the columns of each of the DataFrames
Panel.values |
Numpy representation of NDFrame |
Panel.axes |
Return index label(s) of the internal NDFrame |
Panel.ndim |
Number of axes / array dimensions |
Panel.size |
number of elements in the NDFrame |
Panel.shape |
Return a tuple of axis dimensions |
Panel.dtypes |
Return the dtypes in this object. |
Panel.ftypes |
Return the ftypes (indication of sparse/dense and dtype) in this object. |
Panel.get_dtype_counts () |
Return the counts of dtypes in this object. |
Panel.get_ftype_counts () |
Return the counts of ftypes in this object. |
Conversion¶
Panel.astype (dtype[, copy, errors]) |
Cast a pandas object to a specified dtype dtype . |
Panel.copy ([deep]) |
Make a copy of this objects data. |
Panel.isna () |
Return a boolean same-sized object indicating if the values are NA. |
Panel.notna () |
Return a boolean same-sized object indicating if the values are not NA. |
Getting and setting¶
Panel.get_value (*args, **kwargs) |
Quickly retrieve single value at (item, major, minor) location |
Panel.set_value (*args, **kwargs) |
Quickly set single value at (item, major, minor) location |
Indexing, iteration, slicing¶
Panel.at |
Fast label-based scalar accessor |
Panel.iat |
Fast integer location scalar accessor. |
Panel.loc |
Purely label-location based indexer for selection by label. |
Panel.iloc |
Purely integer-location based indexing for selection by position. |
Panel.__iter__ () |
Iterate over infor axis |
Panel.iteritems () |
Iterate over (label, values) on info axis |
Panel.pop (item) |
Return item and drop from frame. |
Panel.xs (key[, axis]) |
Return slice of panel along selected axis |
Panel.major_xs (key) |
Return slice of panel along major axis |
Panel.minor_xs (key) |
Return slice of panel along minor axis |
For more information on .at
, .iat
, .loc
, and
.iloc
, see the indexing documentation.
Binary operator functions¶
Panel.add (other[, axis]) |
Addition of series and other, element-wise (binary operator add). |
Panel.sub (other[, axis]) |
Subtraction of series and other, element-wise (binary operator sub). |
Panel.mul (other[, axis]) |
Multiplication of series and other, element-wise (binary operator mul). |
Panel.div (other[, axis]) |
Floating division of series and other, element-wise (binary operator truediv). |
Panel.truediv (other[, axis]) |
Floating division of series and other, element-wise (binary operator truediv). |
Panel.floordiv (other[, axis]) |
Integer division of series and other, element-wise (binary operator floordiv). |
Panel.mod (other[, axis]) |
Modulo of series and other, element-wise (binary operator mod). |
Panel.pow (other[, axis]) |
Exponential power of series and other, element-wise (binary operator pow). |
Panel.radd (other[, axis]) |
Addition of series and other, element-wise (binary operator radd). |
Panel.rsub (other[, axis]) |
Subtraction of series and other, element-wise (binary operator rsub). |
Panel.rmul (other[, axis]) |
Multiplication of series and other, element-wise (binary operator rmul). |
Panel.rdiv (other[, axis]) |
Floating division of series and other, element-wise (binary operator rtruediv). |
Panel.rtruediv (other[, axis]) |
Floating division of series and other, element-wise (binary operator rtruediv). |
Panel.rfloordiv (other[, axis]) |
Integer division of series and other, element-wise (binary operator rfloordiv). |
Panel.rmod (other[, axis]) |
Modulo of series and other, element-wise (binary operator rmod). |
Panel.rpow (other[, axis]) |
Exponential power of series and other, element-wise (binary operator rpow). |
Panel.lt (other[, axis]) |
Wrapper for comparison method lt |
Panel.gt (other[, axis]) |
Wrapper for comparison method gt |
Panel.le (other[, axis]) |
Wrapper for comparison method le |
Panel.ge (other[, axis]) |
Wrapper for comparison method ge |
Panel.ne (other[, axis]) |
Wrapper for comparison method ne |
Panel.eq (other[, axis]) |
Wrapper for comparison method eq |
Function application, GroupBy¶
Panel.apply (func[, axis]) |
Applies function along axis (or axes) of the Panel |
Panel.groupby (function[, axis]) |
Group data on given axis, returning GroupBy object |
Computations / Descriptive Stats¶
Panel.abs () |
Return an object with absolute value taken–only applicable to objects that are all numeric. |
Panel.clip ([lower, upper, axis, inplace]) |
Trim values at input threshold(s). |
Panel.clip_lower (threshold[, axis, inplace]) |
Return copy of the input with values below given value(s) truncated. |
Panel.clip_upper (threshold[, axis, inplace]) |
Return copy of input with values above given value(s) truncated. |
Panel.count ([axis]) |
Return number of observations over requested axis. |
Panel.cummax ([axis, skipna]) |
Return cumulative max over requested axis. |
Panel.cummin ([axis, skipna]) |
Return cumulative minimum over requested axis. |
Panel.cumprod ([axis, skipna]) |
Return cumulative product over requested axis. |
Panel.cumsum ([axis, skipna]) |
Return cumulative sum over requested axis. |
Panel.max ([axis, skipna, level, numeric_only]) |
This method returns the maximum of the values in the object. |
Panel.mean ([axis, skipna, level, numeric_only]) |
Return the mean of the values for the requested axis |
Panel.median ([axis, skipna, level, numeric_only]) |
Return the median of the values for the requested axis |
Panel.min ([axis, skipna, level, numeric_only]) |
This method returns the minimum of the values in the object. |
Panel.pct_change ([periods, fill_method, ...]) |
Percent change over given number of periods. |
Panel.prod ([axis, skipna, level, numeric_only]) |
Return the product of the values for the requested axis |
Panel.sem ([axis, skipna, level, ddof, ...]) |
Return unbiased standard error of the mean over requested axis. |
Panel.skew ([axis, skipna, level, numeric_only]) |
Return unbiased skew over requested axis |
Panel.sum ([axis, skipna, level, numeric_only]) |
Return the sum of the values for the requested axis |
Panel.std ([axis, skipna, level, ddof, ...]) |
Return sample standard deviation over requested axis. |
Panel.var ([axis, skipna, level, ddof, ...]) |
Return unbiased variance over requested axis. |
Reindexing / Selection / Label manipulation¶
Panel.add_prefix (prefix) |
Concatenate prefix string with panel items names. |
Panel.add_suffix (suffix) |
Concatenate suffix string with panel items names. |
Panel.drop ([labels, axis, index, columns, ...]) |
Return new object with labels in requested axis removed. |
Panel.equals (other) |
Determines if two NDFrame objects contain the same elements. |
Panel.filter ([items, like, regex, axis]) |
Subset rows or columns of dataframe according to labels in the specified index. |
Panel.first (offset) |
Convenience method for subsetting initial periods of time series data based on a date offset. |
Panel.last (offset) |
Convenience method for subsetting final periods of time series data based on a date offset. |
Panel.reindex (*args, **kwargs) |
Conform Panel to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index. |
Panel.reindex_axis (labels[, axis, method, ...]) |
Conform input object to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index. |
Panel.reindex_like (other[, method, copy, ...]) |
Return an object with matching indices to myself. |
Panel.rename ([items, major_axis, minor_axis]) |
Alter axes input function or functions. |
Panel.sample ([n, frac, replace, weights, ...]) |
Returns a random sample of items from an axis of object. |
Panel.select (crit[, axis]) |
Return data corresponding to axis labels matching criteria |
Panel.take (indices[, axis, convert, is_copy]) |
Return the elements in the given positional indices along an axis. |
Panel.truncate ([before, after, axis, copy]) |
Truncates a sorted DataFrame/Series before and/or after some particular index value. |
Missing data handling¶
Panel.dropna ([axis, how, inplace]) |
Drop 2D from panel, holding passed axis constant |
Panel.fillna ([value, method, axis, inplace, ...]) |
Fill NA/NaN values using the specified method |
Reshaping, sorting, transposing¶
Panel.sort_index ([axis, level, ascending, ...]) |
Sort object by labels (along an axis) |
Panel.swaplevel ([i, j, axis]) |
Swap levels i and j in a MultiIndex on a particular axis |
Panel.transpose (*args, **kwargs) |
Permute the dimensions of the Panel |
Panel.swapaxes (axis1, axis2[, copy]) |
Interchange axes and swap values axes appropriately |
Panel.conform (frame[, axis]) |
Conform input DataFrame to align with chosen axis pair. |
Combining / joining / merging¶
Panel.join (other[, how, lsuffix, rsuffix]) |
Join items with other Panel either on major and minor axes column |
Panel.update (other[, join, overwrite, ...]) |
Modify Panel in place using non-NA values from passed Panel, or object coercible to Panel. |
Time series-related¶
Panel.asfreq (freq[, method, how, normalize, ...]) |
Convert TimeSeries to specified frequency. |
Panel.shift ([periods, freq, axis]) |
Shift index by desired number of periods with an optional time freq. |
Panel.resample (rule[, how, axis, ...]) |
Convenience method for frequency conversion and resampling of time series. |
Panel.tz_convert (tz[, axis, level, copy]) |
Convert tz-aware axis to target time zone. |
Panel.tz_localize (tz[, axis, level, copy, ...]) |
Localize tz-naive TimeSeries to target time zone. |
Serialization / IO / Conversion¶
Panel.from_dict (data[, intersect, orient, dtype]) |
Construct Panel from dict of DataFrame objects |
Panel.to_pickle (path[, compression, protocol]) |
Pickle (serialize) object to input file path. |
Panel.to_excel (path[, na_rep, engine]) |
Write each DataFrame in Panel to a separate excel sheet |
Panel.to_hdf (path_or_buf, key, **kwargs) |
Write the contained data to an HDF5 file using HDFStore. |
Panel.to_sparse (*args, **kwargs) |
NOT IMPLEMENTED: do not call this method, as sparsifying is not supported for Panel objects and will raise an error. |
Panel.to_frame ([filter_observations]) |
Transform wide format into long (stacked) format as DataFrame whose columns are the Panel’s items and whose index is a MultiIndex formed of the Panel’s major and minor axes. |
Panel.to_xarray () |
Return an xarray object from the pandas object. |
Panel.to_clipboard ([excel, sep]) |
Attempt to write text representation of object to the system clipboard This can be pasted into Excel, for example. |
Index¶
Many of these methods or variants thereof are available on the objects that contain an index (Series/DataFrame) and those should most likely be used before calling these methods directly.
Index |
Immutable ndarray implementing an ordered, sliceable set. |
Attributes¶
Index.values |
return the underlying data as an ndarray |
Index.is_monotonic |
alias for is_monotonic_increasing (deprecated) |
Index.is_monotonic_increasing |
return if the index is monotonic increasing (only equal or |
Index.is_monotonic_decreasing |
return if the index is monotonic decreasing (only equal or |
Index.is_unique |
|
Index.has_duplicates |
|
Index.dtype |
|
Index.inferred_type |
|
Index.is_all_dates |
|
Index.shape |
return a tuple of the shape of the underlying data |
Index.nbytes |
return the number of bytes in the underlying data |
Index.ndim |
return the number of dimensions of the underlying data, |
Index.size |
return the number of elements in the underlying data |
Index.empty |
|
Index.strides |
return the strides of the underlying data |
Index.itemsize |
return the size of the dtype of the item of the underlying data |
Index.base |
return the base object if the memory of the underlying data is |
Index.T |
return the transpose, which is by definition self |
Index.memory_usage ([deep]) |
Memory usage of my values |
Modifying and Computations¶
Index.all (*args, **kwargs) |
Return whether all elements are True |
Index.any (*args, **kwargs) |
Return whether any element is True |
Index.argmin ([axis]) |
return a ndarray of the minimum argument indexer |
Index.argmax ([axis]) |
return a ndarray of the maximum argument indexer |
Index.copy ([name, deep, dtype]) |
Make a copy of this object. |
Index.delete (loc) |
Make new Index with passed location(-s) deleted |
Index.drop (labels[, errors]) |
Make new Index with passed list of labels deleted |
Index.drop_duplicates ([keep]) |
Return Index with duplicate values removed |
Index.duplicated ([keep]) |
Return boolean np.ndarray denoting duplicate values |
Index.equals (other) |
Determines if two Index objects contain the same elements. |
Index.factorize ([sort, na_sentinel]) |
Encode the object as an enumerated type or categorical variable |
Index.identical (other) |
Similar to equals, but check that other comparable attributes are |
Index.insert (loc, item) |
Make new Index inserting new item at location. |
Index.min () |
The minimum value of the object |
Index.max () |
The maximum value of the object |
Index.reindex (target[, method, level, ...]) |
Create index with target’s values (move/add/delete values as necessary) |
Index.repeat (repeats, *args, **kwargs) |
Repeat elements of an Index. |
Index.where (cond[, other]) |
New in version 0.19.0. |
Index.take (indices[, axis, allow_fill, ...]) |
return a new Index of the values selected by the indices |
Index.putmask (mask, value) |
return a new Index of the values set with the mask |
Index.set_names (names[, level, inplace]) |
Set new names on index. |
Index.unique () |
Return unique values in the object. |
Index.nunique ([dropna]) |
Return number of unique elements in the object. |
Index.value_counts ([normalize, sort, ...]) |
Returns object containing counts of unique values. |
Missing Values¶
Index.fillna ([value, downcast]) |
Fill NA/NaN values with the specified value |
Index.dropna ([how]) |
Return Index without NA/NaN values |
Index.isna () |
Detect missing values |
Index.notna () |
Inverse of isna |
Conversion¶
Index.astype (dtype[, copy]) |
Create an Index with values cast to dtypes. |
Index.tolist () |
Return a list of the values. |
Index.to_datetime ([dayfirst]) |
DEPRECATED: use pandas.to_datetime() instead. |
Index.to_series (**kwargs) |
Create a Series with both index and values equal to the index keys |
Index.to_frame ([index]) |
Create a DataFrame with a column containing the Index. |
Sorting¶
Index.argsort (*args, **kwargs) |
Returns the indices that would sort the index and its underlying data. |
Index.sort_values ([return_indexer, ascending]) |
Return sorted copy of Index |
Time-specific operations¶
Index.shift ([periods, freq]) |
Shift Index containing datetime objects by input number of periods and |
Combining / joining / set operations¶
Index.append (other) |
Append a collection of Index options together |
Index.join (other[, how, level, ...]) |
this is an internal non-public method |
Index.intersection (other) |
Form the intersection of two Index objects. |
Index.union (other) |
Form the union of two Index objects and sorts if possible. |
Index.difference (other) |
Return a new Index with elements from the index that are not in other. |
Index.symmetric_difference (other[, result_name]) |
Compute the symmetric difference of two Index objects. |
Selecting¶
Index.get_indexer (target[, method, limit, ...]) |
Compute indexer and mask for new index given the current index. |
Index.get_indexer_non_unique (target) |
Compute indexer and mask for new index given the current index. |
Index.get_level_values (level) |
Return an Index of values for requested level, equal to the length of the index. |
Index.get_loc (key[, method, tolerance]) |
Get integer location, slice or boolean mask for requested label. |
Index.get_value (series, key) |
Fast lookup of value from 1-dimensional ndarray. |
Index.isin (values[, level]) |
Compute boolean array of whether each index value is found in the passed set of values. |
Index.slice_indexer ([start, end, step, kind]) |
For an ordered Index, compute the slice indexer for input labels and |
Index.slice_locs ([start, end, step, kind]) |
Compute slice locations for input labels. |
Numeric Index¶
RangeIndex |
Immutable Index implementing a monotonic integer range. |
Int64Index |
Immutable ndarray implementing an ordered, sliceable set. |
UInt64Index |
Immutable ndarray implementing an ordered, sliceable set. |
Float64Index |
Immutable ndarray implementing an ordered, sliceable set. |
CategoricalIndex¶
CategoricalIndex |
Immutable Index implementing an ordered, sliceable set. |
Categorical Components¶
CategoricalIndex.codes |
|
CategoricalIndex.categories |
|
CategoricalIndex.ordered |
|
CategoricalIndex.rename_categories (*args, ...) |
Renames categories. |
CategoricalIndex.reorder_categories (*args, ...) |
Reorders categories as specified in new_categories. |
CategoricalIndex.add_categories (*args, **kwargs) |
Add new categories. |
CategoricalIndex.remove_categories (*args, ...) |
Removes the specified categories. |
CategoricalIndex.remove_unused_categories (...) |
Removes categories which are not used. |
CategoricalIndex.set_categories (*args, **kwargs) |
Sets the categories to the specified new_categories. |
CategoricalIndex.as_ordered (*args, **kwargs) |
Sets the Categorical to be ordered |
CategoricalIndex.as_unordered (*args, **kwargs) |
Sets the Categorical to be unordered |
IntervalIndex¶
IntervalIndex |
Immutable Index implementing an ordered, sliceable set. |
IntervalIndex Components¶
IntervalIndex.from_arrays (left, right[, ...]) |
Construct an IntervalIndex from a a left and right array |
IntervalIndex.from_tuples (data[, closed, ...]) |
Construct an IntervalIndex from a list/array of tuples |
IntervalIndex.from_breaks (breaks[, closed, ...]) |
Construct an IntervalIndex from an array of splits |
IntervalIndex.from_intervals (data[, name, copy]) |
Construct an IntervalIndex from a 1d array of Interval objects |
MultiIndex¶
MultiIndex |
A multi-level, or hierarchical, index object for pandas objects |
IndexSlice |
Create an object to more easily perform multi-index slicing |
MultiIndex Components¶
MultiIndex.from_arrays (arrays[, sortorder, ...]) |
Convert arrays to MultiIndex |
MultiIndex.from_tuples (tuples[, sortorder, ...]) |
Convert list of tuples to MultiIndex |
MultiIndex.from_product (iterables[, ...]) |
Make a MultiIndex from the cartesian product of multiple iterables |
MultiIndex.set_levels (levels[, level, ...]) |
Set new levels on MultiIndex. |
MultiIndex.set_labels (labels[, level, ...]) |
Set new labels on MultiIndex. |
MultiIndex.to_hierarchical (n_repeat[, n_shuffle]) |
Return a MultiIndex reshaped to conform to the shapes given by n_repeat and n_shuffle. |
MultiIndex.to_frame ([index]) |
Create a DataFrame with the levels of the MultiIndex as columns. |
MultiIndex.is_lexsorted () |
Return True if the labels are lexicographically sorted |
MultiIndex.droplevel ([level]) |
Return Index with requested level removed. |
MultiIndex.swaplevel ([i, j]) |
Swap level i with level j. |
MultiIndex.reorder_levels (order) |
Rearrange levels using input order. |
MultiIndex.remove_unused_levels () |
create a new MultiIndex from the current that removing |
DatetimeIndex¶
DatetimeIndex |
Immutable ndarray of datetime64 data, represented internally as int64, and which can be boxed to Timestamp objects that are subclasses of datetime and carry metadata such as frequency information. |
Time/Date Components¶
DatetimeIndex.year |
The year of the datetime |
DatetimeIndex.month |
The month as January=1, December=12 |
DatetimeIndex.day |
The days of the datetime |
DatetimeIndex.hour |
The hours of the datetime |
DatetimeIndex.minute |
The minutes of the datetime |
DatetimeIndex.second |
The seconds of the datetime |
DatetimeIndex.microsecond |
The microseconds of the datetime |
DatetimeIndex.nanosecond |
The nanoseconds of the datetime |
DatetimeIndex.date |
Returns numpy array of python datetime.date objects (namely, the date part of Timestamps without timezone information). |
DatetimeIndex.time |
Returns numpy array of datetime.time. |
DatetimeIndex.dayofyear |
The ordinal day of the year |
DatetimeIndex.weekofyear |
The week ordinal of the year |
DatetimeIndex.week |
The week ordinal of the year |
DatetimeIndex.dayofweek |
The day of the week with Monday=0, Sunday=6 |
DatetimeIndex.weekday |
The day of the week with Monday=0, Sunday=6 |
DatetimeIndex.weekday_name |
The name of day in a week (ex: Friday) |
DatetimeIndex.quarter |
The quarter of the date |
DatetimeIndex.tz |
|
DatetimeIndex.freq |
get/set the frequency of the Index |
DatetimeIndex.freqstr |
Return the frequency object as a string if its set, otherwise None |
DatetimeIndex.is_month_start |
Logical indicating if first day of month (defined by frequency) |
DatetimeIndex.is_month_end |
Logical indicating if last day of month (defined by frequency) |
DatetimeIndex.is_quarter_start |
Logical indicating if first day of quarter (defined by frequency) |
DatetimeIndex.is_quarter_end |
Logical indicating if last day of quarter (defined by frequency) |
DatetimeIndex.is_year_start |
Logical indicating if first day of year (defined by frequency) |
DatetimeIndex.is_year_end |
Logical indicating if last day of year (defined by frequency) |
DatetimeIndex.is_leap_year |
Logical indicating if the date belongs to a leap year |
DatetimeIndex.inferred_freq |
Selecting¶
DatetimeIndex.indexer_at_time (time[, asof]) |
Select values at particular time of day (e.g. |
DatetimeIndex.indexer_between_time (...[, ...]) |
Select values between particular times of day (e.g., 9:00-9:30AM). |
Time-specific operations¶
DatetimeIndex.normalize () |
Return DatetimeIndex with times to midnight. |
DatetimeIndex.strftime (date_format) |
Return an array of formatted strings specified by date_format, which supports the same string format as the python standard library. |
DatetimeIndex.snap ([freq]) |
Snap time stamps to nearest occurring frequency |
DatetimeIndex.tz_convert (tz) |
Convert tz-aware DatetimeIndex from one time zone to another (using |
DatetimeIndex.tz_localize (tz[, ambiguous, ...]) |
Localize tz-naive DatetimeIndex to given time zone (using |
DatetimeIndex.round (freq, *args, **kwargs) |
round the index to the specified freq |
DatetimeIndex.floor (freq) |
floor the index to the specified freq |
DatetimeIndex.ceil (freq) |
ceil the index to the specified freq |
Conversion¶
DatetimeIndex.to_datetime ([dayfirst]) |
|
DatetimeIndex.to_period ([freq]) |
Cast to PeriodIndex at a particular frequency |
DatetimeIndex.to_perioddelta (freq) |
Calcuates TimedeltaIndex of difference between index values and index converted to PeriodIndex at specified freq. |
DatetimeIndex.to_pydatetime () |
Return DatetimeIndex as object ndarray of datetime.datetime objects |
DatetimeIndex.to_series ([keep_tz]) |
Create a Series with both index and values equal to the index keys |
DatetimeIndex.to_frame ([index]) |
Create a DataFrame with a column containing the Index. |
TimedeltaIndex¶
TimedeltaIndex |
Immutable ndarray of timedelta64 data, represented internally as int64, and |
Components¶
TimedeltaIndex.days |
Number of days for each element. |
TimedeltaIndex.seconds |
Number of seconds (>= 0 and less than 1 day) for each element. |
TimedeltaIndex.microseconds |
Number of microseconds (>= 0 and less than 1 second) for each element. |
TimedeltaIndex.nanoseconds |
Number of nanoseconds (>= 0 and less than 1 microsecond) for each element. |
TimedeltaIndex.components |
Return a dataframe of the components (days, hours, minutes, seconds, milliseconds, microseconds, nanoseconds) of the Timedeltas. |
TimedeltaIndex.inferred_freq |
Conversion¶
TimedeltaIndex.to_pytimedelta () |
Return TimedeltaIndex as object ndarray of datetime.timedelta objects |
TimedeltaIndex.to_series (**kwargs) |
Create a Series with both index and values equal to the index keys |
TimedeltaIndex.round (freq, *args, **kwargs) |
round the index to the specified freq |
TimedeltaIndex.floor (freq) |
floor the index to the specified freq |
TimedeltaIndex.ceil (freq) |
ceil the index to the specified freq |
TimedeltaIndex.to_frame ([index]) |
Create a DataFrame with a column containing the Index. |
PeriodIndex¶
PeriodIndex |
Immutable ndarray holding ordinal values indicating regular periods in time such as particular years, quarters, months, etc. |
Attributes¶
PeriodIndex.day |
The days of the period |
PeriodIndex.dayofweek |
The day of the week with Monday=0, Sunday=6 |
PeriodIndex.dayofyear |
The ordinal day of the year |
PeriodIndex.days_in_month |
The number of days in the month |
PeriodIndex.daysinmonth |
The number of days in the month |
PeriodIndex.end_time |
|
PeriodIndex.freq |
|
PeriodIndex.freqstr |
Return the frequency object as a string if its set, otherwise None |
PeriodIndex.hour |
The hour of the period |
PeriodIndex.is_leap_year |
Logical indicating if the date belongs to a leap year |
PeriodIndex.minute |
The minute of the period |
PeriodIndex.month |
The month as January=1, December=12 |
PeriodIndex.quarter |
The quarter of the date |
PeriodIndex.qyear |
|
PeriodIndex.second |
The second of the period |
PeriodIndex.start_time |
|
PeriodIndex.week |
The week ordinal of the year |
PeriodIndex.weekday |
The day of the week with Monday=0, Sunday=6 |
PeriodIndex.weekofyear |
The week ordinal of the year |
PeriodIndex.year |
The year of the period |
Methods¶
PeriodIndex.asfreq ([freq, how]) |
Convert the PeriodIndex to the specified frequency freq. |
PeriodIndex.strftime (date_format) |
Return an array of formatted strings specified by date_format, which supports the same string format as the python standard library. |
PeriodIndex.to_timestamp ([freq, how]) |
Cast to DatetimeIndex |
PeriodIndex.tz_convert (tz) |
Convert tz-aware DatetimeIndex from one time zone to another (using |
PeriodIndex.tz_localize (tz[, infer_dst]) |
Localize tz-naive DatetimeIndex to given time zone (using |
Scalars¶
Attributes¶
Methods¶
Period.asfreq |
Convert Period to desired frequency, either at the start or end of the |
Period.now |
|
Period.strftime |
Returns the string representation of the Period , depending on the selected format . |
Period.to_timestamp |
Return the Timestamp representation of the Period at the target |
Timestamp¶
Timestamp |
TimeStamp is the pandas equivalent of python’s Datetime and is interchangable with it in most cases. |
Properties¶
Methods¶
Timestamp.astimezone |
Convert tz-aware Timestamp to another time zone. |
Timestamp.ceil |
return a new Timestamp ceiled to this resolution |
Timestamp.combine |
|
Timestamp.ctime |
Return ctime() style string. |
Timestamp.date |
Return date object with same year, month and day. |
Timestamp.dst |
Return self.tzinfo.dst(self). |
Timestamp.floor |
return a new Timestamp floored to this resolution |
Timestamp.freq |
|
Timestamp.freqstr |
|
Timestamp.fromordinal |
passed an ordinal, translate and convert to a ts |
Timestamp.fromtimestamp |
|
Timestamp.isocalendar |
Return a 3-tuple containing ISO year, week number, and weekday. |
Timestamp.isoformat |
|
Timestamp.isoweekday |
Return the day of the week represented by the date. |
Timestamp.normalize |
Normalize Timestamp to midnight, preserving tz information. |
Timestamp.now |
Return the current time in the local timezone. |
Timestamp.replace |
implements datetime.replace, handles nanoseconds |
Timestamp.round |
Round the Timestamp to the specified resolution |
Timestamp.strftime |
format -> strftime() style string. |
Timestamp.strptime |
string, format -> new datetime parsed from a string (like time.strptime()). |
Timestamp.time |
Return time object with same time but with tzinfo=None. |
Timestamp.timetuple |
Return time tuple, compatible with time.localtime(). |
Timestamp.timetz |
Return time object with same time and tzinfo. |
Timestamp.to_datetime64 |
Returns a numpy.datetime64 object with ‘ns’ precision |
Timestamp.to_julian_date |
Convert TimeStamp to a Julian Date. |
Timestamp.to_period |
Return an period of which this timestamp is an observation. |
Timestamp.to_pydatetime |
Convert a Timestamp object to a native Python datetime object. |
Timestamp.today |
Return the current time in the local timezone. |
Timestamp.toordinal |
Return proleptic Gregorian ordinal. |
Timestamp.tz_convert |
Convert tz-aware Timestamp to another time zone. |
Timestamp.tz_localize |
Convert naive Timestamp to local time zone, or remove timezone from tz-aware Timestamp. |
Timestamp.tzname |
Return self.tzinfo.tzname(self). |
Timestamp.utcfromtimestamp |
|
Timestamp.utcnow |
|
Timestamp.utcoffset |
Return self.tzinfo.utcoffset(self). |
Timestamp.utctimetuple |
Return UTC time tuple, compatible with time.localtime(). |
Timestamp.weekday |
Return the day of the week represented by the date. |
Properties¶
Interval.closed |
|
Interval.closed_left |
|
Interval.closed_right |
|
Interval.left |
|
Interval.mid |
|
Interval.open_left |
|
Interval.open_right |
|
Interval.right |
Properties¶
Timedelta.asm8 |
return a numpy timedelta64 array view of myself |
Timedelta.components |
Return a Components NamedTuple-like |
Timedelta.days |
Number of Days |
Timedelta.freq |
|
Timedelta.max |
|
Timedelta.microseconds |
Number of microseconds (>= 0 and less than 1 second). |
Timedelta.min |
|
Timedelta.nanoseconds |
Number of nanoseconds (>= 0 and less than 1 microsecond). |
Timedelta.resolution |
return a string representing the lowest resolution that we have |
Timedelta.seconds |
Number of seconds (>= 0 and less than 1 day). |
Timedelta.value |
Methods¶
Timedelta.ceil |
return a new Timedelta ceiled to this resolution |
Timedelta.floor |
return a new Timedelta floored to this resolution |
Timedelta.isoformat |
Format Timedelta as ISO 8601 Duration like P[n]Y[n]M[n]DT[n]H[n]M[n]S, where the `[n]`s are replaced by the values. |
Timedelta.round |
Round the Timedelta to the specified resolution |
Timedelta.to_pytimedelta |
return an actual datetime.timedelta object |
Timedelta.to_timedelta64 |
Returns a numpy.timedelta64 object with ‘ns’ precision |
Timedelta.total_seconds |
Total duration of timedelta in seconds (to ns precision) |
Window¶
Rolling objects are returned by .rolling
calls: pandas.DataFrame.rolling()
, pandas.Series.rolling()
, etc.
Expanding objects are returned by .expanding
calls: pandas.DataFrame.expanding()
, pandas.Series.expanding()
, etc.
EWM objects are returned by .ewm
calls: pandas.DataFrame.ewm()
, pandas.Series.ewm()
, etc.
Standard moving window functions¶
Rolling.count () |
rolling count of number of non-NaN |
Rolling.sum (*args, **kwargs) |
rolling sum |
Rolling.mean (*args, **kwargs) |
rolling mean |
Rolling.median (**kwargs) |
rolling median |
Rolling.var ([ddof]) |
rolling variance |
Rolling.std ([ddof]) |
rolling standard deviation |
Rolling.min (*args, **kwargs) |
rolling minimum |
Rolling.max (*args, **kwargs) |
rolling maximum |
Rolling.corr ([other, pairwise]) |
rolling sample correlation |
Rolling.cov ([other, pairwise, ddof]) |
rolling sample covariance |
Rolling.skew (**kwargs) |
Unbiased rolling skewness |
Rolling.kurt (**kwargs) |
Unbiased rolling kurtosis |
Rolling.apply (func[, args, kwargs]) |
rolling function apply |
Rolling.quantile (quantile, **kwargs) |
rolling quantile |
Window.mean (*args, **kwargs) |
window mean |
Window.sum (*args, **kwargs) |
window sum |
Standard expanding window functions¶
Expanding.count (**kwargs) |
expanding count of number of non-NaN |
Expanding.sum (*args, **kwargs) |
expanding sum |
Expanding.mean (*args, **kwargs) |
expanding mean |
Expanding.median (**kwargs) |
expanding median |
Expanding.var ([ddof]) |
expanding variance |
Expanding.std ([ddof]) |
expanding standard deviation |
Expanding.min (*args, **kwargs) |
expanding minimum |
Expanding.max (*args, **kwargs) |
expanding maximum |
Expanding.corr ([other, pairwise]) |
expanding sample correlation |
Expanding.cov ([other, pairwise, ddof]) |
expanding sample covariance |
Expanding.skew (**kwargs) |
Unbiased expanding skewness |
Expanding.kurt (**kwargs) |
Unbiased expanding kurtosis |
Expanding.apply (func[, args, kwargs]) |
expanding function apply |
Expanding.quantile (quantile, **kwargs) |
expanding quantile |
Exponentially-weighted moving window functions¶
EWM.mean (*args, **kwargs) |
exponential weighted moving average |
EWM.std ([bias]) |
exponential weighted moving stddev |
EWM.var ([bias]) |
exponential weighted moving variance |
EWM.corr ([other, pairwise]) |
exponential weighted sample correlation |
EWM.cov ([other, pairwise, bias]) |
exponential weighted sample covariance |
GroupBy¶
GroupBy objects are returned by groupby calls: pandas.DataFrame.groupby()
, pandas.Series.groupby()
, etc.
Indexing, iteration¶
GroupBy.__iter__ () |
Groupby iterator |
GroupBy.groups |
dict {group name -> group labels} |
GroupBy.indices |
dict {group name -> group indices} |
GroupBy.get_group (name[, obj]) |
Constructs NDFrame from group with provided name |
Grouper ([key, level, freq, axis, sort]) |
A Grouper allows the user to specify a groupby instruction for a target |
Function application¶
GroupBy.apply (func, *args, **kwargs) |
Apply function and combine results together in an intelligent way. |
GroupBy.aggregate (func, *args, **kwargs) |
|
GroupBy.transform (func, *args, **kwargs) |
|
GroupBy.pipe (func, *args, **kwargs) |
Apply a function with arguments to this GroupBy object, |
Computations / Descriptive Stats¶
GroupBy.count () |
Compute count of group, excluding missing values |
GroupBy.cumcount ([ascending]) |
Number each item in each group from 0 to the length of that group - 1. |
GroupBy.first (**kwargs) |
Compute first of group values |
GroupBy.head ([n]) |
Returns first n rows of each group. |
GroupBy.last (**kwargs) |
Compute last of group values |
GroupBy.max (**kwargs) |
Compute max of group values |
GroupBy.mean (*args, **kwargs) |
Compute mean of groups, excluding missing values |
GroupBy.median (**kwargs) |
Compute median of groups, excluding missing values |
GroupBy.min (**kwargs) |
Compute min of group values |
GroupBy.ngroup ([ascending]) |
Number each group from 0 to the number of groups - 1. |
GroupBy.nth (n[, dropna]) |
Take the nth row from each group if n is an int, or a subset of rows if n is a list of ints. |
GroupBy.ohlc () |
Compute sum of values, excluding missing values |
GroupBy.prod (**kwargs) |
Compute prod of group values |
GroupBy.size () |
Compute group sizes |
GroupBy.sem ([ddof]) |
Compute standard error of the mean of groups, excluding missing values |
GroupBy.std ([ddof]) |
Compute standard deviation of groups, excluding missing values |
GroupBy.sum (**kwargs) |
Compute sum of group values |
GroupBy.var ([ddof]) |
Compute variance of groups, excluding missing values |
GroupBy.tail ([n]) |
Returns last n rows of each group |
The following methods are available in both SeriesGroupBy
and
DataFrameGroupBy
objects, but may differ slightly, usually in that
the DataFrameGroupBy
version usually permits the specification of an
axis argument, and often an argument indicating whether to restrict
application to columns of a specific data type.
DataFrameGroupBy.agg (arg, *args, **kwargs) |
Aggregate using callable, string, dict, or list of string/callables |
DataFrameGroupBy.all |
Return whether all elements are True over requested axis |
DataFrameGroupBy.any |
Return whether any element is True over requested axis |
DataFrameGroupBy.bfill ([limit]) |
Backward fill the values |
DataFrameGroupBy.corr |
Compute pairwise correlation of columns, excluding NA/null values |
DataFrameGroupBy.count () |
Compute count of group, excluding missing values |
DataFrameGroupBy.cov |
Compute pairwise covariance of columns, excluding NA/null values |
DataFrameGroupBy.cummax ([axis]) |
Cumulative max for each group |
DataFrameGroupBy.cummin ([axis]) |
Cumulative min for each group |
DataFrameGroupBy.cumprod ([axis]) |
Cumulative product for each group |
DataFrameGroupBy.cumsum ([axis]) |
Cumulative sum for each group |
DataFrameGroupBy.describe (**kwargs) |
Generates descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values. |
DataFrameGroupBy.diff |
1st discrete difference of object |
DataFrameGroupBy.ffill ([limit]) |
Forward fill the values |
DataFrameGroupBy.fillna |
Fill NA/NaN values using the specified method |
DataFrameGroupBy.filter (func[, dropna]) |
Return a copy of a DataFrame excluding elements from groups that do not satisfy the boolean criterion specified by func. |
DataFrameGroupBy.hist |
Draw histogram of the DataFrame’s series using matplotlib / pylab. |
DataFrameGroupBy.idxmax |
Return index of first occurrence of maximum over requested axis. |
DataFrameGroupBy.idxmin |
Return index of first occurrence of minimum over requested axis. |
DataFrameGroupBy.mad |
Return the mean absolute deviation of the values for the requested axis |
DataFrameGroupBy.pct_change |
Percent change over given number of periods. |
DataFrameGroupBy.plot |
Class implementing the .plot attribute for groupby objects |
DataFrameGroupBy.quantile |
Return values at the given quantile over requested axis, a la numpy.percentile. |
DataFrameGroupBy.rank |
Compute numerical data ranks (1 through n) along axis. |
DataFrameGroupBy.resample (rule, *args, **kwargs) |
Provide resampling when using a TimeGrouper |
DataFrameGroupBy.shift ([periods, freq, axis]) |
Shift each group by periods observations |
DataFrameGroupBy.size () |
Compute group sizes |
DataFrameGroupBy.skew |
Return unbiased skew over requested axis |
DataFrameGroupBy.take |
Return the elements in the given positional indices along an axis. |
DataFrameGroupBy.tshift |
Shift the time index, using the index’s frequency if available. |
The following methods are available only for SeriesGroupBy
objects.
SeriesGroupBy.nlargest |
Return the largest n elements. |
SeriesGroupBy.nsmallest |
Return the smallest n elements. |
SeriesGroupBy.nunique ([dropna]) |
Returns number of unique elements in the group |
SeriesGroupBy.unique |
Return unique values in the object. |
SeriesGroupBy.value_counts ([normalize, ...]) |
The following methods are available only for DataFrameGroupBy
objects.
DataFrameGroupBy.corrwith |
Compute pairwise correlation between rows or columns of two DataFrame objects. |
DataFrameGroupBy.boxplot (grouped[, ...]) |
Make box plots from DataFrameGroupBy data. |
Resampling¶
Resampler objects are returned by resample calls: pandas.DataFrame.resample()
, pandas.Series.resample()
.
Indexing, iteration¶
Resampler.__iter__ () |
Groupby iterator |
Resampler.groups |
dict {group name -> group labels} |
Resampler.indices |
dict {group name -> group indices} |
Resampler.get_group (name[, obj]) |
Constructs NDFrame from group with provided name |
Function application¶
Resampler.apply (arg, *args, **kwargs) |
Aggregate using callable, string, dict, or list of string/callables |
Resampler.aggregate (arg, *args, **kwargs) |
Aggregate using callable, string, dict, or list of string/callables |
Resampler.transform (arg, *args, **kwargs) |
Call function producing a like-indexed Series on each group and return |
Upsampling¶
Resampler.ffill ([limit]) |
Forward fill the values |
Resampler.backfill ([limit]) |
Backward fill the values |
Resampler.bfill ([limit]) |
Backward fill the values |
Resampler.pad ([limit]) |
Forward fill the values |
Resampler.nearest ([limit]) |
Fill values with nearest neighbor starting from center |
Resampler.fillna (method[, limit]) |
Fill missing values |
Resampler.asfreq ([fill_value]) |
return the values at the new freq, |
Resampler.interpolate ([method, axis, limit, ...]) |
Interpolate values according to different methods. |
Computations / Descriptive Stats¶
Resampler.count ([_method]) |
Compute count of group, excluding missing values |
Resampler.nunique ([_method]) |
Returns number of unique elements in the group |
Resampler.first ([_method]) |
Compute first of group values |
Resampler.last ([_method]) |
Compute last of group values |
Resampler.max ([_method]) |
Compute max of group values |
Resampler.mean ([_method]) |
Compute mean of groups, excluding missing values |
Resampler.median ([_method]) |
Compute median of groups, excluding missing values |
Resampler.min ([_method]) |
Compute min of group values |
Resampler.ohlc ([_method]) |
Compute sum of values, excluding missing values |
Resampler.prod ([_method]) |
Compute prod of group values |
Resampler.size () |
Compute group sizes |
Resampler.sem ([_method]) |
Compute standard error of the mean of groups, excluding missing values |
Resampler.std ([ddof]) |
Compute standard deviation of groups, excluding missing values |
Resampler.sum ([_method]) |
Compute sum of group values |
Resampler.var ([ddof]) |
Compute variance of groups, excluding missing values |
Style¶
Styler
objects are returned by pandas.DataFrame.style
.
Constructor¶
Styler (data[, precision, table_styles, ...]) |
Helps style a DataFrame or Series according to the data with HTML and CSS. |
Style Application¶
Styler.apply (func[, axis, subset]) |
Apply a function column-wise, row-wise, or table-wase, updating the HTML representation with the result. |
Styler.applymap (func[, subset]) |
Apply a function elementwise, updating the HTML representation with the result. |
Styler.where (cond, value[, other, subset]) |
Apply a function elementwise, updating the HTML representation with a style which is selected in accordance with the return value of a function. |
Styler.format (formatter[, subset]) |
Format the text display value of cells. |
Styler.set_precision (precision) |
Set the precision used to render. |
Styler.set_table_styles (table_styles) |
Set the table styles on a Styler. |
Styler.set_caption (caption) |
Se the caption on a Styler |
Styler.set_properties ([subset]) |
Convience method for setting one or more non-data dependent properties or each cell. |
Styler.set_uuid (uuid) |
Set the uuid for a Styler. |
Styler.clear () |
“Reset” the styler, removing any previously applied styles. |
Builtin Styles¶
Styler.highlight_max ([subset, color, axis]) |
Highlight the maximum by shading the background |
Styler.highlight_min ([subset, color, axis]) |
Highlight the minimum by shading the background |
Styler.highlight_null ([null_color]) |
Shade the background null_color for missing values. |
Styler.background_gradient ([cmap, low, ...]) |
Color the background in a gradient according to the data in each column (optionally row). |
Styler.bar ([subset, axis, color, width, align]) |
Color the background color proptional to the values in each column. |
Style Export and Import¶
Styler.render (**kwargs) |
Render the built up styles to HTML |
Styler.export () |
Export the styles to applied to the current Styler. |
Styler.use (styles) |
Set the styles on the current Styler, possibly using styles from Styler.export . |
General utility functions¶
Working with options¶
describe_option (pat[, _print_desc]) |
Prints the description for one or more registered options. |
reset_option (pat) |
Reset one or more options to their default value. |
get_option (pat) |
Retrieves the value of the specified option. |
set_option (pat, value) |
Sets the value of the specified option. |
option_context (*args) |
Context manager to temporarily set options in the with statement context. |
Testing functions¶
testing.assert_frame_equal (left, right[, ...]) |
Check that left and right DataFrame are equal. |
testing.assert_series_equal (left, right[, ...]) |
Check that left and right Series are equal. |
testing.assert_index_equal (left, right[, ...]) |
Check that left and right Index are equal. |
Exceptions and warnings¶
errors.DtypeWarning |
Warning that is raised for a dtype incompatiblity. |
errors.EmptyDataError |
Exception that is thrown in pd.read_csv (by both the C and Python engines) when empty data or header is encountered. |
errors.OutOfBoundsDatetime |
|
errors.ParserError |
Exception that is raised by an error encountered in pd.read_csv. |
errors.ParserWarning |
Warning that is raised in pd.read_csv whenever it is necessary to change parsers (generally from ‘c’ to ‘python’) contrary to the one specified by the user due to lack of support or functionality for parsing particular attributes of a CSV file with the requsted engine. |
errors.PerformanceWarning |
Warning raised when there is a possible performance impact. |
errors.UnsortedIndexError |
Error raised when attempting to get a slice of a MultiIndex, and the index has not been lexsorted. |
errors.UnsupportedFunctionCall |
Exception raised when attempting to call a numpy function on a pandas object, but that function is not supported by the object e.g. |