pandas.DataFrame.diff¶
- DataFrame.diff(periods=1, axis=0)[source]¶
First discrete difference of element.
Calculates the difference of a DataFrame element compared with another element in the DataFrame (default is element in previous row).
- Parameters
- periodsint, default 1
Periods to shift for calculating difference, accepts negative values.
- axis{0 or ‘index’, 1 or ‘columns’}, default 0
Take difference over rows (0) or columns (1).
- Returns
- DataFrame
First differences of the Series.
See also
DataFrame.pct_change
Percent change over given number of periods.
DataFrame.shift
Shift index by desired number of periods with an optional time freq.
Series.diff
First discrete difference of object.
Notes
For boolean dtypes, this uses
operator.xor()
rather thanoperator.sub()
. The result is calculated according to current dtype in DataFrame, however dtype of the result is always float64.Examples
Difference with previous row
>>> df = pd.DataFrame({'a': [1, 2, 3, 4, 5, 6], ... 'b': [1, 1, 2, 3, 5, 8], ... 'c': [1, 4, 9, 16, 25, 36]}) >>> df a b c 0 1 1 1 1 2 1 4 2 3 2 9 3 4 3 16 4 5 5 25 5 6 8 36
>>> df.diff() a b c 0 NaN NaN NaN 1 1.0 0.0 3.0 2 1.0 1.0 5.0 3 1.0 1.0 7.0 4 1.0 2.0 9.0 5 1.0 3.0 11.0
Difference with previous column
>>> df.diff(axis=1) a b c 0 NaN 0 0 1 NaN -1 3 2 NaN -1 7 3 NaN -1 13 4 NaN 0 20 5 NaN 2 28
Difference with 3rd previous row
>>> df.diff(periods=3) a b c 0 NaN NaN NaN 1 NaN NaN NaN 2 NaN NaN NaN 3 3.0 2.0 15.0 4 3.0 4.0 21.0 5 3.0 6.0 27.0
Difference with following row
>>> df.diff(periods=-1) a b c 0 -1.0 0.0 -3.0 1 -1.0 -1.0 -5.0 2 -1.0 -1.0 -7.0 3 -1.0 -2.0 -9.0 4 -1.0 -3.0 -11.0 5 NaN NaN NaN
Overflow in input dtype
>>> df = pd.DataFrame({'a': [1, 0]}, dtype=np.uint8) >>> df.diff() a 0 NaN 1 255.0