pandas.DataFrame.where#

DataFrame.where(cond, other=<no_default>, *, inplace=False, axis=None, level=None)[source]#

Replace values where the condition is False.

This method allows conditional replacement of values. Where the condition evaluates to True, the original values are retained; where it evaluates to False, values are replaced with corresponding entries from other.

Parameters:

condbool Series/DataFrame, array-like, or callable: Where cond is True, keep the original value. Where False, replace with corresponding value from other. If cond is callable, it is computed on the Series/DataFrame and should return boolean Series/DataFrame or array. The callable must not change input Series/DataFrame (though pandas doesn’t check it).
otherscalar, array-like, Series/DataFrame, Index, or callable: Entries where cond is False are replaced with corresponding value from other. If other is callable, it is computed on the Series/DataFrame and should return scalar or Series/DataFrame. The callable must not change input Series/DataFrame (though pandas doesn’t check it). If not specified, entries will be filled with the corresponding NULL value (np.nan for numpy dtypes, pd.NA for extension dtypes).
inplacebool, default False: Whether to perform the operation in place on the data.
axisint, default None: Alignment axis if needed. For Series this parameter is unused and defaults to 0.
levelint, default None: Alignment level if needed.

Returns:

Series or DataFrame: When applied to a Series, the function will return a Series, and when applied to a DataFrame, it will return a DataFrame.

See also

DataFrame.mask(): Return an object of same shape as caller.
Series.mask(): Return an object of same shape as caller.

Notes

The where method is an application of the if-then idiom. For each element in the caller, if cond is True the element is used; otherwise the corresponding element from other is used. If the axis of cond does not align with the caller Series/DataFrame, the values of cond on misaligned index positions will be filled with False.

The signature for Series.where() or DataFrame.where() differs from numpy.where(). Roughly df1.where(m, df2) is equivalent to np.where(m, df1, df2).

For further details and examples see the where documentation in indexing.

The dtype of the object takes precedence. The fill value is casted to the object’s dtype, if this can be done losslessly.

Examples

>>> s = pd.Series(range(5))
>>> s.where(s > 0)
0    NaN
1    1.0
2    2.0
3    3.0
4    4.0
dtype: float64
>>> s.mask(s > 0)
0    0.0
1    NaN
2    NaN
3    NaN
4    NaN
dtype: float64

>>> s = pd.Series(range(5))
>>> t = pd.Series([True, False])
>>> s.where(t, 99)
0     0
1    99
2    99
3    99
4    99
dtype: int64
>>> s.mask(t, 99)
0    99
1     1
2    99
3    99
4    99
dtype: int64

>>> s.where(s > 1, 10)
  10
  10
  2
  3
  4
dtype: int64
>>> s.mask(s > 1, 10)
   0
   1
  10
  10
  10
dtype: int64

>>> df = pd.DataFrame(np.arange(10).reshape(-1, 2), columns=["A", "B"])
>>> df
   A  B
0  0  1
1  2  3
2  4  5
3  6  7
4  8  9
>>> m = df % 3 == 0
>>> df.where(m, -df)
   A  B
0  0 -1
1 -2  3
2 -4 -5
3  6 -7
4 -8  9
>>> df.where(m, -df) == np.where(m, df, -df)
      A     B
0  True  True
1  True  True
2  True  True
3  True  True
4  True  True
>>> df.where(m, -df) == df.mask(~m, -df)
      A     B
0  True  True
1  True  True
2  True  True
3  True  True
4  True  True