Nullable Boolean data type#
Note
BooleanArray is currently experimental. Its API or implementation may change without warning.
Indexing with NA values#
pandas allows indexing with NA
values in a boolean array, which are treated as False
.
In [1]: s = pd.Series([1, 2, 3])
In [2]: mask = pd.array([True, False, pd.NA], dtype="boolean")
In [3]: s[mask]
Out[3]:
0 1
dtype: int64
If you would prefer to keep the NA
values you can manually fill them with fillna(True)
.
In [4]: s[mask.fillna(True)]
Out[4]:
0 1
2 3
dtype: int64
If you create a column of NA
values (for example to fill them later)
with df['new_col'] = pd.NA
, the dtype
would be set to object
in the
new column. The performance on this column will be worse than with
the appropriate type. It’s better to use
df['new_col'] = pd.Series(pd.NA, dtype="boolean")
(or another dtype
that supports NA
).
In [5]: df = pd.DataFrame()
In [6]: df['objects'] = pd.NA
In [7]: df.dtypes
Out[7]:
objects object
dtype: object
Kleene logical operations#
arrays.BooleanArray
implements Kleene Logic (sometimes called threevalue logic) for
logical operations like &
(and), 
(or) and ^
(exclusiveor).
This table demonstrates the results for every combination. These operations are symmetrical, so flipping the left and righthand side makes no difference in the result.
Expression 
Result 





































When an NA
is present in an operation, the output value is NA
only if
the result cannot be determined solely based on the other input. For example,
True  NA
is True
, because both True  True
and True  False
are True
. In that case, we don’t actually need to consider the value
of the NA
.
On the other hand, True & NA
is NA
. The result depends on whether
the NA
really is True
or False
, since True & True
is True
,
but True & False
is False
, so we can’t determine the output.
This differs from how np.nan
behaves in logical operations. pandas treated
np.nan
is always false in the output.
In or
In [8]: pd.Series([True, False, np.nan], dtype="object")  True
Out[8]:
0 True
1 True
2 False
dtype: bool
In [9]: pd.Series([True, False, np.nan], dtype="boolean")  True
Out[9]:
0 True
1 True
2 True
dtype: boolean
In and
In [10]: pd.Series([True, False, np.nan], dtype="object") & True
Out[10]:
0 True
1 False
2 False
dtype: bool
In [11]: pd.Series([True, False, np.nan], dtype="boolean") & True
Out[11]:
0 True
1 False
2 <NA>
dtype: boolean