pandas.Series.str.extract#
- Series.str.extract(pat, flags=0, expand=True)[source]#
- Extract capture groups in the regex pat as columns in a DataFrame. - For each subject string in the Series, extract groups from the first match of regular expression pat. - Parameters:
- patstr
- Regular expression pattern with capturing groups. 
- flagsint, default 0 (no flags)
- Flags from the - remodule, e.g.- re.IGNORECASE, that modify regular expression matching for things like case, spaces, etc. For more details, see- re.
- expandbool, default True
- If True, return DataFrame with one column per capture group. If False, return a Series/Index if there is one capture group or DataFrame if there are multiple capture groups. 
 
- Returns:
- DataFrame or Series or Index
- A DataFrame with one row for each subject string, and one column for each group. Any capture group names in regular expression pat will be used for column names; otherwise capture group numbers will be used. The dtype of each result column is always object, even when no match is found. If - expand=Falseand pat has only one capture group, then return a Series (if subject is a Series) or Index (if subject is an Index).
 
 - See also - extractall
- Returns all matches (not just the first match). 
 - Examples - A pattern with two groups will return a DataFrame with two columns. Non-matches will be NaN. - >>> s = pd.Series(['a1', 'b2', 'c3']) >>> s.str.extract(r'([ab])(\d)') 0 1 0 a 1 1 b 2 2 NaN NaN - A pattern may contain optional groups. - >>> s.str.extract(r'([ab])?(\d)') 0 1 0 a 1 1 b 2 2 NaN 3 - Named groups will become column names in the result. - >>> s.str.extract(r'(?P<letter>[ab])(?P<digit>\d)') letter digit 0 a 1 1 b 2 2 NaN NaN - A pattern with one group will return a DataFrame with one column if expand=True. - >>> s.str.extract(r'[ab](\d)', expand=True) 0 0 1 1 2 2 NaN - A pattern with one group will return a Series if expand=False. - >>> s.str.extract(r'[ab](\d)', expand=False) 0 1 1 2 2 NaN dtype: object