pandas.DataFrame.transform¶
- DataFrame.transform(func, axis=0, *args, **kwargs)[source]¶
- Call - funcon self producing a DataFrame with the same axis shape as self.- Parameters
- funcfunction, str, list-like or dict-like
- Function to use for transforming the data. If a function, must either work when passed a DataFrame or when passed to DataFrame.apply. If func is both list-like and dict-like, dict-like behavior takes precedence. - Accepted combinations are: - function 
- string function name 
- list-like of functions and/or function names, e.g. - [np.exp, 'sqrt']
- dict-like of axis labels -> functions, function names or list-like of such. 
 
- axis{0 or ‘index’, 1 or ‘columns’}, default 0
- If 0 or ‘index’: apply function to each column. If 1 or ‘columns’: apply function to each row. 
- *args
- Positional arguments to pass to func. 
- **kwargs
- Keyword arguments to pass to func. 
 
- Returns
- DataFrame
- A DataFrame that must have the same length as self. 
 
- Raises
- ValueErrorIf the returned DataFrame has a different length than self.
 
 - See also - DataFrame.agg
- Only perform aggregating type operations. 
- DataFrame.apply
- Invoke function on a DataFrame. 
 - Notes - Functions that mutate the passed object can produce unexpected behavior or errors and are not supported. See Mutating with User Defined Function (UDF) methods for more details. - Examples - >>> df = pd.DataFrame({'A': range(3), 'B': range(1, 4)}) >>> df A B 0 0 1 1 1 2 2 2 3 >>> df.transform(lambda x: x + 1) A B 0 1 2 1 2 3 2 3 4 - Even though the resulting DataFrame must have the same length as the input DataFrame, it is possible to provide several input functions: - >>> s = pd.Series(range(3)) >>> s 0 0 1 1 2 2 dtype: int64 >>> s.transform([np.sqrt, np.exp]) sqrt exp 0 0.000000 1.000000 1 1.000000 2.718282 2 1.414214 7.389056 - You can call transform on a GroupBy object: - >>> df = pd.DataFrame({ ... "Date": [ ... "2015-05-08", "2015-05-07", "2015-05-06", "2015-05-05", ... "2015-05-08", "2015-05-07", "2015-05-06", "2015-05-05"], ... "Data": [5, 8, 6, 1, 50, 100, 60, 120], ... }) >>> df Date Data 0 2015-05-08 5 1 2015-05-07 8 2 2015-05-06 6 3 2015-05-05 1 4 2015-05-08 50 5 2015-05-07 100 6 2015-05-06 60 7 2015-05-05 120 >>> df.groupby('Date')['Data'].transform('sum') 0 55 1 108 2 66 3 121 4 55 5 108 6 66 7 121 Name: Data, dtype: int64 - >>> df = pd.DataFrame({ ... "c": [1, 1, 1, 2, 2, 2, 2], ... "type": ["m", "n", "o", "m", "m", "n", "n"] ... }) >>> df c type 0 1 m 1 1 n 2 1 o 3 2 m 4 2 m 5 2 n 6 2 n >>> df['size'] = df.groupby('c')['type'].transform(len) >>> df c type size 0 1 m 3 1 1 n 3 2 1 o 3 3 2 m 4 4 2 m 4 5 2 n 4 6 2 n 4