v.0.7.0 (February 9, 2012)¶
New features¶
- New unified merge function for efficiently performing full gamut of database / relational-algebra operations. Refactored existing join methods to use the new infrastructure, resulting in substantial performance gains (GH220, GH249, GH267)
- New unified concatenation function for concatenating
Series, DataFrame or Panel objects along an axis. Can form union or
intersection of the other axes. Improves performance of
Series.append
andDataFrame.append
(GH468, GH479, GH273) - Can pass multiple DataFrames to
DataFrame.append to concatenate (stack) and multiple Series to
Series.append
too - Can pass list of dicts (e.g., a list of JSON objects) to DataFrame constructor (GH526)
- You can now set multiple columns in a
DataFrame via
__getitem__
, useful for transformation (GH342) - Handle differently-indexed output values in
DataFrame.apply
(GH498)
In [1]: df = pd.DataFrame(np.random.randn(10, 4))
In [2]: df.apply(lambda x: x.describe())
Out[2]:
0 1 2 3
count 10.000000 10.000000 10.000000 10.000000
mean 0.190912 -0.395125 -0.731920 -0.403130
std 0.730951 0.813266 1.112016 0.961912
min -0.861849 -2.104569 -1.776904 -1.469388
25% -0.411391 -0.698728 -1.501401 -1.076610
50% 0.380863 -0.228039 -1.191943 -1.004091
75% 0.658444 0.057974 -0.034326 0.461706
max 1.212112 0.577046 1.643563 1.071804
[8 rows x 4 columns]
- Add
reorder_levels
method to Series and DataFrame (GH534) - Add dict-like
get
function to DataFrame and Panel (GH521) - Add
DataFrame.iterrows
method for efficiently iterating through the rows of a DataFrame - Add
DataFrame.to_panel
with code adapted fromLongPanel.to_long
- Add
reindex_axis
method added to DataFrame - Add
level
option to binary arithmetic functions onDataFrame
andSeries
- Add
level
option to thereindex
andalign
methods on Series and DataFrame for broadcasting values across a level (GH542, GH552, others) - Add attribute-based item access to
Panel
and add IPython completion (GH563) - Add
logy
option toSeries.plot
for log-scaling on the Y axis - Add
index
andheader
options toDataFrame.to_string
- Can pass multiple DataFrames to
DataFrame.join
to join on index (GH115) - Can pass multiple Panels to
Panel.join
(GH115) - Added
justify
argument toDataFrame.to_string
to allow different alignment of column headers - Add
sort
option to GroupBy to allow disabling sorting of the group keys for potential speedups (GH595) - Can pass MaskedArray to Series constructor (GH563)
- Add Panel item access via attributes and IPython completion (GH554)
- Implement
DataFrame.lookup
, fancy-indexing analogue for retrieving values given a sequence of row and column labels (GH338) - Can pass a list of functions to aggregate with groupby on a DataFrame, yielding an aggregated result with hierarchical columns (GH166)
- Can call
cummin
andcummax
on Series and DataFrame to get cumulative minimum and maximum, respectively (GH647) value_range
added as utility function to get min and max of a dataframe (GH288)- Added
encoding
argument toread_csv
,read_table
,to_csv
andfrom_csv
for non-ascii text (GH717) - Added
abs
method to pandas objects - Added
crosstab
function for easily computing frequency tables - Added
isin
method to index objects - Added
level
argument toxs
method of DataFrame.
API Changes to integer indexing¶
One of the potentially riskiest API changes in 0.7.0, but also one of the most important, was a complete review of how integer indexes are handled with regard to label-based indexing. Here is an example:
In [3]: s = pd.Series(np.random.randn(10), index=range(0, 20, 2))
In [4]: s
Out[4]:
0 -1.294524
2 0.413738
4 0.276662
6 -0.472035
8 -0.013960
10 -0.362543
12 -0.006154
14 -0.923061
16 0.895717
18 0.805244
Length: 10, dtype: float64
In [5]: s[0]