v.0.6.0 (November 25, 2011)¶
New Features¶
- Added
melt
function topandas.core.reshape
- Added
level
parameter to group by level in Series and DataFrame descriptive statistics (GH313) - Added
head
andtail
methods to Series, analogous to to DataFrame (GH296) - Added
Series.isin
function which checks if each value is contained in a passed sequence (GH289) - Added
float_format
option toSeries.to_string
- Added
skip_footer
(GH291) andconverters
(GH343) options toread_csv
andread_table
- Added
drop_duplicates
andduplicated
functions for removing duplicate DataFrame rows and checking for duplicate rows, respectively (GH319) - Implemented operators ‘&’, ‘|’, ‘^’, ‘-‘ on DataFrame (GH347)
- Added
Series.mad
, mean absolute deviation - Added
QuarterEnd
DateOffset (GH321) - Added
dot
to DataFrame (GH65) - Added
orient
option toPanel.from_dict
(GH359, GH301) - Added
orient
option toDataFrame.from_dict
- Added passing list of tuples or list of lists to
DataFrame.from_records
(GH357) - Added multiple levels to groupby (GH103)
- Allow multiple columns in
by
argument ofDataFrame.sort_index
(GH92, GH362) - Added fast
get_value
andput_value
methods to DataFrame (GH360) - Added
cov
instance methods to Series and DataFrame (GH194, GH362) - Added
kind='bar'
option toDataFrame.plot
(GH348) - Added
idxmin
andidxmax
to Series and DataFrame (GH286) - Added
read_clipboard
function to parse DataFrame from clipboard (GH300) - Added
nunique
function to Series for counting unique elements (GH297) - Made DataFrame constructor use Series name if no columns passed (GH373)
- Support regular expressions in read_table/read_csv (GH364)
- Added
DataFrame.to_html
for writing DataFrame to HTML (GH387) - Added support for MaskedArray data in DataFrame, masked values converted to NaN (GH396)
- Added
DataFrame.boxplot
function (GH368) - Can pass extra args, kwds to DataFrame.apply (GH376)
- Implement
DataFrame.join
with vectoron
argument (GH312) - Added
legend
boolean flag toDataFrame.plot
(GH324) - Can pass multiple levels to
stack
andunstack
(GH370) - Can pass multiple values columns to
pivot_table
(GH381) - Use Series name in GroupBy for result index (GH363)
- Added
raw
option toDataFrame.apply
for performance if only need ndarray (GH309) - Added proper, tested weighted least squares to standard and panel OLS (GH303)
Performance Enhancements¶
- VBENCH Cythonized
cache_readonly
, resulting in substantial micro-performance enhancements throughout the code base (GH361) - VBENCH Special Cython matrix iterator for applying arbitrary reduction operations with 3-5x better performance than np.apply_along_axis (GH309)
- VBENCH Improved performance of
MultiIndex.from_tuples
- VBENCH Special Cython matrix iterator for applying arbitrary reduction operations
- VBENCH + DOCUMENT Add
raw
option toDataFrame.apply
for getting better performance when - VBENCH Faster cythonized count by level in Series and DataFrame (GH341)
- VBENCH? Significant GroupBy performance enhancement with multiple keys with many “empty” combinations
- VBENCH New Cython vectorized function
map_infer
speeds upSeries.apply
andSeries.map
significantly when passed elementwise Python function, motivated by (GH355) - VBENCH Significantly improved performance of
Series.order
, which also makes np.unique called on a Series faster (GH327) - VBENCH Vastly improved performance of GroupBy on axes with a MultiIndex (GH299)
Contributors¶
A total of 8 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.
- Adam Klein +
- Chang She +
- Dieter Vandenbussche
- Jeff Hammerbacher +
- Nathan Pinger +
- Thomas Kluyver
- Wes McKinney
- Wouter Overmeire +