v0.13.0 (January 3, 2014)¶
This is a major release from 0.12.0 and includes a number of API changes, several new features and enhancements along with a large number of bug fixes.
Highlights include:
- support for a new index type
Float64Index
, and other Indexing enhancements HDFStore
has a new string based syntax for query specification- support for new methods of interpolation
- updated
timedelta
operations - a new string manipulation method
extract
- Nanosecond support for Offsets
isin
for DataFrames
Several experimental features are added, including:
- new
eval/query
methods for expression evaluation - support for
msgpack
serialization - an i/o interface to Google’s
BigQuery
Their are several new or updated docs sections including:
- Comparison with SQL, which should be useful for those familiar with SQL but still learning pandas.
- Comparison with R, idiom translations from R to pandas.
- Enhancing Performance, ways to enhance pandas performance with
eval/query
.
Warning
In 0.13.0 Series
has internally been refactored to no longer sub-class ndarray
but instead subclass NDFrame
, similar to the rest of the pandas containers. This should be
a transparent change with only very limited API implications. See Internal Refactoring
API changes¶
read_excel
now supports an integer in itssheetname
argument giving the index of the sheet to read in (GH4301).Text parser now treats anything that reads like inf (“inf”, “Inf”, “-Inf”, “iNf”, etc.) as infinity. (GH4220, GH4219), affecting
read_table
,read_csv
, etc.pandas
now is Python 2/3 compatible without the need for 2to3 thanks to @jtratner. As a result, pandas now uses iterators more extensively. This also led to the introduction of substantive parts of the Benjamin Peterson’ssix
library into compat. (GH4384, GH4375, GH4372)pandas.util.compat
andpandas.util.py3compat
have been merged intopandas.compat
.pandas.compat
now includes many functions allowing 2/3 compatibility. It contains both list and iterator versions of range, filter, map and zip, plus other necessary elements for Python 3 compatibility.lmap
,lzip
,lrange
andlfilter
all produce lists instead of iterators, for compatibility withnumpy
, subscripting andpandas
constructors.(GH4384, GH4375, GH4372)Series.get
with negative indexers now returns the same as[]
(GH4390)Changes to how
Index
andMultiIndex
handle metadata (levels
,labels
, andnames
) (GH4039):# previously, you would have set levels or labels directly >>> pd.index.levels = [[1, 2, 3, 4], [1, 2, 4, 4]] # now, you use the set_levels or set_labels methods >>> index = pd.index.set_levels([[1, 2, 3, 4], [1, 2, 4, 4]]) # similarly, for names, you can rename the object # but setting names is not deprecated >>> index = pd.index.set_names(["bob", "cranberry"]) # and all methods take an inplace kwarg - but return None >>> pd.index.set_names(["bob", "cranberry"], inplace=True)
All division with
NDFrame
objects is now truedivision, regardless of the future import. This means that operating on pandas objects will by default use floating point division, and return a floating point dtype. You can use//
andfloordiv
to do integer division.Integer division
In [3]: arr = np.array([1, 2, 3, 4]) In [4]: arr2 = np.array([5, 3, 2, 1]) In [5]: arr / arr2 Out[5]: array([0, 0, 1, 4]) In [6]: pd.Series(arr) // pd.Series(arr2) Out[6]: 0 0 1 0 2 1 3 4 dtype: int64
True Division
In [7]: pd.Series(arr) / pd.Series(arr2) # no future import required Out[7]: 0 0.200000 1 0.666667 2 1.500000 3 4.000000 dtype: float64
Infer and downcast dtype if
downcast='infer'
is passed tofillna/ffill/bfill
(GH4604)__nonzero__
for all NDFrame objects, will now raise aValueError
, this reverts back to (GH1073, GH4633) behavior. See gotchas for a more detailed discussion.This prevents doing boolean comparison on entire pandas objects, which is inherently ambiguous. These all will raise a
ValueError
.>>> df = pd.DataFrame({'A': np.random.randn(10), ... 'B': np.random.randn(10), ... 'C': pd.date_range('20130101', periods=10) ... }) ... >>> if df: ... pass ... Traceback (most recent call last): ... ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). >>> df1 = df >>> df2 = df >>> df1 and df2 Traceback (most recent call last): ... ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). >>> d = [1, 2, 3] >>> s1 = pd.Series(d) >>> s2 = pd.Series(d) >>> s1 and s2 Traceback (most recent call last): ... ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Added the
.bool()
method toNDFrame
objects to facilitate evaluating of single-element boolean Series:In [1]: pd.Series([True]).bool() Out[1]: True In [2]: pd.Series([False]).bool()