v0.10.1 (January 22, 2013)¶
This is a minor release from 0.10.0 and includes new features, enhancements, and bug fixes. In particular, there is substantial new HDFStore functionality contributed by Jeff Reback.
An undesired API breakage with functions taking the inplace
option has been
reverted and deprecation warnings added.
API changes¶
- Functions taking an
inplace
option return the calling object as before. A deprecation message has been added - Groupby aggregations Max/Min no longer exclude non-numeric data (GH2700)
- Resampling an empty DataFrame now returns an empty DataFrame instead of raising an exception (GH2640)
- The file reader will now raise an exception when NA values are found in an explicitly specified integer column instead of converting the column to float (GH2631)
- DatetimeIndex.unique now returns a DatetimeIndex with the same name and
- timezone instead of an array (GH2563)
New features¶
- MySQL support for database (contribution from Dan Allan)
HDFStore¶
You may need to upgrade your existing data files. Please visit the compatibility section in the main docs.
You can designate (and index) certain columns that you want to be able to
perform queries on a table, by passing a list to data_columns
In [1]: store = pd.HDFStore('store.h5')
In [2]: df = pd.DataFrame(np.random.randn(8, 3),
...: index=pd.date_range('1/1/2000', periods=8),
...: columns=['A', 'B', 'C'])
...:
In [3]: df['string'] = 'foo'
In [4]: df.loc[df.index[4:6], 'string'] = np.nan
In [5]: df.loc[df.index[7:9], 'string'] = 'bar'
In [6]: df['string2'] = 'cool'
In [7]: df
Out[7]:
A B C string string2
2000-01-01 0.469112 -0.282863 -1.509059 foo cool
2000-01-02 -1.135632 1.212112 -0.173215 foo cool
2000-01-03 0.119209 -1.044236 -0.861849 foo cool
2000-01-04 -2.104569 -0.494929 1.071804 foo cool
2000-01-05 0.721555 -0.706771 -1.039575 NaN cool
2000-01-06 0.271860 -0.424972 0.567020 NaN cool
2000-01-07 0.276232 -1.087401 -0.673690 foo cool
2000-01-08 0.113648 -1.478427 0.524988 bar cool
# on-disk operations
In [8]: store.append('df', df, data_columns=['B', 'C', 'string', 'string2'])
In [9]: store.select('df', "B>0 and string=='foo'")
Out[9]:
A B C string string2
2000-01-02 -1.135632 1.212112 -0.173215 foo cool
# this is in-memory version of this type of selection
In [10]: df[(df.B > 0) & (df.string == 'foo')]