.. _10min_tut_01_tableoriented: {{ header }} What kind of data does pandas handle? ===================================== .. raw:: html pandas data table representation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. image:: ../../_static/schemas/01_table_dataframe.svg :align: center .. raw:: html A :class:`DataFrame` is a 2-dimensional data structure that can store data of different types (including characters, integers, floating point values, categorical data and more) in columns. It is similar to a spreadsheet, a SQL table or the ``data.frame`` in R. - The table has 3 columns, each of them with a column label. The column labels are respectively ``Name``, ``Age`` and ``Sex``. - The column ``Name`` consists of textual data with each value a string, the column ``Age`` are numbers and the column ``Sex`` is textual data. In spreadsheet software, the table representation of our data would look very similar: .. image:: ../../_static/schemas/01_table_spreadsheet.png :align: center Each column in a ``DataFrame`` is a ``Series`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. image:: ../../_static/schemas/01_table_series.svg :align: center .. raw:: html .. note:: If you are familiar with Python :ref:`dictionaries `, the selection of a single column is very similar to the selection of dictionary values based on the key. You can create a ``Series`` from scratch as well: .. ipython:: python ages = pd.Series([22, 35, 58], name="Age") ages A pandas ``Series`` has no column labels, as it is just a single column of a ``DataFrame``. A Series does have row labels. Do something with a DataFrame or Series ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. raw:: html As illustrated by the ``max()`` method, you can *do* things with a ``DataFrame`` or ``Series``. pandas provides a lot of functionalities, each of them a *method* you can apply to a ``DataFrame`` or ``Series``. As methods are functions, do not forget to use parentheses ``()``. .. raw:: html Many pandas operations return a ``DataFrame`` or a ``Series``. The :func:`~DataFrame.describe` method is an example of a pandas operation returning a pandas ``Series`` or a pandas ``DataFrame``. .. raw:: html
To user guide Check more options on ``describe`` in the user guide section about :ref:`aggregations with describe ` .. raw:: html
.. note:: This is just a starting point. Similar to spreadsheet software, pandas represents data as a table with columns and rows. Apart from the representation, also the data manipulations and calculations you would do in spreadsheet software are supported by pandas. Continue reading the next tutorials to get started! .. raw:: html

REMEMBER

- Import the package, aka ``import pandas as pd`` - A table of data is stored as a pandas ``DataFrame`` - Each column in a ``DataFrame`` is a ``Series`` - You can do things by applying a method to a ``DataFrame`` or ``Series`` .. raw:: html
.. raw:: html
To user guide A more extended explanation to ``DataFrame`` and ``Series`` is provided in the :ref:`introduction to data structures `. .. raw:: html