pandas.io.json.json_normalize¶

pandas.io.json.json_normalize(data: Union[Dict, List[Dict]], record_path: Union[str, List, NoneType] = None, meta: Union[str, List, NoneType] = None, meta_prefix: Union[str, NoneType] = None, record_prefix: Union[str, NoneType] = None, errors: Union[str, NoneType] = 'raise', sep: str = '.', max_level: Union[int, NoneType] = None)[source]¶

Normalize semi-structured JSON data into a flat table.

Parameters:

data : dict or list of dicts

Unserialized JSON objects.

record_path : str or list of str, default None

Path in each object to list of records. If not passed, data will be assumed to be an array of records.

meta : list of paths (str or list of str), default None

Fields to use as metadata for each record in resulting table.

meta_prefix : str, default None

If True, prefix records with dotted (?) path, e.g. foo.bar.field if meta is [‘foo’, ‘bar’].

record_prefix : str, default None

If True, prefix records with dotted (?) path, e.g. foo.bar.field if path to records is [‘foo’, ‘bar’].

errors : {‘raise’, ‘ignore’}, default ‘raise’

Configures error handling.

‘ignore’ : will ignore KeyError if keys listed in meta are not always present.
‘raise’ : will raise KeyError if keys listed in meta are not always present.

New in version 0.20.0.

sep : str, default ‘.’

Nested records will generate names separated by sep. e.g., for sep=’.’, {‘foo’: {‘bar’: 0}} -> foo.bar.

New in version 0.20.0.

max_level : int, default None

Max number of levels(depth of dict) to normalize. if None, normalizes all levels.

New in version 0.25.0.

Returns:

frame : DataFrame
Normalize semi-structured JSON data into a flat table.

Examples

>>> from pandas.io.json import json_normalize
>>> data = [{'id': 1, 'name': {'first': 'Coleen', 'last': 'Volk'}},
...         {'name': {'given': 'Mose', 'family': 'Regner'}},
...         {'id': 2, 'name': 'Faye Raker'}]
>>> json_normalize(data)
    id        name name.family name.first name.given name.last
0  1.0         NaN         NaN     Coleen        NaN      Volk
1  NaN         NaN      Regner        NaN       Mose       NaN
2  2.0  Faye Raker         NaN        NaN        NaN       NaN

>>> data = [{'id': 1,
...          'name': "Cole Volk",
...          'fitness': {'height': 130, 'weight': 60}},
...         {'name': "Mose Reg",
...          'fitness': {'height': 130, 'weight': 60}},
...         {'id': 2, 'name': 'Faye Raker',
...          'fitness': {'height': 130, 'weight': 60}}]
>>> json_normalize(data, max_level=0)
            fitness                 id        name
0   {'height': 130, 'weight': 60}  1.0   Cole Volk
1   {'height': 130, 'weight': 60}  NaN    Mose Reg
2   {'height': 130, 'weight': 60}  2.0  Faye Raker

Normalizes nested data upto level 1.

>>> data = [{'id': 1,
...          'name': "Cole Volk",
...          'fitness': {'height': 130, 'weight': 60}},
...         {'name': "Mose Reg",
...          'fitness': {'height': 130, 'weight': 60}},
...         {'id': 2, 'name': 'Faye Raker',
...          'fitness': {'height': 130, 'weight': 60}}]
>>> json_normalize(data, max_level=1)
  fitness.height  fitness.weight   id    name
0   130              60          1.0    Cole Volk
1   130              60          NaN    Mose Reg
2   130              60          2.0    Faye Raker

>>> data = [{'state': 'Florida',
...          'shortname': 'FL',
...          'info': {'governor': 'Rick Scott'},
...          'counties': [{'name': 'Dade', 'population': 12345},
...                       {'name': 'Broward', 'population': 40000},
...                       {'name': 'Palm Beach', 'population': 60000}]},
...         {'state': 'Ohio',
...          'shortname': 'OH',
...          'info': {'governor': 'John Kasich'},
...          'counties': [{'name': 'Summit', 'population': 1234},
...                       {'name': 'Cuyahoga', 'population': 1337}]}]
>>> result = json_normalize(data, 'counties', ['state', 'shortname',
...                                            ['info', 'governor']])
>>> result
         name  population    state shortname info.governor
0        Dade       12345   Florida    FL    Rick Scott
1     Broward       40000   Florida    FL    Rick Scott
2  Palm Beach       60000   Florida    FL    Rick Scott
3      Summit        1234   Ohio       OH    John Kasich
4    Cuyahoga        1337   Ohio       OH    John Kasich

>>> data = {'A': [1, 2]}
>>> json_normalize(data, 'A', record_prefix='Prefix.')
    Prefix.0
0          1
1          2

Returns normalized data with columns prefixed with the given string.

Table Of Contents

Search

pandas.io.json.json_normalize¶