pandas.json_normalize

pandas.json_normalize(data: Union[Dict, List[Dict]], record_path: Union[str, List, NoneType] = None, meta: Union[str, List[Union[str, List[str]]], NoneType] = None, meta_prefix: Union[str, NoneType] = None, record_prefix: Union[str, NoneType] = None, errors: Union[str, NoneType] = 'raise', sep: str = '.', max_level: Union[int, NoneType] = None) → 'DataFrame'[source]

Normalize semi-structured JSON data into a flat table.

Parameters
datadict or list of dicts

Unserialized JSON objects.

record_pathstr or list of str, default None

Path in each object to list of records. If not passed, data will be assumed to be an array of records.

metalist of paths (str or list of str), default None

Fields to use as metadata for each record in resulting table.

meta_prefixstr, default None

If True, prefix records with dotted (?) path, e.g. foo.bar.field if meta is [‘foo’, ‘bar’].

record_prefixstr, default None

If True, prefix records with dotted (?) path, e.g. foo.bar.field if path to records is [‘foo’, ‘bar’].

errors{‘raise’, ‘ignore’}, default ‘raise’

Configures error handling.

  • ‘ignore’ : will ignore KeyError if keys listed in meta are not always present.

  • ‘raise’ : will raise KeyError if keys listed in meta are not always present.

sepstr, default ‘.’

Nested records will generate names separated by sep. e.g., for sep=’.’, {‘foo’: {‘bar’: 0}} -> foo.bar.

max_levelint, default None

Max number of levels(depth of dict) to normalize. if None, normalizes all levels.

New in version 0.25.0.

Returns
frameDataFrame
Normalize semi-structured JSON data into a flat table.

Examples

>>> from pandas.io.json import json_normalize
>>> data = [{'id': 1, 'name': {'first': 'Coleen', 'last': 'Volk'}},
...         {'name': {'given': 'Mose', 'family': 'Regner'}},
...         {'id': 2, 'name': 'Faye Raker'}]
>>> json_normalize(data)
    id        name name.family name.first name.given name.last
0  1.0         NaN         NaN     Coleen        NaN      Volk
1  NaN         NaN      Regner        NaN       Mose       NaN
2  2.0  Faye Raker         NaN        NaN        NaN       NaN
>>> data = [{'id': 1,
...          'name': "Cole Volk",
...          'fitness': {'height': 130, 'weight': 60}},
...         {'name': "Mose Reg",
...          'fitness': {'height': 130, 'weight': 60}},
...         {'id': 2, 'name': 'Faye Raker',
...          'fitness': {'height': 130, 'weight': 60}}]
>>> json_normalize(data, max_level=0)
            fitness                 id        name
0   {'height': 130, 'weight': 60}  1.0   Cole Volk
1   {'height': 130, 'weight': 60}  NaN    Mose Reg
2   {'height': 130, 'weight': 60}  2.0  Faye Raker

Normalizes nested data up to level 1.

>>> data = [{'id': 1,
...          'name': "Cole Volk",
...          'fitness': {'height': 130, 'weight': 60}},
...         {'name': "Mose Reg",
...          'fitness': {'height': 130, 'weight': 60}},
...         {'id': 2, 'name': 'Faye Raker',
...          'fitness': {'height': 130, 'weight': 60}}]
>>> json_normalize(data, max_level=1)
  fitness.height  fitness.weight   id    name
0   130              60          1.0    Cole Volk
1   130              60          NaN    Mose Reg
2   130              60          2.0    Faye Raker
>>> data = [{'state': 'Florida',
...          'shortname': 'FL',
...          'info': {'governor': 'Rick Scott'},
...          'counties': [{'name': 'Dade', 'population': 12345},
...                       {'name': 'Broward', 'population': 40000},
...                       {'name': 'Palm Beach', 'population': 60000}]},
...         {'state': 'Ohio',
...          'shortname': 'OH',
...          'info': {'governor': 'John Kasich'},
...          'counties': [{'name': 'Summit', 'population': 1234},
...                       {'name': 'Cuyahoga', 'population': 1337}]}]
>>> result = json_normalize(data, 'counties', ['state', 'shortname',
...                                            ['info', 'governor']])
>>> result
         name  population    state shortname info.governor
0        Dade       12345   Florida    FL    Rick Scott
1     Broward       40000   Florida    FL    Rick Scott
2  Palm Beach       60000   Florida    FL    Rick Scott
3      Summit        1234   Ohio       OH    John Kasich
4    Cuyahoga        1337   Ohio       OH    John Kasich
>>> data = {'A': [1, 2]}
>>> json_normalize(data, 'A', record_prefix='Prefix.')
    Prefix.0
0          1
1          2

Returns normalized data with columns prefixed with the given string.