pandas.
json_normalize
Normalize semi-structured JSON data into a flat table.
Unserialized JSON objects.
Path in each object to list of records. If not passed, data will be assumed to be an array of records.
Fields to use as metadata for each record in resulting table.
If True, prefix records with dotted (?) path, e.g. foo.bar.field if meta is [‘foo’, ‘bar’].
If True, prefix records with dotted (?) path, e.g. foo.bar.field if path to records is [‘foo’, ‘bar’].
Configures error handling.
‘ignore’ : will ignore KeyError if keys listed in meta are not always present.
‘raise’ : will raise KeyError if keys listed in meta are not always present.
Nested records will generate names separated by sep. e.g., for sep=’.’, {‘foo’: {‘bar’: 0}} -> foo.bar.
Max number of levels(depth of dict) to normalize. if None, normalizes all levels.
New in version 0.25.0.
Examples
>>> data = [{'id': 1, 'name': {'first': 'Coleen', 'last': 'Volk'}}, ... {'name': {'given': 'Mose', 'family': 'Regner'}}, ... {'id': 2, 'name': 'Faye Raker'}] >>> pandas.json_normalize(data) id name name.family name.first name.given name.last 0 1.0 NaN NaN Coleen NaN Volk 1 NaN NaN Regner NaN Mose NaN 2 2.0 Faye Raker NaN NaN NaN NaN
>>> data = [{'id': 1, ... 'name': "Cole Volk", ... 'fitness': {'height': 130, 'weight': 60}}, ... {'name': "Mose Reg", ... 'fitness': {'height': 130, 'weight': 60}}, ... {'id': 2, 'name': 'Faye Raker', ... 'fitness': {'height': 130, 'weight': 60}}] >>> json_normalize(data, max_level=0) fitness id name 0 {'height': 130, 'weight': 60} 1.0 Cole Volk 1 {'height': 130, 'weight': 60} NaN Mose Reg 2 {'height': 130, 'weight': 60} 2.0 Faye Raker
Normalizes nested data up to level 1.
>>> data = [{'id': 1, ... 'name': "Cole Volk", ... 'fitness': {'height': 130, 'weight': 60}}, ... {'name': "Mose Reg", ... 'fitness': {'height': 130, 'weight': 60}}, ... {'id': 2, 'name': 'Faye Raker', ... 'fitness': {'height': 130, 'weight': 60}}] >>> json_normalize(data, max_level=1) fitness.height fitness.weight id name 0 130 60 1.0 Cole Volk 1 130 60 NaN Mose Reg 2 130 60 2.0 Faye Raker
>>> data = [{'state': 'Florida', ... 'shortname': 'FL', ... 'info': {'governor': 'Rick Scott'}, ... 'counties': [{'name': 'Dade', 'population': 12345}, ... {'name': 'Broward', 'population': 40000}, ... {'name': 'Palm Beach', 'population': 60000}]}, ... {'state': 'Ohio', ... 'shortname': 'OH', ... 'info': {'governor': 'John Kasich'}, ... 'counties': [{'name': 'Summit', 'population': 1234}, ... {'name': 'Cuyahoga', 'population': 1337}]}] >>> result = json_normalize(data, 'counties', ['state', 'shortname', ... ['info', 'governor']]) >>> result name population state shortname info.governor 0 Dade 12345 Florida FL Rick Scott 1 Broward 40000 Florida FL Rick Scott 2 Palm Beach 60000 Florida FL Rick Scott 3 Summit 1234 Ohio OH John Kasich 4 Cuyahoga 1337 Ohio OH John Kasich
>>> data = {'A': [1, 2]} >>> json_normalize(data, 'A', record_prefix='Prefix.') Prefix.0 0 1 1 2
Returns normalized data with columns prefixed with the given string.