pandas.io.json.json_normalize¶
-
pandas.io.json.
json_normalize
(data, record_path=None, meta=None, meta_prefix=None, record_prefix=None, errors='raise', sep='.')[source]¶ “Normalize” semi-structured JSON data into a flat table
Parameters: data : dict or list of dicts
Unserialized JSON objects
record_path : string or list of strings, default None
Path in each object to list of records. If not passed, data will be assumed to be an array of records
meta : list of paths (string or list of strings), default None
Fields to use as metadata for each record in resulting table
record_prefix : string, default None
If True, prefix records with dotted (?) path, e.g. foo.bar.field if path to records is [‘foo’, ‘bar’]
meta_prefix : string, default None
errors : {‘raise’, ‘ignore’}, default ‘raise’
- ‘ignore’ : will ignore KeyError if keys listed in meta are not always present
- ‘raise’ : will raise KeyError if keys listed in meta are not always present
New in version 0.20.0.
sep : string, default ‘.’
Nested records will generate names separated by sep, e.g., for sep=’.’, { ‘foo’ : { ‘bar’ : 0 } } -> foo.bar
New in version 0.20.0.
Returns: frame : DataFrame
Examples
>>> from pandas.io.json import json_normalize >>> data = [{'id': 1, 'name': {'first': 'Coleen', 'last': 'Volk'}}, ... {'name': {'given': 'Mose', 'family': 'Regner'}}, ... {'id': 2, 'name': 'Faye Raker'}] >>> json_normalize(data) id name name.family name.first name.given name.last 0 1.0 NaN NaN Coleen NaN Volk 1 NaN NaN Regner NaN Mose NaN 2 2.0 Faye Raker NaN NaN NaN NaN
>>> data = [{'state': 'Florida', ... 'shortname': 'FL', ... 'info': { ... 'governor': 'Rick Scott' ... }, ... 'counties': [{'name': 'Dade', 'population': 12345}, ... {'name': 'Broward', 'population': 40000}, ... {'name': 'Palm Beach', 'population': 60000}]}, ... {'state': 'Ohio', ... 'shortname': 'OH', ... 'info': { ... 'governor': 'John Kasich' ... }, ... 'counties': [{'name': 'Summit', 'population': 1234}, ... {'name': 'Cuyahoga', 'population': 1337}]}] >>> result = json_normalize(data, 'counties', ['state', 'shortname', ... ['info', 'governor']]) >>> result name population info.governor state shortname 0 Dade 12345 Rick Scott Florida FL 1 Broward 40000 Rick Scott Florida FL 2 Palm Beach 60000 Rick Scott Florida FL 3 Summit 1234 John Kasich Ohio OH 4 Cuyahoga 1337 John Kasich Ohio OH