pandas.io.json.build_table_schema#
- pandas.io.json.build_table_schema(data, index=True, primary_key=None, version=True)[source]#
Create a Table schema from
data
.This method is a utility to generate a JSON-serializable schema representation of a pandas Series or DataFrame, compatible with the Table Schema specification. It enables structured data to be shared and validated in various applications, ensuring consistency and interoperability.
- Parameters:
- dataSeries or DataFrame
The input data for which the table schema is to be created.
- indexbool, default True
Whether to include
data.index
in the schema.- primary_keybool or None, default True
Column names to designate as the primary key. The default None will set ‘primaryKey’ to the index level or levels if the index is unique.
- versionbool, default True
Whether to include a field pandas_version with the version of pandas that last revised the table schema. This version can be different from the installed pandas version.
- Returns:
- dict
A dictionary representing the Table schema.
See also
DataFrame.to_json
Convert the object to a JSON string.
read_json
Convert a JSON string to pandas object.
Notes
See Table Schema for conversion types. Timedeltas as converted to ISO8601 duration format with 9 decimal places after the seconds field for nanosecond precision.
Categoricals are converted to the any dtype, and use the enum field constraint to list the allowed values. The ordered attribute is included in an ordered field.
Examples
>>> from pandas.io.json._table_schema import build_table_schema >>> df = pd.DataFrame( ... {'A': [1, 2, 3], ... 'B': ['a', 'b', 'c'], ... 'C': pd.date_range('2016-01-01', freq='D', periods=3), ... }, index=pd.Index(range(3), name='idx')) >>> build_table_schema(df) {'fields': [{'name': 'idx', 'type': 'integer'}, {'name': 'A', 'type': 'integer'}, {'name': 'B', 'type': 'string'}, {'name': 'C', 'type': 'datetime'}], 'primaryKey': ['idx'], 'pandas_version': '1.4.0'}