pandas.read_gbq¶

pandas.read_gbq(query, project_id=None, index_col=None, col_order=None, reauth=False, auth_local_webserver=False, dialect=None, location=None, configuration=None, credentials=None, use_bqstorage_api=None, max_results=None, progress_bar_type=None)[source]¶

Load data from Google BigQuery.

This function requires the pandas-gbq package.

See the How to authenticate with Google BigQuery guide for authentication instructions.

Parameters

querystr

SQL-Like Query to return data values.

project_idstr, optional

Google BigQuery Account project ID. Optional when available from the environment.

index_colstr, optional

Name of result column to use for index in results DataFrame.

col_orderlist(str), optional

List of BigQuery column names in the desired order for results DataFrame.

reauthbool, default False

Force Google BigQuery to re-authenticate the user. This is useful if multiple accounts are used.

auth_local_webserverbool, default False

Use the local webserver flow instead of the console flow when getting user credentials.

New in version 0.2.0 of pandas-gbq.

dialectstr, default ‘legacy’

Note: The default value is changing to ‘standard’ in a future version.

SQL syntax dialect to use. Value can be one of:

'legacy': Use BigQuery’s legacy SQL dialect. For more information see BigQuery Legacy SQL Reference.
'standard': Use BigQuery’s standard SQL, which is compliant with the SQL 2011 standard. For more information see BigQuery Standard SQL Reference.

Changed in version 0.24.0.

locationstr, optional

Location where the query job should run. See the BigQuery locations documentation for a list of available locations. The location must match that of any datasets used in the query.

New in version 0.5.0 of pandas-gbq.

configurationdict, optional

Query config parameters for job processing. For example:

configuration = {‘query’: {‘useQueryCache’: False}}

For more information see BigQuery REST API Reference.

credentialsgoogle.auth.credentials.Credentials, optional

Credentials for accessing Google APIs. Use this parameter to override default credentials, such as to use Compute Engine google.auth.compute_engine.Credentials or Service Account google.oauth2.service_account.Credentials directly.

New in version 0.8.0 of pandas-gbq.

New in version 0.24.0.

use_bqstorage_apibool, default False

Use the BigQuery Storage API to download query results quickly, but at an increased cost. To use this API, first enable it in the Cloud Console. You must also have the bigquery.readsessions.create permission on the project you are billing queries to.

This feature requires version 0.10.0 or later of the pandas-gbq package. It also requires the google-cloud-bigquery-storage and fastavro packages.

New in version 0.25.0.

max_resultsint, optional

If set, limit the maximum number of rows to fetch from the query results.

New in version 0.12.0 of pandas-gbq.

New in version 1.1.0.

progress_bar_typeOptional, str

If set, use the tqdm library to display a progress bar while the data downloads. Install the tqdm package to use this feature.

Possible values of progress_bar_type include:

None: No progress bar.
'tqdm': Use the tqdm.tqdm() function to print a progress bar to sys.stderr.
'tqdm_notebook': Use the tqdm.tqdm_notebook() function to display a progress bar as a Jupyter notebook widget.
'tqdm_gui': Use the tqdm.tqdm_gui() function to display a progress bar as a graphical dialog box.

Note that his feature requires version 0.12.0 or later of the pandas-gbq package. And it requires the tqdm package. Slightly different than pandas-gbq, here the default is None.

New in version 1.0.0.

Returns

df: DataFrame: DataFrame representing results of query.