import pandas_gbq
credentials = ... # From google-auth or pydata-google-auth library.
# Update the in-memory credentials cache (added in pandas-gbq 0.7.0).
pandas_gbq.context.credentials = credentials
pandas_gbq.context.project = "your-project-id"
# The credentials and project_id arguments can be omitted.
df = pandas_gbq.read_gbq("SELECT my_col FROM `my_dataset.my_table`")
If running on Google Colab,
pandas-gbq attempts to authenticate with the
google.colab.auth.authenticate_user()
method. See the Getting started
with BigQuery on Colab notebook for an
example of using this authentication method with other libraries that use
Google BigQuery.
To use Colab authentication, install version 1.8.0 or later of the
pydata-google-auth
package.
Application Default Credentials via the google.auth.default()
function.
If pandas-gbq can obtain default credentials but those credentials
cannot be used to query BigQuery, pandas-gbq will also try obtaining
user account credentials.
A common problem with default credentials when running on Google
Compute Engine is that the VM does not have sufficient access scopes
to query BigQuery.
User account credentials.
pandas-gbq loads cached credentials from a hidden user folder on the
operating system.
Windows%APPDATA%\pandas_gbq\bigquery_credentials.dat
Linux/Mac/Unix~/.config/pandas_gbq/bigquery_credentials.dat
If pandas-gbq does not find cached credentials, it prompts you to open a
web browser, where you can grant pandas-gbq permissions to access your
cloud resources. These credentials are only used locally. See the
privacy policy for details.
Using service account credentials is particularly useful when working on
remote servers without access to user input.
Create a service account key via the service account key creation page in
the Google Cloud Platform Console. Select the JSON key type and download the
key file.
To use service account credentials, set the credentials
parameter to the result of a call to:
google.oauth2.service_account.Credentials.from_service_account_file()
,which accepts a file path to the JSON file.
from google.oauth2 import service_account
import pandas_gbq
credentials = service_account.Credentials.from_service_account_file(
'path/to/key.json',
df = pandas_gbq.read_gbq(sql, project_id="YOUR-PROJECT-ID", credentials=credentials)
google.oauth2.service_account.Credentials.from_service_account_info()
,which accepts a dictionary corresponding to the JSON file contents.
from google.oauth2 import service_account
import pandas_gbq
credentials = service_account.Credentials.from_service_account_info(
"type": "service_account",
"project_id": "YOUR-PROJECT-ID",
"private_key_id": "6747200734a1f2b9d8d62fc0b9414c5f2461db0e",
"private_key": "-----BEGIN PRIVATE KEY-----\nM...I==\n-----END PRIVATE KEY-----\n",
"client_email": "[email protected]",
"client_id": "12345678900001",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://accounts.google.com/o/oauth2/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/...iam.gserviceaccount.com"
df = pandas_gbq.read_gbq(sql, project_id="YOUR-PROJECT-ID", credentials=credentials)
Alternatively, you can set GOOGLE_APPLICATION_CREDENTIALS
environment variable to the
full path to the JSON file.
$ export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json
Use the with_scopes()
method
to use authorize with specific OAuth2 scopes, which may be required in
queries to federated data sources such as Google Sheets.
credentials = ...
credentials = credentials.with_scopes(
'https://www.googleapis.com/auth/drive',
'https://www.googleapis.com/auth/cloud-platform',
df = pandas_gbq.read_gbq(..., credentials=credentials)
See the Getting started with authentication on Google Cloud Platform guide and
Google Auth Library User Guide for more information
on service accounts.
Use the pydata-google-auth
library to authenticate with a user account (i.e. a G Suite or Gmail
account). The pydata_google_auth.get_user_credentials()
function loads
credentials from a cache on disk or initiates an OAuth 2.0 flow if cached
credentials are not found.
import pandas_gbq
import pydata_google_auth
SCOPES = [
'https://www.googleapis.com/auth/cloud-platform',
'https://www.googleapis.com/auth/drive',
credentials = pydata_google_auth.get_user_credentials(
SCOPES,
# Note, this doesn't work if you're running from a notebook on a
# remote sever, such as over SSH or with Google Colab. In those cases,
# install the gcloud command line interface and authenticate with the
# `gcloud auth application-default login` command and the `--no-browser`
# option.
auth_local_webserver=True,
df = pandas_gbq.read_gbq(
"SELECT my_col FROM `my_dataset.my_table`",
project_id='YOUR-PROJECT-ID',
credentials=credentials,
Warning
Do not store credentials on disk when using shared computing resources
such as a GCE VM or Colab notebook. Use the
pydata_google_auth.cache.NOOP
cache to avoid writing credentials
to disk.