Tree Schema integrates with BigQuery to extract the metadata from your tables, collect sample values for your fields and to sync your schema descriptions from Tree Schema back to BigQuery.
Connecting to BigQuery¶
The following fields are required for BigQuery:
JSON Key File: paste the content of your JSON key file for your account or service account here into this input field
For details on how to create a service account GCP Documentation.
Tree Schema connects to BigQuery directly using the API provided by Google via HTTPS and is not eligible to connect through a jump server.
The following permissions are required for Tree Schema to integrate with BigQuery.
bigquery.datasets.get: allows access to list and retrieve metadata about BigQuery datasets
bigquery.tables.list: allows polling to find the full list of tables within a data set
bigquery.tables.get: allows access to retrieve metadata about BigQuery tables
In addition, these permissions are optional:
bigquery.tables.getData: Required to extract sample values from each field. Capturing sample values can also be turned off at your organization level.
bigquery.tables.update: Required to sync table descriptions made in Tree Schema back to BigQuery. Syncing descriptions to BigQuery can also be turned off at the data store level.
As an example, the following role can be created in Google’s IAM in order to provide all access required:
Tree Schema exclusively uses the Google APIs in order to access both your metadata as well as to collect sample values for your data. Tree Schema does not execute any queries against BigQuery since queries are billed by the amount of data scanned, this includes querying the INFORMATION_SCHEMA. Unfortunately, Google does not provide access to views via the API, therefore metadata about views is not captured within Tree Schema.
Tree Schema will check to make sure that your metadata between BigQuery and Tree Schema is kept in sync. This includes validating that the descriptions of your tables in BigQuery matches the description of your schemas in Tree Schema. When your data is in sync you will see the successful synchronization icon on the README:
If Tree Schema does not have the appropriate permissions to update the descriptions in BigQuery you will see an error similar to the following: