Overview of Data Catalog with BigQuery

This document provides an overview of how Data Catalog relates to BigQuery.

Data Catalog is a fully managed, scalable metadata management service within Dataplex.

Data Catalog use cases

BigQuery uses Data Catalog to perform the following use cases:

  • Visualizing data lineage.
  • Searching for resources for which you have access.
  • Tagging resources with metadata.

For a more complete description of Data Catalog, see What is Data Catalog.

How Data Catalog works

Data Catalog can catalog metadata from BigQuery data sources. After your metadata is cataloged, you can add your own metadata to these data sources by using tags. For a given BigQuery project, Data Catalog automatically catalogs BigQuery metadata about datasets, tables, views, and models. Data Catalog handles two types of metadata: technical metadata and business metadata. To learn more about metadata, see Data Catalog metadata.

Search and discovery

Data Catalog offers a powerful predicate-based search experience for technical and business metadata that's associated with a Data Catalog entry that represents a BigQuery data source. You must have the permissions to read the metadata for a resource so that you can apply search and discovery on the metadata. Data Catalog does not index the data within a resource. Data Catalog only indexes the metadata that describes a BigQuery data source.

Data Catalog controls some metadata such as user-generated tags. For all metadata sourced from BigQuery, Data Catalog is a read-only service that reflects the metadata and permissions provided by BigQuery. You can make edits in BigQuery to add, update, or delete the metadata of a data entry.

To learn more about Data Catalog search, see Search for BigQuery resources.

Access Data Catalog

You can access Data Catalog functionality by using the following interfaces:

What's next