Skip to main content

datasets

Creates, updates, deletes, gets or lists a datasets resource.

Overview

Namedatasets
TypeResource
Idgoogle.bigquery.datasets

Fields

The following fields are returned by SELECT queries:

Successful response

NameDatatypeDescription
idstringOutput only. The fully-qualified unique name of the dataset in the format projectId:datasetId. The dataset name without the project name is given in the datasetId field. When creating a new dataset, leave this field blank, and instead specify the datasetId field.
accessarrayOptional. An array of objects that define dataset access for one or more entities. You can set this property when inserting or updating a dataset in order to control who is allowed to access the data. If unspecified at dataset creation time, BigQuery adds default dataset access for the following entities: access.specialGroup: projectReaders; access.role: READER; access.specialGroup: projectWriters; access.role: WRITER; access.specialGroup: projectOwners; access.role: OWNER; access.userByEmail: [dataset creator email]; access.role: OWNER; If you patch a dataset, then this field is overwritten by the patched dataset's access field. To add entities, you must supply the entire existing access array in addition to any new entities that you want to add.
creationTimestring (int64)Output only. The time when this dataset was created, in milliseconds since the epoch.
datasetReferenceobjectRequired. A reference that identifies the dataset. (id: DatasetReference)
defaultCollationstringOptional. Defines the default collation specification of future tables created in the dataset. If a table is created in this dataset without table-level default collation, then the table inherits the dataset default collation, which is applied to the string fields that do not have explicit collation specified. A change to this field affects only tables created afterwards, and does not alter the existing tables. The following values are supported: * 'und:ci': undetermined locale, case insensitive. * '': empty string. Default to case-sensitive behavior.
defaultEncryptionConfigurationobjectThe default encryption key for all tables in the dataset. After this property is set, the encryption key of all newly-created tables in the dataset is set to this value unless the table creation request or query explicitly overrides the key. (id: EncryptionConfiguration)
defaultPartitionExpirationMsstring (int64)This default partition expiration, expressed in milliseconds. When new time-partitioned tables are created in a dataset where this property is set, the table will inherit this value, propagated as the TimePartitioning.expirationMs property on the new table. If you set TimePartitioning.expirationMs explicitly when creating a table, the defaultPartitionExpirationMs of the containing dataset is ignored. When creating a partitioned table, if defaultPartitionExpirationMs is set, the defaultTableExpirationMs value is ignored and the table will not be inherit a table expiration deadline.
defaultRoundingModestringOptional. Defines the default rounding mode specification of new tables created within this dataset. During table creation, if this field is specified, the table within this dataset will inherit the default rounding mode of the dataset. Setting the default rounding mode on a table overrides this option. Existing tables in the dataset are unaffected. If columns are defined during that table creation, they will immediately inherit the table's default rounding mode, unless otherwise specified.
defaultTableExpirationMsstring (int64)Optional. The default lifetime of all tables in the dataset, in milliseconds. The minimum lifetime value is 3600000 milliseconds (one hour). To clear an existing default expiration with a PATCH request, set to 0. Once this property is set, all newly-created tables in the dataset will have an expirationTime property set to the creation time plus the value in this property, and changing the value will only affect new tables, not existing ones. When the expirationTime for a given table is reached, that table will be deleted automatically. If a table's expirationTime is modified or removed before the table expires, or if you provide an explicit expirationTime when creating a table, that value takes precedence over the default expiration time indicated by this property.
descriptionstringOptional. A user-friendly description of the dataset.
etagstringOutput only. A hash of the resource.
externalCatalogDatasetOptionsobjectOptional. Options defining open source compatible datasets living in the BigQuery catalog. Contains metadata of open source database, schema or namespace represented by the current dataset. (id: ExternalCatalogDatasetOptions)
externalDatasetReferenceobjectOptional. Reference to a read-only external dataset defined in data catalogs outside of BigQuery. Filled out when the dataset type is EXTERNAL. (id: ExternalDatasetReference)
friendlyNamestringOptional. A descriptive name for the dataset.
isCaseInsensitivebooleanOptional. TRUE if the dataset and its table names are case-insensitive, otherwise FALSE. By default, this is FALSE, which means the dataset and its table names are case-sensitive. This field does not affect routine references.
kindstringOutput only. The resource type. (default: bigquery#dataset)
labelsobjectThe labels associated with this dataset. You can use these to organize and group your datasets. You can set this property when inserting or updating a dataset. See Creating and Updating Dataset Labels for more information.
lastModifiedTimestring (int64)Output only. The date when this dataset was last modified, in milliseconds since the epoch.
linkedDatasetMetadataobjectOutput only. Metadata about the LinkedDataset. Filled out when the dataset type is LINKED. (id: LinkedDatasetMetadata)
linkedDatasetSourceobjectOptional. The source dataset reference when the dataset is of type LINKED. For all other dataset types it is not set. This field cannot be updated once it is set. Any attempt to update this field using Update and Patch API Operations will be ignored. (id: LinkedDatasetSource)
locationstringThe geographic location where the dataset should reside. See https://cloud.google.com/bigquery/docs/locations for supported locations.
maxTimeTravelHoursstring (int64)Optional. Defines the time travel window in hours. The value can be from 48 to 168 hours (2 to 7 days). The default value is 168 hours if this is not set.
resourceTagsobjectOptional. The tags attached to this dataset. Tag keys are globally unique. Tag key is expected to be in the namespaced format, for example "123456789012/environment" where 123456789012 is the ID of the parent organization or project resource for this tag key. Tag value is expected to be the short name, for example "Production". See Tag definitions for more details.
restrictionsobjectOptional. Output only. Restriction config for all tables and dataset. If set, restrict certain accesses on the dataset and all its tables based on the config. See Data egress for more details. (id: RestrictionConfig)
satisfiesPzibooleanOutput only. Reserved for future use.
satisfiesPzsbooleanOutput only. Reserved for future use.
selfLinkstringOutput only. A URL that can be used to access the resource again. You can use this URL in Get or Update requests to the resource.
storageBillingModelstringOptional. Updates storage_billing_model for the dataset.
tagsarrayOutput only. Tags for the dataset. To provide tags as inputs, use the resourceTags field.
typestringOutput only. Same as type in ListFormatDataset. The type of the dataset, one of: * DEFAULT - only accessible by owner and authorized accounts, * PUBLIC - accessible by everyone, * LINKED - linked dataset, * EXTERNAL - dataset with definition in external metadata catalog.

Methods

The following methods are available for this resource:

NameAccessible byRequired ParamsOptional ParamsDescription
getselectprojectId, +datasetIdaccessPolicyVersion, datasetViewReturns the dataset specified by datasetID.
listselectprojectIdall, filter, maxResults, pageTokenLists all datasets in the specified project to which the user has been granted the READER dataset role.
insertinsertprojectIdaccessPolicyVersionCreates a new empty dataset.
patchupdateprojectId, +datasetIdaccessPolicyVersion, updateModeUpdates information in an existing dataset. The update method replaces the entire dataset resource, whereas the patch method only replaces fields that are provided in the submitted dataset resource. This method supports RFC5789 patch semantics.
updatereplaceprojectId, +datasetIdaccessPolicyVersion, updateModeUpdates information in an existing dataset. The update method replaces the entire dataset resource, whereas the patch method only replaces fields that are provided in the submitted dataset resource.
deletedeleteprojectId, +datasetIddeleteContentsDeletes the dataset specified by the datasetId value. Before you can delete a dataset, you must delete all its tables, either manually or by specifying deleteContents. Immediately after deletion, you can create another dataset with the same name.
undeleteexecprojectId, +datasetIdUndeletes a dataset which is within time travel window based on datasetId. If a time is specified, the dataset version deleted at that time is undeleted, else the last live version is undeleted.

Parameters

Parameters can be passed in the WHERE clause of a query. Check the Methods section to see which parameters are required or optional for each operation.

NameDatatypeDescription
+datasetIdstring
projectIdstring
accessPolicyVersioninteger (int32)
allboolean
datasetViewstring
deleteContentsboolean
filterstring
maxResultsinteger (uint32)
pageTokenstring
updateModestring

SELECT examples

Returns the dataset specified by datasetID.

SELECT
id,
access,
creationTime,
datasetReference,
defaultCollation,
defaultEncryptionConfiguration,
defaultPartitionExpirationMs,
defaultRoundingMode,
defaultTableExpirationMs,
description,
etag,
externalCatalogDatasetOptions,
externalDatasetReference,
friendlyName,
isCaseInsensitive,
kind,
labels,
lastModifiedTime,
linkedDatasetMetadata,
linkedDatasetSource,
location,
maxTimeTravelHours,
resourceTags,
restrictions,
satisfiesPzi,
satisfiesPzs,
selfLink,
storageBillingModel,
tags,
type
FROM google.bigquery.datasets
WHERE projectId = '{{ projectId }}' -- required
AND +datasetId = '{{ +datasetId }}' -- required
AND accessPolicyVersion = '{{ accessPolicyVersion }}'
AND datasetView = '{{ datasetView }}';

INSERT examples

Creates a new empty dataset.

INSERT INTO google.bigquery.datasets (
data__access,
data__datasetReference,
data__defaultCollation,
data__defaultEncryptionConfiguration,
data__defaultPartitionExpirationMs,
data__defaultRoundingMode,
data__defaultTableExpirationMs,
data__description,
data__externalCatalogDatasetOptions,
data__externalDatasetReference,
data__friendlyName,
data__isCaseInsensitive,
data__labels,
data__linkedDatasetSource,
data__location,
data__maxTimeTravelHours,
data__resourceTags,
data__storageBillingModel,
projectId,
accessPolicyVersion
)
SELECT
'{{ access }}',
'{{ datasetReference }}',
'{{ defaultCollation }}',
'{{ defaultEncryptionConfiguration }}',
'{{ defaultPartitionExpirationMs }}',
'{{ defaultRoundingMode }}',
'{{ defaultTableExpirationMs }}',
'{{ description }}',
'{{ externalCatalogDatasetOptions }}',
'{{ externalDatasetReference }}',
'{{ friendlyName }}',
{{ isCaseInsensitive }},
'{{ labels }}',
'{{ linkedDatasetSource }}',
'{{ location }}',
'{{ maxTimeTravelHours }}',
'{{ resourceTags }}',
'{{ storageBillingModel }}',
'{{ projectId }}',
'{{ accessPolicyVersion }}'
RETURNING
id,
access,
creationTime,
datasetReference,
defaultCollation,
defaultEncryptionConfiguration,
defaultPartitionExpirationMs,
defaultRoundingMode,
defaultTableExpirationMs,
description,
etag,
externalCatalogDatasetOptions,
externalDatasetReference,
friendlyName,
isCaseInsensitive,
kind,
labels,
lastModifiedTime,
linkedDatasetMetadata,
linkedDatasetSource,
location,
maxTimeTravelHours,
resourceTags,
restrictions,
satisfiesPzi,
satisfiesPzs,
selfLink,
storageBillingModel,
tags,
type
;

UPDATE examples

Updates information in an existing dataset. The update method replaces the entire dataset resource, whereas the patch method only replaces fields that are provided in the submitted dataset resource. This method supports RFC5789 patch semantics.

UPDATE google.bigquery.datasets
SET
data__access = '{{ access }}',
data__datasetReference = '{{ datasetReference }}',
data__defaultCollation = '{{ defaultCollation }}',
data__defaultEncryptionConfiguration = '{{ defaultEncryptionConfiguration }}',
data__defaultPartitionExpirationMs = '{{ defaultPartitionExpirationMs }}',
data__defaultRoundingMode = '{{ defaultRoundingMode }}',
data__defaultTableExpirationMs = '{{ defaultTableExpirationMs }}',
data__description = '{{ description }}',
data__externalCatalogDatasetOptions = '{{ externalCatalogDatasetOptions }}',
data__externalDatasetReference = '{{ externalDatasetReference }}',
data__friendlyName = '{{ friendlyName }}',
data__isCaseInsensitive = {{ isCaseInsensitive }},
data__labels = '{{ labels }}',
data__linkedDatasetSource = '{{ linkedDatasetSource }}',
data__location = '{{ location }}',
data__maxTimeTravelHours = '{{ maxTimeTravelHours }}',
data__resourceTags = '{{ resourceTags }}',
data__storageBillingModel = '{{ storageBillingModel }}'
WHERE
projectId = '{{ projectId }}' --required
AND +datasetId = '{{ +datasetId }}' --required
AND accessPolicyVersion = '{{ accessPolicyVersion}}'
AND updateMode = '{{ updateMode}}'
RETURNING
id,
access,
creationTime,
datasetReference,
defaultCollation,
defaultEncryptionConfiguration,
defaultPartitionExpirationMs,
defaultRoundingMode,
defaultTableExpirationMs,
description,
etag,
externalCatalogDatasetOptions,
externalDatasetReference,
friendlyName,
isCaseInsensitive,
kind,
labels,
lastModifiedTime,
linkedDatasetMetadata,
linkedDatasetSource,
location,
maxTimeTravelHours,
resourceTags,
restrictions,
satisfiesPzi,
satisfiesPzs,
selfLink,
storageBillingModel,
tags,
type;

REPLACE examples

Updates information in an existing dataset. The update method replaces the entire dataset resource, whereas the patch method only replaces fields that are provided in the submitted dataset resource.

REPLACE google.bigquery.datasets
SET
data__access = '{{ access }}',
data__datasetReference = '{{ datasetReference }}',
data__defaultCollation = '{{ defaultCollation }}',
data__defaultEncryptionConfiguration = '{{ defaultEncryptionConfiguration }}',
data__defaultPartitionExpirationMs = '{{ defaultPartitionExpirationMs }}',
data__defaultRoundingMode = '{{ defaultRoundingMode }}',
data__defaultTableExpirationMs = '{{ defaultTableExpirationMs }}',
data__description = '{{ description }}',
data__externalCatalogDatasetOptions = '{{ externalCatalogDatasetOptions }}',
data__externalDatasetReference = '{{ externalDatasetReference }}',
data__friendlyName = '{{ friendlyName }}',
data__isCaseInsensitive = {{ isCaseInsensitive }},
data__labels = '{{ labels }}',
data__linkedDatasetSource = '{{ linkedDatasetSource }}',
data__location = '{{ location }}',
data__maxTimeTravelHours = '{{ maxTimeTravelHours }}',
data__resourceTags = '{{ resourceTags }}',
data__storageBillingModel = '{{ storageBillingModel }}'
WHERE
projectId = '{{ projectId }}' --required
AND +datasetId = '{{ +datasetId }}' --required
AND accessPolicyVersion = '{{ accessPolicyVersion}}'
AND updateMode = '{{ updateMode}}'
RETURNING
id,
access,
creationTime,
datasetReference,
defaultCollation,
defaultEncryptionConfiguration,
defaultPartitionExpirationMs,
defaultRoundingMode,
defaultTableExpirationMs,
description,
etag,
externalCatalogDatasetOptions,
externalDatasetReference,
friendlyName,
isCaseInsensitive,
kind,
labels,
lastModifiedTime,
linkedDatasetMetadata,
linkedDatasetSource,
location,
maxTimeTravelHours,
resourceTags,
restrictions,
satisfiesPzi,
satisfiesPzs,
selfLink,
storageBillingModel,
tags,
type;

DELETE examples

Deletes the dataset specified by the datasetId value. Before you can delete a dataset, you must delete all its tables, either manually or by specifying deleteContents. Immediately after deletion, you can create another dataset with the same name.

DELETE FROM google.bigquery.datasets
WHERE projectId = '{{ projectId }}' --required
AND +datasetId = '{{ +datasetId }}' --required
AND deleteContents = '{{ deleteContents }}';

Lifecycle Methods

Undeletes a dataset which is within time travel window based on datasetId. If a time is specified, the dataset version deleted at that time is undeleted, else the last live version is undeleted.

EXEC google.bigquery.datasets.undelete 
@projectId='{{ projectId }}' --required,
@+datasetId='{{ +datasetId }}' --required
@@json=
'{
"deletionTime": "{{ deletionTime }}"
}';