Skip to main content

clusters

Creates, updates, deletes, gets or lists a clusters resource.

Overview

Nameclusters
TypeResource
Idgoogle.dataproc.clusters

Fields

The following fields are returned by SELECT queries:

Successful response

NameDatatypeDescription
clusterNamestringRequired. The cluster name, which must be unique within a project. The name must start with a lowercase letter, and can contain up to 51 lowercase letters, numbers, and hyphens. It cannot end with a hyphen. The name of a deleted cluster can be reused.
clusterUuidstringOutput only. A cluster UUID (Unique Universal Identifier). Dataproc generates this value when it creates the cluster.
configobjectOptional. The cluster config for a cluster of Compute Engine Instances. Note that Dataproc may set default values, and values may change when clusters are updated.Exactly one of ClusterConfig or VirtualClusterConfig must be specified. (id: ClusterConfig)
labelsobjectOptional. The labels to associate with this cluster. Label keys must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). Label values may be empty, but, if present, must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). No more than 32 labels can be associated with a cluster.
metricsobjectOutput only. Contains cluster daemon metrics such as HDFS and YARN stats.Beta Feature: This report is available for testing purposes only. It may be changed before final release. (id: ClusterMetrics)
projectIdstringRequired. The Google Cloud Platform project ID that the cluster belongs to.
statusobjectOutput only. Cluster status. (id: ClusterStatus)
statusHistoryarrayOutput only. The previous cluster status.
virtualClusterConfigobjectOptional. The virtual cluster config is used when creating a Dataproc cluster that does not directly control the underlying compute resources, for example, when creating a Dataproc-on-GKE cluster (https://cloud.google.com/dataproc/docs/guides/dpgke/dataproc-gke-overview). Dataproc may set default values, and values may change when clusters are updated. Exactly one of config or virtual_cluster_config must be specified. (id: VirtualClusterConfig)

Methods

The following methods are available for this resource:

NameAccessible byRequired ParamsOptional ParamsDescription
projects_regions_clusters_getselectprojectId, region, clusterNameGets the resource representation for a cluster in a project.
projects_regions_clusters_listselectprojectId, regionfilter, pageSize, pageTokenLists all regions/{region}/clusters in a project alphabetically.
projects_regions_clusters_createinsertprojectId, regionrequestId, actionOnFailedPrimaryWorkersCreates a cluster in a project. The returned Operation.metadata will be ClusterOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#clusteroperationmetadata).
projects_regions_clusters_patchupdateprojectId, region, clusterNamegracefulDecommissionTimeout, updateMask, requestIdUpdates a cluster in a project. The returned Operation.metadata will be ClusterOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#clusteroperationmetadata). The cluster must be in a RUNNING state or an error is returned.
projects_regions_clusters_deletedeleteprojectId, region, clusterNameclusterUuid, requestId, gracefulTerminationTimeoutDeletes a cluster in a project. The returned Operation.metadata will be ClusterOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#clusteroperationmetadata).
projects_regions_clusters_stopexecprojectId, region, clusterNameStops a cluster in a project.
projects_regions_clusters_startexecprojectId, region, clusterNameStarts a cluster in a project.
projects_regions_clusters_repairexecprojectId, region, clusterNameRepairs a cluster.
projects_regions_clusters_diagnoseexecprojectId, region, clusterNameGets cluster diagnostic information. The returned Operation.metadata will be ClusterOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#clusteroperationmetadata). After the operation completes, Operation.response contains DiagnoseClusterResults (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#diagnoseclusterresults).
projects_regions_clusters_inject_credentialsexecprojectsId, regionsId, clustersIdInject encrypted credentials into all of the VMs in a cluster.The target cluster must be a personal auth cluster assigned to the user who is issuing the RPC.

Parameters

Parameters can be passed in the WHERE clause of a query. Check the Methods section to see which parameters are required or optional for each operation.

NameDatatypeDescription
clusterNamestring
clustersIdstring
projectIdstring
projectsIdstring
regionstring
regionsIdstring
actionOnFailedPrimaryWorkersstring
clusterUuidstring
filterstring
gracefulDecommissionTimeoutstring (google-duration)
gracefulTerminationTimeoutstring (google-duration)
pageSizeinteger (int32)
pageTokenstring
requestIdstring
updateMaskstring (google-fieldmask)

SELECT examples

Gets the resource representation for a cluster in a project.

SELECT
clusterName,
clusterUuid,
config,
labels,
metrics,
projectId,
status,
statusHistory,
virtualClusterConfig
FROM google.dataproc.clusters
WHERE projectId = '{{ projectId }}' -- required
AND region = '{{ region }}' -- required
AND clusterName = '{{ clusterName }}' -- required;

INSERT examples

Creates a cluster in a project. The returned Operation.metadata will be ClusterOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#clusteroperationmetadata).

INSERT INTO google.dataproc.clusters (
data__projectId,
data__clusterName,
data__config,
data__virtualClusterConfig,
data__labels,
projectId,
region,
requestId,
actionOnFailedPrimaryWorkers
)
SELECT
'{{ projectId }}',
'{{ clusterName }}',
'{{ config }}',
'{{ virtualClusterConfig }}',
'{{ labels }}',
'{{ projectId }}',
'{{ region }}',
'{{ requestId }}',
'{{ actionOnFailedPrimaryWorkers }}'
RETURNING
name,
done,
error,
metadata,
response
;

UPDATE examples

Updates a cluster in a project. The returned Operation.metadata will be ClusterOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#clusteroperationmetadata). The cluster must be in a RUNNING state or an error is returned.

UPDATE google.dataproc.clusters
SET
data__projectId = '{{ projectId }}',
data__clusterName = '{{ clusterName }}',
data__config = '{{ config }}',
data__virtualClusterConfig = '{{ virtualClusterConfig }}',
data__labels = '{{ labels }}'
WHERE
projectId = '{{ projectId }}' --required
AND region = '{{ region }}' --required
AND clusterName = '{{ clusterName }}' --required
AND gracefulDecommissionTimeout = '{{ gracefulDecommissionTimeout}}'
AND updateMask = '{{ updateMask}}'
AND requestId = '{{ requestId}}'
RETURNING
name,
done,
error,
metadata,
response;

DELETE examples

Deletes a cluster in a project. The returned Operation.metadata will be ClusterOperationMetadata (https://cloud.google.com/dataproc/docs/reference/rpc/google.cloud.dataproc.v1#clusteroperationmetadata).

DELETE FROM google.dataproc.clusters
WHERE projectId = '{{ projectId }}' --required
AND region = '{{ region }}' --required
AND clusterName = '{{ clusterName }}' --required
AND clusterUuid = '{{ clusterUuid }}'
AND requestId = '{{ requestId }}'
AND gracefulTerminationTimeout = '{{ gracefulTerminationTimeout }}';

Lifecycle Methods

Stops a cluster in a project.

EXEC google.dataproc.clusters.projects_regions_clusters_stop 
@projectId='{{ projectId }}' --required,
@region='{{ region }}' --required,
@clusterName='{{ clusterName }}' --required
@@json=
'{
"clusterUuid": "{{ clusterUuid }}",
"requestId": "{{ requestId }}"
}';