pipelines
Creates, updates, deletes, gets or lists a pipelines
resource.
Overview
Name | pipelines |
Type | Resource |
Id | google.datapipelines.pipelines |
Fields
The following fields are returned by SELECT
queries:
- get
- list
Successful response
Name | Datatype | Description |
---|---|---|
name | string | The pipeline name. For example: projects/PROJECT_ID/locations/LOCATION_ID/pipelines/PIPELINE_ID . * PROJECT_ID can contain letters ([A-Za-z]), numbers ([0-9]), hyphens (-), colons (:), and periods (.). For more information, see Identifying projects. * LOCATION_ID is the canonical ID for the pipeline's location. The list of available locations can be obtained by calling google.cloud.location.Locations.ListLocations . Note that the Data Pipelines service is not available in all regions. It depends on Cloud Scheduler, an App Engine application, so it's only available in App Engine regions. * PIPELINE_ID is the ID of the pipeline. Must be unique for the selected project and location. |
createTime | string (google-datetime) | Output only. Immutable. The timestamp when the pipeline was initially created. Set by the Data Pipelines service. |
displayName | string | Required. The display name of the pipeline. It can contain only letters ([A-Za-z]), numbers ([0-9]), hyphens (-), and underscores (_). |
jobCount | integer (int32) | Output only. Number of jobs. |
lastUpdateTime | string (google-datetime) | Output only. Immutable. The timestamp when the pipeline was last modified. Set by the Data Pipelines service. |
pipelineSources | object | Immutable. The sources of the pipeline (for example, Dataplex). The keys and values are set by the corresponding sources during pipeline creation. |
scheduleInfo | object | Internal scheduling information for a pipeline. If this information is provided, periodic jobs will be created per the schedule. If not, users are responsible for creating jobs externally. (id: GoogleCloudDatapipelinesV1ScheduleSpec) |
schedulerServiceAccountEmail | string | Optional. A service account email to be used with the Cloud Scheduler job. If not specified, the default compute engine service account will be used. |
state | string | Required. The state of the pipeline. When the pipeline is created, the state is set to 'PIPELINE_STATE_ACTIVE' by default. State changes can be requested by setting the state to stopping, paused, or resuming. State cannot be changed through UpdatePipeline requests. |
type | string | Required. The type of the pipeline. This field affects the scheduling of the pipeline and the type of metrics to show for the pipeline. |
workload | object | Workload information for creating new jobs. (id: GoogleCloudDatapipelinesV1Workload) |
Successful response
Name | Datatype | Description |
---|---|---|
name | string | The pipeline name. For example: projects/PROJECT_ID/locations/LOCATION_ID/pipelines/PIPELINE_ID . * PROJECT_ID can contain letters ([A-Za-z]), numbers ([0-9]), hyphens (-), colons (:), and periods (.). For more information, see Identifying projects. * LOCATION_ID is the canonical ID for the pipeline's location. The list of available locations can be obtained by calling google.cloud.location.Locations.ListLocations . Note that the Data Pipelines service is not available in all regions. It depends on Cloud Scheduler, an App Engine application, so it's only available in App Engine regions. * PIPELINE_ID is the ID of the pipeline. Must be unique for the selected project and location. |
createTime | string (google-datetime) | Output only. Immutable. The timestamp when the pipeline was initially created. Set by the Data Pipelines service. |
displayName | string | Required. The display name of the pipeline. It can contain only letters ([A-Za-z]), numbers ([0-9]), hyphens (-), and underscores (_). |
jobCount | integer (int32) | Output only. Number of jobs. |
lastUpdateTime | string (google-datetime) | Output only. Immutable. The timestamp when the pipeline was last modified. Set by the Data Pipelines service. |
pipelineSources | object | Immutable. The sources of the pipeline (for example, Dataplex). The keys and values are set by the corresponding sources during pipeline creation. |
scheduleInfo | object | Internal scheduling information for a pipeline. If this information is provided, periodic jobs will be created per the schedule. If not, users are responsible for creating jobs externally. (id: GoogleCloudDatapipelinesV1ScheduleSpec) |
schedulerServiceAccountEmail | string | Optional. A service account email to be used with the Cloud Scheduler job. If not specified, the default compute engine service account will be used. |
state | string | Required. The state of the pipeline. When the pipeline is created, the state is set to 'PIPELINE_STATE_ACTIVE' by default. State changes can be requested by setting the state to stopping, paused, or resuming. State cannot be changed through UpdatePipeline requests. |
type | string | Required. The type of the pipeline. This field affects the scheduling of the pipeline and the type of metrics to show for the pipeline. |
workload | object | Workload information for creating new jobs. (id: GoogleCloudDatapipelinesV1Workload) |
Methods
The following methods are available for this resource:
Name | Accessible by | Required Params | Optional Params | Description |
---|---|---|---|---|
get | select | projectsId , locationsId , pipelinesId | Looks up a single pipeline. Returns a "NOT_FOUND" error if no such pipeline exists. Returns a "FORBIDDEN" error if the caller doesn't have permission to access it. | |
list | select | projectsId , locationsId | filter , pageSize , pageToken | Lists pipelines. Returns a "FORBIDDEN" error if the caller doesn't have permission to access it. |
create | insert | projectsId , locationsId | Creates a pipeline. For a batch pipeline, you can pass scheduler information. Data Pipelines uses the scheduler information to create an internal scheduler that runs jobs periodically. If the internal scheduler is not configured, you can use RunPipeline to run jobs. | |
patch | update | projectsId , locationsId , pipelinesId | updateMask | Updates a pipeline. If successful, the updated Pipeline is returned. Returns NOT_FOUND if the pipeline doesn't exist. If UpdatePipeline does not return successfully, you can retry the UpdatePipeline request until you receive a successful response. |
delete | delete | projectsId , locationsId , pipelinesId | Deletes a pipeline. If a scheduler job is attached to the pipeline, it will be deleted. | |
stop | exec | projectsId , locationsId , pipelinesId | Freezes pipeline execution permanently. If there's a corresponding scheduler entry, it's deleted, and the pipeline state is changed to "ARCHIVED". However, pipeline metadata is retained. | |
run | exec | projectsId , locationsId , pipelinesId | Creates a job for the specified pipeline directly. You can use this method when the internal scheduler is not configured and you want to trigger the job directly or through an external system. Returns a "NOT_FOUND" error if the pipeline doesn't exist. Returns a "FORBIDDEN" error if the user doesn't have permission to access the pipeline or run jobs for the pipeline. |
Parameters
Parameters can be passed in the WHERE
clause of a query. Check the Methods section to see which parameters are required or optional for each operation.
Name | Datatype | Description |
---|---|---|
locationsId | string | |
pipelinesId | string | |
projectsId | string | |
filter | string | |
pageSize | integer (int32) | |
pageToken | string | |
updateMask | string (google-fieldmask) |
SELECT
examples
- get
- list
Looks up a single pipeline. Returns a "NOT_FOUND" error if no such pipeline exists. Returns a "FORBIDDEN" error if the caller doesn't have permission to access it.
SELECT
name,
createTime,
displayName,
jobCount,
lastUpdateTime,
pipelineSources,
scheduleInfo,
schedulerServiceAccountEmail,
state,
type,
workload
FROM google.datapipelines.pipelines
WHERE projectsId = '{{ projectsId }}' -- required
AND locationsId = '{{ locationsId }}' -- required
AND pipelinesId = '{{ pipelinesId }}' -- required;
Lists pipelines. Returns a "FORBIDDEN" error if the caller doesn't have permission to access it.
SELECT
name,
createTime,
displayName,
jobCount,
lastUpdateTime,
pipelineSources,
scheduleInfo,
schedulerServiceAccountEmail,
state,
type,
workload
FROM google.datapipelines.pipelines
WHERE projectsId = '{{ projectsId }}' -- required
AND locationsId = '{{ locationsId }}' -- required
AND filter = '{{ filter }}'
AND pageSize = '{{ pageSize }}'
AND pageToken = '{{ pageToken }}';
INSERT
examples
- create
- Manifest
Creates a pipeline. For a batch pipeline, you can pass scheduler information. Data Pipelines uses the scheduler information to create an internal scheduler that runs jobs periodically. If the internal scheduler is not configured, you can use RunPipeline to run jobs.
INSERT INTO google.datapipelines.pipelines (
data__name,
data__displayName,
data__type,
data__state,
data__workload,
data__scheduleInfo,
data__schedulerServiceAccountEmail,
data__pipelineSources,
projectsId,
locationsId
)
SELECT
'{{ name }}',
'{{ displayName }}',
'{{ type }}',
'{{ state }}',
'{{ workload }}',
'{{ scheduleInfo }}',
'{{ schedulerServiceAccountEmail }}',
'{{ pipelineSources }}',
'{{ projectsId }}',
'{{ locationsId }}'
RETURNING
name,
createTime,
displayName,
jobCount,
lastUpdateTime,
pipelineSources,
scheduleInfo,
schedulerServiceAccountEmail,
state,
type,
workload
;
# Description fields are for documentation purposes
- name: pipelines
props:
- name: projectsId
value: string
description: Required parameter for the pipelines resource.
- name: locationsId
value: string
description: Required parameter for the pipelines resource.
- name: name
value: string
description: >
The pipeline name. For example: `projects/PROJECT_ID/locations/LOCATION_ID/pipelines/PIPELINE_ID`. * `PROJECT_ID` can contain letters ([A-Za-z]), numbers ([0-9]), hyphens (-), colons (:), and periods (.). For more information, see [Identifying projects](https://cloud.google.com/resource-manager/docs/creating-managing-projects#identifying_projects). * `LOCATION_ID` is the canonical ID for the pipeline's location. The list of available locations can be obtained by calling `google.cloud.location.Locations.ListLocations`. Note that the Data Pipelines service is not available in all regions. It depends on Cloud Scheduler, an App Engine application, so it's only available in [App Engine regions](https://cloud.google.com/about/locations#region). * `PIPELINE_ID` is the ID of the pipeline. Must be unique for the selected project and location.
- name: displayName
value: string
description: >
Required. The display name of the pipeline. It can contain only letters ([A-Za-z]), numbers ([0-9]), hyphens (-), and underscores (_).
- name: type
value: string
description: >
Required. The type of the pipeline. This field affects the scheduling of the pipeline and the type of metrics to show for the pipeline.
valid_values: ['PIPELINE_TYPE_UNSPECIFIED', 'PIPELINE_TYPE_BATCH', 'PIPELINE_TYPE_STREAMING']
- name: state
value: string
description: >
Required. The state of the pipeline. When the pipeline is created, the state is set to 'PIPELINE_STATE_ACTIVE' by default. State changes can be requested by setting the state to stopping, paused, or resuming. State cannot be changed through UpdatePipeline requests.
valid_values: ['STATE_UNSPECIFIED', 'STATE_RESUMING', 'STATE_ACTIVE', 'STATE_STOPPING', 'STATE_ARCHIVED', 'STATE_PAUSED']
- name: workload
value: object
description: >
Workload information for creating new jobs.
- name: scheduleInfo
value: object
description: >
Internal scheduling information for a pipeline. If this information is provided, periodic jobs will be created per the schedule. If not, users are responsible for creating jobs externally.
- name: schedulerServiceAccountEmail
value: string
description: >
Optional. A service account email to be used with the Cloud Scheduler job. If not specified, the default compute engine service account will be used.
- name: pipelineSources
value: object
description: >
Immutable. The sources of the pipeline (for example, Dataplex). The keys and values are set by the corresponding sources during pipeline creation.
UPDATE
examples
- patch
Updates a pipeline. If successful, the updated Pipeline is returned. Returns NOT_FOUND
if the pipeline doesn't exist. If UpdatePipeline does not return successfully, you can retry the UpdatePipeline request until you receive a successful response.
UPDATE google.datapipelines.pipelines
SET
data__name = '{{ name }}',
data__displayName = '{{ displayName }}',
data__type = '{{ type }}',
data__state = '{{ state }}',
data__workload = '{{ workload }}',
data__scheduleInfo = '{{ scheduleInfo }}',
data__schedulerServiceAccountEmail = '{{ schedulerServiceAccountEmail }}',
data__pipelineSources = '{{ pipelineSources }}'
WHERE
projectsId = '{{ projectsId }}' --required
AND locationsId = '{{ locationsId }}' --required
AND pipelinesId = '{{ pipelinesId }}' --required
AND updateMask = '{{ updateMask}}'
RETURNING
name,
createTime,
displayName,
jobCount,
lastUpdateTime,
pipelineSources,
scheduleInfo,
schedulerServiceAccountEmail,
state,
type,
workload;
DELETE
examples
- delete
Deletes a pipeline. If a scheduler job is attached to the pipeline, it will be deleted.
DELETE FROM google.datapipelines.pipelines
WHERE projectsId = '{{ projectsId }}' --required
AND locationsId = '{{ locationsId }}' --required
AND pipelinesId = '{{ pipelinesId }}' --required;
Lifecycle Methods
- stop
- run
Freezes pipeline execution permanently. If there's a corresponding scheduler entry, it's deleted, and the pipeline state is changed to "ARCHIVED". However, pipeline metadata is retained.
EXEC google.datapipelines.pipelines.stop
@projectsId='{{ projectsId }}' --required,
@locationsId='{{ locationsId }}' --required,
@pipelinesId='{{ pipelinesId }}' --required;
Creates a job for the specified pipeline directly. You can use this method when the internal scheduler is not configured and you want to trigger the job directly or through an external system. Returns a "NOT_FOUND" error if the pipeline doesn't exist. Returns a "FORBIDDEN" error if the user doesn't have permission to access the pipeline or run jobs for the pipeline.
EXEC google.datapipelines.pipelines.run
@projectsId='{{ projectsId }}' --required,
@locationsId='{{ locationsId }}' --required,
@pipelinesId='{{ pipelinesId }}' --required;