documents
Creates, updates, deletes, gets or lists a documents
resource.
Overview
Name | documents |
Type | Resource |
Id | google.discoveryengine.documents |
Fields
The following fields are returned by SELECT
queries:
- projects_locations_collections_data_stores_branches_documents_get
- projects_locations_collections_data_stores_branches_documents_list
- projects_locations_data_stores_branches_documents_get
- projects_locations_data_stores_branches_documents_list
Successful response
Name | Datatype | Description |
---|---|---|
id | string | Immutable. The identifier of the document. Id should conform to RFC-1034 standard with a length limit of 128 characters. |
name | string | Immutable. The full resource name of the document. Format: projects/{project}/locations/{location}/collections/{collection}/dataStores/{data_store}/branches/{branch}/documents/{document_id} . This field must be a UTF-8 encoded string with a length limit of 1024 characters. |
aclInfo | object | Access control information for the document. (id: GoogleCloudDiscoveryengineV1DocumentAclInfo) |
content | object | The unstructured data linked to this document. Content can only be set and must be set if this document is under a CONTENT_REQUIRED data store. (id: GoogleCloudDiscoveryengineV1DocumentContent) |
derivedStructData | object | Output only. This field is OUTPUT_ONLY. It contains derived data that are not in the original input document. |
indexStatus | object | Output only. The index status of the document. * If document is indexed successfully, the index_time field is populated. * Otherwise, if document is not indexed due to errors, the error_samples field is populated. * Otherwise, if document's index is in progress, the pending_message field is populated. (id: GoogleCloudDiscoveryengineV1DocumentIndexStatus) |
indexTime | string (google-datetime) | Output only. The last time the document was indexed. If this field is set, the document could be returned in search results. This field is OUTPUT_ONLY. If this field is not populated, it means the document has never been indexed. |
jsonData | string | The JSON string representation of the document. It should conform to the registered Schema or an INVALID_ARGUMENT error is thrown. |
parentDocumentId | string | The identifier of the parent document. Currently supports at most two level document hierarchy. Id should conform to RFC-1034 standard with a length limit of 63 characters. |
schemaId | string | The identifier of the schema located in the same data store. |
structData | object | The structured JSON data for the document. It should conform to the registered Schema or an INVALID_ARGUMENT error is thrown. |
Successful response
Name | Datatype | Description |
---|---|---|
id | string | Immutable. The identifier of the document. Id should conform to RFC-1034 standard with a length limit of 128 characters. |
name | string | Immutable. The full resource name of the document. Format: projects/{project}/locations/{location}/collections/{collection}/dataStores/{data_store}/branches/{branch}/documents/{document_id} . This field must be a UTF-8 encoded string with a length limit of 1024 characters. |
aclInfo | object | Access control information for the document. (id: GoogleCloudDiscoveryengineV1DocumentAclInfo) |
content | object | The unstructured data linked to this document. Content can only be set and must be set if this document is under a CONTENT_REQUIRED data store. (id: GoogleCloudDiscoveryengineV1DocumentContent) |
derivedStructData | object | Output only. This field is OUTPUT_ONLY. It contains derived data that are not in the original input document. |
indexStatus | object | Output only. The index status of the document. * If document is indexed successfully, the index_time field is populated. * Otherwise, if document is not indexed due to errors, the error_samples field is populated. * Otherwise, if document's index is in progress, the pending_message field is populated. (id: GoogleCloudDiscoveryengineV1DocumentIndexStatus) |
indexTime | string (google-datetime) | Output only. The last time the document was indexed. If this field is set, the document could be returned in search results. This field is OUTPUT_ONLY. If this field is not populated, it means the document has never been indexed. |
jsonData | string | The JSON string representation of the document. It should conform to the registered Schema or an INVALID_ARGUMENT error is thrown. |
parentDocumentId | string | The identifier of the parent document. Currently supports at most two level document hierarchy. Id should conform to RFC-1034 standard with a length limit of 63 characters. |
schemaId | string | The identifier of the schema located in the same data store. |
structData | object | The structured JSON data for the document. It should conform to the registered Schema or an INVALID_ARGUMENT error is thrown. |
Successful response
Name | Datatype | Description |
---|---|---|
id | string | Immutable. The identifier of the document. Id should conform to RFC-1034 standard with a length limit of 128 characters. |
name | string | Immutable. The full resource name of the document. Format: projects/{project}/locations/{location}/collections/{collection}/dataStores/{data_store}/branches/{branch}/documents/{document_id} . This field must be a UTF-8 encoded string with a length limit of 1024 characters. |
aclInfo | object | Access control information for the document. (id: GoogleCloudDiscoveryengineV1DocumentAclInfo) |
content | object | The unstructured data linked to this document. Content can only be set and must be set if this document is under a CONTENT_REQUIRED data store. (id: GoogleCloudDiscoveryengineV1DocumentContent) |
derivedStructData | object | Output only. This field is OUTPUT_ONLY. It contains derived data that are not in the original input document. |
indexStatus | object | Output only. The index status of the document. * If document is indexed successfully, the index_time field is populated. * Otherwise, if document is not indexed due to errors, the error_samples field is populated. * Otherwise, if document's index is in progress, the pending_message field is populated. (id: GoogleCloudDiscoveryengineV1DocumentIndexStatus) |
indexTime | string (google-datetime) | Output only. The last time the document was indexed. If this field is set, the document could be returned in search results. This field is OUTPUT_ONLY. If this field is not populated, it means the document has never been indexed. |
jsonData | string | The JSON string representation of the document. It should conform to the registered Schema or an INVALID_ARGUMENT error is thrown. |
parentDocumentId | string | The identifier of the parent document. Currently supports at most two level document hierarchy. Id should conform to RFC-1034 standard with a length limit of 63 characters. |
schemaId | string | The identifier of the schema located in the same data store. |
structData | object | The structured JSON data for the document. It should conform to the registered Schema or an INVALID_ARGUMENT error is thrown. |
Successful response
Name | Datatype | Description |
---|---|---|
id | string | Immutable. The identifier of the document. Id should conform to RFC-1034 standard with a length limit of 128 characters. |
name | string | Immutable. The full resource name of the document. Format: projects/{project}/locations/{location}/collections/{collection}/dataStores/{data_store}/branches/{branch}/documents/{document_id} . This field must be a UTF-8 encoded string with a length limit of 1024 characters. |
aclInfo | object | Access control information for the document. (id: GoogleCloudDiscoveryengineV1DocumentAclInfo) |
content | object | The unstructured data linked to this document. Content can only be set and must be set if this document is under a CONTENT_REQUIRED data store. (id: GoogleCloudDiscoveryengineV1DocumentContent) |
derivedStructData | object | Output only. This field is OUTPUT_ONLY. It contains derived data that are not in the original input document. |
indexStatus | object | Output only. The index status of the document. * If document is indexed successfully, the index_time field is populated. * Otherwise, if document is not indexed due to errors, the error_samples field is populated. * Otherwise, if document's index is in progress, the pending_message field is populated. (id: GoogleCloudDiscoveryengineV1DocumentIndexStatus) |
indexTime | string (google-datetime) | Output only. The last time the document was indexed. If this field is set, the document could be returned in search results. This field is OUTPUT_ONLY. If this field is not populated, it means the document has never been indexed. |
jsonData | string | The JSON string representation of the document. It should conform to the registered Schema or an INVALID_ARGUMENT error is thrown. |
parentDocumentId | string | The identifier of the parent document. Currently supports at most two level document hierarchy. Id should conform to RFC-1034 standard with a length limit of 63 characters. |
schemaId | string | The identifier of the schema located in the same data store. |
structData | object | The structured JSON data for the document. It should conform to the registered Schema or an INVALID_ARGUMENT error is thrown. |
Methods
The following methods are available for this resource:
Parameters
Parameters can be passed in the WHERE
clause of a query. Check the Methods section to see which parameters are required or optional for each operation.
Name | Datatype | Description |
---|---|---|
branchesId | string | |
collectionsId | string | |
dataStoresId | string | |
documentsId | string | |
locationsId | string | |
projectsId | string | |
allowMissing | boolean | |
documentId | string | |
pageSize | integer (int32) | |
pageToken | string | |
updateMask | string (google-fieldmask) |
SELECT
examples
- projects_locations_collections_data_stores_branches_documents_get
- projects_locations_collections_data_stores_branches_documents_list
- projects_locations_data_stores_branches_documents_get
- projects_locations_data_stores_branches_documents_list
Gets a Document.
SELECT
id,
name,
aclInfo,
content,
derivedStructData,
indexStatus,
indexTime,
jsonData,
parentDocumentId,
schemaId,
structData
FROM google.discoveryengine.documents
WHERE projectsId = '{{ projectsId }}' -- required
AND locationsId = '{{ locationsId }}' -- required
AND collectionsId = '{{ collectionsId }}' -- required
AND dataStoresId = '{{ dataStoresId }}' -- required
AND branchesId = '{{ branchesId }}' -- required
AND documentsId = '{{ documentsId }}' -- required;
Gets a list of Documents.
SELECT
id,
name,
aclInfo,
content,
derivedStructData,
indexStatus,
indexTime,
jsonData,
parentDocumentId,
schemaId,
structData
FROM google.discoveryengine.documents
WHERE projectsId = '{{ projectsId }}' -- required
AND locationsId = '{{ locationsId }}' -- required
AND collectionsId = '{{ collectionsId }}' -- required
AND dataStoresId = '{{ dataStoresId }}' -- required
AND branchesId = '{{ branchesId }}' -- required
AND pageSize = '{{ pageSize }}'
AND pageToken = '{{ pageToken }}';
Gets a Document.
SELECT
id,
name,
aclInfo,
content,
derivedStructData,
indexStatus,
indexTime,
jsonData,
parentDocumentId,
schemaId,
structData
FROM google.discoveryengine.documents
WHERE projectsId = '{{ projectsId }}' -- required
AND locationsId = '{{ locationsId }}' -- required
AND dataStoresId = '{{ dataStoresId }}' -- required
AND branchesId = '{{ branchesId }}' -- required
AND documentsId = '{{ documentsId }}' -- required;
Gets a list of Documents.
SELECT
id,
name,
aclInfo,
content,
derivedStructData,
indexStatus,
indexTime,
jsonData,
parentDocumentId,
schemaId,
structData
FROM google.discoveryengine.documents
WHERE projectsId = '{{ projectsId }}' -- required
AND locationsId = '{{ locationsId }}' -- required
AND dataStoresId = '{{ dataStoresId }}' -- required
AND branchesId = '{{ branchesId }}' -- required
AND pageSize = '{{ pageSize }}'
AND pageToken = '{{ pageToken }}';
INSERT
examples
- projects_locations_collections_data_stores_branches_documents_create
- projects_locations_data_stores_branches_documents_create
- Manifest
Creates a Document.
INSERT INTO google.discoveryengine.documents (
data__structData,
data__jsonData,
data__name,
data__id,
data__schemaId,
data__content,
data__parentDocumentId,
data__aclInfo,
projectsId,
locationsId,
collectionsId,
dataStoresId,
branchesId,
documentId
)
SELECT
'{{ structData }}',
'{{ jsonData }}',
'{{ name }}',
'{{ id }}',
'{{ schemaId }}',
'{{ content }}',
'{{ parentDocumentId }}',
'{{ aclInfo }}',
'{{ projectsId }}',
'{{ locationsId }}',
'{{ collectionsId }}',
'{{ dataStoresId }}',
'{{ branchesId }}',
'{{ documentId }}'
RETURNING
id,
name,
aclInfo,
content,
derivedStructData,
indexStatus,
indexTime,
jsonData,
parentDocumentId,
schemaId,
structData
;
Creates a Document.
INSERT INTO google.discoveryengine.documents (
data__structData,
data__jsonData,
data__name,
data__id,
data__schemaId,
data__content,
data__parentDocumentId,
data__aclInfo,
projectsId,
locationsId,
dataStoresId,
branchesId,
documentId
)
SELECT
'{{ structData }}',
'{{ jsonData }}',
'{{ name }}',
'{{ id }}',
'{{ schemaId }}',
'{{ content }}',
'{{ parentDocumentId }}',
'{{ aclInfo }}',
'{{ projectsId }}',
'{{ locationsId }}',
'{{ dataStoresId }}',
'{{ branchesId }}',
'{{ documentId }}'
RETURNING
id,
name,
aclInfo,
content,
derivedStructData,
indexStatus,
indexTime,
jsonData,
parentDocumentId,
schemaId,
structData
;
# Description fields are for documentation purposes
- name: documents
props:
- name: projectsId
value: string
description: Required parameter for the documents resource.
- name: locationsId
value: string
description: Required parameter for the documents resource.
- name: collectionsId
value: string
description: Required parameter for the documents resource.
- name: dataStoresId
value: string
description: Required parameter for the documents resource.
- name: branchesId
value: string
description: Required parameter for the documents resource.
- name: structData
value: object
description: >
The structured JSON data for the document. It should conform to the registered Schema or an `INVALID_ARGUMENT` error is thrown.
- name: jsonData
value: string
description: >
The JSON string representation of the document. It should conform to the registered Schema or an `INVALID_ARGUMENT` error is thrown.
- name: name
value: string
description: >
Immutable. The full resource name of the document. Format: `projects/{project}/locations/{location}/collections/{collection}/dataStores/{data_store}/branches/{branch}/documents/{document_id}`. This field must be a UTF-8 encoded string with a length limit of 1024 characters.
- name: id
value: string
description: >
Immutable. The identifier of the document. Id should conform to [RFC-1034](https://tools.ietf.org/html/rfc1034) standard with a length limit of 128 characters.
- name: schemaId
value: string
description: >
The identifier of the schema located in the same data store.
- name: content
value: object
description: >
The unstructured data linked to this document. Content can only be set and must be set if this document is under a `CONTENT_REQUIRED` data store.
- name: parentDocumentId
value: string
description: >
The identifier of the parent document. Currently supports at most two level document hierarchy. Id should conform to [RFC-1034](https://tools.ietf.org/html/rfc1034) standard with a length limit of 63 characters.
- name: aclInfo
value: object
description: >
Access control information for the document.
- name: documentId
value: string
UPDATE
examples
- projects_locations_collections_data_stores_branches_documents_patch
- projects_locations_data_stores_branches_documents_patch
Updates a Document.
UPDATE google.discoveryengine.documents
SET
data__structData = '{{ structData }}',
data__jsonData = '{{ jsonData }}',
data__name = '{{ name }}',
data__id = '{{ id }}',
data__schemaId = '{{ schemaId }}',
data__content = '{{ content }}',
data__parentDocumentId = '{{ parentDocumentId }}',
data__aclInfo = '{{ aclInfo }}'
WHERE
projectsId = '{{ projectsId }}' --required
AND locationsId = '{{ locationsId }}' --required
AND collectionsId = '{{ collectionsId }}' --required
AND dataStoresId = '{{ dataStoresId }}' --required
AND branchesId = '{{ branchesId }}' --required
AND documentsId = '{{ documentsId }}' --required
AND allowMissing = {{ allowMissing}}
AND updateMask = '{{ updateMask}}'
RETURNING
id,
name,
aclInfo,
content,
derivedStructData,
indexStatus,
indexTime,
jsonData,
parentDocumentId,
schemaId,
structData;
Updates a Document.
UPDATE google.discoveryengine.documents
SET
data__structData = '{{ structData }}',
data__jsonData = '{{ jsonData }}',
data__name = '{{ name }}',
data__id = '{{ id }}',
data__schemaId = '{{ schemaId }}',
data__content = '{{ content }}',
data__parentDocumentId = '{{ parentDocumentId }}',
data__aclInfo = '{{ aclInfo }}'
WHERE
projectsId = '{{ projectsId }}' --required
AND locationsId = '{{ locationsId }}' --required
AND dataStoresId = '{{ dataStoresId }}' --required
AND branchesId = '{{ branchesId }}' --required
AND documentsId = '{{ documentsId }}' --required
AND allowMissing = {{ allowMissing}}
AND updateMask = '{{ updateMask}}'
RETURNING
id,
name,
aclInfo,
content,
derivedStructData,
indexStatus,
indexTime,
jsonData,
parentDocumentId,
schemaId,
structData;
DELETE
examples
- projects_locations_collections_data_stores_branches_documents_delete
- projects_locations_data_stores_branches_documents_delete
Deletes a Document.
DELETE FROM google.discoveryengine.documents
WHERE projectsId = '{{ projectsId }}' --required
AND locationsId = '{{ locationsId }}' --required
AND collectionsId = '{{ collectionsId }}' --required
AND dataStoresId = '{{ dataStoresId }}' --required
AND branchesId = '{{ branchesId }}' --required
AND documentsId = '{{ documentsId }}' --required;
Deletes a Document.
DELETE FROM google.discoveryengine.documents
WHERE projectsId = '{{ projectsId }}' --required
AND locationsId = '{{ locationsId }}' --required
AND dataStoresId = '{{ dataStoresId }}' --required
AND branchesId = '{{ branchesId }}' --required
AND documentsId = '{{ documentsId }}' --required;
Lifecycle Methods
- projects_locations_collections_data_stores_branches_documents_import
- projects_locations_collections_data_stores_branches_documents_purge
- projects_locations_data_stores_branches_documents_import
- projects_locations_data_stores_branches_documents_purge
Bulk import of multiple Documents. Request processing may be synchronous. Non-existing items are created. Note: It is possible for a subset of the Documents to be successfully updated.
EXEC google.discoveryengine.documents.projects_locations_collections_data_stores_branches_documents_import
@projectsId='{{ projectsId }}' --required,
@locationsId='{{ locationsId }}' --required,
@collectionsId='{{ collectionsId }}' --required,
@dataStoresId='{{ dataStoresId }}' --required,
@branchesId='{{ branchesId }}' --required
@@json=
'{
"inlineSource": "{{ inlineSource }}",
"gcsSource": "{{ gcsSource }}",
"bigquerySource": "{{ bigquerySource }}",
"fhirStoreSource": "{{ fhirStoreSource }}",
"spannerSource": "{{ spannerSource }}",
"cloudSqlSource": "{{ cloudSqlSource }}",
"firestoreSource": "{{ firestoreSource }}",
"alloyDbSource": "{{ alloyDbSource }}",
"bigtableSource": "{{ bigtableSource }}",
"errorConfig": "{{ errorConfig }}",
"reconciliationMode": "{{ reconciliationMode }}",
"updateMask": "{{ updateMask }}",
"autoGenerateIds": {{ autoGenerateIds }},
"idField": "{{ idField }}",
"forceRefreshContent": {{ forceRefreshContent }}
}';
Permanently deletes all selected Documents in a branch. This process is asynchronous. Depending on the number of Documents to be deleted, this operation can take hours to complete. Before the delete operation completes, some Documents might still be returned by DocumentService.GetDocument or DocumentService.ListDocuments. To get a list of the Documents to be deleted, set PurgeDocumentsRequest.force to false.
EXEC google.discoveryengine.documents.projects_locations_collections_data_stores_branches_documents_purge
@projectsId='{{ projectsId }}' --required,
@locationsId='{{ locationsId }}' --required,
@collectionsId='{{ collectionsId }}' --required,
@dataStoresId='{{ dataStoresId }}' --required,
@branchesId='{{ branchesId }}' --required
@@json=
'{
"gcsSource": "{{ gcsSource }}",
"inlineSource": "{{ inlineSource }}",
"filter": "{{ filter }}",
"errorConfig": "{{ errorConfig }}",
"force": {{ force }}
}';
Bulk import of multiple Documents. Request processing may be synchronous. Non-existing items are created. Note: It is possible for a subset of the Documents to be successfully updated.
EXEC google.discoveryengine.documents.projects_locations_data_stores_branches_documents_import
@projectsId='{{ projectsId }}' --required,
@locationsId='{{ locationsId }}' --required,
@dataStoresId='{{ dataStoresId }}' --required,
@branchesId='{{ branchesId }}' --required
@@json=
'{
"inlineSource": "{{ inlineSource }}",
"gcsSource": "{{ gcsSource }}",
"bigquerySource": "{{ bigquerySource }}",
"fhirStoreSource": "{{ fhirStoreSource }}",
"spannerSource": "{{ spannerSource }}",
"cloudSqlSource": "{{ cloudSqlSource }}",
"firestoreSource": "{{ firestoreSource }}",
"alloyDbSource": "{{ alloyDbSource }}",
"bigtableSource": "{{ bigtableSource }}",
"errorConfig": "{{ errorConfig }}",
"reconciliationMode": "{{ reconciliationMode }}",
"updateMask": "{{ updateMask }}",
"autoGenerateIds": {{ autoGenerateIds }},
"idField": "{{ idField }}",
"forceRefreshContent": {{ forceRefreshContent }}
}';
Permanently deletes all selected Documents in a branch. This process is asynchronous. Depending on the number of Documents to be deleted, this operation can take hours to complete. Before the delete operation completes, some Documents might still be returned by DocumentService.GetDocument or DocumentService.ListDocuments. To get a list of the Documents to be deleted, set PurgeDocumentsRequest.force to false.
EXEC google.discoveryengine.documents.projects_locations_data_stores_branches_documents_purge
@projectsId='{{ projectsId }}' --required,
@locationsId='{{ locationsId }}' --required,
@dataStoresId='{{ dataStoresId }}' --required,
@branchesId='{{ branchesId }}' --required
@@json=
'{
"gcsSource": "{{ gcsSource }}",
"inlineSource": "{{ inlineSource }}",
"filter": "{{ filter }}",
"errorConfig": "{{ errorConfig }}",
"force": {{ force }}
}';