Retrieving data ingestion error diagnostics
ÃÛ¶¹ÊÓƵ Experience Platform provides two methods for uploading and ingesting data. You can either use batch ingestion, which allows you to insert data using various file types (such as CSVs), or streaming ingestion, which allows you to insert their data to Platform using streaming endpoints in real time.
This document provides information on monitoring batch ingestion, managing partial batch ingestion errors, as well as a reference for partial batch ingestion types.
Getting started
This guide requires a working understanding of the following components of ÃÛ¶¹ÊÓƵ Experience Platform:
- Experience Data Model (XDM) System: The standardized framework by which Experience Platform organizes customer experience data.
- ÃÛ¶¹ÊÓƵ Experience Platform Data Ingestion: The methods by which data can be sent to Experience Platform.
Reading sample API calls
This tutorial provides example API calls to demonstrate how to format your requests. These include paths, required headers, and properly formatted request payloads. Sample JSON returned in API responses is also provided. For information on the conventions used in documentation for sample API calls, see the section on how to read example API calls in the Experience Platform troubleshooting guide.
Gather values for required headers
In order to make calls to Platform APIs, you must first complete the . Completing the authentication tutorial provides the values for each of the required headers in all Experience Platform API calls, as shown below:
Authorization: Bearer {ACCESS_TOKEN}
x-api-key: {API_KEY}
x-gw-ims-org-id: {ORG_ID}
All resources in Experience Platform, including those belonging to the Schema Registry, are isolated to specific virtual sandboxes. All requests to Platform APIs require a header that specifies the name of the sandbox the operation will take place in:
x-sandbox-name: {SANDBOX_NAME}
Downloading error diagnostics download-diagnostics
ÃÛ¶¹ÊÓƵ Experience Platform allows users to download the error diagnostics of the input files. The diagnostics will be retained within Platform for up to 30 days.
List input files list-files
The following request retrieves a list of all the files provided in a finalized batch.
API format
GET /batches/{BATCH_ID}/meta?path=input_files
{BATCH_ID}
Request
curl -X GET https://platform.adobe.io/data/foundation/export/batches/af838510-2233-11ea-acf0-f3edfcded2d2/meta?path=input_files \
-H 'Authorization: Bearer {ACCESS_TOKEN}' \
-H 'x-api-key: {API_KEY}' \
-H 'x-gw-ims-org-id: {ORG_ID}' \
-H 'x-sandbox-name: {SANDBOX_NAME}'
Response
A successful response will return JSON objects detailing where the diagnostics were saved.
{
"_page": {
"count": 1,
"limit": 100
},
"data": [
{
"_links": {
"self": {
"href": "https://platform.adobe.io/data/foundation/export/batches/af838510-2233-11ea-acf0-f3edfcded2d2/meta?path=input_files/fileMetaData1.json"
}
},
"length": "1337",
"name": "fileMetaData1.json"
},
{
"_links": {
"self": {
"href": "https://platform.adobe.io/data/foundation/export/batches/af838510-2233-11ea-acf0-f3edfcded2d2}/meta?path=input_files/fileMetaData2.json"
}
},
"length": "1042",
"name": "fileMetaData2.json"
}
]
}
Retrieve input file diagnostics retrieve-diagnostics
Once you have retrieved a list of all the different input files, you can retrieve the diagnostics of the individual file by using the following request.
API format
GET /batches/{BATCH_ID}/meta?path=input_files/{FILE}
{BATCH_ID}
{FILE}
Request
curl -X GET https://platform.adobe.io/data/foundation/export/batches/af838510-2233-11ea-acf0-f3edfcded2d2/meta?path=input_files/fileMetaData1.json \
-H 'Authorization: Bearer {ACCESS_TOKEN}' \
-H 'x-api-key: {API_KEY}' \
-H 'x-gw-ims-org-id: {ORG_ID}' \
-H 'x-sandbox-name: {SANDBOX_NAME}'
Response
A successful response will return JSON objects containing path
objects detailing where the diagnostics were saved. The response will return the path
objects in format.
{"path": "F1.json"}
{"path": "etc/F2.json"}
Retrieve batch ingestion errors retrieve-errors
If batches contain failures, you should retrieve error information about these failures so you can re-ingest the data.
Check status check-status
To check the status of the ingested batch, you must supply the batch’s ID in the path of a GET request. To learn more about using this API call, please read the catalog endpoint guide.
API format
GET /catalog/batches/{BATCH_ID}
GET /catalog/batches/{BATCH_ID}?{FILTER}
{BATCH_ID}
id
value of the batch you want to check the status of.{FILTER}
&
). For more information, please read the guide on filtering Catalog data.Request
curl -X GET https://platform.adobe.io/data/foundation/catalog/batches/af838510-2233-11ea-acf0-f3edfcded2d2 \
-H 'Authorization: Bearer {ACCESS_TOKEN}' \
-H 'x-api-key: {API_KEY}' \
-H 'x-gw-ims-org-id: {ORG_ID}' \
-H 'x-sandbox-name: {SANDBOX_NAME}'
Response without errors
A successful response returns with detailed information about the batch’s status.
{
"af838510-2233-11ea-acf0-f3edfcded2d2": {
"status": "success",
"tags": {
"acp_enableErrorDiagnostics": true,
"acp_partialIngestionPercent": 5
},
"relatedObjects": [
{
"type": "dataSet",
"id": "5deac2648a19d218a888d2b1"
}
],
"id": "af838510-2233-11ea-acf0-f3edfcded2d2",
"externalId": "af838510-2233-11ea-acf0-f3edfcded2d2",
"inputFormat": {
"format": "parquet"
},
"imsOrg": "{ORG_ID}",
"started": 1576741718543,
"metrics": {
"inputByteSize": 568,
"inputFileCount": 4,
"inputRecordCount": 519,
"outputRecordCount": 497,
"failedRecordCount": 0
},
"completed": 1576741722026,
"created": 1576741597205,
"createdClient": "{API_KEY}",
"createdUser": "{USER_ID}",
"updatedUser": "{USER_ID}",
"updated": 1576741722644,
"version": "1.0.5"
}
}
metrics.failedRecordCount
inputRecordCount
from the outputRecordCount
. This value is generated on all batches regardless if errorDiagnostics
is enabled.Response with errors
If the batch has one or more errors and has error diagnostics enabled, the response returns more information about the errors, both within the payload itself as well as a downloadable error file. Note that the status of a batch containing errors may still have a status of success.
{
"01E8043CY305K2MTV5ANH9G1GC": {
"status": "success",
"tags": {
"acp_enableErrorDiagnostics": true,
"acp_partialIngestionPercent": 5
},
"relatedObjects": [
{
"type": "dataSet",
"id": "5deac2648a19d218a888d2b1"
}
],
"id": "01E8043CY305K2MTV5ANH9G1GC",
"externalId": "01E8043CY305K2MTV5ANH9G1GC",
"inputFormat": {
"format": "parquet"
},
"imsOrg": "{ORG_ID}",
"started": 1576741718543,
"metrics": {
"inputByteSize": 568,
"inputFileCount": 4,
"inputRecordCount": 519,
"outputRecordCount": 514,
"failedRecordCount": 5
},
"completed": 1576741722026,
"created": 1576741597205,
"createdClient": "{API_KEY}",
"createdUser": "{USER_ID}",
"updatedUser": "{USER_ID}",
"updated": 1576741722644,
"version": "1.0.5",
"errors": [
{
"code": "INGEST-1212-400",
"description": "Encountered 5 errors in the data. Successfully ingested 514 rows. Please review the associated diagnostic files for more details."
},
{
"code": "INGEST-1401-400",
"description": "The row has corrupted data and cannot be read or parsed. Fix the corrupted data and try again.",
"recordCount": 2
},
{
"code": "INGEST-1555-400",
"description": "A required field is either missing or has a value of null. Add the required field to the input row and try again.",
"recordCount": 3
}
]
}
}
metrics.failedRecordCount
inputRecordCount
from the outputRecordCount
. This value is generated on all batches regardless if errorDiagnostics
is enabled.errors.recordCount
errorDiagnostics
is enabled.code language-json |
---|
|
Next steps next-steps
This tutorial covered how to monitor partial batch ingestion errors. For more information on batch ingestion, please read the batch ingestion developer guide.
Appendix appendix
This section provides supplemental information about ingestion error types.
Partial batch ingestion error types partial-ingestion-types
Partial batch ingestion has three different error types when ingesting data:
Unreadable files unreadable
If the batch ingested has unreadable files, the batch’s errors will be attached on the batch itself. More information on retrieving the failed batch can be found in the retrieving failed batches guide.
Invalid schemas or headers schemas-headers
If the batch ingested has an invalid schema or invalid headers, the batch’s errors will be attached on the batch itself. More information on retrieving the failed batch can be found in the retrieving failed batches guide.
Unparsable rows unparsable
If the batch you ingested has unparsable rows, you can use the following request to view a list of files that contain errors.
API format
GET /export/batches/{BATCH_ID}/meta?path=row_errors
{BATCH_ID}
id
value of the batch you are retrieving error information from.Request
curl -X GET https://platform.adobe.io/data/foundation/export/batches/01EFZ7W203PEKSAMVJC3X99VHQ/meta?path=row_errors \
-H 'Authorization: Bearer {ACCESS_TOKEN}' \
-H 'x-api-key: {API_KEY}' \
-H 'x-gw-ims-org-id: {ORG_ID}' \
-H 'x-sandbox-name: {SANDBOX_NAME}'
Response
A successful response returns a list of the files that have errors.
{
"data": [
{
"name": "conversion_errors_0.json",
"length": "1162",
"_links": {
"self": {
"href": "https://platform.adobe.io:443/data/foundation/export/batches/01EFZ7W203PEKSAMVJC3X99VHQ/meta?path=row_errors%2Fconversion_errors_0.json"
}
}
},
{
"name": "parsing_errors_0.json",
"length": "153",
"_links": {
"self": {
"href": "https://platform.adobe.io:443/data/foundation/export/batches/01EFZ7W203PEKSAMVJC3X99VHQ/meta?path=row_errors%2Fparsing_errors_0.json"
}
}
}
],
"_page": {
"limit": 100,
"count": 2
}
}
You can then retrieve detailed information about the errors using the diagnostics retrieval endpoint.
A sample response of retrieving the error file can be seen below:
{
"_corrupt_record": "{missingQuotes: 'v1'}",
"_errors": [{
"code": "1401",
"message": "Row is corrupted and cannot be read, please fix and resend."
}],
"_filename": "parsing_errors_0.json"
}