Partial Batch Ingestion Overview | ÃÛ¶¹ÊÓÆµ Experience Platform

Documentation Experience Platform Data Ingestion Guide

Partial batch ingestion

Last update: Fri Jan 06 2023 00:00:00 GMT+0000 (Coordinated Universal Time)

Topics:
Data Ingestion

CREATED FOR:

Developer

Partial batch ingestion is the ability to ingest data containing errors, up to a certain threshold. With this capability, users can successfully ingest all their correct data into ÃÛ¶¹ÊÓÆµ Experience Platform while all their incorrect data is batched separately, along with details as to why it is invalid.

This document provides a tutorial for managing partial batch ingestion.

Getting started

This tutorial requires a working knowledge of the various ÃÛ¶¹ÊÓÆµ Experience Platform services involved with partial batch ingestion. Before beginning this tutorial, please review the documentation for the following services:

Batch ingestion: The method which Platform ingests and stores data from data files, such as CSV and Parquet.
Experience Data Model (XDM): The standardized framework by which Platform organizes customer experience data.

The following sections provide additional information that you will need to know in order to successfully make calls to Platform APIs.

Reading sample API calls

This guide provides example API calls to demonstrate how to format your requests. These include paths, required headers, and properly formatted request payloads. Sample JSON returned in API responses is also provided. For information on the conventions used in documentation for sample API calls, see the section on how to read example API calls in the Experience Platform troubleshooting guide.

Gather values for required headers

In order to make calls to Platform APIs, you must first complete the . Completing the authentication tutorial provides the values for each of the required headers in all Experience Platform API calls, as shown below:

Authorization: Bearer {ACCESS_TOKEN}
x-api-key: {API_KEY}
x-gw-ims-org-id: {ORG_ID}

All resources in Experience Platform are isolated to specific virtual sandboxes. All requests to Platform APIs require a header that specifies the name of the sandbox the operation will take place in:

x-sandbox-name: {SANDBOX_NAME}

NOTE

For more information on sandboxes in Platform, see the sandbox overview documentation.

Enable a batch for partial batch ingestion in the API enable-api

NOTE

This section describes enabling a batch for partial batch ingestion using the API. For instructions on using the UI, please read the enable a batch for partial batch ingestion in the UI step.

You can create a new batch with partial ingestion enabled.

To create a new batch, follow the steps in the batch ingestion developer guide. Once you reach the Create batch step, add the following field within the request body:

{
    "enableErrorDiagnostics": true,
    "partialIngestionPercent": 5
}

Property

Description

enableErrorDiagnostics

A flag that allows Platform to generate detailed error messages about your batch.

partialIngestionPercent

The percentage of acceptable errors before the entire batch will fail. So, in this example, a maximum of 5% of the batch can be errors, before it will fail.

Enable a batch for partial batch ingestion in the UI enable-ui

NOTE

This section describes enabling a batch for partial batch ingestion using the UI. If you have already enabled a batch for partial batch ingestion using the API, you can skip ahead to the next section.

To enable a batch for partial ingestion through the Platform UI, you can create a new batch through source connections, create a new batch in an existing dataset, or create a new batch through the â€œMap CSV to XDM flowâ€.

Create a new source connection new-source

To create a new source connection, follow the listed steps in the Sources overview. Once you reach the Dataflow detail step, take note of the Partial ingestion and Error diagnostics fields.

The Partial ingestion toggle allows you to enable or disable the use of partial batch ingestion.

The Error diagnostics toggle only appears when the Partial ingestion toggle is off. This feature allows Platform to generate detailed error messages about your ingested batches. If the Partial ingestion toggle is turned on, enhanced error diagnostics are automatically enforced.

The Error threshold allows you to set the percentage of acceptable errors before the entire batch will fail. By default, this value is set to 5%.

Use an existing dataset existing-dataset

To use an existing dataset, start by selecting a dataset. The sidebar on the right populates with information about the dataset.

The Partial ingestion toggle allows you to enable or disable the use of partial batch ingestion.

The Error threshold allows you to set the percentage of acceptable errors before the entire batch will fail. By default, this value is set to 5%.

Now, you can upload data using the Add data button, and it will be ingested using partial ingestion.

Use the â€œMap CSV to XDM schemaâ€ flow map-flow

To use the â€œMap CSV to XDM schemaâ€ flow, follow the listed steps in the Map a CSV file tutorial. Once you reach the Add data step, take note of the Partial ingestion and Error diagnostics fields.

The Partial ingestion toggle allows you to enable or disable the use of partial batch ingestion.

Error threshold allows you to set the percentage of acceptable errors before the entire batch will fail. By default, this value is set to 5%.

Next steps next-steps

This tutorial covered how to create or modify a dataset to enable partial batch ingestion. For more information on batch ingestion, please read the batch ingestion developer guide.

For information on monitoring partial ingestion errors, please read the batch ingestion error diagnostics guide.

recommendation-more-help

2ee14710-6ba4-4feb-9f79-0aad73102a9a

ÃÛ¶¹ÊÓÆµ