Create datasets
In this lesson, you will create datasets to receive your data. You will be excited to know that this is the shortest lesson in the tutorial!
All data that is successfully ingested into ÃÛ¶¹ÊÓƵ Experience Platform is persisted in the data lake as datasets. A dataset is a storage and management construct for a collection of data, typically a table, that contains a schema (columns) and fields (rows). Datasets also contain metadata that describes various aspects of the data they store.
Data Architects will need to create datasets outside of this tutorial.
Before you begin the exercises, watch this short video to learn more about datasets:
Permissions required
In the Configure Permissions lesson, you set up all the access controls required to complete this lesson.
Create datasets in the UI
In this exercise, we will create datasets in the UI. Let’s start with the loyalty data:
-
Go to Datasets in the Platform user interface’s left navigation
-
Select the Create dataset button
-
On the next screen, select Create dataset from schema
-
On the next screen, select your
Luma Loyalty Schema
and then select the Next button
-
Name the dataset
Luma Loyalty Dataset
and select the Finish button
-
When the dataset has saved, you will be taken to a screen like this:
That’s it! I told you this was going to be quick. Create these other datasets using the same steps:
Luma Offline Purchase Events Dataset
for yourLuma Offline Purchase Events Schema
Luma Web Events Dataset
for yourLuma Web Events Schema
Luma Product Catalog Dataset
for yourLuma Product Catalog Schema
Create a dataset using API
Now create the Luma CRM Dataset
using the API.
Luma CRM Dataset
in the user interface that’s fine. Name it Luma CRM Dataset
and use the Luma CRM Schema
.Get the id of the schema to be used in the dataset
First we need to get the $id
of the Luma CRM Schema
:
- Open Postman
- If you don’t have an access token, open the request OAuth: Request Access Token and select Send to request a new access token, just like you did in the Postman lesson.
- Open the request Schema Registry API > Schemas > Retrieve a list of schemas within the specified container.
- Select the Send button
- You should get a 200 response
- Look in the response for the
Luma CRM Schema
item and copy the$id
value
Create the dataset
Now you can create the dataset:
-
Download to your
Luma Tutorial Assets
folder. -
Import the collection into Postman
-
Select the request Catalog Service API > Datasets > Create a new dataset.
-
Paste the following as the Body of the request, replacing the id value with your own:
code language-json { "name": "Luma CRM Dataset", "schemaRef": { "id": "REPLACE_WITH_YOUR_OWN_ID", "contentType": "application/vnd.adobe.xed-full+json;version=1" }, "fileDescription": { "persisted": true, "containerFormat": "parquet", "format": "parquet" } }
-
Select the Send button
-
You should get a 201 Created response containing the id of your new dataset!
400: There was a problem retrieving xdm schema
. Make sure you have replaced the id in the sample above with the id of your ownLuma CRM Schema
- No auth token: Run the OAuth: Request Access Token request to generate a new token
401: Not Authorized to PUT/POST/PATCH/DELETE for this path : /global/schemas/
: Update the CONTAINER_ID environment variable fromglobal
totenant
403: PALM Access Denied. POST access is denied for this resource from access control
: Verify your user permissions in the Admin Console
You can go back to the Datasets screen in the Platform user interface, you can verify the successful creation of all five datasets!
Additional Resources
Now that all of our schemas, identities, and datasets are in place, we can enable them for Real-Time Customer Profile.