Validate data in the datalake
Learn how to validate if data has successfully ingested into the datalake using ÃÛ¶¹ÊÓƵ Experience Platform’s Query Service. For detailed product documentation, see the Query Editor UI guide.
Transcript
Hi everyone. Today, we are going to discuss about validate data in the data lake. The first thing is request the dataset schema name IDs that ingested the data for the profile. For that, keep the below details in handy, which includes schema name, dataset name, and dataset ID. The second thing is, why do we need dataset schema information for query service? Firstly, for the table name to look up when building the query, this can be found on the dataset overview page. Now log into your AEP profile. AEP stands for ÃÛ¶¹ÊÓƵ Experience Platform. On the dashboard to the left panel, scroll down and click on the tab named datasets. There you will find a browse option. Click on it and search for your desired dataset. Now click on the dataset. There on the right panel, you can easily locate the table name. There’s also an option to copy it for building query. Secondly, we can preview the last successful batch ingestion, having default limit of a hundred rows to see which XDM field holds the data, which we want to look up via query service. Now, in order to preview dataset on the top right corner, you can find the option to preview dataset. Click on the option to see the XDM field that’s holding the ECID value. Now you can view the XDM schema and when scrolling down on the right panel, we can locate the XDM field path that will be used in the SQL statement. Now let’s try to build an SQL statement using XDM paths. So what is the need to run the query? It is because the query in AEP works on ADL’s data. So if we need to confirm the data in ADL’s, we need to query the data in query tab. And if records are returned, it proves that the data exists in ADL’s. If not, ADL’s does not have any data. Go back to AEP profile and navigate to the query tab on the left panel. Then on the top right corner, you can see an option to create query. Click on it and write down your SQL statement. The statement is select aapsupport.identification.ecid as ECID from Gupta event dataset or website limit 10. You can see their desired results over here. At last, let’s move on to the explanation of SQL statement. The first line select aapsupport.identification.ecid as ECID is for selecting the XDM field that holds the value of ECIDs and then placing this into a column named ECID using the as command. The as command is used to rename a column of the table within alias. Alias only exist for the duration of the query. The second line that is from guftart event datasets for website, which is pointing the dataset table in question. Last line limit 10 limits the output results to 10. Please contact the AEP support team for any further assistance. Hope this was helpful. Thank you.
recommendation-more-help
9051d869-e959-46c8-8c52-f0759cee3763