Data Governance Overview | ÃÛ¶¹ÊÓÆµ Experience Platform

Documentation Experience Platform Data Governance Guide

Data Governance overview data-governance-overview

Last update: Wed Jan 24 2024 00:00:00 GMT+0000 (Coordinated Universal Time)

Topics:
Data Governance

CREATED FOR:

User
Developer
Admin

One of the core capabilities of ÃÛ¶¹ÊÓÆµ Experience Platform is to bring data from multiple enterprise systems together to better allow marketers to identify, understand, and engage customers. This data may be subject to usage restrictions defined by your organization or by legal regulations. It is therefore important to ensure that your data operations within Platform are compliant with data usage policies.

Manage customer data and ensure compliance with regulations, restrictions, and policies applicable to data use with ÃÛ¶¹ÊÓÆµ Experience Platform Data Governance. Data governance plays a key role within Experience Platform at various levels, including cataloging, data lineage, data usage labeling, data usage policies, and controlling usage of data for marketing actions.

NOTE

In Experience Platform, data governance is only concerned with how data is used or activated, regardless of the user performing the action. For information on how to control access to specific data fields for certain Platform users within your organization, see the documentation on attribute-based access control instead.

Data governance roles data-governance-roles

As a concept, data governance is neither automatic, nor does it occur in a vacuum. What began as a role for one individual, typically recognized as a data steward, has grown considerably as the data governance ecosystem has expanded. Today, data governance requires continual management and monitoring in order to be successful. Effective data governance relies on data stewards having tools with which data can be properly labeled, usage policies can be created, and compliance with those policies can be enforced.

While data governance should be the responsibility of every individual in the organization, here are some of the essential roles within the data governance cycle:

Graphic to convey the four data governance roles, with quotes about the duties of each role.

Data steward data-steward

Data stewards are the heart of data governance. This role is responsible for interpreting regulations, contractual restrictions, and policies, and applying them directly to the data. Informed by their understanding of these regulations, restrictions, and policies, the role of a data steward includes:

Reviewing data, datasets, and data samples to apply and manage metadata usage labeling.
Creating data policies and applying them to datasets and fields.
Communicating data policies to the organization.

Marketer marketer

Marketers are the end point of data governance. They request data from the data governance infrastructure created by data stewards, scientists, and engineers. Marketers encompass a number of different specialties under the marketing umbrella, including the following:

Marketing Analysts request data to enable understanding of customers, both as individuals and in groups (also known as segments).
Marketing Specialists and Experience Designers use data to design new customer experiences.

Data Governance framework data-governance-framework

The Data Governance framework simplifies and streamlines the process of categorizing data and creating data usage policies. Once data labels have been applied and data usage policies are in place, marketing actions can be evaluated to ensure the correct use of data.

There are three key elements to the Data Governance framework: Labels, Policies, and Enforcement.

Labels: Classify data that reflects privacy-related considerations and contractual conditions to be compliant with regulations and organization policies.
Policies: Describe what kinds of marketing actions are allowed or not allowed to be taken on specific data.
Enforcement: Uses the policy framework to advise and enforce policies across different data access patterns.

Data usage labels data-usage-labels

Data Governance enables data stewards to apply usage labels at the schema field level to categorize data according to the type of policies that apply.

The Data Governance framework includes predefined data usage labels that can be used to categorize data in three ways:

The three data usage label categories.

Contract â€œCâ€ Data Labels: Label and categorize data that has contractual obligations or is related to customer data governance policies.
Identity â€œIâ€ Data Labels: Label and categorize data that can identify or contact a specific person.
Sensitive â€œSâ€ Data Labels: Label and categorize data related to sensitive data such as geographic data.

NOTE

See the guide on supported data usage labels for a complete list of available labels, and definitions for each label type.

Labels can be applied at any time, providing flexibility in how you choose to govern data. Best practice encourages labeling data when it is ingested into Experience Platform, or as soon as data becomes available in Platform.

See the overview on data usage labels for more information on how data usage labels are used to help enforce data governance compliance.

Data usage policies data-usage-policies

For data usage labels to effectively support data compliance, data usage policies must be implemented. Data usage policies are rules that describe the kinds of marketing actions that you are allowed to, or restricted from, performing on data within Experience Platform.

An example of a marketing action might be the desire to export a dataset to a third-party service. If there is a policy in place declaring that Personally Identifiable Information (PII) cannot be exported, and an â€œIâ€ label (identity data) has been applied to the field level from its schema. Policy Service then prevents any action that would export this dataset to a third-party destination. Should one of these action attempts occur, Policy Service sends a message telling you that a data usage policy has been violated.

There are two types of policies available:

Data governance policy: Restrict data activation based on the marketing action being performed and the data usage labels carried by the data in question.
Consent policy: Filter the profiles that can be activated to destinations based on your customersâ€™ consent or preferences.

Once data usage labels have been applied, data stewards can create policies using the Policy Service API or the Experience Platform user interface. For more information on data usage policies and marketing actions, see the policies overview.

IMPORTANT

All data usage policies (including core policies provided by ÃÛ¶¹ÊÓÆµ) are disabled by default. For an individual policy to be considered for enforcement, you must manually enable that policy.

Next steps

This document provided a high-level introduction to Data Governance and the Data Governance framework. You can now continue to the data usage labels user guide and start adding usage labels to your experience data.

Appendix

The following section provides additional information regarding Data Governance.

Data Governance terminology data-governance-terminology

The following table outlines key terms related to Data Governance and theData Governance framework.

Term

Definition

Contract labels

Contract â€œCâ€ labels are used to categorize data that has contractual obligations or is related to your organizationâ€™s data governance policies.

Cross-site data

Cross-site data is the combination of data from several sites. Cross-site data includes both on-site and off-site data, or a combination of data from several off-site sources.

Data governance

Data governance encompasses the strategies and technologies used to ensure that data is in compliance with regulations and corporate policies with respect to data usage.

Data steward

The data steward is the person responsible for the management, oversight, and enforcement of an organizationâ€™s data assets. A data steward also ensures that data governance policies are safeguarded and maintained to be compliant with government regulations and organization policies.

Data usage labels

Data usage labels provide users the ability to categorize data that reflects privacy-related considerations and contractual conditions to be compliant with regulations and corporate policies.

Dataset labels

Labels can be added to a schema. All fields within a dataset inherit the schemaâ€™s labels.

Field labels

Field labels are data governance labels that are either inherited from a schema or applied directly to a field. Data governance labels applied to a field are not inherited up to the schema level.

Geofence

A geofence is a virtual geographic boundary, defined by GPS or RFID technology, that enables software to trigger a response when a mobile device enters or leaves a particular area.

Identity labels

Identity â€œIâ€ labels are used to categorize data that can identify or contact a specific person.

Interest-based targeting

Interest-based targeting, also known as personalization, occurs if the following three conditions are met:
Data collected on-site is,

Used to make inferences about a usersâ€™ interest,
Used in another context, such as on another site or app (off-site)
Used to select which content or ads are served based on those inferences.

Marketing action

A marketing action, in the context of the data governance framework, is an action that an Experience Platform data consumer takes, for which there is a need to check for violations of data usage policies

Policy

In the data governance framework, a policy is a rule that describes what kinds of marketing actions are allowed or not allowed to be taken on specific data.

Schema Labels

Manage the labels for data governance, consent, and access control at the schema level. This propagates the labels to every dataset that uses that schema.

Sensitive Labels

Sensitive â€œSâ€ labels are used to categorize data that you, and your organization, consider sensitive.

Additional resources

The following video is intended to support your understanding of the Data Governance framework.

video poster

Transcript

In this video, Iâ€™ll give you an overview of the governance features in ÃÛ¶¹ÊÓÆµ Experience Platform. Letâ€™s first look at the data governance challenges facing enterprises today. On one hand, you want to provide personalized customer experiences by leveraging customer data, data science, analytics, segmentation and activation. On the other hand, you also want to ensure that all data usage complies with policies based on legal, contractual and privacy obligations. Complying with such policies becomes hard when data usage is siloed from data stewardship. The governance features on ÃÛ¶¹ÊÓÆµ Experience Platform allows you to address these challenges and break down such enterprise silos. ÃÛ¶¹ÊÓÆµ Experience Platform provides an easily extensible governance framework that is deeply embedded in data usage workflows. We call this the data usage and labeling enforcement framework, or DUEL, and it provides features for you to take complete control over governing your data, from the point itâ€™s collected at data sources to when itâ€™s syndicated to destinations outside platform. The framework is built on three key aspects, labels, policies, and enforcement. First, you can classify data using governance labels. Second, you can alter governance policies to define usage restrictions. Third, policies can be enforced when the data is used. With these three pillars, you can be assured that all data usage is in compliance and does not violate any policies.

Now letâ€™s take a look at each of these aspects, starting with data classification using labels.

In an increasingly privacy conscious landscape, you want to ensure that your data is appropriately tagged and classified to reflect corporate policies, contractual obligations, compliance requirements and regional regulations. Governance labels enable you to do this and are critical to differentiating known and unknown data about your customers. Once labeled, the classifications are propagated as data flows through the system. Labels are also building blocks to other usage policies and enable services to identify and restrict data usage. Different types of governance labels are offered in ÃÛ¶¹ÊÓÆµ Experience Platform to capture a rich and complex set of restrictions you want to apply on the data. Three categories of labels are provided out of the box. Contract labels can be used to categorize data, indicating contractual obligations. For example, you can use C6 Contract Label on data that is contractually restricted from use an onsite targeting. Identity labels can be used to categorize data that can be used to identify or contact a specific person. For example, you can use I1 identity label on data from your CRM system that directly identifies an individual, like the email address or phone number. Sensitive labels can be used to categorize sensitive data like geolocation. Additionally, you can also create custom labels based on your business needs.

Data classification is designed to be a convenient experience that minimizes repetitive tasks. You can apply the governance labels at the source level on datasets. Granularity to apply labels at the dataset field level is also provided. This means that when you want to use the data, restrictions relevant for individual fields can be honored while fields without restrictions can continue to be used. Any segments or profiles that are created from source datasets will automatically inherited the labels. Not only does this reduce the chance of making mistakes, but any updated restrictions are propagated and reflected in data assets downstream in real-time.

Now that weâ€™ve covered data classification, letâ€™s look at policy management, the second aspect of the governance framework. ÃÛ¶¹ÊÓÆµ Experience Platform provides data stewards with the ability to define data usage policies based on their corporate, legal and privacy guidelines. A governance policy describes what kinds of data usage actions are not allowed based on the classifications applicable on data. Two features are provided that act as building blocks to define a governance policy, governance labels and marketing actions. Letâ€™s look at an example. You may want to define a policy that states, directly identifiable data should not be used for onsite targeting. To enable this, you can use a marketing action identifying onsite advertising and the governance label I1 to create a policy rule.

The policy engine offers flexibility to use a Boolean expression of labels when altering the rule. This can be a combination of one or more Boolean and/or expressions involving labels. You can define your own marketing actions and also enable or disable policies to have complete control over policy definitions. Governance policies come in two flavors. Core policies are provided out of the box, and use predefined labels and marketing actions in their definition. These are defined in coordination with privacy and legal guidelines, and help you get started with common restrictions to be applied for customer experience use cases. You can see some of the core policies on the right side. For example, a policy to restrict email targeting is based off label C4 and C5, and prevents data with that classification from use for email targeting. Custom policies provide flexibility to define your own restrictions using the policy engine tailored to your specific use cases.

The third aspect of the governance framework is policy enforcement. Once policies are defined and enabled, applications using data can enforce these policies at the point where data is used for specific marketing actions. The applications should go through the following steps to enable governance enforcement. First, all the classifications for data requested for usage needs to be retrieved. Next, all the marketing actions for which data is requested for is retrieved. Once the labels and actions are retrieved, any violations against active governance policies can be checked, if any policies are violated, data usage can be controlled to honor the policies. Policy enforcement is built into ÃÛ¶¹ÊÓÆµ Experience Platform with both simplicity and extensibility as key considerations. All four steps to enable enforcement can be performed using governance APIs. This means that any custom applications built on top of Platform can restrict data usage, and do it in a way that allows those applications to define the enforcement experience to provide to their users. Platform services and applications will provide an enforcement experience out of the box. This automates the steps for enforcement under the hood and is embedded in the usage flow of the applications. Platformâ€™s approach to governance is not just about restricting usage, but also to help you make informed choices on what you can do to mitigate the violation and continue delivering customer experiences. Whenever a policy violation happens, enough context is provided about why this happened. This includes a list of violated policies, as well as a comprehensive lineage analysis. By analyzing this lineage, you can identify how the activation you are trying to perform caused a policy violation. Based on this analysis, you can decide on the course of action and subsequently remediate violations by making appropriate updates to data relationships. Thatâ€™s an overview of the governance features in ÃÛ¶¹ÊÓÆµ Experience Platform.

The following video provides guidance on how to apply data usage labels to your schemas or the entirety of a dataset in Experience Platform.

video poster

Transcript

Hi, itâ€™s Daniel. In this video, Iâ€™ll show you how to apply data usage labels to your schemas and data sets in the ÃÛ¶¹ÊÓÆµ Experience platform. Data usage labels enable you to tag and classify data to reflect corporate policies, contractual obligations, compliance requirements, regional regulations, user access policies, and customer consent preferences. This is critical to differentiating known and unknown data about your customers and applying appropriate controls on data usage based on the nature of the data.

You can label data in both schemas and data sets. Letâ€™s start with schemas. Iâ€™ll open the platform interface and click schemas on the left NAV to see a list of all of my schemas. Weâ€™ll see the Luma loyalty schema listed here. Iâ€™ll select a schema to see more details. The interface shows the schema structure weâ€™re interested in labeling data. So letâ€™s click the labels tab at the top. In this view, you can see all of the fields in this schema and which labels they contain. Thereâ€™s a filter option to help you filter to specific field groups, identity fields and the already applied labels in the Luma Loyalty schema. I have an email address field that can directly identify an individual. Letâ€™s apply an appropriate classification to this field.

Select a checkbox next to the field name and then click the button to edit the governance labels for the field.

In the dialog, you can choose labels from any of the three categories provided out of the box. Contract labels, identity labels, and sensitive labels. You can even create a new label from within. This workflow will apply an AI one label to indicate that email address is directly identified. Label Data. Click Save. To finish. Another workflow you can use is to select a field from the schemas structure view and then select apply labels.

When you label a fields in a schema. That label is inherited across all schemas and platform. Using that same fields group. So that should save you the trouble of having to track down all other locations that use that same field group.

For example, hereâ€™s another schema which uses the same personal contact details Fields group as my loyalty schema. If I look at the email address field, you will see itâ€™s already applied. Keep in mind, though, that sometimes several field groups use the same field at the same X path, and labeling the field in just one group wonâ€™t impact the use of that field in another group. Hereâ€™s an example of that. If I open my Luma context schema and open the personal email address field, which was added by the extreme business person details group. Note that the label has not been applied.

Labels can also be applied at the data set level. In this case, letâ€™s pretend we purchased a data set from a third party and it has a contractual restriction preventing the data from being used for any cross aid targeting outside of the customerâ€™s websites and applications. This data set might be based on schema field groups used by other datasets which donâ€™t have this contractual restriction. So in this case, it doesnâ€™t make sense to apply labels at the schema level, but instead to the data set itself. To do this, click the data governance tab. You might notice labels that were inherited from your schema. You can toggle these in or out of view. Scroll through the list of labels and apply the C five label to indicate the restriction for cross-site targeting. One thing to keep in mind is that labels applied at the data set level donâ€™t play a role in access control policies, so it will impact data governance and consent policies, but not user access.

In previous versions of ÃÛ¶¹ÊÓÆµ Experience, platform labels could be applied on individual data set fields to capture. Are fine grained classifications not applicable to the entire data set that has been discontinued in favor of the more powerful schema field labeling. But hereâ€™s an example of what that looked like, and you may still see some individual fields labeled in your data set. We suggest that you remove them over time and replace them with equivalent labels on the schema fields. To remove a label from a data set first, make sure youâ€™ve already labeled the schema field equivalent and then in your data set, just click the X to remove the label from the field. Since the labels are applied on the schema and data set metadata, you can start classification as soon as you have the data sets modeled and platform. You donâ€™t need to wait until the actual data ingestion starts. Once data is classified, data usage can be restricted by defining governance, consent and user access policies. If policies have already been created before, youâ€™ve labeled your fields, you may experience a governance policy violation message like this one when applying a label. This means that thereâ€™s already a usage in place which will be violated by the application of this label. So use the data lineage diagram to understand what other configurations you might need to change before being able to add the label. Also, be aware that since the schema labels are also used for access control, a field might disappear from your view after youâ€™ve applied a label. If you donâ€™t have permission to view fields with that label, if that happens to you, I suggest you open the schema and use the filter to see what labels are being used. Even if you canâ€™t see which fields theyâ€™ve been applied to. And then make sure youâ€™re in a permissions role that has the same labels. You should now be able to classify data by applying labels on individual fields in this schema and to an entire data set. Thanks for watching.

recommendation-more-help

834e0cae-2761-454a-be4d-62f0fd4b4456