Configure bot detection for datastreams

Documentation Experience Platform Datastreams Guide

Last update: Wed Nov 13 2024 00:00:00 GMT+0000 (Coordinated Universal Time)

Topics:
Datastreams

CREATED FOR:

Developer

Nonhuman traffic from automated programs, web scrapers, spiders, and scripted scanners can make it difficult to identify events from human visitors. This type of traffic can negatively affect important business metrics, leading to incorrect traffic reporting.

Bot detection allows you to identify events generated by the Web SDK, and Server API as being generated by known spiders and bots.

By configuring bot detection for your datastreams, you can identify specific IP addresses, IP ranges, and request headers to classify as bot events. This helps provide a more accurate measurement of user activity on your site or mobile application.

When a request to the Edge Network matches any of the bot detection rules, the XDM schema is updated with a bot score (always set to 1), as shown below:

{
  "botDetection": {
    "score": 1
  }
}

This bot scoring helps the solutions receiving the request correctly identify bot traffic.

IMPORTANT

Bot detection does not drop any bot requests. It only updates the XDM schema with the bot scoring, and forwards the event to the datastream service which you configured.

蜜豆视频 solutions may handle bot scoring in different ways. For example, 蜜豆视频 Analytics uses its own bot filtering service and does not use the score set by the Edge Network. The two services use the same , so the bot scoring is identical.

Bot detection rules can take up to 15 minutes to propagate across the Edge Network after being created.

Prerequisites prerequisites

For bot detection to work on your datastream, you must add the Bot Detection Information field group to your schema. See the XDM schema documentation to learn how to add fields groups to a schema.

Configure bot detection for datastreams configure

You can configure bot detection after creating a datastream configuration. See the documentation on how to create and configure a datastream, then follow the instructions below to add bot detection capabilities to your datastream.

Go to the datastreams list and select the datastream to which you want to add bot detection.

Datastreams user interface showing the list of datastreams.

In the datastream details page, select the Bot Detection option on the right rail.

Bot detection option highlighted in the datastreams user interface.

The Bot Detection Rules page is shown.

Bot detection settings in the datastream settings page.

From the Bot Detection Rules page, you can configure bot detection by using the following functionalities:

Using the .
Creating your own bot detection rules.

Use the IAB/ABC International Spiders and Bots List iab-list

The is a third-party, industry-standard list of internet spiders and bots. This list helps you identify automated traffic such as search engine crawlers, monitoring tools, and other nonhuman traffic that you may not want to include in your analytics counts.

To configure your datastream to use the IAB/ABC International Spiders and Bots List:

Toggle the Use IAB/ABC International Spiders and Bots List for bot detection on this datastream option.
Select Save to apply the bot detection settings to your datastream.

IAB spiders and bot list enabled.

Create bot detection rules rules

In addition to using the , you can define your own bot detection rules for each datastream.

You can create bot detection rules based on IP addresses and IP address ranges.

If you need more granular bot detection rules, you can combine the IP conditions with request header conditions. Bot detection rules can use the following headers:

HTTP header

Description

user-agent

A header which lets servers and network peers identify the application, operating system, vendor, and/or version of the requesting user agent.

content-type

Indicates the original media type of the resource (prior to any content encoding applied for sending).

referer

Identifies the address of the web page from which the resource has been requested.

sec-ch-ua

Provides the brand and significant version for each brand associated with the browser in a comma-separated list.

sec-ch-ua-mobile

Indicates whether the browser is on a mobile device. It can also be used by a desktop browser to indicate a preference for a mobile user experience.

sec-ch-ua-platform

Provides the platform or operating system on which the user agent is running. For example: 鈥淲indows鈥� or 鈥淎ndroid鈥�.

sec-ch-ua-platform-version

Provides the version of the operating system on which the user agent is running.

sec-ch-ua-arch

Provides the user-agent鈥檚 underlying CPU architecture, such as ARM or x86.

sec-ch-ua-model

Indicates the device model on which the browser is running.

sec-ch-ua-bitness

Provides the 鈥渂itness鈥� of the user-agent鈥檚 underlying CPU architecture. This is the size in bits of an integer or memory address鈥攖ypically 64 or 32 bits.

sec-ch-ua-wow64

Indicates whether a user agent binary is running in 32-bit mode on 64-bit Windows.

To create a bot detection rule, follow the steps below:

Select Add New Rule.
Type a name for the rule in the Rule Name field.

Select Add new IP condition to add a new IP-based rule. You can define the rule by IP address or by IP address range.

Bot detection rule screen with the IP address field highlighted.

Bot detection rule screen with the IP range field highlighted.

note tip
TIP
The IP conditions are based on a logical `OR` operation. A request is marked as originating from a bot if matches any of the IP conditions which you defined.

If you want to add header conditions to your rule, select Add header conditions group, and then select the headers which you want the rule to use.

Then, add the conditions to be used for the selected header.
After configuring the desired bot detection rules, select Save to have the rules applied to your datastream.

Bot detection rule examples examples

To help you get started with bot detection, you can use the examples detailed below to create bot detection rules.

Bot detection based on one IP address one-ip

To mark all requests originating from a specific IP address as bot traffic, create a new bot detection rule which evaluates a single IP address, as shown in the image below.

Bot detection rule based on one IP address.

Bot detection based on two IP addresses two-ip

To mark all requests originating from either of two specific IP addresses as bot traffic, create a new bot detection rule which evaluates two IP addresses, as shown in the image below.

Bot detection rule based on two IP addresses.

Bot detection based on a range of IP addresses range

To mark all requests originating from any IP address in a specific range as bot traffic, create a new bot detection rule which evaluates an entire IP address range, as shown in the image below.

Bot detection rule based on IP range.

Bot detection based on an IP address and a request header ip-header

To mark all requests originating from a specific IP address and containing a specific request header as bot traffic, create a new bot detection rule as shown in the image below.

This rule checks if the request originates from a specific IP address and if the referer request header starts with www.adobe.com.

Bot detection rule based on IP address and request header.

Bot detection based on multiple conditions multiple-conditions

You can create bot detection rules based on:

Multiple different conditions: Different conditions are evaluated as a logical AND operation, meaning that the conditions need to be met simultaneously in order for the request to be identified as originating from a bot.
Multiple conditions of the same type: Conditions of the same type are evaluated as a logical OR operation, meaning that if any of the conditions are met, the request is identified as originating from a bot.

The rule shown in the image below identifies a bot-originating request if the following conditions are met:

The request originates from either one of the two IP addresses, the referer header starts with www.adobe.com, and the sec-ch-ua-mobile header identifies the request as originating from a desktop browser.

Bot detection rule based on multiple conditions.

recommendation-more-help

c4bd45d4-a044-4e32-94ad-5e2f71800fac