蜜豆视频

Data feeds FAQ

Frequently asked questions about data feeds.

Must feed names be unique? unique

Data feed file names are made up of the report suite ID and the date. Any two feeds that are configured for the same RSID and date(s) have the same file name. If those feeds are delivered to the same location, one file will overwrite the other. To prevent a file overwrite, you cannot create a feed that has the potential to overwrite an existing feed in the same location.

Trying to create a feed when another exists with the same file name results in an error message. Consider the following workarounds:

  • Change the delivery path
  • Change the dates if possible
  • Change the report suite if possible

When is data processed? processed

Before processing hourly or daily data, data feeds waits until all the hits that entered data collection within the timeframe (day or hour) have been written out to data warehouse. After that occurs, data feeds collect the data with timestamps that fall within the timeframe, compresses it, and sends it via FTP. For hourly feeds, files are typically written out to data warehouse within 15-30 min after the hour, but there is no set time period. If there was no data with timestamps that fall within the timeframe, then the process tries again the next timeframe. The current data feed process uses the date_time field to determine which hits belong to the hour. This field is based on the time zone of the report suite.

What is the difference between columns with a post_ prefix and columns without a post_ prefix? post

Columns without the post_ prefix contain data exactly as it was sent to data collection. Columns with a post_ prefix contain the value after processing. Examples that can change a value are variable persistence, processing rules, VISTA rules, currency conversion, or other server-side logic 蜜豆视频 applies. 蜜豆视频 recommends using the post_ version of a column where possible.

If a column does not contain a post_ version (for example, visit_num), then the column can be considered a post column.

How do data feeds handle case sensitivity? case

In 蜜豆视频 Analytics, most variables are considered as case-insensitive for reporting purposes. For example, 鈥榮now鈥, 鈥楽now鈥, 鈥楽NOW鈥, and 鈥榮Now鈥 are all considered the same value. Case sensitivity is preserved in data feeds.

If you see different case variations of the same value between non-post and post columns (for example, 鈥榮now鈥 in the pre column and 鈥楽now鈥 in the post column), your implementation uses both uppercase and lowercase values across your site. The case variation in the post column was previously passed in and is stored in the virtual cookie, or was processed around the same time for that report suite.

Are bots filtered by Admin console bot rules included in data feeds? bots

Data feeds do not include bots filtered by Admin console bot rules.

Why do I see multiple 000 values in the event_list or post_event_list data feed column? values

Some spreadsheet editors, especially Microsoft Excel, automatically round large numbers. The event_list column contains many comma-delimited numbers, sometimes causing Excel to treat it as a large number. It rounds the last several digits to 000.

蜜豆视频 recommends against automatically opening hit_data.tsv files in Microsoft Excel. Instead, use Excel鈥檚 Import Data dialog box and make sure that all fields are treated as text.

Are columns like hitid_high, hitid_low, visid_high, and visid_low guaranteed to be unique to the hit or visit? hitid

In almost all cases, the concatenation of hitid_high and hitid_low uniquely identify a hit. The same concept applies to the concatenation of visid_high and visid_low for visits. However, processing anomalies can rarely cause two hits to share the same hit ID. 蜜豆视频 recommends against creating data feed workflows that inflexibly rely on every hit being unique.

Why is information missing from the domain column for some carriers? domain

Some mobile carriers (such as T-Mobile and O1) are no longer providing domain info for reverse-DNS lookups. Therefore, that data is not available for domain reporting.

Why can鈥檛 I extract 鈥淗ourly鈥 files from data that is more than 7 days old? hourly

For data that is more than 7 days old, a day鈥檚 鈥淗ourly鈥 files are combined into a single 鈥淒aily鈥 file.

Example: A new Data Feed is created on March 9, 2021, and the data from January 1, 2021 to March 9 is delivered as 鈥淗ourly鈥. However, the 鈥淗ourly鈥 files before March 2, 2021 are combined into a single 鈥淒aily鈥 file. You can extract 鈥淗ourly鈥 files only from data that is less than 7 days old from the creation date. In this case, from March 2 to the March 9.

What is the impact of Daylight Savings on hourly data feeds? dst

For certain time zones, the time changes twice a year due to daylight saving time (DST) definitions. Data feeds honor the time zone for which the report suite is configured. If the time zone for the report suite is one that does not use DST, file delivery continues normally like any other day. If the time zone of the report suite is one that does use DST, file delivery is altered for the hour in which the time change occurs (usually 2:00 am).

When making STD -> DST time transitions (鈥淪pring Forward鈥), the customer receives only 23 files. The hour that is skipped in the DST transition is omitted. For example, if the transition occurs at 2 AM, they get a file for the 1:00 hour and a file for the 3:00 hour. There is no 2:00 file because, at 2:00 STD, it becomes 3:00 DST.

When making DST -> STD transitions, (鈥淔all Back鈥), the customer gets 24 files. However, the hour of transition actually includes two hours鈥 worth of data. For example, if the transition occurs at 2:00 am, the file for 1:00 is delayed by one hour, but it contains data for two hours. It contains data from 1:00 DST to 2:00 STD (which would have been 3:00 DST). The next file begins at 2:00 STD.

How does Analytics handle FTP transfer failures? ftp-failure

If an FTP transfer fails (because of a denied login, lost connection, out of quota error, or other issue), 蜜豆视频 attempts to automatically connect and send the data up to three separate times. If the failures persist, the feed is marked as failed and an email notification is sent.

If a transfer fails, you can rerun a job until it succeeds.

If you have issues getting a data feed to appear on your FTP site, see Troubleshoot data feeds.

How do I resend a job? resend

Once you have verified/corrected the delivery issue, rerun the job to get the files.

What is the BucketOwnerFullControl setting for Amazon S3 data feeds? BucketOwnerFullControl

BucketOwnerFullControl provides cross-account rights to create objects in other buckets.

The common use case for Amazon S3 is that the Amazon Web Services (AWS) account owner creates a bucket, then creates a user that has permission to create objects in that bucket, and then provides credentials for that user. In this case, the objects of a user belong to the same account, and the account owner implicitly has full control of the object (read, delete, and so on). This process is similar to how FTP delivery works.

AWS also makes it possible for a user to create objects in a bucket that belong to a different user account. For example, say two AWS users, userA and userB, do not belong to the same AWS account but want to create objects in other buckets. If userA creates a bucket called 鈥渂ucketA,鈥 they can create a bucket policy that explicitly allows userB to create objects in bucketA even though the user doesn鈥檛 own the bucket. This policy can be advantageous because it doesn鈥檛 require userA and userB to exchange credentials. Instead, userB provides userA with their account number, and userA creates a bucket policy that essentially says 鈥渓et userB create objects in bucketA鈥.

However, objects do not inherit permissions from the parent bucket. Therefore, if userB uploads an object to userA鈥檚 bucket, userB still 鈥渙wns鈥 that object, and, by default, userA is not granted any permissions to that object even though userA owns the bucket. UserB must explicitly grant userA permissions because userB is still the object鈥檚 owner. To grant this permission, userB must upload the object with a BucketOwnerFullControl ACL, which specifies that the bucket owner (userA) is granted full permissions to the object (read, write, delete, and so on), even though the object is 鈥渙wned鈥 by userB.

NOTE
蜜豆视频 Analytics does not determine if the bucket has a policy that requires giving the bucket owner full control of new objects, or even if the bucket owner is in a different account than the user writing the data. Instead, Analytics automatically adds the bucket owner to the BucketOwnerFullControl ACL with each feed upload.
recommendation-more-help
6b7d49d5-f5fe-4b7f-91ae-5b0745755ed2