蜜豆视频

Text parser

You can use the Text parser tool to parse text for use in other 蜜豆视频 Workfront Fusion scenario modules. The Text parser does not require a connection.

Access requirements

Expand to view access requirements for the functionality in this article.

You must have the following access to use the functionality in this article:

table 0-row-2 1-row-2 2-row-2 3-row-2 layout-auto html-authored no-header
蜜豆视频 Workfront package Any
蜜豆视频 Workfront license

New: Standard

Or

Current: Work or higher

蜜豆视频 Workfront Fusion license** No Workfront Fusion license requirement.
Product

New:

  • Select or Prime Workfront package: Your organization must purchase 蜜豆视频 Workfront Fusion.
  • Ultimate Workfront package: Workfront Fusion is included.

Or

Current: Your organization must purchase 蜜豆视频 Workfront Fusion.

For more detail about the information in this table, see Access requirements in documentation.

For information on 蜜豆视频 Workfront Fusion licenses, see 蜜豆视频 Workfront Fusion licenses.

Text parser API information

The Text parser connector uses the following:

API tag
v2

Text parser modules and their fields

When you configure Text parser modules, 蜜豆视频 Workfront Fusion displays the fields listed below. A bolded title in a module indicates a required field.

If you see the map button above a field or function, you can use it to set variables and functions for that field. For more information, see Map information from one module to another.

Map toggle

Transformers

Get Elements from HTML

Retrieves the desired elements from HTML code.

Continue the execution of the route even if the module finds no matches
Enable this option to ensure that the module does not stop the scenario if it returns no results.
Element type

Select the type of element you want to retrieve from the HTML code.

  • Image
  • Link
  • iFrame element(s)
HTML
Enter or map the HTML code you want to retrieve the specified element types from.

Get Elements from text

Parses elements from text based on the given pattern.

Input text
Enter or map the text you want to parse.
Pattern
Select the pattern that reflects the elements you want to parse from the text.
Ignore Duplicate Occurrences
Check this box to ignore duplicate occurrences of a text element.

HTML to Text

HTML
Enter the HTML code you want to convert to plain text.
Line break
Select the type of newline (line break).
Uppercase headings
Enable this option to convert text enclosed in the heading tags (such as <h2> </h2>) into uppercase text.

Match Pattern

The Match pattern module enables you to find and extract string elements matching a search pattern from a given text. This module uses regular expressions (also known as regex or regexp).

A regular expression is a sequence of characters in which each character is either a metacharacter, having a special meaning, or a regular character that has a literal meaning. These character and metacharacters identify a pattern that can be used to search text. For example, if you wanted to search for names, you could set up a regular expression to search for a pattern that consists of two consecutive words that begin with capital letters. Regular expressions are a powerful tool for searching and manipulating text.

A discussion of regular expressions is beyond the scope of this article. We recommend the following resources:

  • For the complete list of metacharacters, see in MDN web docs.
  • For a tutorial on how to create regular expressions, we recommend .
  • For experimenting with regular expressions, we recommend the website. Select the ECMAScript (JavaScript) FLAVOR in the left panel.
Pattern

Enter the regular expression pattern.

Example: [+-]?(\d+(\.\d+)?|\.\d+)([eE][+-]?\d+)? extracts all numerals in the provided text.

Note:

The pattern should contain at least one capture group in parenthesis (). If the pattern does not contain any capture groups, the output bundle is empty.

Global match
Enable this option to retrieve all matches in the text. Each match is output in a separate bundle. If this option is disabled, the module retrieves only the first entry.
Case sensitive
Enable this option for this module to treat text as case-sensitive.
Multiline
Enable this option to ensure that beginning and end metacharacters (^ and $) matches the beginning or end of each line, not just the very beginning or end of the whole input string.
Singleline
Enable this option to ensure that the period (.) matches newline characters (\n).
Continue the execution of the route even if the module returns no results
Enable this option to ensure that the module does not stop the scenario if it returns no results.
Text
Enter or map the text you want to match the pattern.

Replace

Searches the entered text for a specified value or regular expression and replaces the result with the new value.

Pattern
Enter the search term. You can also use a regular expression. For more details about the regular expression refer to the Match Pattern module.
New value
Enter the value that yiou want to replace the search term.
Global match
Enable this option to retrieve all matches in the text. Each match is output in a separate bundle. If this option is disabled, the module retrieves only the first entry.
Case sensitive
Enable this option for this module to treat text as case-sensitive.
Multiline
Enable this option to ensure that beginning and end metacharacters (^ and $) matches the beginning or end of each line, not just the very beginning or end of the whole input string.
Singleline
Enable this option to ensure that the period (.) matches newline characters (\n).
Text
Enter the text to be searched.

Data Scraping

Data scraping, sometimes called web scraping, data extraction, or web harvesting, is the process of collecting data from websites and storing it in your local database or spreadsheets. If you want to scrape data from a website and you are not familiar with regular expressions, you may use a data scraping tool.

If the data scraping tool provides a REST API, you can connect to it via our universal HTTP modules and Webhooks modules.

Text parser troubleshooting

Use this information if you can not get a text parser to produce any output.

recommendation-more-help

Example:

The module should parse the filetype of a file document 鈥渇ilename.docx鈥, and the extension of the filename varies from DOCX to PDF to CSV.

The expression that you may choose to use in this case is ..+

This regular expression would normally result in a full match.

However, implementing this expression in your text parser does not result in a match:

No match

The reason for this is that the 鈥渋鈥 shows only the number of matches per match so in this case, we have 2 matches, threfore after the 鈥渋鈥 there is a numerical value 1 and 2. The use case for this is that should you ever need to match or pass data through a filter only the second matched value you can specify which value that is represented by the numerical value.

Match

To be able to get the match values that you require to add brackets to the part that you want to parse (for example, to extract from 鈥渇ilename.docx鈥 - 鈥渄ocx鈥 only), then, according to the regex expression we are using for this case scenario, the brackets should be applied on \.(.+)

This captures the DOCX, places it in a group, and leave the 鈥.鈥 out of it.

Get matches

In the output shown in the picture below, the capturing group will match any character (except for line terminators).

Output

Another workaround that also incorporates regex is using the replace function

{{replace("abcdefghijklmno pqr stuvw xyz.docx"; "/.\./"; ".")}}

Then replace abcdefghijklmno pqr stuvw xyz.docx with your actual filename variable.

7e1891ad-4d59-4355-88ab-a2e62ed7d1a3