Your use of this site is conditioned on Your continued compliance with the Terms of Use.

I have read Terms of Use. I am aware I may use the Site and/or its Content for personal use only in accordance with the Terms of Use, as a part of my relationship with ABBYY. It’s expressly forbidden to use the Site and/or its Content for competitive and benchmarking purposes.

IF YOU DO NOT AGREE, DO NOT USE THE SITE.

Terms of Use

The sites https://www.abbyy.com/, https://help.abbyy.com/ and other ABBYY-owned sites (collectively, “Site”) are the property of ABBYY Development Inc. and affiliates, the ABBYY group companies ("ABBYY") and its licensors. BY USING THE SITE, YOU AGREE TO THESE TERMS OF USE; IF YOU DON’T AGREE, DO NOT USE THE SITE.

The services and information that ABBYY provides to You are subject to the following Terms of Use (referred to as “Terms”). ABBYY reserves the right, at its sole discretion, to change, modify, add or remove portions of these Terms, at any time. It is Your responsibility to check these Terms for amendments. ABBYY reserves the right to do any of the following, at any time, without notice: to modify, suspend or terminate operation of or access to the Site, or any portion of the Site, for any reason; to modify or change the Site, or any portion of the Site; and to interrupt the operation of the Site or any portion of the Site for maintenance or other reason. You may not use the Site or any Content for any purpose that is unlawful or prohibited by these Terms, or to solicit the performance of any illegal activity or other activity which infringes the rights of ABBYY or others. You agree that ABBYY may, in its sole discretion and without prior notice, terminate Your access to the Site and/or block Your future access to the Site if ABBYY finds that You have violated these Terms or other agreements. You agree that any violation by You of these Terms will constitute an unlawful and unfair business practice. You agree that ABBYY may, in its sole discretion and without prior notice, terminate Your access to the Site. You agree that ABBYY will not be liable to You or to any third party for termination of Your access to the Site as a result of any violation of these Terms.

Your continued use of the Site means that You agree to the amendments. As long as You comply with these Terms, ABBYY grants You a personal, non-exclusive, non-transferable, limited right to enter and use the Site.

Disclaimer of Warranty

All materials contained herein, the Site and any Content, service or features are provided "AS IS" and "AS-AVAILABLE" without warranty of any kind. ABBYY disclaims all warranties of any kind, including all warranties and conditions of accuracy, merchantability, whether express implied or statutory, fitness for a particular purpose, defects-free, virus-free, contamination title and non-infringement, specific results warranty. Any use of the materials of this site is at Your own discretion and risk and You are solely responsible for any damage to Your computer system, including but not limited to loss of data.

ABBYY DISCLAIMS ALL LIABILITY FOR THE ACTS, OMISSIONS AND CONDUCT OF ANY THIRD PARTIES IN CONNECTION WITH OR RELATED TO YOUR USE OF THE SITE AND/OR ANY SERVICES. YOU ASSUME TOTAL RESPONSIBILITY FOR YOUR USE OF THE SITE AND ANY LINKED SITES AND PAGES. YOUR SOLE REMEDY AGAINST ABBYY FOR DISSATISFACTION WITH THE SITE OR ANY CONTENT IS TO STOP USING THE SITE OR ANY SUCH CONTENT.

Limitation of Liability

Under no circumstances shall ABBYY be liable for any kind of damages, indirect or consequential, exemplary, incidental or punitive damages, including, without limitation loss of profits or revenues and/or costs of replacement goods and all damages resulting from (i) downloading of any software available, (ii) use of the Content or the Site, or the service, or the software, (iii) ABBYY's failure to provide services, whether in action of contract, negligence or other tortuous action even if ABBYY has been informed in advance of the possibility of such damages.If, notwithstanding the other provisions of these Terms, ABBYY is found to be liable to You for any damage or loss which arises out of or is in any way connected with Your use of the Site or any Content, ABBYY’s liability shall in no event exceed fifty US dollars. Some jurisdictions do not allow limitations of liability, so the foregoing limitation may not apply to You.

Transmission and Submission of Information

ABBYY does not guarantee the security of any information transmitted to or from the Site. Any material, information or other communication You transmit or post to this Site will be considered non-confidential. ABBYY will have no obligations with respect to such communications. Additionally, by using the Site, You acknowledge and agree that Internet transmissions are never completely secure. You understand that any message or information You send to the Site may be read or intercepted by others.

By transmitting or posting any information You grant ABBYY an unrestricted royalty free right to copy, disclose, distribute, otherwise dispose and use of such information.

While browsing through this site You agree to refrain from posting or transmitting to or from this Site any unlawful material that may violate any domestic and/or international legislation.

ABBYY may at it's own discretion monitor or review any areas on this site where users transmit or post materials or communicate solely with each other, including but not limited to any kind of chat rooms or user forums. ABBYY will have no liability related to the content of any such areas.

Relevant ABBYY Privacy Policy (Notice) and Cookie Policy (if applicable) apply to use of this Site, and its terms are made a part of these Terms by this reference. In conflict, Privacy Policy (Notice) and Cookie Policy provisions prevail. If You choose to provide ABBYY with your personal information (including but not limited to via web-forms located at the Site), your personal information will be treated in accordance with relevant Privacy Policy (Notice) which is available via link displayed on the same webpage where such information is collected.

Downloads

Any software that is made available to download from this Site is the copyrighted work of ABBYY and/or its suppliers. Such software shall be used in accordance with the respective terms of the end user license agreement (EULA) or terms of service which accompany the software. Any use of the Software not in accordance with the respective agreement is expressly prohibited. ABBYY’s obligations, if any, with regard to its products and services are governed by the agreements pursuant to which they are provided, and nothing on this Site should be construed to alter such agreements.

Should ABBYY make available certain materials for downloading from the Site and expressly agree upon it, You may use information on ABBYY products and services, provided that You (a use such information only for Your personal, non-commercial informational purpose, (b) make no modifications to any such information (c) do not remove any proprietary notices in all copies of such documentation, and (d) not make any additional representations or warranties relating to such documentation. It is expressly forbidden to use the Site or its Content for benchmarking and competitive purposes.

If there is a conflict between these Terms and the terms posted for or applicable to a specific portion of the Site or for any service offered on or through the Site, the latter terms shall control with respect to Your use of that portion of the Site or the specific service.

ABBYY may make changes to any products or services offered on the Site, or to the applicable prices for any such products or services, at any time, without notice. The materials on the Site with respect to products and services may be outdated, and ABBYY makes no commitment to update the materials on the Site with respect to such products and services. ABBYY provides access to ABBYY international data and documentation and may contain references to products and services that are not offered in every country. Such reference does not imply that ABBYY intends to offer such products or services in Your country. Although the Site is accessible worldwide, not all features, products or services referenced or offered through or on the Site are available to all persons or in all geographic locations. ABBYY reserves the right to limit, in its sole discretion, the provision and quantity of any Content, feature, product or service to any person or geographic area. Any offer for any feature, product or service made on the Site is void where prohibited.

Use of Content

All the text, communications, software, scripting, photos, text, video, visual interfaces, graphics, trademarks, logos, music, sounds, images, artwork and computer code and other materials (collectively "Content") are owned, controlled or licensed by or to ABBYY and are protected by intellectual property, competition laws and are provided by ABBYY as a service to its customers only and is exclusively for personal use. Except as expressly provided in these Terms, You may not use any portion of this Site or the Content without ABBYY’s express prior written consent. Usage for benchmarking or competitive purpose is expressly prohibited.

Trademarks

The trademarks, logos, and service marks (collectively "Trademarks") appearing on the ABBYY website are the property of ABBYY, its licensorsand other third-parties. All Trademarks are provided for Your information and do not grant You a license to use them. Neither title nor intellectual property rights are transferred to You.A list of third-party Trademarks and patents is available here.

Links to Third-Party Sites

This site may provide links to other third-party sites. ABBYY makes no representations whatsoever about any other site which You may access through this site. You acknowledge and agree that ABBYY is not responsible for the content of any linked site or any link contained in a linked site.

References on this site to any names, marks, products or services of any third parties or hypertext links to third party sites or information are provided solely as a convenience to You. ABBYY does not endorse or recommend content of such sites.

Mentioning non ABBYY products or services is for informational purposes only and constitutes neither an endorsement, nor a recommendation.

Foreign Legislation

These Terms constitute the entire agreement between You and ABBYY with regard to Your use of the Site, and any and all other written or oral agreements or understandings previously existing between You and ABBYY with respect to such use are hereby superseded. ABBYY’s failure to enforce strict performance of these Terms shall not be construed as a waiver by ABBYY of any provision or any right it has to enforce these Terms, nor shall any course of conduct between ABBYY and You or any other party be deemed to modify any provision of these Terms. These Terms shall not be interpreted or construed to confer any rights or remedies on any third parties.

ABBYY does not warrant compliance with any foreign legislation. If You access the Site, You are solely responsible for compliance with all applicable local laws.

You may not use or export or re-export any Content or any copy or adaptation of such Content, or any product or service offered on the Site, in violation of any applicable laws or regulations, including without limitation international export control laws and regulations, and/or for competition and benchmarking.

Subscription Terms

The Subscription terms incorporate terms on subscription, billing and payment procedure for the use of ABBYY Software on a subscription-based model. You can review the full terms here.

Partner Subscription Terms

The Partner Subscription Terms incorporate terms on subscription, billing and payment procedure for the use of ABBYY Software on a Subscription-based model by partners. You can review the full terms here.

May 2021

English (English)

Japanese (日本語)

Extract substring

This operation extracts data from the current substring, i.e. from cells in the selected columns.

Suppose your loaded data contains a column with comments and you want to extract certain information from these comments. To do this, you need to place this information into a separate column, map the fields of the new column, and apply the required analysis modules.

Configuring the operation

Add a new operation and select Extract substring from the list.
For detailed instructions on creating operations, see Repository & Data Management > ETL in the Cloud > Operations.
In the operation editor, specify the following:

Source column
From the drop-down list, select the column from which you need to extract data.
New column
Enter a name for your new column. This column will contain the extracted data.
Strategy
Select a strategy to be used to extract data from the source column you specified in step 2а:

Position
This strategy extracts characters occupying specific positions in the substring.
For example, if Start position is set to 1 and Extract string length is set to 3, the program will extract characters occupying positions 1 through 3.
Pattern
This strategy uses a regular expression to extract data. 
You will need to specify a regular expression in the Pattern expression field (see Using regular expressions below).
Click Show regexp hints to see help on regular expressions.

Use filter
You can use filters if you need to apply an operation only to records that meet certain criteria (for example, only to those records where the "Employee" field is not empty).
For details, see Using filters with operations. 
Use the Run preview option if you want to view the result of the operation before performing it. Only part of the table will be displayed in preview mode.

Click Save. This will close the operation editor window, with the created operation appearing on the top panel.

Using regular expressions

The Extract substring operation can use a regular expression to extract text from a column. Regular expressions allow you to specify complicated search patterns.

The tables below list the available operators and quantifiers and provide examples of their use.

Important.Text search with regular expressions is case-sensitive.

Note: https://regex101.com/ offers an online service that helps you create and parse regular expressions, explaining the purpose of each operator. There you can also test your regular expression and get detailed information on how it works. Simply enter your regular expression in the REGULAR EXPRESSION field and see it parsed and explained in the EXPLANATION pane.

Operators

Operator	Description	Example
String of characters	Matches the specified string of characters.	Regular expression: marketing Column contains job titles: "accountant" "marketing specialist" "marketing manager" Regular expression will find: " " "marketing" "marketing"
String of characters in round brackets ( )	Matches the specified string of characters. The round brackets allow additional operations to be applied to the string.	Regular expression: (marketing) Column contains job titles: "accountant" "marketing specialist" "marketing manager" Regular expression will find: " " "marketing" "marketing"
String of characters in round brackets (?: )	Matches the specified string of characters. The matched substring will not be saved in the array of results.	Regular expression: (?:marketing)
Any character . (period)	Matches any character—a letter, a digit, or a special symbol.	Regular expression: (Mark .) Column contains names of employees: “Mark Stanford” “Dianne Millington” “Mark Wood” Regular expression will find: “Mark S” “ ” “Mark W”
Characters in square brackets [ ]	Matches any one of the specified characters. The search will stop as soon as a matching character is found. To specify a range of possible characters, use “-”: [0-9] stands for any digit [a-z] stands for ant lower-case letter [A-Z] stands for any upper-case letter [a-zA-Z] stands for any letter	Regular expression: [0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9] Column contains text that mentions social security numbers: “No. 934-80-1840” “SSN 870-28-7383" “SSN unknown” Regular expression will find: “934-80-1840" “870-28-7383" " "
Special symbols \	Matches any one of the following special symbols: \ ^ $ . \| ? * + ( ) [ { Use the backslash to escape a special symbol.	Regular expression: \+ Column contains records marked with: "+" "-" Regular expression will find: "+" " "
Number of occurrences \{}	Specifies the number of times the preceding expression may occur.	Regular expression: [0-9]\{4} Column contains text that mentions extension phone numbers: "Phone 1111" "Extension 2222" "No extension number" Regular expression will find: "1111" "2222" " "
Alternatives (More than one) \|	Combines multiple alternatives and matches any of one of them.	Regular expression: (M\|m)arketing Column contains job titles: "accountant" "marketing manager" "Marketing Manager" Regular expression will find: " " “marketing” “Marketing”

Operator

Description

Example

String of characters

Matches the specified string of characters.

Regular expression:
marketing

Column contains job titles:
"accountant"
"marketing specialist"
"marketing manager"

Regular expression will find:
" "
"marketing"
"marketing"

String of characters in round brackets

( )

Matches the specified string of characters. The round brackets allow additional operations to be applied to the string.

Regular expression:
(marketing)

Column contains job titles:
"accountant"
"marketing specialist"
"marketing manager"

Regular expression will find:
" "
"marketing"
"marketing"

String of characters in round brackets

(?: )

Matches the specified string of characters. The matched substring will not be saved in the array of results.

Regular expression:
(?:marketing)

Any character

. (period)

Matches any character—a letter, a digit, or a special symbol.

Regular expression:
(Mark .)

Column contains names of employees:
“Mark Stanford”
“Dianne Millington”
“Mark Wood”

Regular expression will find:
“Mark S”
“ ”
“Mark W”

Characters in square brackets

[ ]

Matches any one of the specified characters. The search will stop as soon as a matching character is found.

To specify a range of possible characters, use “-”:
[0-9] stands for any digit
[a-z] stands for ant lower-case letter
[A-Z] stands for any upper-case letter
[a-zA-Z] stands for any letter

Regular expression:
[0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]

Column contains text that mentions social security numbers:
“No. 934-80-1840”
“SSN 870-28-7383"
“SSN unknown”

Regular expression will find:
“934-80-1840"
“870-28-7383"
" "

Special symbols

Matches any one of the following special symbols:
\ ^ $ . | ? * + ( ) [ {

Use the backslash to escape a special symbol.

Regular expression:
\+

Column contains records marked with:

"+"
"-"

Regular expression will find:
"+"
" "

Number of occurrences

\{}

Specifies the number of times the preceding expression may occur.

Regular expression:
[0-9]\{4}

Column contains text that mentions extension phone numbers:
"Phone 1111"
"Extension 2222"
"No extension number"

Regular expression will find:
"1111"
"2222"
" "

Alternatives

(More than one)

Combines multiple alternatives and matches any of one of them.

Regular expression:
(M|m)arketing

Column contains job titles:
"accountant"
"marketing manager"
"Marketing Manager"

Regular expression will find:
" "
“marketing”
“Marketing”

Quantifiers

A quantifier specifies the allowed number of occurrences for the immediately preceding character, group of characters or class of characters.

Quantifier	Description	Example
? Alternative {0,1}	Repeats the preceding character zero or one times. Makes the preceding character optional. To make optional a group of characters, enclose the characters in round brackets.	Regular expression: Mark (Stanford)? Column contains names of employees: “Mark Stanford” “Dianne Millignton” “Frank Wood” “Mark Hamilton” Regular expression will find: “Mark Stanford” “ ” “ ” “Mark”
* Alternative {0,}	Repeats the preceding character zero or more times.	Regular expression: Mark Stanf.* Column contains names of employees, some of them with typos: “Mark Stanford” “Dianne Millington” “Mark Stanfard” “Mark Stanfort” Regular expression will find: “Mark Stanford” “ ” “Mark Stanfard” “Mark Stanfort”
+ Alternative {1,}	Repeats the preceding character one or more times.	Regular expression: [C\|c]hapter [0-9]+ Column contains text that mentions book chapters: “Chapter 3 describes...” "chapter 15 provides information about..." Regular expression will find: "Chapter 3" "chapter 15"
{m}	Repeats the preceding pattern m times.	Regular expression: \d{4} Column contains text that mentions vehicle registration year: “Vehicle registration year 2019”. Regular expression will find: “2019”
{m,}	Repeats the preceding pattern at least m times.	Regular expression: \d{10,} Column contains text that mentions customers' phone numbers: "Phone 89991234567" "Tel. 9991234567" Regular expression will find: "89991234567" "9991234567"
{m, n}	Repeats the preceding pattern m to n times (m cannot be greater than n).	Regular expression: \d{2,4} Column contains text that mentions vehicle registration year: “Vehicle registration year 2019” “Vehicle registration year '19” Regular expression will find: "2019" "19"

Quantifiers may be:

greedy
non-greedy (also called lazy or reluctant)

Greedy is the default behavior and means that the quantifier will match as much as it can, i.e. the regular expression will initially look for the maximum number of matching characters and then will give back one character at a time if the remainder of the pattern cannot be matched.

let regexp = /".+"/g;
    let str = 'a "witch" and her "broom" is one';
    alert( str.match(regexp) ); // "witch" and her "broom"

Conversely, a non-greedy quantifier will attempt to match the minimum number of occurrences.

let regexp = /".+?"/g; let
    str = 'a "witch" and her "broom" is one';
    alert( str.match(regexp) ); // witch, broom

An example of a regular expression using multiple operators and quantifiers

Suppose you have a document that contains a date in the following format: Friday, June 18, 2021 8:45:30 PM

To detect the time, the following regular expression can be used:
([0-1]?[0-9]|[2][0-3]):([0-5][0-9])(:[0-5][0-9])?
This regular expression will find hours, minutes, and seconds (if indicated).

9/5/2024 4:23:54 PM