How to Create a Script for Document Separation
ABBYY FineReader Server 14 allows you to configure document separation with the help of a script. This script is executed in addition to one of the built-in document separation methods: first, the document separation is performed in accordance with the selected method, and then the script is applied. Scripts can be used for adjusting or enhancing a selected separation rule using recognized text and barcodes.
The script is run separately on each recognized page; therefore document separation cannot be performed as a result of comparing two pages. The document separation script allows you to set the beginning of a new document based on the page content and properties, mark the page for deletion (e.g. if it is a separation page or a blank page), or discard the job.
Important! Document separation scripts are triggered at the Processing Station. Thus, for the script to work properly with shared resources, you should run the Processing Stations under a user account which has the necessary access rights for these resources.
To create and use the script, do the following:
- Open the 3. Document Separation tab of Workflow Properties dialog box.
- Select one of the built-in document separation methods (the default setting is Create one document for each job) and click Script....
- In the Script Editor dialog box that opens, select the scripting language and enter the script code. The reference "this" or "Me" refers to the RecognizedPage object.
- To check the script, click the Check button. To save the script, click OK.
Important! When processing multi-page documents, the program splits each document into several portions, to be processed simultaneously, each by a separate processor core. The number of pages in each portion is set in the PagesSlice attribute (the default setting is 25).
- If you modify the workflow properties or load settings from an XML file so that PagesSlice=25 and then add a document separation script to the workflow, documents will not be split and the PagesSlice will be automatically set to -1. If the value of the PagesSlice attribute is other than 25, it will remain unchanged.
- If you modify the workflow properties or load settings from an XML file so that PagesSlice=-1 and there is no document separation script to the workflow, the PagesSlice attribute will be automatically set to and documents will be split into portions of 25 pages.
For details on using scripts, see Using Scripts in ABBYY FineReader Server.
Sample
The sample script provided below is written in JScript and is intended for distribution of recognized pages between documents of three groups (articles, resumes, brochures) using separation pages. If the text on a recognized page corresponds to the set text, the relevant custom text is assigned to this page (the CustomText property of the RecognizedPage object), the page is marked as the first in the document (the IsStartingPage property) and marked for deletion (the IsForDeletion property). Therefore, running the script separates the documents into 3 groups and deletes separation pages.
Note. This script is used in the Scripting Demo demo processing scenario, whose settings are contained in ScriptingDemoWorkflow.xml in the following folder: Samples (Start > Programs> ABBYY FineReader Server 14 > Examples).
var pageText = Text;
var isArticle = pageText == "Separator sheet Document type: article";
var isResume = pageText == "Separator sheet Document type: resume";
var isBrochure = pageText == "Separator sheet Document type: brochure";
if( isArticle ) {
CustomText = "Article";
} else if( isResume ) {
CustomText = "Resume";
} else if( isBrochure ) {
CustomText = "Brochure";
}
if( isArticle || isResume || isBrochure ) {
IsStartingPage = true;
IsForDeletion = true;
}
See also
3/26/2024 1:49:49 PM