Web Services API Best Practices

This section lists recommendations pertaining to maximizing the ABBYY FineReader Server 14 Web Services API performance.

The code samples in this section are written in C# and use the Web Services API SOAP protocol.

All recommendations listed here are also applicable to REST API. However, some parameters may have different names. For more information, see REST API.

Processing a single job via the API

In order to process a single file via API using the default settings, do the following:

  1. Call the StartProcessFile method by passing the file name and the contents of the file for processing (the FileName and FileContents fields of the FileContainer object). The response of the method will contain the ID of the created job.
  2. Periodically call the GetJobStateInfo method using the obtained job ID until it returns either the JS_Complete value (job processing completed) or the JS_NoSuchJob vaule (job deleted manually).
  3. Call the GetJobResult method to get an object containing the processing results (including file contents).

Note. If you need to change the processing parameters or process several files in a single job, use a combination of the CreateTicket and StartProcessTicket methods instead of using StartProcessFile.

Note. If you do not need the file contents when getting processing results, use the GetJobResultEx method with a DoNotSendFiles flag instead of the GetJobResult method.

To process several jobs, you can carry out the steps above several times in a row, however, this will not always be the best solution.  

ABBYY FineReader Server uses built-in algorithms to form job queues and distribute jobs across available processing stations. We recommend that you follow the steps outlined below if you need to process several jobs.

Processing multiple jobs via the API:

Starting from Release 2 Update 2 onwards:

  1. Decide on a value (N), e.g. 10 (your choice should be based on your hardware specifications). See also: Performance Guide.
  2. Send the first N jobs for processing by calling StartProcessFile N times using different parameters, which will return N job ID's. Unlike the sequential approach, this will send all N jobs to the queue at the same time, letting FRS 14 automatically distribute them across your processing stations.
  3. Periodically call the GetJobStateInfos method, passing to it an array of N job ID's, until at least one job ID is returned as  JS_Complete or JS_NoSuchJob. Compared to the sequential approach, this will decrease the number of requests by N times.
  4. Get the results for completed jobs.
  5. Replace the finished jobs in the queue with new ones, so that the number of active jobs stays at N (or less if you do not have enough jobs available).

The GetJobStateInfos method was implemented in Release 2 Update 2. If your FRS14 version is Release 2 Update 1 or older, we recommend that you update your software. If you are unable to do so, we still recommend using this approach, however, in this case you will need to call GetJobStateInfo N times instead of calling GetJobStateInfos.

Use unique names for jobs and input files

Files that have been sent for processing via API are copied to the workflow input folder for processing. The names of these files correspond to the values specified in the FileName property of the FileContainer objects.

Additionally, an XML file containing processing settings is placed in the input directory. The name of the file corresponds to the value of the Name property of the XmlTicket object (alternatively, if the field is empty, the value of the FileName property of the first FileContainer object).

Sending several requests with identical file names at the same times will result in all of them except one being completed with the following error:

The file 'filename.ext' is already in the Input folder. Please wait until the workflow retrieves it, or use a different name.

This is why we recommend using unique file names. For example, you can use the GUID as a prefix:

// Send multiple files for processing using SOAP protocol of ABBYY FineReader Server WebServices API
public static string SendFilesForProcessing(
   WebServiceSoapClient client, string serverLocation,
   string workflowName, string[] inputFilePaths)
   {
       // Generate unique prefix for each job
       var prefix = Guid.NewGuid().ToString();
       var inputFiles = new List<InputFile>();
       foreach (var inputFilePath in inputFilePaths)
       {
           inputFiles.Add(new InputFile
           {
               FileData = new FileContainer
               {
                   // Prepend this prefix to each input file name
                   FileName = $"{prefix}_{Path.GetFileName(inputFilePath)}",
                   FileContents = File.ReadAllBytes(inputFilePath)
               }
           });
       }
       var ticket = client.CreateTicket(serverLocation, workflowName);
       // Also prepend this prefix to the name of the job, or leave the name empty
       ticket.Name = $"{prefix} (From Web API)";
       ticket.InputFiles = inputFiles.ToArray();
       // Send the job to server and return its ID
       return client.StartProcessTicket(serverLocation, workflowName, ticket);
   }
}

Passing files

Use one of the following methods to send files for processing:

  • If your files are relatively small (several dozen megabytes), use the FileContents property of the FileContainer object. This will pass the file contents as a byte array.
  • Starting from Release 1 Update 9 onwards, use the POST /api/workflows/{workflowName}/input/multipart method to pass files using multipart streaming, which will not require storing the full file in the memory (for more information, please see Workflows controller).
  • You can also pass file-sharing service addresses instead of the file contents themselves by using the LocationPath property of the FileContainer object.

Important! This method is not recommended and only exists for RS4 compatibility purposes. Avoid using this method in newer projects. This mode is disabled by default because it may introduce a security vulnerability unless proper access control is used for the application pool identity account. For more information, please see FileContainerObject.

Avoid using the following synchronous methods: ProcessFile and ProcessTicket

These methods are only meant to be used for backwards compatibility. Using these methods unnecessarily overloads the server and may lead to TimeoutException errors. Wherever possible, use a combination of StartProcessFile / StartProcessTicket, GetJobState, and GetJobResult instead.

Use the IsTemporary property for temporary jobs

Starting from Release 2 onwards:

If your API workflow does not involve storing the processing results on the server for longer than 24 hours, you should always set the IsTemporary property of the XmlTicket object to true. For more information, please see XmlTicket Object.

26.03.2024 13:49:49

Please leave your feedback about this article

Usage of Cookies. In order to optimize the website functionality and improve your online experience ABBYY uses cookies. You agree to the usage of cookies when you continue using this site. Further details can be found in our Privacy Notice.