How to Configure Microsoft Search IFilter
ABBYY FineReader Server 14 includes the Microsoft Search IFilter component, which allows the program to interact with the following search engines developed by Microsoft: Microsoft Search Server, Microsoft Office SharePoint Server, and Microsoft Windows Search.
A Microsoft search engine indexes documents in specified folders. If these folders contain documents in image formats that must be indexed, these documents are sent to ABBYY FineReader Server through IFilter. ABBYY FineReader Server recognizes the documents as part of a special hidden workflow and exports the results as text. When the recognition results become available, IFilter returns them to the Microsoft engine for indexing and the documents become searchable.
IFilter is based on the ABBYY FineReader Server 14 – IFilter Backend service (Start > Control Panel > Administrative Tools > Services > ABBYY FineReader Server 14 – IFilter Backend) and performs two main functions:
- It receives image files from the Microsoft search system crawler and sends them to ABBYY FineReader Server for OCR.
- It returns the recognized text to the Microsoft search system for indexing.
For this feature to work correctly with Microsoft SharePoint, the following conditions must be met:
- A search service application must be created in SharePoint Server.
- The user that installs IFilter must have the permissions necessary for connecting to the databases used by the search service application in SharePoint Server and permissions for modifying data in these databases.
To configure Microsoft Search IFilter, do the following:
- Configure Microsoft Search Server, Microsoft Office SharePoint Server or Microsoft Windows Search for file indexing:
- Specify the folders with image files to be indexed.
- Specify the formats of the files to be indexed. Microsoft Search IFilter is automatically registered for the following file extensions: .jpg, .jpeg, .jpe, .tif, .tiff, .pdf, .bmp, .pcx, .dcx, .png, .djvu, .j2k, .jp2.
- Note. ABBYY FineReader Server 14 IFilter can be used to index PDF documents in SharePoint 2013. In order for this functionality to be available, the September 2014 Cumulative Update for SharePoint Server 2013 must be installed (support.microsoft.com/kb/2883068).
- Note. Now when you run a search in SharePoint 2013 or 2016, any *.jpg, *.jpe and *.jpeg files will be processed by means of ABBYY FineReader Server 14 IFilter.
Note. After Microsoft Search IFilter is installed and Microsoft Office SharePoint Server is configured for file indexing (i.e. all the necessary folders and file formats are specified), you must restart the SharePoint Server search service. To do this, open the command line (click Start > Run..., and in the dialog box that opens enter "cmd", then click OK) and run the following commands sequentially: "net stop osearch" and "net start osearch".
Note. When installing ABBYY FineReader Server 14 IFilter for SharePoint 2013 and 2016, all the configuration is done automatically by the installation program.
- Configure the ABBYY FineReader Server 14 IFilter. To do this, in the IFilter Settings dialog box (Start > Programs > ABBYY FineReader Server 14 > IFilter Settings), specify the following parameters:
- In the Server Manager location field, enter the DNS name or the IP address of the computer on which the Server Manager is installed.
- Click Test Connection to check if the connection is established. If there is a connection, "Connected" will appear to the left of the button.
- Select recognition languages. English will be used by default. When several IFilter copies are installed in the network which use one and the same FineReader Server, different recognition languages may be selected for different IFilter copies. For example, each user of Windows Search may choose the recognition languages that they need on their own PC.
- We recommend enabling the Process photos option to ensure the processing of *.jpg, *.jpe, and *.jpeg files.
- Change the temporary IFilter folder if necessary. This folder stores files with recognition results until they are transferred to the Microsoft system, and the IFilter database file.
- File recognition for IFilter is carried out using a special hidden workflow. To set the workflow parameters, in the Remote Administration Console, select the link Microsoft Search IFilter. In the IFilter Workflow Settings dialog box, set the following parameters:
- Change the temporary folders for input and output files if necessary.
Important! Input and Output folders must be shared, and the path to the folder must be specified in UNC format. The user account under which the ABBYY FineReader Server 14 IFilter Backend service runs (on the computer where the ABBYY FineReader Server 14 IFilter is installed) must have read/write permissions to these folders.
- Set up an image recognition schedule. Images can be recognized continuously or according to schedule. For more information on creating schedules, see Creating a Schedule.
Note. IFilter stores information about previously recognized files in a database and uses it to check if a file has already been recognized. If a file has been created or modified after the last indexing session, it will be sent to ABBYY FineReader Server for recognition. Files that have not changed since the last indexing will not be re-recognized.
Note. The Microsoft search system awaits response from ABBYY FineReader Server 14 IFilter for a certain period of time. Sometimes, this time period may not be enough for ABBYY FineReader Server to recognize a large file. In this case, a two-stage indexing will be performed: at the first request of the Microsoft system, the file is transferred to FineReader Server for OCR, and at the next query, which occurs after a while, the recognized text is returned to the search system for indexing. For this reason, the contents of new image files may be added to the index and becomes available for searching after a delay of up to several hours.
Note. If a recognition language is changed in the IFilter settings, the indexing results must be updated. To achieve this, remove all files from the <Temporary File Storage>\Results folder and run indexing anew.
Note. If you use ABBYY FineReader Server 14 IFilter for indexing files in Windows Server 2012, you should manually add the Windows Search Service feature in your operating system.
To enable the Windows Search Service in Windows Server 2012, complete the following steps:
- Start the Server Manager.
- Open the Management menu and select Add Roles and Features.
- On the Installation Type tab, select the Role-Based or feature-based installation option.
- In the Server Selection tab, select the server or virtual drive to which you want to install the Windows Search Service.
- Select the Windows Search Service in the Features tab.
- Open the Confirmation tab and make sure that the Windows Search Service is listed. Click the Install button.
ABBYY FineReader Server 14 IFilter events can be logged in a OCRServerIFilter.part0.log (OCRServerIFilter.part1.log etc.) file in the %ProgramData%\ABBYY FineReader Server 14 folder. Logging is disabled by default. To enable logging, select the corresponding option in the IFilter Settings dialog box. The log includes information about:
- starting and stopping IFilter
- files transferred to ABBYY FineReader Server for recognition
- IFilter errors
Information about critical errors in Microsoft Search IFilter is logged in the system event viewer (Start > Control Panel > Administrative Tools > Event Viewer > Application).
11/29/2022 5:26:42 PM