Setting up your classifier and loading images

For each classifier, you need to create a training batch. To create a training batch:

  1. Switch to the classification training batch view by doing one of the following:
    • clicking the button on the toolbar;
    • selecting Open Classifier Training Batches in the View or Classification Training menu;
    • using Ctrl+Alt+Shift+B shortcut.
  2. Right-click inside the pane with the list of batches and click New Batch... on the shortcut menu.
  3. Specify a classification profile by selecting Change Classification Profile in the Classification Training menu or in the classifier batch shortcut menu.
  4. Specify precision/recall balance by selecting Change Recall/Precision Priority either in the Classification Training menu or in the classifier batch shortcut menu.
  5. Add training images to the batch by doing one of the following:
    • Clicking the Load Images from Folders button on the toolbar or selecting Load Images from Folders in the File menu. Then, choose the folder containing the subfolders with the appropriate images. Each subfolder should only contain images belonging to a single class.
    • Clicking Load Images... in the batch context menu or in the File menu.

Note: Image pre-processing is only available for pages that have been added to the batch using the Load Images... option. Images that have been added using Load Images from Folders are processed according to the Project properties.

Note: If you selected the Image or Combined classification profile, the training images should have the same color scheme (e.g. black-and-white) as the images that you want to classify.

  1. Set a reference class to the images using one of the following methods:
    • By clicking Set Class... in the Classification Training menu (you can also right-click a document and then click Set Class... on the shortcut menu). You can use sections or section variants as reference classes. Select a reference class from the list or create a new one and link it with a section or a section variant. The reference class must have the same name as the corresponding Document Definition.
    • By subfolder name. If the images have been added to the batch using the Load Images from Folders feature, you can assign classes based on the image’s subfolder name. Select the appropriate images and click the button on the toolbar. Alternatively, select Classification Training → Set Class Based on Subfolder Name.

Note: If the class name is the same as the name of the corresponding document definition section, they will be linked automatically when you click Autocorrect... in the Class Mapping dialog box. For more details, see Mapping classes to Document Definition.

Assigning reference classes based on classification results

If you have a large number of unsorted images that you are planning to use for classification training, you can use some of them to train your classifier and assign reference classes based on classification results to the rest:

  • Set reference classes to some of the images manually using Set Class....
  • Launch the classification training (the Train button on the toolbar; alternatively, Train in the Classification Training menu or in the shortcut menu).
  • Select the remaining images without a reference class and launch the Classify feature by clicking the button found on the toolbar. Alternatively, you can select Classify in the Classification Training menu or in the context menu. A result class will be assigned to the images based on the trained classifier.
  • For the selected images, click Set Class Based on Classification Result in the Classification Training menu or in the shortcut menu. The images will then be assigned the same reference class as their result class. For images that have had an incorrect result class assigned, you will have to set the reference class.

Note: If you want a classifier to use existing recognition results while matching Document Definitions, select the Prefer settings from batch type option on the Classification tab in the properties of classifier training batch (a batch context menu →  Properties…). This way the program synchronizes settings of the full-text recognition and allows you not to recognize documents once again and considerably save time required for classification.

01.12.2020 7:03:59


Please leave your feedback about this article