Creating Document Definitions for documents that do not require automatic data extraction
A Document Definition without automatic data extraction may be useful in the following cases:
- When processing documents that must always be present in a document set but from which no data should or can be extracted. Examples include hand-written statements, notices, etc.
- When you need to classify and sort documents without extracting their data.
In cases like those described above, you can create a Document Definition that will cause its documents to skip the analysis stage. This will reduce the workload on the verification operator and will speed up the processing.
To create a Document Definition without automatic field extraction, the following conditions must be met:
- Your documents must have no anchor identifiers (otherwise, they will be treated as fixed forms).
- There must be no FlexiLayout loaded.
- Field training must be disabled.
In the Document Definition Wizard, select Documents that do not require automatic data extraction. As a result, a Document Definition with only one section will be created and any documents that don't require automatic data extraction will skip the analysis stage.
Typically, this kind of documents do not require OCR. If you are planning to use manual indexing or want your verification operators to be able to enter text by clicking it on the image, you must make sure that your documents have a text layer. To add a text layer to your documents, create at least one field with the Can have region option selected.
Note: Alternatively, you can have a text layer added at the export stage by selecting the Create searchable PDF option.
Any Document Definition can be modified to prevent automatic field extraction. All you need to do is remove all the anchors and FlexiLayouts and disable field training.
12.04.2024 18:16:02