English (English) - Change language

Hypotheses for Table elements

The program looks for tables and divides them into columns and rows by relying on the Separators and White Gaps on the image. Additionally, table headers and footers are used to facilitate table detection. Headers and footers serve as top and bottom boundaries for table bodies, No information is extracted from them. The header contains the names of the columns which may be used to divide the table into columns. Once a table has been detected, the program formulates hypotheses for the entire table element, its columns, rows, and cells.

A Table hypothesis has the following properties:

Property Description
Element name The full name of the element.
Page The number of the page on which the element was detected.
Surrounding rect The coordinates of the rectangle which surrounds the region of the hypothesis.
Width The width of the region of the hypothesis.
Height The height of the region of the hypothesis.
Header found Shows whether the table header has been found.
Footer found Shows whether the table footer has been found.
Body found Shows whether the table body has been found.
Order name The name of the detected order of columns in the table.
Detected Shows whether the object described by the element has been found (true) or whether a null hypothesis has been formulated (false).
From the best path Shows whether the found hypothesis belongs to the best path in the tree of hypotheses (true) or not (false).
Pre-search quality How well the hypothesis matches the properties of the element specified by the settings in the Properties dialog box and by the code in the Advanced pre-search relations field.
Post-search quality The quality of the hypothesis after the conditions in the Advanced post-search relations field have been applied.
Chain quality The quality of the chain of hypotheses, from the first subelement of the group to the current subelement. Chain quality is calculated by multiplying the qualities of all the subelements in the chain and is used to compare rival chains of hypotheses.

A Table header hypothesis has the following properties:

Property Description
Element name The full name of the element.
Page The number of the page on which the element was detected.
Surrounding rect The coordinates of the rectangle which surrounds the region of the hypothesis.
Width The width of the region of the hypothesis.
Height The height of the region of the hypothesis.
Column name list Shows the found table columns.
Detected Shows whether the object described by the element has been found (true) or whether a null hypothesis has been formulated (false).
From the best path Shows whether the found hypothesis belongs to the best path in the tree of hypotheses (true) or not (false).
Pre-search quality How well the hypothesis matches the properties of the element specified by the settings in the Properties dialog box and by the code in the Advanced pre-search relations field.
Post-search quality The quality of the hypothesis after the conditions in the Advanced post-search relations field have been applied.
Chain quality The quality of the chain of hypotheses, from the first subelement of the group to the current subelement. Chain quality is calculated by multiplying the qualities of all the subelements in the chain and is used to compare rival chains of hypotheses.

A Table footer hypothesis has the following properties:

Property Description
Element name The full name of the element.
Page The number of the page on which the element was detected.
Surrounding rect The coordinates of the rectangle which surrounds the region of the hypothesis.
Width The width of the region of the hypothesis.
Height The height of the region of the hypothesis.
Detected Shows whether the object described by the element has been found (true) or whether a null hypothesis has been formulated (false).
From the best path Shows whether the found hypothesis belongs to the best path in the tree of hypotheses (true) or not (false).
Pre-search quality How well the hypothesis matches the properties of the element specified by the settings in the Properties dialog box and by the code in the Advanced pre-search relations field.
Post-search quality The quality of the hypothesis after the conditions in the Advanced post-search relations field have been applied.
Chain quality The quality of the chain of hypotheses, from the first subelement of the group to the current subelement. Chain quality is calculated by multiplying the qualities of all the subelements in the chain and is used to compare rival chains of hypotheses.

A Table body hypothesis has the following properties:

Property Description
Element name The full name of the element.
Page The number of the page on which the element was detected.
Surrounding rect The coordinates of the rectangle which surrounds the region of the hypothesis.
Width The width of the region of the hypothesis.
Height The height of the region of the hypothesis.
Order name Shows the name of the found column order.
Found columns Shows the names of the found columns.
Rows number Shows the number of rows found in the table.
Detected Shows whether the object described by the element has been found (true) or whether a null hypothesis has been formulated (false).
From the best path Shows whether the found hypothesis belongs to the best path in the tree of hypotheses (true) or not (false).
Pre-search quality How well the hypothesis matches the properties of the element specified by the settings in the Properties dialog box and by the code in the Advanced pre-search relations field.
Post-search quality The quality of the hypothesis after the conditions in the Advanced post-search relations field have been applied.
Chain quality The quality of the chain of hypotheses, from the first subelement of the group to the current subelement. Chain quality is calculated by multiplying the qualities of all the subelements in the chain and is used to compare rival chains of hypotheses.

More:

Working with tables

Search area

Additional search constraints

15.09.2020 9:42:43


Please leave your feedback about this article