Tables

Top  Previous  Next

DRPThe Document Template Designer window allows you to identify and adapt the data extraction methods to various types of tables in a quick and simple manner. Document tables can include tables that span over multiple pages, horizontal repeating tables, and other complex tables. Tables that are identified by the system (with or without user involvement) are structured into UXML so as to be available for export to external applications.  

 

The following table is an example of a simple table that you may encounter in documents. More complex examples are given in the next sections.

 

table raw

 

If automatic recognition is turned-off or if the system does not recognize a table, you can force table recognition by clicking the Table button in the Toolbox and visually paint the table area in the document. Once the area is selected, the Table Type window is displayed allowing you to select recognition options and other advanced parsing options.

 

clip0003 NOTE In order to be able to select tables, as well as text-boxes, graphs and anchors in the Document Template Designer window at the second stage of the template design, the file type of the template your are designing should have the Enable Advanced Parsing option selected during the first stage (when designing the template's FRP). For more information, see section Creating a New File Recognition Template.

 

Add table

 

Depending on the table type and structure, you can select one of the following recognition and advanced parsing options:

 

Automatic - Automatically detects the table header. Tables with no header are also allowed. Such tables require adjacent keyword/anchor to assure reliable reconstruction.

Table- automatic

 

Force Header -  Identify the indicated number of rows in the table as the header. The header is explicitly searched for during the reconstruction phase making this kind of table more independent of other objects in the pattern.

Table- Force Header

 

Separate by Header - Use this option to indicate the table width by the header. If some rows run beyond the header width, they will be cut. A header is required for such a table.

 

By Border - Use this option to indicate the table borders (limits) exactly as they appear in the document.

 

Horizontal Repeating - Indicates a series of tables with horizontal repeating pattern. A header is required for such a table.

Table- Horizontal repeating

 

Advanced options:

 

Search for similar -  Used in cases where a table spans over multiple pages and all sections have headers. In this case the system searches for a similar header in the "spilled" table.  

Table - search for similar

 

Merge rows - When selected, the system identifies empty cells as being part of the cell above or below and merges them together. It is recommended to select this option even when the training table does not have empty cells as they may appear in other documents of the same type.  

Table- merge rows        

 

Merge columns  - When selected, the system identifies inconsistencies in cell size and may correct it by merging columns together. It is recommended to select this option even if it is not necessary for the training table.

 

Table- merge columns

 

Maintain minimum fill ratio - Minimum Fill Ratio is the percentage of cells that are allowed to remain empty in a found table. Table headers are not taken into account.

 

Table- fill min

 

Horizontally bound - Use this option to prevent the parser from jumping over the table's vertical borders to other objects that might be placed next to it, when reconstructing the table.

 

Maintain original cells - When this option is selected, the parser attempts to parse the table with the original structure (multi-row cells, and so on), thus reducing the chance that unnecessary rows/columns will be created. This option is relevant for documents that have visually separated borders, such as when there are black border lines in the table. This feature is turned on by default; however, it may be important to turn it off in cases when a table has a complex structure or does not have borders between columns.

 

Table -Maintain Original Cels

 

After selecting the parsing option and clicking OK, the table is processed and constructed in a new table grid. The table is highlighted in yellow and if a header is recognized, it is highlighted in a darker color. It is also possible to extend the header to more than one row by right-clicking on the table and adjusting the Header rows number.  

 

Table created

Additional columns can be added to the table by clicking on the + button on top left corner and dragging the column borders to adjust the column width. To delete rows, resize the table so that rows outside of the table grid will be deleted.