How to extract floating tables which do not have a fixed position?
The default table extraction tool of Docparser allows you to define the horizontal positions of your table columns with only a couple of clicks. This method works great if your tables can be found more or less at the same position inside your documents (e.g. in the case of invoices, purchase orders, ...). Your tables can have a variable vertical position and length, but they need to keep the same horizontal position inside the document when using this method.
But what if tables are randomly placed somewhere inside a document?
If you want to parse floating tables with a variable horizontal and vertical position, you might want to try the following workaround:
In a first step, you need to isolate the text block which contains the table by using two text filters which define the start and end position of your table. The method of extracting data from variable positions inside a text is described more in detail in this article.
Once the text block containing the table is extracted, you can add the "Add Text Filter > More Filters > Convert To Table Rows". This filter comes with a variety of options to split up each line into multiple columns. Once you obtained a first table representation of your table, you can add more table filter rules to continue splitting up columns, group rows, remove unwanted rows, etc.