How to parse repeating values based on anchor keywords?
Docparser offers various filters to extract repeating data sets from documents. This article describes how you can extract single repeating text values (e.g. Names, Street Names) from a document that contains multiple data sets. This method assumes that an anchor keyword is located close to each data value that you want to extract.
A common case for this parsing filter would be a list of data entries, where each individual data field is located after a label. For example, data which looks like this:
Street: 123, High Street
Street: 456, Upper Street
Street: 789, Lower Street
In the first step of the parsing rule editor, choose the template "Repeating Text Values":
Click on "Continue" and enter an anchor keyword which is located next to the values you want to extract. Our Repeating Text Values will then return each line which contains the keyword you entered. In the example above we could enter "Name:" as the anchor keyword.
Finally, you can add a horizontal offset value to move the position where the data extraction starts to the left and to the right. You can also add a row offset value in case the value you want to capture is not located on the same line as the anchor keyword. Finally, you can also define after how many characters the data value ends.