How to parse repeating values based on anchor keywords?

Docparser offers various filters to extract repeating data sets from documents. This article describes how you can extract single repeating text values (e.g. Names, Street Names) from a document that contains multiple data sets. This method assumes that an anchor keyword is located close to each data value that you want to extract. 

A common case for this parsing filter would be a list of data entries, where each individual data field is located after a label. For example, data which looks like this:

Name: John
Street: 123, High Street

Name: Jane
Street: 456, Upper Street

Name: Mark
Street: 789, Lower Street

Alternative methods to parse repeating text data would be our table extraction tool or our filter for parsing repeating text blocks.

In the first step of the parsing rule editor, choose the template "Repeating Text Values":

Click on "Continue" and enter an anchor keyword which is located next to the values you want to extract. Our Repeating Text Values will then return each line which contains the keyword you entered. In the example above we could enter "Name:" as the anchor keyword.

Finally, you can add a horizontal offset value to move the position where the data extraction starts to the left and to the right. You can also add a row offset value in case the value you want to capture is not located on the same line as the anchor keyword. Finally, you can also define after how many characters the data value ends.

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.

Still need help? Contact Us Contact Us