What it is, what it isn’t, how it works, and how it impacts AP?
In Part 1 of this two-part blog series, we began to uncover clues to solve the mystery of OCR, answering the questions: What is it? What isn’t it? How does it even work? And how does it make a difference in the AP workflow?
We’ve already uncovered the clue about what OCR is: A computer looking at an image or file and being able to identify what is on it.
And we learned not to confuse OCR with data extraction technology. (As a reminder, smart data extraction understands and processes the text from the OCR to transform it into relevant data.)
We also realized the importance of a P2P automation solution having complete technology, combining OCR, smart data extraction, and machine learning, because OCR by itself does not know what to do with the information it reads.
Finally, in Part 1, we introduced the three predominant types of extraction technology, with the first type—human verified or outsourced extraction—being our third clue.
In a template-based data extraction tool, a user has to predefine specifically “where” on a document a specific piece of information can be found and “what” the tool should do with the data it finds. The extraction process can be done fairly quickly, however, it can become an administrative burden as templates must be managed and updated as documents change. Fourth Clue: In this scenario, humans have to constantly manage the templates; read them, interpret them, and update them. This might defeat the intent of transitioning to AP automation because you are not saving time, and it might even be more time-intensive. Even duplicating efforts in some respects.
Data extraction using A.I. or machine learning is able to “understand” what information on a document needs to be used and, more importantly, what should be done with said information to make it relevant data. For example, technology utilizing machine learning is able to populate the Total Amount of an invoice without being taught or shown where to grab the data. Because the tool has seen thousands of examples it is able to draw on past experiences to make conclusions. Smart!
Final Clue: When it comes to the Yooz AP automation solution, our smart data extraction technology leverages OCR to read information from scanned/photos of paper invoices or invoices images received via email. It then interprets the information, extracts the relevant data, then applies it to the appropriate field in the application to then be reviewed and sent for approval. Finally, the data is exported to an ERP. If there are pieces of data that cannot be interpreted or read, Yooz learns over time how to extract those missing pieces. This is referred to as machine learning, and is powered by A.I.
With constant enhancements, no end user is ever involved to teach the software. The staff transitions from manual data entry and third-party verification to simply reviewing data extraction for accuracy. If there is a miss, the reviewer can click inside the Yooz application to quickly correct it and flag the miss. Utilizing machine learning optimizations, the system will become more intelligent over time, reducing the number of mistakes.
Today’s organizations are focused on speed, efficiency, and leveraging technology to solve business problems. When looking at options, take the time to first set your business goals and determine what challenges need to be solved. Then find a solution that solves as many or all of those critical needs.
Sure, you can ask, “Do you have OCR?” But don’t stop there. Keep digging until you have a complete understanding of each solution you are considering and, more importantly, what best suits your business.
1Hewlett Packard Enterprise Development LP. 2018. Retrieved June 29, 2018 from https://dev.havenondemand.com/apis/ocrdocument#overview