When we published our blog series about solving the mystery of OCR, we gave a series of clues. The first clue was providing the official definition: The mechanical or electronic conversion of images of typed, printed, or even handwritten text, into machine-encoded text. The text can come from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo), or from subtitle text superimposed on an image (like from a television broadcast).
It looks like this1:
Simply put, it’s a computer looking at an image or file and being able to identify what is on it.
Our second clue informs the topic of this blog series: Don’t confuse OCR with smart data extraction.
OCR is a technology that turns a picture into words. The next layer, smart data extraction, understands and processes the text from the OCR to transform it into relevant data. In other words, it takes the words and does something with them. This is critical to know because OCR by itself does not know what to do with the information it reads. This is where the “smart” in smart data extraction comes in.
Some AP automation solutions providers might claim OCR technology. Be careful and dig deeper to find more information. Many apply human, or manual, extraction, outsourcing to a third party—also called third-party verification. OCR extraction that layers human verification uses people to put data read by the OCR into predefined fields. In this scenario data entry is done by an outsourced firm and takes time as the data is being populated by people, typically 24 to 72 business hours. Kind of defeats the purpose of moving from a manual AP process to an automated process to save time, right?
What you really want to know when investigating invoice and payment processing automation solution providers is if the solution has a complete technology, combining OCR (converting images to text), smart data extraction (transforming the text into relevant data), and machine learning (remembering the data and populating it into the applicable data fields each time the data is recognized).
In other words, you are looking for a system that gets smarter the more you use it!
In Part 2 of this blog series, you’ll learn more about how smart data extraction puts the “smart” in AP automation.
1Hewlett Packard Enterprise Development LP. 2018. Retrieved June 29, 2018 from https://dev.havenondemand.com/apis/ocrdocument#overview