Monday, October 14, 2013

Data validation and the quality of information obtained in capture processes

Normally, when we talk about capture software, functionalities such as the classification of documents or extracting data from documents are the star features. This is normal, given that they’re the two functionalities which allow businesses to obtain information which would otherwise be inaccessible in their documents. 

The levels of precision provided by the results of these processes is subject to a number of factors which aren’t just limited by the power and quality of the software (the quality of the documents to be processed, for example.) In many cases, these documents are images that have been scanned from photocopies of photocopies, and their quality is so poor that even the human eye has problems trying to read the information. Under these conditions, machines and existing technologies can’t do much more than the human eye can. Not getting information, or getting imprecise information, means that the systems using this data are working with mistakes. In the case of invoices, for example, if the extracted data are incorrect (let’s suppose that the invoice total extracted is €500, when it should really be €600), our accounting software is going to process an incorrect amount. That’s where data validation – either manual or automatic - becomes important.

Validating the information obtained by the capture software is one way of guaranteeing the quality of the information before sending it on to feed other systems. 

Data Validation Options for Capture Software

  • Notification for those documents in which data extraction/classification falls below a set security level for accuracy: In other words, if the system isn’t 99% sure about the extraction or classification of a document, it will alert the user. 
  • Help with previewing the document: Being able to zoom in on the document as we’re checking it helps us to locate and identify data in scanned images. 
  • Manual validation: ability for users to correct incorrect data obtained by the system. 
  • Automatic validation: This options permits connections between the systems and databases in which information can be found that can corroborate corresponding data. Let’s say that the name of a patient has been extracted from a clinical report: the system can search for the patient’s name in the hospital’s computer system, checking that the data exists and checking other associated data, such as the patient’s social security number. 
In the case of Athento version 2.0, the system provides validation views for processed documents. These views allow users to correct wrong data, or data that could not be extracted. What’s more, the system also allows users to view the document with the help of zoom (a magnifying glass), so that those responsible for validating data can better see the data. 


No comments:

Post a Comment