Determining OCR and ICR recognition rates

Determining OCR and ICR recognition rates

Our imaging columnist takes a look at how meaningful OCR and ICR recognition rates really are.

By Peter Webb

Last month's imaging column prompted howls of outrage from the industry. The following was typical: "How can you possibly say that vendor claims for scanner throughputs are particularly misleading? These guys are saints compared to the OCR/ICR industry!" Scanner throughputs may be exaggerated, but in a lot of cases optical character recognition (OCR) and intelligent character recognition (ICR) recognition rates are actually completely meaningless. So, naturally, this month's column will discuss OCR and ICR rates.

First, a few definitions. We will follow general industry usage and use OCR to refer to reading machine print, and ICR to refer to reading handwriting or hand print.

Who fills in forms with a typewriter these days?

OCR of full pages basically works in the following way. If you have large slabs of laser printed material that you want turned into text, OCR will do it for you. Different systems will handle headings, diagrams, funny fonts, columns, signatures, footnotes, and annotations differently, and you can expect these real-world issues to be problems. Of course, just because you can OCR a laser printed paper document doesn't always mean that it's the best approach - if you own copyright (you do own copyright, don't you?) it may be easier to get the author to email you the original file.

GOING ONLINE

OCR of forms-based information generally works. But who fills in forms with a typewriter these days? All of these applications seem to be going online onto the Internet, and this technology gets less interesting every day.

ICR of full pages of information - reading customers' handwritten correspondence - doesn't really work. Please, don't email or ring to tell me about some new breakthrough coming out of the lab which reads unconstrained handwriting. I have heard about this breakthrough about once a month for the last 20 years, and I am just a little skeptical.

ICR is a bit like having a work experience person in. You find work which is really simple to do, and where mistakes don't particularly matter.

ICR of forms-based information is the remaining category. A lot of this stuff is moving to the Web (particularly in banking and finance), but it is still a big application. ICR is possible and practical for some niche applications.

ICR and OCR vendors love to talk about the technology used in their devices - "neural net based", "polynomial analysis", and "feature extraction". None of this has any real practical impact upon the user or purchaser. The important thing (the only thing) is how well it works.

HIT AND MISS

In considering OCR/ICR, the most important characteristic is the recognition rate. Vendors are fond of quoting rates like "98 per cent" or "99 per cent" accuracy. But on which characters, scanned at what resolution, and in whose handwriting? It's like a bow and arrow manufacturer proudly advertising that they can hit the bulls-eye 98 per cent of the time, then having in fine print down the bottom "actual results will depend upon the size of the target and how far away you stand".

Even the assumption that there are two possible results from OCR/ICR - a correct read or an incorrect read - is usually wrong. Most devices allow a confidence level to be set, such that if the OCR/ICR reader is unsure it flags the character for manual entry. Setting the confidence too high just means that everything gets flagged for manual entry, and the unit isn't doing anything. Setting the confidence too low is like walking up to your least accurate data entry person and saying "I know you can't reliably tell 4's from 6's, and 1's from 7's, and you think my name is Pefer VVebb, but that doesn't matter - if you're not sure about some character, just guess".

So the recognition rate of ICR units depends upon the acceptable trade-off between correct reads, outright mistakes, and characters flagged for manual processing.

But let's, for the sake of analysis, assume that this wonderful new ICR reader is 99 per cent accurate - it will read 99 per cent of characters correctly and only one per cent incorrectly. Pretty good, only one per cent incorrect. Can't we just do that one per cent manually? Well, no, because we don't know which one per cent they are, and we would have to rekey everything to find the errors. Can we live with a one per cent error rate?

If the customer number on the withdrawal slip is 10 characters long, and there is then a one per cent chance of an error on each character, there is a 10 per cent chance of reading at least one character in the account number incorrectly. (check digits can help here, of course). And if the withdrawal amount is five digits long, then there is a five per cent chance we will make the cheque out for the wrong amount. And a 40 per cent chance of sending it to the wrong street address. Add all of these together, and it's clear that any ICR process that needs to capture more than a handful of characters is virtually guaranteed to make at least one error on every form.

So ICR is still very much a niche technology. It's a bit like having a work experience person in. You find work which is really simple to do, and where mistakes don't particularly matter. That's why full page OCR to load full text databases is quite successful. If you want to know the company policy concerning employees being hurt by large herbivores falling from a great height, then OCR might allow you to find documents mentioning the word "rhinoceros" or "hippopotamus" more easily. The occasional mistake - misreading a word as "rhihocerous" - doesn't matter unless it's the only large airborne herbivore mentioned on the page.

And in the mean time, you might want to be nice to your data entry staff. I suspect they will be with us for a long time yet ...

Peter Webb is a principal consultant with Opticon. He can be contacted on 0413-737509 or at pwebb@opticon-aust.com.au.

Business Solution: