Understanding the Basics of Optical Character Recognition

Optical character recognition (OCR) is a computer process that translates an image of a text on a document into a format that machines can read. This process involves a device scanning a document, saving the scanned image as a file, converting the text in that scanned image, and reading it. OCR technology is quick and is widely used for document scanning.

This ability to quickly scan and verify the contents of documents makes OCR effective for ID document authentication. Security personnel use scanners equipped with optical character recognition technology to scan OCR drivers licence, international passports, work ID cards, and other identity documents.

How Optical Character Recognition Works

Devices and software applications that support optical character recognition work in the four steps outlined below:

  1. Image capture

The scanning device will capture an image of the document to be authenticated and use OCR software to analyse the image. It will classify the dark parts of the image as text, and the light parts as the background.

  1. Pre-processing

This is when the OCR software cleans the scanned image and eliminates errors so it can be accurately read. The software can clean the image in different ways, such as slightly tilting it to align it properly, smoothing the edges of the image, removing digital spots, and cleaning the lines and boxes in the scanned image.

  1. Text recognition

OCR software stores numbers, letters, and other characters as symbols (called glyphs) in its database on the cloud. The software has two major processes with which it recognizes text on ID documents; these are pattern matching and feature extraction.

Pattern matching works by breaking down the text on the ID documents into glyphs and comparing them to the glyphs in the software’s database to find a match. For this method to work, the stored glyph has to be in the same font as the glyph obtained from the scanned image. This method is only effective on ID documents created with commonly known fonts.

Feature extraction further breaks down glyphs obtained from an ID document into features like lines, line intersections, line directions, and closed loops. The software then uses these features to find glyphs that match in the database.

  1. Post-processing

The software subsequently saves the extracted text as a computer file. In some cases, the OCR-powered device makes the before-and-after versions of the scanned image available for inspection.

Endnote

OCR technology is valuable in document scanning because it is efficient and quick. When equipped with a device, it can recognize someone’s name, date of birth, and other biodata on an ID document and compare it with the information stored in a database in seconds to find a match.

This reduces lines at security checkpoints and allows people to be processed faster. It also increases accuracy and reduces the amount of personnel needed at a security checkpoint since there will be less need for manual input. This has encouraged banks, airports, offices, healthcare facilities, logistics companies, and many businesses to adopt it.

Photo of author

Author

Dave

Hello, I'm Dave! I'm an Apple fanboy with a Macbook, iPhone, Airpods, Homepod, iPad and probably more set up in my house. My favourite type of mobile app is probably gaming, with Genshin Impact being my go-to game right now.

Read more from Dave

appsuk-symbol-cropped-color-bg-purple@2x

Apps UK
International House
12 Constance Street
London, E16 2DQ