The Samsung SDS R&D Center Vietnam launched AI-based OCR solution on CMC Cloud (C.OPE2N,

What is AI-based OCR solution?

Optical Character Recognition (OCR) is a software that allows to convert images of text (printed text or handwritten text which is captured by scanner or by mobile device) into documents for editing. In particular, OCR has the ability to digitalize many unstructured documents such as invoices, passports, business cards, documents. Since its inception, OCR technology has helped many businesses accelerate the digital transformation process, thus optimizing human resources to save operating costs.

Samsung SDS Vietnam has developed AI-based OCR software which has been deployed on the CMC Cloud platform (C.OPE2N – Key features and functions of the software will be shown in the following sections.

Key features of the AI-based OCR solution on CMC Cloud

The AI-based OCR solution from SDSRV is focusing on financial and banking sectors.

Handwritten text recognition: the AI-based OCR solution allows to recognize both printed text and handwritten with high accuracy - up to 99% for print character, 95% for handwritten numeric characters (namely date, phone number, identity card number) and 85% for handwriting such as full name, address.

Key information extraction (KIE): the AI-based OCR solution supports to identify the meaning of each data line, thereby, easily extracting key information of documents and integrating with the customer's existing database system.

Document classification: the AI-based OCR solution supports to classify various types of document with high accuracy. For example, the feature supports a bank or insurance company to build a classification system of any type of unstructured data automatically.

Functions of the OCR solution on CMC Cloud

Function 1: Extract information from general papers (Basic OCR)

From an image, the OCR solution supports to extract all text information to a txt file. Users do not need to manually retype the information content from the image file, instead, they use the returned results after extraction to copy or edit.

Function 2: Extract information from identity card

From a color image of a Vietnamese identity card, extracted information can be saved as csv file. Specially, this function also supports any type of ID card such as paper card, electronic chip card.

Key Information includes: ID Number, Full Name, Date of Birth, Hometown, Address, Gender, Nationality, Expired Date, Place of Issue.

Function 3: Extract information from driver's license

From a color image of a Vietnamese driver’s license, extracted information can be saved as csv file.

Information fields include: ID number, Full name, Date of birth, Address, Nationality, Rank, Expire date, Date range

Function 4: Extract information from invoices

From an image of a scanned one-page Vietnamese VAT invoice, extracted information can be saved as csv file.

Information fields include: Form number, Serial, Invoice number, Date of issue, Seller company name, Tax code of the seller, Address of the seller, Phone number of the seller, Buyer company name, Address of the buyer, Buyer's tax code, VAT rate, Total amount, Total amount in words.

Function 5: Text extraction in template

If user have a form and want to extract text information in some given areas, this function allows user to:

- Define a template by selecting a standard image and text areas to extract. The defined templates can be saved and used later.
- Define anchors for matching an input image with a template.
- Extract information from an image according to a defined template.

E.g 1: Creating a template

E.g 2: Extracting information

Let give it a try

After just a simple account registration, you can experience the free trial of the OCR solution by visiting the following link: