[GUIDE] Bank card recognition achieved by HUAWEI ML Kit

46 posts
Thanks Meter: 2
By Freemind R, Official Huawei Rep on 21st May 2020, 10:05 AM
Post Reply Email Thread
1 About This Document
Check out the machine learning service business introduction on the Huawei Developer website (

It can be seen that Huawei HMS divides machine learning services into four major services: text-related services, language-related service, image-related services, and face/body-related services. One of them is text-related services. Including text recognition, document recognition, bank card recognition, general card recognition, what are the differences and associations between these sub-services?I will try to explain.

2 Application Scenario Differences
Text service SDKs are classified into device APIs and cloud APIs. Device APIs are processed and analyzed only on the device side and use the computing resource such as the CPU and GPU of the device. Cloud APIs need to send data to the cloud and use the server resources on the cloud for processing and analysis, all the services have device-side APIs except the document identification service, which requires a large amount of computing data to be processed on the cloud. To simplify the analysis scope, we only describe the device-side API service in this document.

2.1 Scenario Comparison

2.1.1 Text recognition: It is more like a versatile talent. Anything can be done, as long as it is text, it can be recognized.

Text OCR application scenarios

Text OCR does not provide a UI. The UI is implemented by developers.

2.1.2 Bank card identification: more like a partial student, only a certain subject is excellent.

default customized box is provided for bank cards. You can quickly extract bank card numbers by directly aligning with the box.

Bank card identification

2.1.3 General cards: Between the above two categories, with certain attainments in a certain field. Can extract text from all cards. In addition, a card alignment box is provided to prompt users to align cards to be identified.

2.2 How to Choose
Bank Card OCR are selected for identification bank cards. For other types of cards, general cards identification are used for identification. For other scenarios, text recognition is used.

3 Service Integration Differences
Compilation Dependency Differences
In order to facilitate everyone's understanding, first explain the following concepts:

Basic SDK APIs provided for developers. All APIs are opened through the basic SDK.

Plug-in The calibration box mentioned in the previous scene comparison summary provides an interface to verify the input quality of the image frame. If it does not meet the requirements, can prompt the user to reposition it.

Model package This is the core of Huawei's HMS ML Kit services. It contains a large number of samples input through a machine learning platform to learn and generate interpreter model files.

Compilation Dependency Summary

According to the preceding compilation dependency, all services need to integrate the corresponding basic SDK and model package. However, Bank Card recognition, and General Card recognition have corresponding plug-ins, which are the calibration boxes mentioned above. In terms of models, Bank Card recognition use a dedicated model package, while General Card recognition and text recognition uses a general model package.

Development Differences

First, let's see how to integrate the services. The detailed steps are not described here. You can view the development steps of the corresponding services on Huawei Developers.
Text recognition
Create an identifier. MLTextAnalyzer analyzer = MLAnalyzerFactory.getInstance().getLocalTextAnalyz er(setting);

Create a fram object and transfer the image bitmap. MLFrame frame = MLFrame.fromBitmap(bitmap);

Send the frame object to the recognizer for recognition. Task<MLText> task = analyzer.asyncAnalyseFrame(frame);

Result handling Task<MLText> task = analyzer.asyncAnalyseFrame(frame); task.addOnSuccessListener(new OnSuccessListener<MLText>() { @override public void onSuccess(MLText text) { // Recognition success. } }).addOnFailureListener(new OnFailureListener() { @override public void onFailure(Exception e) { // Recognition failed. } });

Bank Card recognition

Start the UI to identify the bank card. private void startCaptureActivity(MLBcrCapture.Callback callback) {

Rewrite the callback function to process the recognition result. private MLBcrCapture.Callback callback = new MLBcrCapture.Callback() { @override public void onSuccess(MLBcrCaptureResult bankCardResult){ // Identify the success. } };

General Card recognition

Start the interface to identify the general card. private void startCaptureActivity(Object object, MLGcrCapture.Callback callback)

Rewrite the callback function to process the recognition result. private MLGcrCapture.Callback callback = new MLGcrCapture.Callback() { @override public int onResult(MLGcrCaptureResult cardResult){ //Successful identification processing The return MLGcrCaptureResult.CAPTURE_STOP;// processing is complete, and the identification is exited. } };

Development Summary

According to the preceding comparison, the processing logic is similar except that no GUI is provided for text recognition. The images to be recognized are transmitted to the SDK and the recognition result is obtained through the callback function. The core difference is that the returned structured data is different.

According to the preceding comparison, the bank card recognition return the directly processed identification content. You can directly obtain the bank card number through the interface without considering how the content is extracted. However, the text recognition and general card recognition return the full identification information, it contains text content such as blocks, lines, and words. If you want to obtain the required information, you need to extract the full information that is identified. For example, you can use the regular expression to match consecutive x digits to identify a card number or match the content after a recognized keyword.

4 Technical Difference Analysis

Based on the preceding difference analysis, we can see that text-related services are different in scenarios, service integration, also has some association. For example, Text recognition and General Card recognition use the same general machine learning model. The following analyzes and explains the technical differences from the technical perspective. As described in the compilation dependency analysis, the basic SDK and model package need to be integrated for text services, and plug-ins need to be integrated for some services to generate calibration boxes. What is the model package? You may be familiar with machine learning. Machine learning is usually divided into the collection of training samples, feature extraction, data modeling, prediction, etc. The model is actually a "mapping function" learned through training samples, feature extraction and other actions in machine learning. In HUAWEI HMS ML Kit, this mapping function is not enough. It needs to be executed, which is called the interpreter framework. In addition, some algorithms need to perform pre-processing and post-processing on the image, for example, converting an image frame into a corresponding eigenvector. To facilitate understanding, the preceding content is collectively referred to as a model file. To enable these model files to run on the mobile phone, the model files further need to be optimized, for example, a running speed of the model files on the mobile phone terminal is optimized, and a size of the model files is reduced.

Differences and association analysis

Now, let's look at the differences and relationships between text services. To facilitate understanding, the following figure shows the differences and relationships between text services.

Text recognition

The training is carried out using a general text data set. His advantages are wide application range and high flexibility. As long as the text content can be recognized.

General card recognition

It is the same as the data set used for text recognition, so there is no difference between the model files, but a general card plug-in is added. The main function is to ensure that the user points the card to the center of the camera, and also recognizes the reflective and blurred images , if the requirements are not met, the user is prompted to readjust, so that the recognition accuracy of the card can be improved.

Bank Card OCR

The bank card recognition service uses the dedicated data training set of the bank card. We all know that the characters on the bank card are greatly different from those in common print. In addition, the characters are convex. If the general model is used, it is difficult to achieve high accuracy, the training uses the dedicated data sets of bank cards and ID cards to improve the accuracy of ID card and bank card identification. In addition, targeted pre-processing is performed for bank cards. For example, the image quality and tilt angle can be dynamically detected in real time, and an alignment box can be generated to restrict the location of cards, if the image is blurred, reflected, or not aligned with the calibration box, the user is prompted to re-align the image.

Based on Huawei machine learning service, we will share a series of practical experience later. You can continue to pay attention to our forum.

Any questions about this, you can try to acquire answers from HUAWEI Developer Forum.
Post Reply Subscribe to Thread

huawei ml kit

Guest Quick Reply (no urls or BBcode)
Previous Thread Next Thread
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes