基于Android平台的证件信息识别APP开发与实现毕业论文
2021-04-21 22:01:59
摘 要
在当今社会中,人们的钱包里也许会因为移动支付的出现而没有现金,但是一定会有各类证件和卡片。由于各类证件的存在以及普遍使用,证件识别成为了一项巨大的重要任务。证件识别,即能实现拍照自动输入身份信息,让用户完全告别手动输入身份证、驾驶证、行驶证等证件信息。它支持Android、iOS、Java、Linux等多终端形式接入,还能通过多样化的结果输出,满足应用的个性化定制需求。证件识别是利用扫描仪、数码相机、或手机相机拍摄各种证件图像(二代身份证、护照、驾照、行驶证等),快速扫描并读取证件图像上的所有联系信息,自动判别证件上的各栏位信息,存入证件信息数据库。因为身份证是对于每个人来说最重要的证件,所以本文选取的研究对象为身份证。由于Android系统应用广泛,故本文选取的平台为Android平台。移动平台的信息识别系统可以广泛地应用于交通管理系统、安全系统、教育事业、电信行业、公安行业、互联网金融等需要证件信息的地方。如此一来,便可提高信息识别的效率和准确率,节省大量的人力物力资源,前景甚是可观。
本文开发基于Android平台的证件识别系统的过程,可分为两大步骤:①图像预处理;②文字识别。在第一步图像预处理过程中,又可细分为五个部分:图像归一化、图像灰度化、图像二值化、腐蚀膨胀、提取关键信息。在第二步文字识别过程中,主要分为身份证号码识别与身份证汉字信息识别。由于身份证号码是由十位数字与X这11个字符组成,结构较为简单,种类较少,故本文采用Tesseract-OCR引擎对其进行识别。对于身份证上的汉字信息识别,由于汉字结构较为复杂,种类繁多,故本文采用卷积神经网络的方法对其进行识别。
本文基于Android平台开发了一款身份证信息识别系统,主要识别的证件为身份证。该系统能成功识别身份证正面的信息,对身份证号码的识别率达到98%以上,对于单个汉字的识别率达到90%以上,对于身份证上的整体汉字信息的识别率达到78%以上。
关键词:证件识别;图像处理;神经网络 ;OCR
Abstract
In today's society, people may not have cash in their wallets because of the emergence of mobile payment, but there must be various kinds of certificates and cards. Because of the existence and widespread use of various kinds of certificates, identification of certificates has become a huge and important task. Identification of certificates, that is, automatically photographing the input of identity information, allowing users to completely bid farewell to manually enter the ID card, driving license, driving license and other certificate information. It supports multi terminal form access such as Android, iOS, Java, Linux and so on, and it can also output through diversified results to meet the personalized customization needs of applications. Identification of certificates is to use a scanner, a digital camera, or a mobile camera to take all kinds of certificates images (two generation identity card, passport, driver license, driving license, etc.), quickly scan and read all the contact information on the document image, automatically distinguish the information on the columns on the certificate, and deposit the information database of the certificate. Because ID card is the most important certificates for everyone, the object of this study is identity card. As the Android system is widely used, the platform selected in this paper is Android platform. The information recognition system of mobile platform can be widely used in traffic management system, security system, education, telecommunication industry, public security industry, Internet Finance and other places where certificates information is needed. In this way, we can improve the efficiency and accuracy of information recognition, save a lot of manpower and material resources, and the prospects are considerable.
This paper develops the process of certificate recognition system based on Android platform, which can be divided into two steps: ①image preprocessing; ②Character recognition. In the first step of image preprocessing, it can be divided into five parts: image normalization, image grayscale, image two value, corrosion expansion, and extraction of key information. In the second step of word recognition, it is mainly divided into ID number identification and ID card recognition. Because the ID card number is composed of 11 characters which contains ten digit and X, the structure is relatively simple and the type is less, so this paper uses the Tesseract engine to identify it. As for the identification of Chinese character information on ID card, the convolution neural network is used to identify the Chinese characters because of the complex structure and various kinds of Chinese characters.
Based on the Android platform, this paper develops an ID card information recognition system. The main identification certificates are identity cards. The system can identify the positive information of the ID card successfully, the identification rate of the ID card number is above 98%, the recognition rate of the single Chinese character is above 90%, and the recognition rate of the whole Chinese character information on the ID card is over 78%.
Keywords: Identification of Credentials; Image Processing; Nerve Net ;OCR
目 录
摘 要 1
Abstract 2
1 绪论 4
1.1 研究背景及其意义 4
1.2 国内外的研究现状 4
1.3 论文的主要研究内容 6
1.4 论文的组织结构 6
2 支撑技术 7
2.1 图像预处理 7
2.1.1 图像灰度化 7
2.1.2 图像二值化 8
2.1.3 膨胀腐蚀 9
2.1.4 提取关键信息 10
2.2 Tesseract-OCR 10
2.2.1 Tesseract-OCR概述 10
2.2.2 制作专用字符集 11
2.3 卷积神经网络 13
2.3.1 卷积神经网络简介 13
2.3.2 基本网络结构 13
3 应用的设计与开发 15
3.1 总体设计 15
3.2 系统框图 15
3.3 界面设计 16
4 系统测试 21
4.1 硬件因素测试 21
4.2 光照因素测试 21
4.3 总体测试 22
5 总结与展望 23
参考文献 24
致谢 25
1 绪论
1.1 研究背景及其意义
近年来,随着移动互联网的的迅速发展,作为人们最常用的移动智能终端——手机,其包含的功能越来越多,人们对其的依赖性也越来越强。不仅有供人们娱乐的手机应用存在,越来越多的政府相关部门也都推出了自己的手机应用,方便人们了解实时讯息以及及时操作,如:《掌上电力》《交管12123》以及手机银行APP等。这些APP大多都需要填写个人身份证信息 (即实名认证),手动输入身份证信息的速度较慢,且用户体验较差。如果能有一款自动识别证件信息的软件,相信用户的体验会变得更好。
目前,针对于手机,比较常见的操作系统主要有以下几种:苹果公司推出的IOS操作系统、Google公司推出的 Android 操作系统、微软公司推出的 Windows Phone 操作系统和BlackBerry 推出的 BlackBerry 操作系统等。其中 Android 操作系统[1] 是 Google 公司于 2007 年正式推出的移动平台系统,相较于其它系统,它显著的开源性使它得到了众多手机厂商和研发人员的支持。2016 年第一季度 Android 操作系统在中国市场的占有率高达 77%。本文主要研究的手机系统为Android系统。
目前大多数的身份证信息录入方式主要为两种:①人工录入;②传统读卡器识别。第一种方式耗费人力物力资源较多,地点受限且容易出错。第二种方式虽然识别准确率高,但是同样对使用环境要求也高。不仅需要借助外接硬件设备及外接电源,还需要连接电脑