Stefan Larsson, Filip Mellqvist - DiVA Portal

3y ago
96 Views
26 Downloads
2.97 MB
71 Pages
Last View : 16d ago
Last Download : 3m ago
Upload by : Warren Adams
Transcription

Computer ScienceStefan Larsson, Filip MellqvistAutomatic Number Plate Recognition forAndroidBachelor’s Project

Automatic Number Plate Recognition forAndroidStefan Larsson, Filip Mellqvistc 2019 The author(s) and Karlstad University

This report is submitted in partial fulfillment of the requirementsfor the Bachelor’s degree in Computer Science. All material inthis report which is not our own work has been identified andno material is included for which a degree has previously beenconferred.Stefan LarssonFilip MellqvistApproved, 04-06-2019Advisor: Tobias PullsExaminer: Stefan Alfredssoon

AbstractThis thesis describes how we utilize machine learning and image preprocessing to create asystem that can extract a license plate number by taking a picture of a car with an Androidsmartphone. This project was provided by ÅF at the behalf of one of their customers whowanted to make the workflow of their employees more efficient.The two main techniques of this project are object detection to detect license platesand optical character recognition to then read them. In between are several different imagepreprocessing techniques to make the images as readable as possible. These techniquesmainly includes skewing and color distorting the image. The object detection consists ofa convolutional neural network using the You Only Look Once technique, trained by ususing Darkflow.When using our final product to read license plates of expected quality in our evaluationphase, we found that 94.8% of them were read correctly. Without our image preprocessing,this was reduced to only 7.95%.

Contents1 Introduction11.1Purpose of our project . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11.2Dissertation layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21.3Project outcome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22 Background32.1ÅF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32.2Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32.2.1Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . .42.2.2Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . .42.2.3K-means clustering . . . . . . . . . . . . . . . . . . . . . . . . . . .5Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .62.3.1What is Computer Vision? . . . . . . . . . . . . . . . . . . . . . . .62.3.2Low- and High-level Vision . . . . . . . . . . . . . . . . . . . . . . .62.3.3Binary Image and Adaptive Threshold . . . . . . . . . . . . . . . .82.3.4The HSV model . . . . . . . . . . . . . . . . . . . . . . . . . . . . .92.3.5Image Classification and Object Detection . . . . . . . . . . . . . .92.3.6Optical Character Recognition . . . . . . . . . . . . . . . . . . . . .10Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112.4.1OpenCV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112.4.2Tesseract-OCR . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112.4.3Pytesseract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112.4.4You Only Look Once . . . . . . . . . . . . . . . . . . . . . . . . . .122.4.5Anaconda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12OpenALPR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .132.32.42.5i

2.6Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Project Design13143.1Android application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .153.2OCR Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .153.3Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .163.4Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .164 Project Implementation4.14.24.317Building an object detector . . . . . . . . . . . . . . . . . . . . . . . . . .174.1.1Setting up an environment . . . . . . . . . . . . . . . . . . . . . . .174.1.2Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .194.1.3Porting to an Android device . . . . . . . . . . . . . . . . . . . . .21OCR service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .234.2.1The two colors of the plates . . . . . . . . . . . . . . . . . . . . . .244.2.2Prepare the image . . . . . . . . . . . . . . . . . . . . . . . . . . .254.2.3Identify the contour . . . . . . . . . . . . . . . . . . . . . . . . . . .284.2.4Identify the corners . . . . . . . . . . . . . . . . . . . . . . . . . . .304.2.5Skew the plate . . . . . . . . . . . . . . . . . . . . . . . . . . . . .324.2.6Refining the image . . . . . . . . . . . . . . . . . . . . . . . . . . .344.2.7Read the image . . . . . . . . . . . . . . . . . . . . . . . . . . . . .36Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .375 Evaluation385.1Android performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .385.2Evaluation of object detection . . . . . . . . . . . . . . . . . . . . . . . . .395.3OCR service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .405.3.1OCR performance . . . . . . . . . . . . . . . . . . . . . . . . . . . .405.3.2Precision vs. time . . . . . . . . . . . . . . . . . . . . . . . . . . . .43ii

5.3.35.4Evaluating the impact of preprocessing . . . . . . . . . . . . . . . .48Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .506 Conclusion516.1Project summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .516.2Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .526.3Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53References54iii

List of Figures1.1A simple model showing our whole system. . . . . . . . . . . . . . . . . . .2.1An image with red and green channel (image: CC BY-SA 3.0 [9]). The2colors are represented on a plot grouped into segments using k-means (image:public domain [11]). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52.2Performing canny edge detection on a pair of headphones. . . . . . . . . .72.3Local and global threshold applied on an image with both bright and darkareas. Local adaptive thresholding on the image in the middle and globalfixed thresholding on the image to the right. . . . . . . . . . . . . . . . . .2.48The HSV cylinder showing the connection of the values: Hue, Saturationand Value (image: CC BY-SA 3.0 [10]). . . . . . . . . . . . . . . . . . . . .92.5How an Image Classifier and an Object Detector would see a cup. . . . . .103.1The planned flow of our system. . . . . . . . . . . . . . . . . . . . . . . . .144.1Annotating a picture of a license plate. . . . . . . . . . . . . . . . . . . . .194.2A screenshot of the final application where the object detector is more than90% confident that a license plate is found. . . . . . . . . . . . . . . . . . .234.3The result of k-means color quantization where k 3. . . . . . . . . . . . .254.4The desired HSV mask applied. The quadrilateral is highlighted and thecharacters are easy to identify. . . . . . . . . . . . . . . . . . . . . . . . . .4.5HSV Masks with six different ranges (values in upper left corner). Left threelower values and right three higher values. . . . . . . . . . . . . . . . . . .4.62728The final accepted mask, acquired when the saturation value ranges from0-60 (second value in each array) and the value (third value) is ranging from195-255. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .284.7Desired contour drawn on the source image. . . . . . . . . . . . . . . . . .304.8The corners of the contour drawn on the source image. . . . . . . . . . . .32iv

4.9A quadrilateral with the edges A, B, C and D and the corners E, F, G and H. 324.10 The result of grabbing the corners of the source image and skewing them tothe desired corners. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.134A flowchart of the image being put into the OCR service pipeline, first beingpreprocessed and then read by the OCR software, later to be matched or not. 425.2A simplified flowchart of the binarization process where the image is putinto the adaptive threshold, and if not matched by the OCR software, willgo into the manual threshold together with a calculated threshold value. .v45

List of Tables5.1Comparing the running time of our app to the specifications of our devices.5.2Comparison of size and time between two generic images of minimum andmaximum potential dimensions with no need for iteration. . . . . . . . . .5.342A table showing the outcomes of the varying multipliers with the four mostessential numbers for our evaluation. . . . . . . . . . . . . . . . . . . . . .5.43946A table showing x, which is the number the attempt will get raised to everyiteration, together with the four most important numbers for the evaluation. 475.5A table comparing the two methods for tweaking the threshold in the binarization of the image.5.6. . . . . . . . . . . . . . . . . . . . . . . . . . . . .A table comparing the accuracy and speed of the OCR software with andwithout preprocessing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.74849A table comparing the accuracy and speed of the OCR software with andwithout preprocessing on a subset of optimized images.vi. . . . . . . . . .50

Listings4.1Create and prepare an Anaconda virtual environment called tgpu2. . . . .184.2Clone Darkflow repository. . . . . . . . . . . . . . . . . . . . . . . . . . . .184.3Download Darkflow dependencies with Anaconda. . . . . . . . . . . . . . .184.4Installing Darkflow using pip. . . . . . . . . . . . . . . . . . . . . . . . . .184.5Initiating a Darkflow training session. . . . . . . . . . . . . . . . . . . . . .204.6OpenCV k-means on image. . . . . . . . . . . . . . . . . . . . . . . . . . .244.7Reading the image with OpenCV. . . . . . . . . . . . . . . . . . . . . . . .254.8Creating HSV ranges. That is the upper and lower limit. . . . . . . . . . .264.9Creating the mask with the input of the image, the lower values in HSV, aswell as the upper ones. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .274.10 Creating dynamic HSV ranges. The upper and lower limit respectively willpart progressively . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .274.11 Getting coordinates of all points that will make up the contours of the mask. 294.12 Confirms quadrilateral with arcLength() and approxPolyDP(). . . . . . .294.13 Locating the corners, iterating through every point in the contour. . . . . .314.14 The function that utilizes OpenCV’s functions getPerspectiveTransform()and warpPerspective() to skew the image. . . . . . . . . . . . . . . . . .334.15 The function where the binarized image is returned together with the calculated optimal threshold. . . . . . . . . . . . . . . . . . . . . . . . . . . .364.16 The implementation of Pytesseract, configured to read the license plate. . .37vii

List of AbbreviationsML - Machine LearningOCR - Optical Character RecognitionCNN - Convolutional Neural NetworkOpenCV - Open Source Computer VisionPyTesseract - Python-TesseractYOLO - You Only Look OnceHSV - Hue, Saturation, ValueRGB - Red, Green, Blueviii

1IntroductionMachine learning, neural networks and artificial intelligence are all concepts which haveexploded in popularity the past few years. It allows computers to automatically analyzeimmense amounts of data and make decisions based on its patters. This can be invaluablebecause the amounts of data used are often much too large for any human to analyze,comprehend and draw a conclusion from. It can be used almost anywhere, from selfdriving cars to brewing the perfect pint of beer1 . Computer vision is a field in computerscience that has had great success due to the increasing popularity of machine learning.Instead of having a human look at images and decide what they depict, we are able toteach computers to recognize patterns of previous images and see the resemblance in newimages. Computer vision can also be used to read alphanumeric characters in images andturn them into text.1.1Purpose of our projectThe purpose of this project is to develop a system for our employer ÅF that will change theworkflow for one of their customers. This customer has employees who often file damagereports on cars they own using an application created by ÅF. In its current state, theworkflow consists of taking several pictures of the car in question and then opening a texteditor to manually add the number of the license plate for the application to downloadinformation about it. The idea is to change the workflow so that information about the caris automatically gathered as part of the process of taking pictures. This would be done bycreating a system where a computer is able to read the license plate directly off an imagetaken by the employees, and that is what our project consists create-next-pint/, [2019-05-09].1

1.2Dissertation layoutChapter 2 consists of background on machine learning and computer vision such as toolsand concepts as well as adding some background to our employer. It explains differentconcepts and tools used in machine learning as well as adding some background to ouremployer. In Chapter 3 we explain the overall design of our solution consisting of a mobileapp and a backend service. Chapter 4 is a detailed explanation of how we implementedthe project and what tools where used to accomplish this. In Chapter 5 we evaluate theaccuracy of our object detection, performance of the application and both accuracy andperformance on the OCR service. In Chapter 6 we discuss the project as a whole, problemthat arose and how the system be could be improved further in the future.1.3Project outcomeThe project outcome came very close to what we expected while planning it. We have anAndroid application capable of detecting and cropping license plates as well as an OCRservice that is able to read a surprisingly high number of license plates. Our application’sperformance did not reach the levels we though were necessary before creating it, but aswe tested it with a real phone we realized that our estimated requirements were too high.A high-level image of our system is provided in Figure 1.1.Figure 1.1: A simple model showing our whole system.2

2BackgroundThis chapter begins by giving some information about our employer and then focus onmachine learning and computer vision. It will bring up theory about the subjects as wellas tools and techniques usable in practical implementations.2.1ÅFÅF AB, formerly known as Ångpanneföreningen (The Steam Boiler Association), is aSwedish company founded in 1895. They are an engineering and consulting companywhose main areas are industry, infrastructure, energy and digital solutions. [25] Duringthis project we worked under their Digital Solutions division.2.2Machine LearningMachine Learning (ML) is an automated process of data analysis. [29] By providing amachine learning algorithm with appropriate data, the algorithm will use it to detectpatterns and be able to make decisions by itself. Giving software the ability of decisionmaking allows for automation of various tasks where a human was previously required.Machine learning algorithms are categorized into three different groups depending onwhat style of learning they use. [7] Supervised learning, which is the most common practice,is done by giving the machine labeled data which it learns from. When trained, thealgorithm is expected to put those labels on unlabeled data. Unsupervised learning is whenyou use unlabeled data to train. This is done to allow the computer to learn the underlyingstructure of data by itself. The last category is called semi-supervised machine learningand is a mix between the two, used primarily because labeling data is time consuming.3

2.2.1Neural NetworksA Neural Network (NN) is a learning system commonly used when creating a supervisedML model which is inspired by neuron connections in human brains. [15] It consists of threetypes of layers: an input layer, one or more hidden layers and an output layer. The orderof the layers are important as the input layer executes first, then the hidden layer(s) andfinally the output layer. Each of those layers are built up by one or more nodes (commonlyreferred to as neuron or unit). These nodes have different purposes depending on whichlayer they are in. Nodes from the input layer will take some input from an external sourceand forward it to the nodes in the hidden layer. The hidden nodes will then apply afunction to the input, generate an output and send it to either the next hidden layer ifone exists or the output layer. The nodes in the output layer will also apply a function tothe input, although their output is the final result of the network and is sent as output ofthe NN. Each node gets an initial random weight associated to it which is used in theirrespective function. When training the network this weight will be adjusted to fit thedata. The rate at which the weight is adjusted can be improved by using an optimizerwhen training. Optimizers are algorithms which optimize the training of neural networks.These algorithms can improve both the speed of training as well as the accuracy of thefinal NN. There are many different optimizers—Adam, RMSProp and AdaGrad to namea few—and they all apply different functions to improve training. [6]2.2.2Convolutional Neural NetworksA Convolutional Neural Network (CNN) is a subclass of neural networks designed forapplying machine learning to images. [28] An image contains a very big amount of data.A typical photo taken with a modern cellphone may have a resolution 2000x1000 pixels.Every pixel also has to store three values representing each color in the red, green and bluecolor wheel. This gives us an array of the size 2000x1000x3 for a single image. While thisarray could be put through a regular NN to detect patterns, it would be inefficient and4

the end result would likely have insufficient precision. CNNs reduces the computationalcomplexity by scaling down the size of an image while extracting and keeping importantfeatures. This does not only increase speed but also increases precision by reducing theamount of noise in an image. While still using nodes as the original definition as NN, thelayers of a CNN are different and specifically made for imagery.2.2.3K-means clusteringK-means clustering is a way of grouping data into segments where k is a variable whichtells the algorithm how many groups or clusters the final result should consist of. [12] Thealgorithm then takes k number of positions, that is starting points and divide the data intok partitions to calculate the means of each partition. This is done n times and for everyiteration, the new partitions are calculated with regards to the current means. When usingk-means to quantize colors, the same algorithm is applied on a set of colors. The colorsare divided into k segments where the colors are converted into the average color of eachrespective groups they have fallen into. In Figure 2.1, we can see k-means being appliedon an image with values ranging from 0-255 in two directions, becoming a color combinedby red and green color. The k value is set to 16; hence 16 dots. The colors (dots), are eachindividually set to the average color of their associated group.Figure 2.1: An image with red and green channel (image: CC BY-SA 3.0 [9]). The colorsare represented on a plot grouped into segments using k-means (image: public domain [11]).5

2.3Computer VisionComputer vision is an interdisciplinary field where engineers strives to make machinesmimic the human visual system, to enable them to perceive the world in the same way ashumans do.2

Automatic Number Plate Recognition for Android Stefan Larsson, Filip Mellqvist c 2019 The author(s) and Karlstad University. This report is submitted in partial ful llment of the requirements for the Bachelor’s degree in Computer Science. All material in

Related Documents:

The gate voltage that "inverts" the channel material into one with as many free carriers ATV 2011, L1, Per Larsson-Edefors Page 7 . Technology must scale in other "dimensions". i dVti l )DLDi I3 (ATV 2011, L1, Per Larsson-Edefors Page 21 . What will happen in multi-gate FETs? ATV 2011, L1, Per Larsson-Edefors Page 29

J & S Polled Herefords JankeBox51BuckLakeCarl Box52 3M3 JaworskiJ 3 Jeffcott Clarence J 388 2 Jeffcott Jerry Boxl3BuckLk 388-3 Jenson Paul . Larson Lars HBox59 3M-37S1 Larson Ross Larsson LEinar Boxll5BuckLake 388 1 Larsson S H Boxl04BuckLake I earh D 388-3H/ Lee Brian BuckLake

Self-esteem is a core feature of psychological adjustment during ado-lescence (Rosenberg, 1985), and low self-esteem is predictive of depression risk during adolescence (Goodman & Whitaker, 2002), especially among girls (Park, 2003; Ra ty, Larsson, So derfeldt, & Larsson, 2005). Because increases in physical activity have been associated .

Sep 08, 2008 · What is semantics, what is meaning Lecture 1 Hana Filip. September 8, 2008 Hana Filip 2 What is semantics? Semantics is the study of the relation between form and meaning –Basic observation: language relates physical phenomena (acoustic blast

2Filip Rosenqvist,Stockholm,2013 ISBNXXX-X-XX-XXXXXX-X Rendered as PDFand distributed online,by Filip Rosenqvist 2Any part of my grammar and/or (future) wordlist may be quoted, copied, and distributed freely and non- commercially as long as Iam mentioned as the original creator (as opposed to potential future co-developers) and

2 DIVA 2.0 Diagnostic Interview for ADHD in adults Colophon The Diagnostic Interview for ADHD in adults (DIVA) is a publication of the DIVA Foundation, The Hague, The Netherlands, August 2010. The original English translation by Vertaalbureau Boot was supported by Janssen-Cilag B.V. Back-translation into Dutch by Sietske Helder. Revison by

DIVA Operation Manual 1 Introduction Please read this instruction manual carefully. This manual is intended for use with all Diva Weigh Only and Price Computing scales. The Diva scale is combined with a scanner for full featured Checkout applications. The Diva scale is a high precision weighing instrument.

Abrasive water jet can do this with quality results but, generally is too expensive compared to plasma, laser or punching. 5. Cut Geometry Abrasive waterjet cuts have straight edges with a slight amount of taper. Kerf width is controlled by the orifice/nozzle combination. Cuts in thicker materials generally require larger combinations with more abrasive usage. The kerf width can be as small as .