Number Plate Recognition using Python
In the following tutorial, we will understand how to recognize License number plates using the Python programming language. We will utilize OpenCV for this project in order to identify the license number plates and the python pytesseract for the characters and digits extraction from the plate. We will build a Python program that automatically recognizes the License Number Plate by the end of this tutorial.
Understanding the Automatic License Number Plate Recognition System
Automatic License Number Plate Recognition Systems are available in all shapes and sizes:
The fact that makes Automatic License Number Plate Recognition more complicated may require operating in real-time.
For instance, let us consider an ANPR system that is mounted on a toll road. It has to be able to detect the number plate of each vehicle passing by, OCR the characters on the plate, and then store this data in a database so the vehicle's owner can be billed for the toll.
Few compounding factors make ANPR extremely challenging, involving finding a set of data we can utilize in order to train a custom model for ANPR. Large, robust datasets of ANPR that are utilized to train state-of-the-art models are tightly guarded and hardly (if ever) released publicly:
Pre-requisites of the project
We will use the Python OpenCV library. It is an open-source library for machine learning and offers a common infrastructure for computer vision. We will also use Pytesseract for the project. Pytesseract is a Tesseract-OCR Engine to read images type and extract the details available in the image.
We can install the OpenCV library using the pip installer with the help of the following syntax:
The same procedure will be followed in order to install the Pytesseract engine. The syntax for the same is shown below:
Features of OpenCV
Understanding the Python code
Since we have covered the theory part for the project, let us get into the coding part. We have divided the whole source code of the project into different steps for better understanding and clarity.
Step 1: Importing the required modules
First of all, we have to import the OpenCV and pytessaract along with matplotlib, glob and OS.
Note: The name of the file must be the exact number in the respective image of the license plate. For instance, if the number of the license plate is "FTY348U", then the name of the image file will be "FTY348U.jpg".
Step 2: Performing OCR using the Tesseract Engine on Number Plates
For the following step, we have to perform OCR with the help of the Tesseract Engine on License Number plates. The same can be observed in the following snippet of code.
In the above snippet of code, we have specified the path to the image files of the License number plate using the OS module. We have also defined two empty lists as NP_list and predicted_NP. We have then appended the actual number plate to the list using the append() function. We then used the OpenCV module to read each number plate image file and stored them in the NP_img variable. We have then passed each number plate image file to the Tesseract OCR engine with the help of the Python library wrapper. We have then got back predicted_res for the number plate and append it in a list and compare it with the genuine one.
Now, since we have the plates predicted but we don"t know the prediction. So, in order to view the data and prediction, we will perform a bit of visualization, as shown below. We will also be estimating the accuracy of the prediction without the help of any core function.
Original Number Plate Predicted Number Plate Accuracy -------------------- ----------------------- -------- DL3CAM0857 DL3CAM0857 100 % MD06NYW MDOGNNS 40 % TN21TC706 TN21TC706 100 % TN63DB5481 TN63DB5481 100 % UP14DR4070 UP14DR4070 100 % W5KHN WSKHN 80 %
In the above snippet of code, we have defined a function calculating the predicted accuracy. Within the function, we used the for-loop to iterate through the list of original number plates and the predicted ones and checked if they matched. We have also checked the accuracy on the basis of the number's length for getting better and appropriate results.
We can observe that the Tesseract OCR engine mostly predicts all of the license plates correctly with a rate of 100% accuracy. The Tesseract OCR engine predicted incorrectly for the number plates, and we will apply the image processing technique on those number plate files and pass them to the Tesseract OCR again. We can increase the accuracy rate of the Tesseract Engine for the number plates of the incorrectly predicted number plates by applying the techniques of image processing.
Step 3: Techniques for Image Processing
Let us consider the following snippet of code to understand the technique of Image Processing.
In the above snippet of code, we have imported the image module from the matplotlib library and used the for-loop to extract the image from the designated folder. We have then used the imread() function to read the extracted image. We have then used the plot module of the matplotlib library to display the image for the users.
Let us consider the following example to understand the same.
In the above snippet of code, we have some tools of the OpenCV module to resize the image, convert it into grayscale, and denoise the image.
Once the above steps are complete, we can pass the transformed license plate file to the Tesseract OCR engine and view the predicted result.
The same can be observed in the following snippet of code.
In the above snippet of code, we passed the final processed image to the Tesseract OCR engine to extract the number from the license number plate.
Similarly, we can perform this image process for all other license number plates with not 100% accuracy. Thus, the number plate recognition model is ready.