Recognizing Handwritten Digits

 Hello,This article serves as a documentation of my second project — Recognizing Handwritten Digits for my internship program at Suven Consultants & Technology Pvt. Ltd.

The main objective of this project is to understand the classical problem of Recognizing Handwritten Digits by computers. This is a classic classification problem in machine learning that can be traced back to the first automatic machines that needed to recognize individual characters in handwritten documents. Think about, for example, the ZIP codes on letters at the post office and the automation needed to recognize these five digits. Perfect recognition of these codes is necessary in order to sort mail automatically and efficiently.

Included among the other applications that may come to mind is OCR (Optical Character Recognition) software. OCR software must read handwritten text, or pages of printed books, for general electronic documents in which each character is well defined.

But the problem of handwriting recognition goes farther back in time, more precisely to the early 20th Century (1920s), when Emanuel Goldberg (1881–1970) began his studies regarding this issue and suggested that a statistical approach would be an optimal choice.

Implementation -

Image for post

Understanding the ‘digits’ data set

The digits data set has a number of attributes.

dir(digits)> ['DESCR', 'data', 'feature_names', 'frame', 
'images', 'target', 'target_names']

Target names attribute tells us about the available digit labels in the data set.

digits.target_names # List of digits representations in the data set> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

digits.image is an array with 3 dimensions.

  • The first dimension indexes images, & we see that we have 1797 images in total.
  • The next two dimensions are the x & y coordinates of the pixels in each image. Each image has 8x8 = 64 pixels.

In other words, this array could be represented in 3D as a pile of images with 8x8 pixels each.

digits.images.shape >(1797, 8, 8)

Let’s look at the data of the first 8x8 image. Each slot in the array corresponds to a pixel, and the value in the slot is the amount of black in the pixel.

Image for post

The data set is a collection of images represented as array with 3 dimensions. Machine learning algorithms work efficiently with normalized one dimensional arrays i.e or vectors. So ‘flattening’ the data set is basically converting the three dimensional array into a vector for further computation.

n_samples = len(digits.images)data = digits.images.reshape((n_samples, -1))Final shape of our data -
digits.data.shape
>> (1797, 64)
Image for post

I chose K Nearest-meigbors (KNN) as my classifier. Reference: https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

model = KNeighborsClassifier(n_neighbors = 5)

Now this is where the ‘learning’ happens. The model is fit with the training data. This is done using the fit() method of sklearn library.

model.fit(x_train, y_train)

The vital step of a model is checking it’s performance. Usually project managers set a cut-off accuracy for the model. The model should meet those numbers else, the hyper-parameters are changed and sent to training again.

Predictions is done using the predict() method in sklearn.

model.predict(x_train)

I wrote a score evaluation function to count the number of correct predictions with respect to the target names and converted the value to a percentile value.

Image for post

Step 7: Save the trained model

Lastly, using the ‘joblib’ library we may save the model. In the future, if we need to use the model, we will not have to train it again.

file_name = 'digit_model.joblib'dump(model, file_name)['digit_model.joblib']

Conclusion -

The model is able to predict handwritten digits with an average accuracy of 98%.

Comments

Popular posts from this blog

Meteorogical Data Analysis