TH's Notes
  • Home
  • Categories
  • Tags
  • Archives

Implement multilayer perceptron for digit recognition

This post will demonstrate how to implement multilayer perceptron for digit recognition.

The detailed derivations of algorithm can be found from this script.

Main workflow

  • Preparing training/validation/testing datasets.
  • Set the weight decay / numerical parameters.
  • Check if the gradients of the loss function are correct.
  • Training model.
  • Estimate the accuracy of predictions.

Ipython notebook

In [1]:
%load_ext autoreload
%autoreload 2
%matplotlib inline


import numpy as np
import matplotlib.pyplot as plt

from dnn_play.classifiers.mlp import MLP, mlp_loss, rel_err_gradients

from dnn_play.utils.data_utils import load_mnist
from dnn_play.utils.visualize_utils import display_network

# Plot settings
plt.rcParams['figure.figsize'] = (10.0, 10.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
In [2]:
# Load MNIST data
(X_train, y_train), (X_val, y_val), (X_test, y_test) = load_mnist()
#(X_train, y_train), (X_val, y_val), (X_test, y_test) = load_mnist(n_train=900, n_val=100, n_test=100)

print("X_train shape = {} y_train shape = {}".format(X_train.shape, y_train.shape))
print("X_val   shape = {} y_val  shape = {}".format(X_val.shape, y_val.shape))
print("X_test  shape = {} y_test shape = {}".format(X_test.shape, y_test.shape))
X_train shape = (55000, 784) y_train shape = (55000,)
X_val   shape = (5000, 784) y_val  shape = (5000,)
X_test  shape = (10000, 784) y_test shape = (10000,)
In [3]:
# Number of layer units
input_size  = X_train.shape[1] # Dimension of features
hidden_size_L1 = 200
output_size = np.max(y_train) + 1 # Number of classes

# Network configuration
layer_units = ((input_size, hidden_size_L1, output_size))

# Hyperparameters
reg = 1e-4 # Regulation, weight decay

# Numerical parameters
max_iters = 400

# Define the classifier
clf = MLP(layer_units)

# Initialize weights
weights = clf.init_weights()

loss, grad = mlp_loss(weights, X_train, y_train, 0.0)

# Note there are 10 classes.
# As a rough sanity check, our loss should be something close to -log(0.1).
print('loss: %f' % loss)
print('sanity check: %f' % (-np.log(0.1)))
loss: 2.302589
sanity check: 2.302585
In [4]:
# Gradient checking
if rel_err_gradients() < 1e-8:
    print("Gradient check passed!")
else:
    print("Gradient check failed!")  
Gradient check passed!
In [5]:
"""
Training
"""
reg = 1e-4 # Regulation, weight decay

#clf = MLP(layer_units, weights = weights)
weights, loss_history, train_acc_history, val_acc_history = clf.fit(X_train, y_train, X_val, y_val, 
                                                                    reg=reg, max_iters=max_iters, verbose=True)
    
iter:   20, loss: 1.586643, train_acc: 0.431182, val_acc: 0.446200
iter:   40, loss: 0.361330, train_acc: 0.902291, val_acc: 0.924200
iter:   60, loss: 0.226814, train_acc: 0.945418, val_acc: 0.957600
iter:   80, loss: 0.169513, train_acc: 0.966964, val_acc: 0.972000
iter:  100, loss: 0.138802, train_acc: 0.980491, val_acc: 0.976800
iter:  120, loss: 0.121456, train_acc: 0.986982, val_acc: 0.979400
iter:  140, loss: 0.112092, train_acc: 0.991691, val_acc: 0.980600
iter:  160, loss: 0.107356, train_acc: 0.993564, val_acc: 0.981600
iter:  180, loss: 0.104478, train_acc: 0.994745, val_acc: 0.983200
iter:  200, loss: 0.102320, train_acc: 0.995327, val_acc: 0.984200
iter:  220, loss: 0.100476, train_acc: 0.995291, val_acc: 0.983800
iter:  240, loss: 0.099097, train_acc: 0.995145, val_acc: 0.985000
iter:  260, loss: 0.097932, train_acc: 0.995582, val_acc: 0.983800
iter:  280, loss: 0.097004, train_acc: 0.995782, val_acc: 0.984200
iter:  300, loss: 0.096228, train_acc: 0.996291, val_acc: 0.984600
iter:  320, loss: 0.095559, train_acc: 0.996364, val_acc: 0.985200
iter:  340, loss: 0.094926, train_acc: 0.996327, val_acc: 0.984400
iter:  360, loss: 0.094436, train_acc: 0.996382, val_acc: 0.985000
iter:  380, loss: 0.094027, train_acc: 0.996436, val_acc: 0.984800
iter:  400, loss: 0.093674, train_acc: 0.996491, val_acc: 0.985400
In [6]:
# Plot the loss function and train / validation accuracies
plt.subplot(2, 1, 1)
plt.plot(loss_history)
plt.title('Loss history')
plt.xlabel('Epoch')
plt.ylabel('Loss')

plt.subplot(2, 1, 2)
plt.plot(train_acc_history)
plt.plot(val_acc_history)
plt.legend(['Training accuracy', 'Validation accuracy'], loc='lower right')
plt.xlabel('Epoch')
plt.ylabel('Clasification accuracy')
Out[6]:
<matplotlib.text.Text at 0x106b5f630>
In [7]:
# Visualize the weights 

W0 = weights[0]['W']
image = display_network(W0)
plt.imshow(image, cmap = plt.cm.gray)
Out[7]:
<matplotlib.image.AxesImage at 0x106be5fd0>
In [8]:
# Make predictions
pred = clf.predict(X_test)
acc = np.mean(y_test == pred)

print("Accuracy: {:5.2f}% \n".format(acc*100))
Accuracy: 98.14% 

In [9]:
# View some images and predictions
n_images = 4
images = X_test[:n_images].reshape((n_images, 28, 28))
pred = clf.predict(X_test[:n_images])

for i in range(n_images):
    plt.subplot(1, n_images, i+1)
    plt.imshow(images[i], cmap = plt.cm.gray)
    plt.title('Predicted digit: {}'.format(pred[i]))
    plt.axis('off')

    

Multilayer perceptron

In case you are interested in all codes related in this demonstration, please check the repository.

Comments
comments powered by Disqus

  • « Implement softmax regression for digit recognition
  • Implement sparse autoencoder for digit recognition »

Published

Dec 23, 2015

Category

Machine learning

Tags

  • cv 16
  • Powered by Pelican. Theme: Elegant by Talha Mansoor