Use mxnet to classify images of birds (third episode)

After using CNN in previous article, it still can’t recognize the correct name of birds if the little creature stand on the corner (instead of the center) of the whole picture. Then I started to think about the problem: how to let neural-network ignore the position of the bird in picture, but only focus on its exists? Eventually I recollected the “max pooling”:


By choose the max feature value from 2×2 pad, it will amplify the most important feature without affected by backgrounds. For example, if we split a picture into 2×2 chassis (4 plates) and the bird only stand in the first plate, the “max pooling” will choose only the first plate for next processing. Those trees, pools, leaves and other trivial issues in other three plates will be omitted.

Then I modify the structure of CNN again:

and using “0.3” for my learning rate, as “0.3” is better to against overfitting.

For one week (Chinese New Year Festival), I was studying “Neural Networks and Deep Learning”. This book is briefly awesome! A lot of doubts about Neural Networks for me have been explained and resolved. In third chapter, the author Michael Nielsen suggests a method, which really enlightened me, to defeat overfitting: artificially expanding training data. The example is rotating the MNIST handwritten digital picture by 15 degrees:

In my case, I decided to crop different parts of bird picture if the picture is a rectangle:

by using the python PIL (Picture Processing Library):

The effect of using “max pooling” and “expanding training data” is significant:

Use mxnet to classify images of birds (second episode)

Using one convolutional-layer and two fully-connected-layers cost too much memory and also have bad performance for training, therefore I modify the model to two convolutional-layers and two narrow fully-connected-layers:

and training it by using learning rate “0.1” instead of “0.01” which may cause “overfit” in neural network.
Finally, the model occupied only 6MB disk space (It was more than 200MB before).

Now I could build a web site on a virtual machine of AliCloud (which is sponsored by Allen Mei, my old colleague) to let users uploading birds’ image and classifying it freely. To thank my sponsor, I named the web site “Allen’s bird” 🙂

In this web, I use angularjs and ngImgCrop plugin from “Alex Kaul”. They are powerful and convenient.

The append() operation of np.array() is very slow.
After replacing np.array() by normal python array, the training script could run much faster now.

Use mxnet to classify images of birds (first episode)

Recently, I was trying to classify images of birds by using machine learning technology. The most familiar deep learning library for me is the mxnet, so I use its python interface to build my Birds-Classification-System.
For having not sufficient number of images for all kinds of bird, I just collect three types of them: “Loggerhead Shrike”, “Anhinga”, and “Eastern Meadowlark”.

Loggerhead Shrike Anhinga Eastern Meadowlark

After collecting more than 800 images of the three kinds of bird, I started to write my python code by learning the “Handwritten Digital Sample” of mxnet step by step.
Firstly, using PIL (Python Image Library) to preprocess these images – chop them from rectangle to square with 100 pixels length of edge:

Then put all images into a numpy array and label them:

Now I can build the Convolutional Neural Network model easily by using the powerful mxnet. The CNN will slice all pictures to 8×8 pixels small chunk with 2 pixels step, therefore enhance the small features of these birds, such as black-eye-mask of Loggerhead-Shrike, yellow neck of Eastern-Meadowlark, etc.

Training the data:

Using GPU for training is extremely fast – it only cost me 5 minutes to train all 800 images, although adjusting the parameters of CNN cost me more than 3 days 🙂

Firstly I use Fully Connected Neural Network, it costs a lot of time for training but prone to overfit. After using the CNN with BatchNorm() in mxnet, the speed of training and affect of classification advanced significantly.
CNN(Convolutional Neural Network) is really a ace in deep learning area for images!

The type of variables in Python

Haven’t written python code for more than one year, I met this simple problem:

Even the code have print out the value of “a” and “b” as 2 and 1, the condition check “if a >= b:” is false!

Spending more than 10 minutes, I eventually get the reason: the type of “a” is “int” but “b” is “string” (and the interpreter of Python will not report any warning about this “inconsistency”). I should have been taking enough care of the type of these variables.
Seems “print” can’t reveal adequate details of a variable, therefore it is highly suggested we using “pprint” instead of “print”.

The result will be

A example of Mesos Python Framework to calculate Pi

I have written an example of Mesos Framework by python. It simply calculate “Pi” by using Mento-Carlo algorithm
The whole source code is at
At beginning, I use python threading in “launchTask()”:

But I found out that the executors only spend a small part of CPU resource in slave machines (about 100%~150%, which is too low in a 8-cores server). First, I thought it may be limited by cgroup, but later, the answer is revealed: multi-threaded python application is actually single thread because of GIL.

With no choice, I have to change my code: the thread will launch a new process, then the new process will calculate the result and finally return the result to thread. It works fine in my Mesos cluster.

Override the __init__ of BaseHTTPRequestHandler in python

Currently, I am writing some python program for the front of our storage system. By using ‘BaseHTTPRequestHandler’ and ‘ThreadedHTTPServer’, we could implement a simple multi-thread http server quickly. But after add ‘__init__()’ for our ‘MyHandler’, it doesn’t work correctly now:

Then I found this statement in python docs:

But thanks for the code example, we could override ‘__init__’ of BaseHTTPRequestHandler’ this way: