Recently I need to train a DNN model for object detection task. In the past, I am using the object detection framework from tensorflows’s subject — models. But there are two reasons that I couldn’t use it to do my own project:
First, it’s too big. There are more than two hundred python files in this subproject. It contains different types of models such as Resnet, Mobilenet and different type of detection algorithms such as RCNN and SSD (Single Shot Detection). Enormous codes usually mean hard to understand and customize. It might cost a lot of times to build a new project by referencing a big old project.
Second, it uses too much configuration files hence even trying to understanding these configs would be a tedious work.
Then I find another better project in GitHub: SSD-Tensorflow. It’s simple. It only includes less than fifty Python files. You can begin training only by using one command:

DATASET_DIR=./train_tfrecords
TRAIN_DIR=./logs/
CHECKPOINT_PATH=./logs/
CUDA_VISIBLE_DEVICES=0,1 python train_ssd_network.py \
  --train_dir=${TRAIN_DIR} \
  --dataset_dir=${DATASET_DIR} \
  --dataset_name=pascalvoc_2007 \
  --dataset_split_name=train \
  --model_name=ssd_512_vgg \
  --checkpoint_path=${CHECKPOINT_PATH} \
  --save_summaries_secs=60 \
  --save_interval_secs=600 \
  --weight_decay=0.0005 \
  --optimizer=rmsprop \
  --learning_rate=0.001 \
  --end_learning_rate=0.000001 \
  --learning_rate_decay_factor=0.96 \
  --num_epochs_per_decay=5.0 \
  --batch_size=8 \
  --gpu_memory_fraction=1.0 \
  --num_clones=2

No configuration files, no ProtoBuf formats.
Although simple and easy to use, this project still has the capability to be remolded. We can add new dataset by changing the code in directory ‘/datasets’ and add new networks in directory ‘/nets’. Let’s see the basic implementation of its VGG for SSD:

# nets/ssd_vgg_512.py
    with tf.variable_scope(scope, 'ssd_512_vgg', [inputs], reuse=reuse):
        # Original VGG-16 blocks.
        net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
        end_points['block1'] = net
        net = slim.max_pool2d(net, [2, 2], scope='pool1')
        # Block 2.
        net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
        end_points['block2'] = net
        net = slim.max_pool2d(net, [2, 2], scope='pool2')
        # Block 3.
        net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
        end_points['block3'] = net
        net = slim.max_pool2d(net, [2, 2], scope='pool3')
        # Block 4.
        net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
        end_points['block4'] = net
        net = slim.max_pool2d(net, [2, 2], scope='pool4')
        # Block 5.
        net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
        end_points['block5'] = net
        net = slim.max_pool2d(net, [3, 3], 1, scope='pool5')
...

Quit straight, isn’t it?