February 2018 – Robin on Linux

My choice between Raspberry PI, Arduino, Pyboard and Micro:bit

I want to teach my child about programming. But Teaching child to sit steadily and keep watching computer screen is not easy, I think, for children usually can’t focus on the boring developing IDE for more than ten minutes. Therefore I try to find some micro-controller which could be used to do some interesting works for kids, such as getting temperature/humidity from the environment, or control micro motors on toy car.
There are many micro-controller or micro-computer chips on the market. Then I have to compare them and finally choose the most suitable one.

Raspberry PI is very powerful. It could do almost anything that a personal computer or laptop can do. But the problem about Raspberry PI is it is too difficult to learn for a child. And another reason I give up it is the price: $35 for only the bare chip without any peripherals.

Arduino

Arduino is cheap enough. But you could only use C-language to program it. Using C-language need strong knowledge about computer science, such as memory models and data structures. Imaging use C-language to implement a working-well dictionary, it looks like building a space ship in the backyard for a pupil.
Until now, I could narrow my choices to chips that could support python, or micro-python. Because Python is easy to understand, looks much straightforward, and also could be used in imperative mode. In one word, it’s much easier to learn than C.
So let’s take a look at chips which support micro-python.

Pyboard

Pyboard is simple and cheap enough, and also supports micro-python. The only imperfection of it is that its hardware interface is hard to use for someone who doesn’t know hardware very well.

Micro:bit

Launched in 2015 by BBC, Micro:bit is the most suitable chip for a child to learning program and even IOT(Internet Of Things), in my opinion. It is cheap: only $18. It supports programming by micro-python and Javascript (Graphic IDE), so the child could using a few lines of Python code to control it. It also supports a bunch of peripherals such as micro motor, thermometer, hygrometer, Bluetooth and WIFI. Children could use it as core-controller to run an intelligent toy car. The Micro:bit even have Android/IOS app for operation, which is perfect for a little child under 7 years.
So this is my choice: Micro:bit.

Technical Meeting with Nvidia Corporation

Last week I went to Nvidia Corporation of Santa Clara (California) with my colleagues to join a technical meeting about cutting-edge hardware and software of Deep Learning.

The new office building of NVIDIA

At the first day, team leaders from Nvidia introduced their developing plan of new hardware and software. The new hardware are about Tesla V100, NVLink, and HGX (next generation of DGX). And the software is about CUDA-9.2 NCCL-2.0 and TensorRT-3.0
Here are some notes about their introducing:

The next generation of Tesla P4 GPU will have tensor-core, 16GB memory, and H264 decoder (performance as Tesla P100) for better inference performance, especially for image/video processing.
The software support of tensor-core (mainly in Tesla V100 GPU) has been integrated into Tensorflow-1.5 version.
The TensorRT could turn three layers of Deep Learning (Conv layer, Bias layer, Relu layer) to one CBR layer, eliminate concatenation layers, to accelerate inference computing.
The tool ‘nvidia-smi’ could show ‘util’ of GPU. But ‘80%’ utility only means this GPU run task (no matter how many CUDA-cores has been used) for 0.8 seconds in one second period. Therefore it’s not an accurate metrics for real GPU load. NVPROF is the much powerful and accurate tool for profiling of GPU

The TITAN V GPU

At the second day, many teams from Alibaba (my company) ask Nvidia different questions. Here are some questions and answers:

Q: Some Deep Learning Compilers such as XLA (Google) and TVM(from AWS) could compile python code to GPU intermediate representation directly. How will Nvidia work with these application-oriented compilers?
A: The google XLA team will be shut off and move to optimize TPU performance only. Nvidia will still focus on a library such as CUDA/cuDNN/TensorRT and will not build frameworks like Tensorflow or Mxnet.

Q: There are many new types of hardware launched for Deep Learning: Google’s TPU, some ASICs developed by other companies. How will Nvidia keep cost performance over these new competitors?
A: ASICs are not programmable. If models of Deep Learning changes, the ASIC will be in the trash. For example, TPU has Relu/Conv instructions, but if it comes to a new type of activation function, it will not work anymore. Furthermore, customers can only run TPU on Google’s cloud, which means they have to put their data on the cloud, without other choices.

The DGX server

We also visited the Demo Room of Nvidia’s state-of-art hardware for auto-driving and deep learning. It was an effective meeting, and we learn a lot.

The car of auto-driving testing platform

I am standing before the NVIDIA logo

Robin on Linux

Monthly Archives: February 2018