Tag Archives: TPU

Read paper “In-Datacenter Performance Analysis of a Tensor Processing Unit”

Paper reference: In-Datacenter Performance Analysis of a Tensor Processing Unit” Application Using floating point (16bit or 32bit) for NN (Neural Network) training, then a step called quantization transforms floating-point numbers into narrow integers–often just 8 bits–which are usually good enough for inference. MLP(Multi-layer Perceptions), CNN(Convolutional Neural Netowrks), and RNN(Recurrent Neural… Read more »