Design and Analysis of a hardware CNN accelerator
I am learning CNN algorithm to implement it on FPGA.
So I started to review related papers and this will be the 1st paper that I reviewed.
The basic concept of this paper is actually from google's TPU. They (the authors) optimized quantization and CONV layers to make the system efficiency.
Followings are a summary of the paper.