Please checkout DnnWeaver v1.0 below
H. Sharma, J. Park, D. Mahajan, E. Amaro, J. K. Kim, C. Shao, A. Mishra, H. Esmaeilzadeh, "From High-Level Deep Neural Models to FPGAs", in the Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2016.
DnnWeaver consists of four components:
The Translator component of DnnWeaver creates a macro-dataflow graph of the DNN model using our Instruction Set Architecture (ISA). Each node of the macro-dataflow graph represents a layer of DNN model. DnnWeaver then generates a static execution schedule for the accelerator using the macro-dataflow graph.
The Design Planner optimizes the following parameters to specialize the accelerator for the given DNN model:
The Design Weaver generates the accelerator core using our hand-optimized Verilog templates according to the Design Planner.
The Integrator is the final component of DnnWeaver and appends the memory interface code to the accelerator. As a part of the initial release, DnnWeaver includes the memory interface code for Xilinx Zynq ZC7020 board.
DnnWeaver supports a wide range of Convolutional Neural Networks. The following are the benchmark models we used for the experimental results in our paper:
Benchmark Deep Neural Network (DNN) Models |
|||||||||
---|---|---|---|---|---|---|---|---|---|
# | Benchmark Name | Data Set | Description | Number of Layers | Model Size (MegaBytes) | Number of Operations | Lines of Code (prototxt) | ||
1 | LeNet | MNIST | Hand-written Digit Recognition | 7 | 0.8 MB | 2 GOps | 128 | ||
2 | Cifar-10 Full | Cifar-10 | Object Recognition | 12 | 0.2 MB | 12 GOps | 156 | ||
3 | Network-in-Network (NiN) | ILSVRC 2012 | Object Detection and Classification | 28 | 14.5 MB | 1106 GOps | 516 | ||
4 | Djinn-ASR | Djinn and Tonic | Speech-to-text Decoder | 13 | 48.4 MB | 25 GOps | 105 | ||
5 | AlexNet | ILSVRC 2012 | Object Detection and Classification | 20 | 119.0 MB | 1147 GOps | 278 | ||
6 | VGG-CNN-S | ILSVRC 2012 | Object Detection and Classification | 19 | 196.0 MB | 2666 GOps | 200 | ||
7 | Overfeat | ILSVRC 2012 | Object Detection and Classification | 16 | 278.0 MB | 2798 GOps | 196 | ||
8 | VGG-16 | ILSVRC 2012 | Object Detection and Classification | 36 | 324.0 MB | 16362 GOps | 347 |