Matrix multiplication is implemented using a systolic array architecture.
Every cycle feed packed weight data to Input pins and input data to Bidirectional pins. Strobe Enable pin to start receiving results of the matrix multiplication on the Output pins.
MCU is necessary to feed weights and input data into the accelerator and fetch the results.
# | Input | Output | Bidirectional |
---|---|---|---|
0 | packed weights LSB | result LSB | (in) activations LSB |
1 | packed weights | result | (in) activations |
2 | packed weights | result | (in) activations |
3 | packed weights | result | (in) activations |
4 | packed weights | result | (in) activations |
5 | packed weights | result | (in) activations |
6 | packed weights | result | (in) activations |
7 | packed weights MSB | result MSB | (in) activations MSB |