IEEE Micro, volume 38, 5, 2018.


Many recent works take advantage of highly parallel analog in-situ computation in memristor crossbars to accelerate the many vector-matrix multiplication operations in deep neural networks (DNNs). However, these in-situ accelerators have two significant shortcomings: The ADCs account for a large fraction of chip power and area, and these accelerators adopt a homogeneous design in which every resource is provisioned for the worst case. By addressing both problems, the new architecture, called Newton, moves closer to achieving optimal energy per neuron for crossbar accelerators. We introduce new techniques that apply at different levels of the tile hierarchy, some leveraging heterogeneity and others relying on divide-and-conquer numeric algorithms to reduce computations and ADC pressure. Finally, we place constraints on how a workload is mapped to tiles, thus helping reduce resource-provisioning in tiles. For many convolutional-neural-network (CNN) dataflows and structures, Newton achieves a 77-percent decrease in power, 51-percent improvement in energy-efficiency, and 2.1× higher throughput/area, relative to the state-of-the-art In-Situ Analog Arithmetic in Crossbars (ISAAC) accelerator.


Bib Entry

  author = {Nag, Anirban and Shafiee, Ali and Balasubramonian, Rajeev and Srikumar, Vivek and Walker, Ross and Strachan, John Paul and Muralimanohar, Naveen},
  title = {{Newton: Gravitating Towards the Physical Limits of Crossbar Acceleration}},
  journal = {IEEE Micro},
  year = {2018},
  volume = {38}