Large-Scale Optical Neural Networks Based on Photoelectric Multiplication

Open Access

Large-Scale Optical Neural Networks Based on Photoelectric Multiplication

Ryan Hamerly, Liane Bernstein, Alexander Sludds, Marin Soljačić, and Dirk Englund

Phys. Rev. X 9, 021032 – Published 16 May 2019

Abstract

Recent success in deep neural networks has generated strong interest in hardware accelerators to improve speed and energy consumption. This paper presents a new type of photonic accelerator based on coherent detection that is scalable to large ( $N ≳ 10^{6}$ ) networks and can be operated at high (gigahertz) speeds and very low (subattojoule) energies per multiply and accumulate (MAC), using the massive spatial multiplexing enabled by standard free-space optical components. In contrast to previous approaches, both weights and inputs are optically encoded so that the network can be reprogrammed and trained on the fly. Simulations of the network using models for digit and image classification reveal a “standard quantum limit” for optical neural networks, set by photodetector shot noise. This bound, which can be as low as $50 zJ / MAC$ , suggests that performance below the thermodynamic (Landauer) limit for digital irreversible computation is theoretically possible in this device. The proposed accelerator can implement both fully connected and convolutional networks. We also present a scheme for backpropagation and training that can be performed in the same hardware. This architecture will enable a new class of ultralow-energy processors for deep learning.

1 More

Received 12 November 2018
Revised 21 February 2019

DOI:https://doi.org/10.1103/PhysRevX.9.021032

Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.

Published by the American Physical Society

Physics Subject Headings (PhySH)

Imaging & optical processing Laser applications Optical interferometry Photonics Quantum fluctuations & noise

Optical sources & detectors Solid state lasers

Atomic, Molecular & OpticalQuantum Information, Science & Technology

Authors & Affiliations

Ryan Hamerly^*, Liane Bernstein, Alexander Sludds, Marin Soljačić, and Dirk Englund

Research Laboratory of Electronics, MIT, 50 Vassar Street, Cambridge, Massachusetts 02139, USA

^*rhamerly@mit.edu

Popular Summary

The rapid growth of artificial intelligence based on deep neural networks has outpaced available processing power, which is limited by on-chip energy consumption. Optics can significantly improve the energy consumption of neural networks, but current approaches are limited to small systems. At present, the goal of a large-scale, reconfigurable optical neural network remains unrealized. We present a new approach based on photoelectric multiplication that is scalable to systems containing millions of neurons while retaining the speed and energy efficiency of optics.

Optical homodyne detection is a common technique that combines the quadratic response of photodetectors (which means that the intensity scales as the field squared) and linear optical interference to compute the product of two input fields. We show that this “quantum photoelectric multiplication” process can be generalized to matrix-vector and matrix-matrix products using standard free-space optical components (such as lenses, beam splitters, and photodetector arrays) for matrix dimensions ranging from thousands to millions.

Digital computers are irreversible devices subject to a thermodynamic limit for energy per operation. However, optical interference is reversible, so this “Landauer limit” does not apply. Instead, quantum-limited noise gives rise to a “standard quantum limit” for optical neural networks. In simulations, we show that this limit is low enough to suggest sub-Landauer performance is theoretically possible with optics.

Matrix products are the rate-limiting step in neural-network inference, accounting for the vast majority of computing time. Performing this step optically will boost the performance of neural-network processors by orders of magnitude and enable a new class of ultra-low-energy processors for deep learning.

Key Image

Article Text

Click to Expand

Supplemental Material

Click to Expand

References

Click to Expand

Issue

Vol. 9, Iss. 2 — April - June 2019

Subject Areas

Reuse & Permissions

Author publication services for translation and copyediting assistance advertisement

Physical Review X