Look at the tip of a standard #2 pencil. Now, imagine performing over one trillion multiplication operations in the area of that pencil tip every second. This can be accomplished using today’s 7nm semiconductor technology. Combining this massive compute capability with deep neural networks in small, low-cost, battery-powered devices will help us address challenges from Covid-19 to Alzheimer’s disease.
The neural networks behind stand-out systems like AlphaGo, Alexa, GPT-3, and AlphaFold require this kind of computational power to do their magic. Normally they run on data-center servers, GPUs, and massive power supplies. But soon they’ll run on devices that consume less power than a single LED bulb on a strand of holiday lights.
A new class of machine learning called TinyML is bringing these big, math-heavy neural networks to sensors, wearables, and phones. Neural networks rely heavily on multiplication, and emerging hardware implements multiplication using low-precision numbers (8 bits or fewer). This enables chip designers to build many more multipliers in a much smaller area and power envelope compared to the usual 32-bit, single-precision, floating-point multipliers. Research has shown that, in many real-world cases, using low-precision numbers inside neural networks has little to no impact on accuracy. This approach is poised to deliver ultra-efficient neural network inferencing wherever it’s needed most.
Let me give one example. In addressing the Covid-19 pandemic, a major bottleneck developed around testing and identifying infected patients. Recent research suggests that a collection of neural networks trained on thousands of “forced cough” audio clips may be able to detect whether the cougher has the illness, even when the individual is asymptomatic. The neural networks used are computationally expensive, requiring trillions of multiplication operations per second. TinyML could run such cough-analyzing neural networks.
My hope for AI in 2021 is that sophisticated healthcare applications enabled by large neural networks running on small devices will usher in a new era of personalized healthcare that improves the lives of billions of people.
Matthew Mattina leads the Machine Learning Research Lab at Arm as a distinguished engineer and senior director.