Scientists from the University of Beijing and other scientific institutes of China have successfully created the world’s first tensor processor (TPU) based on carbon nanotubes. This innovative processor has the potential to greatly enhance the energy efficiency of artificial intelligence algorithms (source: Nature).
The inspiration behind this groundbreaking development came from the rapid progress of applications like ChatGPT and Sora, as well as Google’s advancements in TPU technology. The researchers observed that traditional silicon semiconductor technologies are struggling to efficiently process the massive amounts of data required for modern AI systems.
The newly developed chip is comprised of 3,000 field transistors on carbon nanotubes arranged in 3×3 computing units. These units form a systolic matrix architecture that can concurrently execute operations involving two-bit integer accumulation and matrix multiplication.
The chip’s architecture is designed to maintain a continuous flow of systolic input data, reducing the need for reading and writing operations in static RAM (SRAM) and resulting in increased energy efficiency.
Each computing unit within the chip receives data from neighboring blocks, calculates partial results independently, and forwards them to adjacent units. These computing blocks are optimized for 2-bit multiplication operations with accumulation (MAC) and matrix multiplication for both signed and unsigned integers.
Using the carbon tensor processor, the researchers constructed a five-layer buckthorn neural network capable of image recognition with an impressive 88% accuracy while consuming only 295 μW of energy, marking it as the most energy-efficient technology of its kind.
Modeling studies have indicated that a carbon transistor utilizing a 180-nanometer technological node could achieve a clock frequency of 850 MHz, with an energy efficiency surpassing 1 TOPS/W (trillions of operations per second per watt).
Looking ahead, the researchers aim to enhance the performance and energy efficiency of their approach by utilizing layered semiconductor carbon nanotubes, shrinking transistor sizes, boosting computing block throughput, and implementing CMOP logic.
Furthermore, the integration of carbon TPU into silicon chips to create three-dimensional structures is being considered. This configuration would involve positioning a silicon processor below a carbon TPU layer, enhancing overall computational capabilities.
The tightly connected computing blocks and systolic data flow architecture of the processor have the potential to significantly reduce memory access, a critical factor in cutting down energy consumption. This development is particularly crucial as the demand for computing power in AI algorithms continues to grow.