Scientists in China have developed a tensor processing unit (TPU) that uses carbon-based transistors instead of silicon – and they say it's extremely energy efficient.
1 TOPS per watt seems more or less in line with what you can get out of an nvidia 4090. In fact with the right kind of data it looks like they’re pushing 2-3 TOPS per watt these days. Int8 with 50% sparsity can do 1.3 POPS (1300 TOPS) and the 4090 has a maximum power draw of 450 watts so that works out to about 3 TOPS per watt.
yeah I think someone in writing this article got very math-confused
An actual quote from the authors says this:
“The system simulation results show that the carbon-based transistor using the 180 nm technology node can reach 850 MHz and the energy efficiency exceeds 1TOPS/w, which shows obvious advantages over other device technologies at the same technology node.”
So it’s (in simulations) way more efficient than 180nm silicon, which was achieved around 1999. If it can be brought down to 10 or even 5nm or less, which they think is theoretically possible, it will probably see insane efficiency gains
I can’t find anything publicly (damn paywall) about the operations per second achieved by the actual tested chip, which was only 3000 transistors and capable of 2 bit operations. Without knowing that we can’t know the actual empirical efficiency. But its so early-days that the simulated result of an 8-bit version is probably more useful information anyhow assuming it’s accurate
1 TOPS per watt seems more or less in line with what you can get out of an nvidia 4090. In fact with the right kind of data it looks like they’re pushing 2-3 TOPS per watt these days. Int8 with 50% sparsity can do 1.3 POPS (1300 TOPS) and the 4090 has a maximum power draw of 450 watts so that works out to about 3 TOPS per watt.
yeah I think someone in writing this article got very math-confused
An actual quote from the authors says this:
So it’s (in simulations) way more efficient than 180nm silicon, which was achieved around 1999. If it can be brought down to 10 or even 5nm or less, which they think is theoretically possible, it will probably see insane efficiency gains
I can’t find anything publicly (damn paywall) about the operations per second achieved by the actual tested chip, which was only 3000 transistors and capable of 2 bit operations. Without knowing that we can’t know the actual empirical efficiency. But its so early-days that the simulated result of an 8-bit version is probably more useful information anyhow assuming it’s accurate