Sohu AI Chip: World’s First Dedicated Transformer Chip Processes 500,000 Tokens Per Second

June 26, 2024 – Etched, a startup founded by Harvard dropouts Gavin Uberti and Chris Zhu less than two years ago, has announced the completion of a $120 million Series A funding round. The funds will be utilized for the development and marketing of Sohu, the world’s first Transformer Application Specific Integrated Circuit (ASIC) chip.

The Sohu chip stands out due to its unique approach of directly etching the Transformer architecture into the silicon. According to Uberti, the chip is manufactured using Taiwan Semiconductor Manufacturing Company’s (TSMC) cutting-edge 4-nanometer process technology. This results in superior inference performance compared to GPUs and other general-purpose AI chips, while consuming less energy.

In benchmark tests using the Llama 70B dataset, Sohu demonstrated an impressive throughput capacity, processing over 500,000 tokens per second. This capability enables the development of products that are beyond the reach of GPUs.

The Sohu chip unlocks a range of advanced functionalities, including real-time voice agents, millisecond-level processing of thousands of words of text, enhanced code tree searches, parallel comparison of hundreds of responses, multicast speculative decoding, and real-time generation of new content. These features pave the way for running trillion-parameter models in the future.


Leave a Reply