ByTedance, the owner of Tiktok, announced the achievement of 1.71-fold increasing the efficiency of learning large language models (LLM), which can reduce the dependence of Chinese technological companies on powerful graphic processors NVIDIA.
Dubao developers explained this breakthrough by introducing the Comet system-the optimized Mixture-OF-Experts (MOE) mechanism, which allows you to more effectively distribute computational resources. In the article on the ArxIV platform, they noted that the technology is already used in the BYTEDANCE working clusters with more than 10,000 GPU, providing significant savings in computing power.
MOE is widely used to scale LLM to trillions of parameters for fixed calculation costs. However, earlier this method was faced with the problem of “providing communication and calculations”, which reduced effectiveness. The new ByTedance approach eliminates the narrow places in data transfer, allowing you to increase the speed of training.
This achievement can weaken the influence of the NVIDIA on the Chinese market, where its high -performance chips fall under the rigid export restrictions of the United States. Similar developments have previously led to fluctuations in the market value of NVIDIA: in February, after the success of the Chinese Deepseek, the company lost $ 600 billion in one day, although it was then restored.
ByTedance plans to open the code of the new system to stimulate further improvements in the field of machine learning. Meanwhile, other Chinese technological giants accelerate the development of AI. Recently, a group of American scientists, including Feifay, introduced a new model of logical output, trained in just 26 minutes at 16 GPU NVIDIA H100 using Alibaba algorithms.