This Chinese Super Scale Intelligence Model, ‘Wu Dao 2.0’, Claims To Be Trained Using 1.75 Trillion Parameters, Surpassing All Prior Models to Achieve a New Breakthrough in Deep Learning

Deep learning is one area of technology where ambitiousness has no barriers. According to a recent announcement by The Beijing Academy of Artificial Intelligence (BAAI), in China, yet another milestone has been achieved in the field with its “Wu Dao” AI system. The GPT 3 brought in new interest for all the AI researchers, the super scale pre training models. By this approach and making use of 175 billion parameters, it managed to achieve exceptional performance results across the natural language processing tasks (NLP). However, the lacking component is its inability to have any form of cognitive abilities or common sense. Therefore, despite the size, even these models cannot indulge in tasks such as open dialogues, visual reasoning, and so on. With Wu Dao, the researchers plan to address this issue. This is China’s first attempt at a home-grown super-scale intelligent model system.

What is Wu Dao?

Wu Dao is primarily a multi-model AI system which means that it is power-packed to do a plethora of different jobs, including the generation of:

  • Audio
  • Text
  • Images

According to some reports, it is even vested with the ability to power virtual tools. It is being said that this model is more sophisticated than the ones introduced by both Google and OpenAI. The most extraordinary component of this AI model that sets it apart from all the others is undoubtedly its size. To draw a comparison, this AI model has been trained using 1.75 trillion parameters; in contrast, the most prominent model by OpenAI, GPT-3, which has been trained using just 175 billion parameters. More than 100 scientists for varying organizations have come together to put into place this new AI model.The Wu Dao model has been trained by studying 1.2 TB of text both in English and Chinese; therefore, it can understand these languages proficiently. This model can simulate conversations, understand pictures, write original poems and create new unheard recipes.

Four Subdivisions

Wu Dao will work with the help of four related research projects, namely:

Wu Dao – Wen Yuan

This model will work as an open-source Chinese pre training model (CPM). It has been claimed that the CPM will give an impetus to the CPM-Distill model and will reduce language confusion by a whopping 38 percent. This would work in the direction of achieving more efficiency on the downstream tasks.

Wu Dao – Wen Lan

This is the first Chinese generic multimodal pre-training model which would be able to understand connotative language. It will draw weak correlations between images and text to derive a result. A contrast cross-modal learning algorithm has been used. It can also replace image and text encoders, giving it the ability to achieve 20 times faster performance than the other models, particularly the UNITER model.

Wu Dao – Wen Hui

Through this, the BAA has proposed a new pre-training paradigm altogether, namely the Generative Language Model (GLM). It claims to be the only model to have achieved the best results in both language understanding and generating tasks, thereby overtaking the standard pretraining models such as BERT and RoBERTa. A vector-based fine-tuning method, P-tuning, has been used to achieve better results on the tasks. According to the reports, the inverse prompting algorithm used by this research model has been able to achieve performance similar to that of humans.

Wu Dao – Wen Su

This is an open-sourced FastMoE that is a Mixed Expert Model. It supports the PyTorch framework and plenty of hardware as well. The MoE transformation requires a single line of code. The model training speed is simultaneously increased by 47 times to that of the PyTorch implementation, thereby improving its speed and efficiency. This model can even handle super long and complex biomolecular structures.

Currently, the research team at BAAI is discussing the new options available for model applications with companies such as Sogou, 360, Alibaba, Zhipu. Xinhua News Agency. Furthermore, there are plans to build API interfaces to support high concurrency and high-speed reasoning for enterprises and individual users.


Amreen Bawa is a consulting intern at MarktechPost. Along with pursuing BA Hons in Social Sciences from Panjab University, Chandigarh, she is also a keen learner and writer, having special interest in the application and scope of artificial intelligence in various facets of life.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...