AI company 01.AI, founded by Kai-Fu Lee, chairman and CEO of Innovation Works, has completed a new round of financing led by Alibaba Cloud at a valuation at over uS$1 billion, according to media reports. The startup also released its first open-source bilingual large model "Yi" today.
Yi" has two base models with parameter sizes of 6B and 34B. It also has a context window, which means the model has "memory," is introduced. Currently, Yi has a 200K context window, capable of processing about 400,000 characters of text—this is also the longest context window among large models globally.
Kai-Fu Lee mentioned that due to the shortage of GPUs, when scaling the model size from 6B to larger sizes, the team needs to manage the scale to reduce the cost of trial and error, and not blindly pursue "size." By refining AI Infra, Yi-34B has reduced training costs by 40%, "if other competitors need 2000 GPUs, we only need 1200."
Yi’s training data mainly comes from public corpus crawling and databases. Kai-Fu Lee explained that the difficulty with training data lies in its high repetition rate and low quality. Through clarification, the team filtered out 3T from over 100T of data. Due to the lower quality of Chinese corpora, the proportion of English corpora in Yi’s training data is currently higher than that of Chinese.
So what is Yi’s capability? In the evaluation, Zero One Infinite referred to multiple datasets used in Meta’s open-source model Llama2 capability evaluation, such as PIQA, SIQA, HellaSwag, WinoGrande, etc., to assess Yi’s "common sense reasoning," "reading comprehension," "math and coding abilities," and other multidimensional capabilities.
The results show that Yi-6B has reached the average level of domestic and international open-source models in common sense reasoning and reading comprehension abilities but is weaker in math and coding abilities. Yi-34B leads significantly in common sense reasoning and reading comprehension abilities and is at the forefront in math and coding abilities compared to the commonly seen parameter sizes of 7B and 13B on the market, Zero One Infinite offers solutions of 6B and 34B. Kai-Fu Lee believes that the 34B size is a "golden ratio" size that is scarce in open-source large models, reaching the "emergence" threshold and meeting accuracy requirements while allowing manufacturers to adopt efficient single-card inference, friendly to training costs.
Kai-Fu Lee candidly stated that before completing financing, Zero One Infinite had already incurred tens of millions of dollars in debt to cover computational power and other training costs. This also reflects Kai-Fu Lee’s determination to go all-in on AI.
As the initiator of Zero One Infinite, Kai-Fu Lee is also one of the leading figures in Chinese artificial intelligence. He has served as the global vice president of Microsoft, the global vice president and president of Greater China at Google, and founded the angel investment and enterprise incubation platform Innovation Works in 2009.
In March 2023, Kai-Fu Lee personally entered the large model track, issuing a "hero post" for the construction of the new company Zero One Infinite: "Zero One Infinite welcomes outstanding talents with AI 2.0 technical strength and AGI faith to join us, to build a new AI2.0 platform together, and accelerate the arrival of AGI." By July, Zero One Infinite already had dozens of core members from domestic and international companies such as Alibaba, Baidu, Google, and Microsoft in place. At the press conference, Kai-Fu Lee introduced, "(The team) wrote the first line of code in June and July." Kai-Fu Lee believes that the biggest business opportunity in the AI 2.0 era will appear in To C/consumer-grade super applications. He mentioned that the first versions of the Internet era’s Super Apps, WeChat and TikTok, were not Super Apps, but accurately captured user needs. And Zero One Infinite’s goal is to make another WeChat, TikTok in the AI 2.0 era.
Specific to Zero One Infinite’s business planning, Kai-Fu Lee told 36Kr that companies that could not commercialize in the AI 1.0 era were eliminated early, and the biggest challenge for commercialized companies is to be sustainable and growable—this means that many AI 1.0 companies need a large scale of people, not high-quality income.
He emphasized that the scaling of revenue should not be driven by headcount but by technology. "With this principle, Zero One Infinite will focus on Consumer applications." Considering that the payment awareness and willingness of domestic users are still in the cultivation stage, Zero One Infinite will consider both localization and going overseas for its applications.