May 17, 2023 – In the morning news from New York, it has been reported that Google unveiled its latest large language model last week, utilizing training data nearly five times larger than its predecessor from 2022. This enhanced model, known as PaLM2, exhibits superior performance in programming, mathematics, and creative writing.
During Google’s I/O developer conference, the company introduced PaLM2, a cutting-edge universal large language model. Internal documents reveal that the model was trained on a staggering 36 trillion tokens. Tokens, representing word sequences, are crucial in training large language models as they teach the model how to predict the next word that may appear in a given string.
The previous version, PaLM, was released in 2022 with a token count of 780 billion.
While Google aims to showcase the prowess of its AI technology, integrated into search, email, word processing, and spreadsheet applications, the company remains reticent about disclosing the scale and other details of its training data. OpenAI, supported by Microsoft, also keeps details of its latest GPT-4 large language model under wraps.
These companies state that the non-disclosure of such information is driven by commercial competition. Both Google and OpenAI are striving to attract users who seek answers directly from chatbots, potentially replacing traditional search engines.
However, as the artificial intelligence arms race intensifies, researchers in the field are advocating for increased transparency.
Since the release of PaLM2, Google has consistently asserted that the new model is smaller than its predecessor, suggesting improved technological efficiency while being capable of handling more complex tasks. Internal documents indicate that PaLM2 was trained on 3.4 trillion parameters, a metric that denotes the model’s complexity. In contrast, the original PaLM was trained on 5.4 trillion parameters.
Google has not yet commented on these reports.
In a blog post about PaLM2, Google explains that the model utilizes a new technique called “computational optimization expansion,” which enhances the efficiency and overall performance of large language models. This includes accelerating inference speed, reducing parameter invocations, and lowering service costs.
Upon announcing PaLM2, Google confirmed previous media reports stating that the model was trained on 100 languages, enabling it to perform a broader range of tasks. It has already been employed in 25 features and products, including the company’s experimental chatbot, Bard. The model is categorized into four scales, namely Gecko, Otter, Bison, and Unicorn, from small to large.
Based on publicly disclosed information, PaLM2 surpasses any existing models in terms of power. Facebook’s LLaMA large language model, announced in February, utilized 1.4 trillion tokens. When OpenAI last disclosed the training scale for GPT-3, it stood at 300 billion tokens. OpenAI stated that GPT-4 demonstrated “human-level performance” in various professional tests upon its release in March.
LaMDA, a conversational large language model introduced by Google two years ago, was trained on 1.5 trillion tokens and was also promoted alongside Bard in February this year.
As new artificial intelligence applications rapidly enter the mainstream, controversies surrounding the underlying technology escalate.
El Mahdi El Mhamdi, a senior research scientist at Google, resigned in February this year, citing a lack of transparency in AI technology as a primary reason. This Tuesday, OpenAI CEO Sam Altman attended a congressional hearing on privacy and technology, emphasizing the need for a new framework to address the potential challenges of artificial intelligence.
“For a new technology, we need a new framework,” Altman stated. “Certainly, companies like ours should bear a lot of responsibility for the tools we introduce.”