A Google AI rival to ChatGPT may pull data from the internet unlike the viral OpenAI offering, Google CEO Sundar Pichai claimed on Monday – saying Google was opening up the conversational AI service dubbed “Bard” to “trusted testers” ahead of “making it more widely available to the public in the coming weeks.”
Bard “draws on information from the web to provide fresh, high-quality responses” Pichai said. (It was not immediately clear if he meant in near-real time or whether that it was trained on recent web scrapes.)
Google Chrome is the world’s dominant browser by a country mile, with 65%+ of the market. With Google owning over 90% of the world’s search engine market, the potential for it to fast-track the model under the noses of billions of consumers is significant and a potential step-change for AI adoption among broader uses.
(For all the hype among “media types” and those working in some form with technology, at a recent social gathering outside of this echo chamber among a mixed group of non-technologists – actors, artists, parents, gardeners, teachers to those wondering – The Stack found that none had heard of ChatGPT let alone tried it.)
Google’s CEO added that “new AI features will begin rolling out on Google Search soon.”
Google AI rival to ChatGPT is based on LaMDA
The Google AI chatbot Bard will be based on its Language Model for Dialogue Applications (or LaMDA), a family of Transformer-based neural language models built for dialogue which had up to 137 billion parameters at launch in 2021 and which were pre-trained on 1.56 trillion words of public dialogue data and web text.
Transformers are AI models that learn context and “meaning” by tracking relationships in sequential data like the words in a sentence; this allows for self-supervised learning without the need for human-labelled data.
(They were pioneered at Google. Its influential 2017 paper explained that transformers are capable of directly modelling “relationships between all words in a sentence, regardless of their respective position. [e.g. in the sentence] ‘I arrived at the bank after crossing the river’, to determine that the word “bank” refers to the shore of a river and not a financial institution, the Transformer can learn to immediately attend to the word ‘river.”)
“We’re releasing it [Bard] initially with our lightweight model version of LaMDA” said Pichai on February 6.
He added: “This much smaller model requires significantly less computing power, enabling us to scale to more users, allowing for more feedback. We’ll combine external feedback with our own internal testing to make sure Bard’s responses meet a high bar for quality, safety and groundedness in real-world information.
“We’re excited for this phase of testing to help us continue to learn and improve Bard’s quality and speed.”
The move suggests “AI wars” among Big Tech providers are heating up notably – with Microsoft also teasing a fresh announcement this week; The Stack will update this story when we have it – but also reflect nerves at Google about reputational risk in the wake of increasingly pointed attacks on ChatGPT’s creators at OpenAI.
Google AI chatbot adoption at scale raises bias rows
Those followed viral posts showing the chatbot being able to write a paean to Joe Biden but declining to do the same thing for Donald Trump. Subsequent attacks on the AI chatbot led OpenAi founder Sam Altman to say on Twitter that “we know that ChatGPT has shortcomings around bias, and are working to improve it. But directing hate at individual OAI employees because of this is appalling. Hit me all you want, but attacking other people here doesn’t help the field advance, and the people doing it know that.”
“We are working to improve the default settings to be more neutral, and also to empower users to get our systems to behave in accordance with their individual preferences within broad bounds. this is harder than it sounds and will take us some time to get right” he posted on February 1. (In Google’s 2021 LaMDA paper its researchers noted that “examples of bias, offensiveness, and hate speech have been found both in training data drawn from social media, and consequently in the output of dialog models trained on such data.
“Dialog models can learn, and even amplify, biases in the training data. Echoing Gehman et al. we find fine-tuning effective to augment language models for safety” the 2021 LaMDA paper suggested.
(The Samuel Gehman et al paper referenced in fact noted that "that pretrained LMs [language models] can degenerate into toxic text even from seemingly innocuous prompts. We empirically assess several controllable generation methods, and find that while data- or compute-intensive methods (e.g., adaptive pretraining on non-toxic data) are more effective at steering away from toxicity than simpler solutions (e.g., banning “bad” words), no current method is failsafe against neural toxic degeneration..."
Google CEO Pichai meanwhile said that beyond Google ChatGPT rival Bard and “our own products, we think it’s important to make it easy, safe and scalable for others to benefit from these advances by building on top of our best models. Next month, we’ll start onboarding individual developers, creators and enterprises so they can try our Generative Language API, initially powered by LaMDA with a range of models to follow. Over time, we intend to create a suite of tools and APIs that will make it easy for others to build more innovative applications with AI. Having the necessary compute power to build reliable and trustworthy AI systems is also crucial to startups, and we are excited to help scale these efforts through our Google Cloud partnerships with Cohere, C3.ai and Anthropic, which was just announced last week. Stay tuned for more developer details soon.”