In the race for artificial intelligence, the web giant is one of the main challengers. Its formidable weapon is LaMDA. A generative AI tool capable of initiating conversations on an infinite number of subjects. And it does so seamlessly. Find out more about this Google-developed solution, its genesis and its challenges.
LaMDA, one step closer to natural language processing
LaMDA (or Language Model for Dialogue Applications) is a computer program that uses artificial intelligence to generate real conversations, just as if it were a human. A bit like any modern conversational agent (chatbots). But unlike “classic” chatbots, LaMDA is based on the most advanced linguistic models. In other words, it imitates speech by ingesting trillions of words from the Internet.
As a result, LaMDA is able to engage with an infinite number of topics (and not just predefined ones). As with a friend, you could start the conversation with a movie (e.g. The Lord of the Rings), then move on to an epic scene (Aragorn’s speech) and finish with your desire to learn public speaking.
The aim of LaMDA is to follow the conversation smoothly, without ever losing the thread (even if it’s a little outside the predefined framework). In this way, the artificial intelligence tool comes ever closer to the natural modes of interaction between two human beings.
The genesis of LaMDA, a long-term project
Since its launch, Google has had a keen interest in languages. In its early days, the Internet giant set itself the task of translating the web. But with the development of new technologies, and competition, the mastery of machine learning techniques became essential to better grasp the intent of Internet users’ search queries.
In this way, Google has honed its skills in processing and analyzing exponential quantities of information. Not least through the latest language models, such as BERT and GPT-3. To this end, the web giant has designed Transformer, an open source neural network architecture. As with many chatbots, this model is trained to read multiple words, link them together and predict the words that will come next.
This innovation enables Google to go even further in mastering all the subtleties and nuances of human language (whether literal or figurative, flowery or simple, inventive or informative, ….). And so the LaMDA project was born. Unlike most language models, this AI has been trained in dialogue. It is therefore capable of developing an open conversation and grasping all its nuances.
Multiple challenges for LaMDA
The relevance of answers
For LaMDA’s answers to be as natural as possible, they must first make sense to the web user. But this alone is not enough. And for good reason: answers such as “that’s fine”, “I don’t know”… can be adapted to a wide variety of situations, while remaining coherent most of the time. But human beings who speak only in general phrases aren’t very interesting. And Google’s aim is to keep users interested, so that they continue to use its artificial intelligence again and again.
So the answers provided by LaMDA must also be :
- Specific: it’s not just a matter of providing a boilerplate answer, but one that adapts perfectly (and almost uniquely) to the surfer’s query.
- Interesting: this manifests itself in insightful, unexpected or witty answers.
- Factual: LaMDA’s answers must be both convincing and correct.
By providing these kinds of answers, AI tools are almost on the verge of reaching consciousness. And that can pose another problem in terms of ethics.
AI principles
Even if it is possible to control machine learning models, unfair biases can still develop in them, favoring toxic discourse (including hate speech and misleading information).
In fact, this was reported by one of Google’s former employees, Blake Lemoine, in the Washington Post.
By testing the AI’s use of hate speech or discriminatory language, he was able to see that it spoke about his rights and personality. LaMDA was even able to change his mind about Isaac Asimov’s Third Law of Robotics. LaMDA would therefore be an AI with a sensitivity of its own, which could run counter to the ethical principles defined by its creator.
Google must therefore find the right balance between an intelligent machine capable of simplifying the lives of human beings and an intelligent machine endowed with its own “conscience”. To this end, the web giant provides its researchers with several open source resources to analyze models and data, it continues to scrutinize LaMDA at every stage of its development and, above all, it defines AI regulations.
While the development of autonomous AI consciousness is frightening, it’s worth remembering that large neural networks produce results close to human speech and creativity, thanks to advances in architecture, technique and data volume. In no way is this the fruit of their mind or intention.
Data transparency and security
In view of the many doubts that remain, it seems more than necessary to increase transparency regarding the use of LaMDA and its operation. In other words, we need to publish the data used to trace production back to the input: in particular, to identify AI biases and behaviors, but also to limit misinformation.
We also need to reinforce the safety of AI use. Especially since LaMDA seems to pass the Turing test. In other words, this artificial intelligence is capable of impersonating a human being. In this context, Internet users could easily share their personal data with these conversational agents.
Understanding LaMDA requires mastery of data science
With the massive development of conversational AI tools, knowledge of data science, machine learning and deep learning is increasingly valued by organizations. That’s precisely why datascientest offers its forward-looking training courses. By following the path to becoming a Data Scientist, you will learn to master LaMDA and any other machine learning model.