Can artificial intelligence really help us talk to the animals?

Lucy Chen
6 min readJul 31, 2022

--

To better understand my cat and his behavior, I tried in the past to use different ways to interact with him, teach him some tricks, and leveraged apps such as “Meow Talk,” a cat translator app using machine learning. But how can AI really help us talk to and understand our animals?

Dolphin handler makes the signal for “together” with her hands, followed by “create”. The two trained dolphins disappear underwater, exchange sounds and then emerge, flip on to their backs and lift their tails. They have devised a new trick of their own and performed it in tandem. “It doesn’t prove that there’s language,” says Aza Raskin. “But it certainly makes a lot of sense that, if they had access to a rich, symbolic way of communicating, that would make this task much easier.”

Raskin is the co-founder and president of Earth Species Project (ESP), a California non-profit group with a bold ambition: to decode non-human communication using a form of artificial intelligence (AI) called machine learning, and make all the knowhow publicly available, thereby deepening our connection with other living species and helping to protect them. A 1970 album of whale song galvanised the movement that led to commercial whaling being banned. What could a Google Translate for the animal kingdom spawn?

The organisation, founded in 2017 with the help of major donors such as LinkedIn co-founder Reid Hoffman, published its first scientific paper last December. The goal is to unlock communication within our lifetimes. “The end we are working towards is, can we decode animal communication, discover non-human language,” says Raskin. “Along the way and equally important is that we are developing technology that supports biologists and conservation now.”

Understanding animal vocalisations has long been the subject of human fascination and study. Various primates give alarm calls that differ according to predator; dolphins address one another with signature whistles; and some songbirds can take elements of their calls and rearrange them to communicate different messages. But most experts stop short of calling it a language, as no animal communication meets all the criteria.

Until recently, decoding has mostly relied on painstaking observation. But interest has burgeoned in applying machine learning to deal with the huge amounts of data that can now be collected by modern animal-borne sensors. “People are starting to use it,” says Elodie Briefer, an associate professor at the University of Copenhagen who studies vocal communication in mammals and birds. “But we don’t really understand yet how much we can do.”

Briefer co-developed an algorithm that analyses pig grunts to tell whether the animal is experiencing a positive or negative emotion. Another, called DeepSqueak, judges whether rodents are in a stressed state based on their ultrasonic calls. A further initiative — Project CETI (which stands for the Cetacean Translation Initiative) — plans to use machine learning to translate the communication of sperm whales.

Yet ESP says its approach is different, because it is not focused on decoding the communication of one species, but all of them. While Raskin acknowledges there will be a higher likelihood of rich, symbolic communication among social animals — for example primates, whales and dolphins — the goal is to develop tools that could be applied to the entire animal kingdom.

This process starts with the development of an algorithm to represent words in a physical space. In this many-dimensional geometric representation, the distance and direction between points (words) describes how they meaningfully relate to each other (their semantic relationship). For example, “king” has a relationship to “man” with the same distance and direction that “woman’ has to “queen”.

It was later noticed that these “shapes” are similar for different languages. And then, in 2017, two groups of researchers working independently found a technique that made it possible to achieve translation by aligning the shapes. To get from English to Urdu, align their shapes and find the point in Urdu closest to the word’s point in English. “You can translate most words decently well,” says Raskin.

ESP’s aspiration is to create these kinds of representations of animal communication — working on both individual species and many species at once — and then explore questions such as whether there is overlap with the universal human shape. Animals don’t only communicate vocally. Bees, for example, let others know of a flower’s location via a “waggle dance”. There will be a need to translate across different modes of communication too.

ESP recently published a paper (and shared its code) on the so called “cocktail party problem” in animal communication, in which it is difficult to discern which individual in a group of the same animals is vocalising in a noisy social environment.

“To our knowledge, no one has done this end-to-end detangling [of animal sound] before,” says Raskin. The AI-based model developed by ESP, which was tried on dolphin signature whistles, macaque coo calls and bat vocalisations, worked best when the calls came from individuals that the model had been trained on; but with larger datasets it was able to disentangle mixtures of calls from animals not in the training cohort.

Another project involves using AI to generate novel animal calls, with humpback whales as a test species. The novel calls — made by splitting vocalisations into micro-phonemes (distinct units of sound lasting a hundredth of a second) and using a language model to “speak” something whale-like — can then be played back to the animals to see how they respond.

A further project aims to develop an algorithm that ascertains how many call types a species has at its command by applying self-supervised machine learning, which does not require any labelling of data by human experts to learn patterns.

Another project seeks to understand automatically the functional meanings of vocalisations. It is being pursued with the laboratory of Ari Friedlaender, a professor of ocean sciences at the University of California, Santa Cruz. The lab studies how wild marine mammals, which are difficult to observe directly, behave underwater and runs one of the world’s largest tagging programmes. Small electronic “biologging” devices attached to the animals capture their location, type of motion and even what they see (the devices can incorporate video cameras). The lab also has data from strategically placed sound recorders in the ocean.

ESP aims to first apply self-supervised machine learning to the tag data to automatically gauge what an animal is doing (for example whether it is feeding, resting, travelling or socialising) and then add the audio data to see whether functional meaning can be given to calls tied to that behaviour. (Playback experiments could then be used to validate any findings, along with calls that have been decoded previously.) This technique will be applied to humpback whale data initially — the lab has tagged several animals in the same group so it is possible to see how signals are given and received. Friedlaender says he was “hitting the ceiling” in terms of what currently available tools could tease out of the data. “Our hope is that the work ESP can do will provide new insights,” he says.

The problem, he explains, is that while many animals can have sophisticated, complex societies, they have a much smaller repertoire of sounds than humans. The result is that the exact same sound can be used to mean different things in different contexts and it is only by studying the context — who the individual calling is, how are they related to others, where they fall in the hierarchy, who they have interacted with — that meaning can hope to be established.

There is also doubt about the concept — that the shape of animal communication will overlap in a meaningful way with human communication. Applying computer-based analyses to human language, with which we are so intimately familiar, is one thing, says Seyfarth. But it can be “quite different” doing it to other species. “It is an exciting idea, but it is a big stretch,” says Kevin Coffey, a neuroscientist at the University of Washington who co-created the DeepSqueak algorithm.

Raskin acknowledges that AI alone may not be enough to unlock communication with other species. But he refers to research that has shown many species communicate in ways “more complex than humans have ever imagined”. The stumbling blocks have been our ability to gather sufficient data and analyse it at scale, and our own limited perception. These are the tools that let us take off the human glasses and understand entire communication systems.

Originally published at https://www.theguardian.com on July 31, 2022.

--

--

No responses yet