When we were recently discussing an Artificial Intelligence algorithm that won a fine arts contest at a state fair, I brought up a related topic. A number of scientists have been working on systems intended to use AI to “decode” the languages of a wide variety of animals, perhaps offering the possibility of understanding whatever they have that passes for speech some day. As it turns out, that work has been quietly progressing for a while now and some teams are reporting progress. At the Daily Beast, nature filmmaker Tom Mustill provides a detailed report on some of these researchers that he’s worked with and how they are using breakthroughs in Artificial Intelligence to build three-dimensional structures of the songs of whales and dolphins by analyzing thousands of hours of records of their speech.

The old method of building translators and speech recognition software relied on “natural language processing,” which is time-consuming and laborious. Now, with AI tools such as artificial neural networks and pattern recognition algorithms, researchers are able to feed massive databases of speech – including the speech of dolphins and whales – into a database and let the system create three-dimensional structures of the sounds, finding places where they either match or are dissimilar. In this fashion, patterns similar to those found in human speech across multiple languages can be identified in the songs of the whales.

A young researcher named Mikel Artetxe at the University of the Basque Country discovered that he could ask an AI to turn the word galaxies of different languages around, superimposing one onto another. And eventually, as if manipulating an absurdly complex game of Tetris, their shapes would match, the constellations of words would align, and if you looked at the same place in the German word galaxy where “king” sits in the English one, you would find “König.”

No examples of translation or other knowledge about either language was required for this to work. This was automatic translation with no dictionary or human input. As Britt and Aza put it, “Imagine being given two entirely unknown languages and that just by analyzing each for long enough you could discover how to translate between them.” It was a transformation of natural language processing.

And then came other new tools, too. Unsupervised learning techniques that worked on audio, in recordings of raw human speech, automatically identified what sounds were meaningful units—words. Other tools could look at word units and infer from their relationships how they were constructed into phrases and sentences—syntax.

I freely admit that most of this goes right over my head, but it’s certainly fascinating. By using all we’ve learned from advancements in translating human languages, these researchers were able to feed in a totally unknown language (that of the whales) and apply the same techniques. The AI was able to recognize which sounds fit the definition of words or phrases and mapped them into a “constellation” of all of the sounds that were recorded.

But as the researchers themselves seem to admit, this exciting advancement still misses one huge part of the larger puzzle. The AI can produce this “map” of all of the words and phrases that the whales are “speaking,” and identify the underlying structure. But that doesn’t get us any closer to knowing what the words mean. Without that knowledge, we should be able to produce structurally correct sentences in the language that whales would recognize, but those sentences would still almost certainly be structurally correct gibberish.

The problem is that almost all of the whale songs they have recorded were taken when the whales were not in sight. If you don’t know what the whale is doing at the time, how are you to determine what it might be “talking” about? You could learn a foreign human language by doing something like walking and saying “walking” and waiting for the other person to provide their word for it. Not so with whales that cannot be observed.

The other question I have, strictly from the perspective of a layman, is how much all of these animals really have to talk about. What is the evolutionary benefit of making noise outside of times when you are trying to attract a mate and reproduce? (And many animals have mating rituals that are conducted in silence.) If you’re a predator, making noise could alert your prey to your presence. Conversely, prey animals could give away their location to predators. So there has to be some sort of more complex “communication” going on there, right?

But going back to the example of the birds I mentioned in the previous post, I’ve been watching and listening to crows where I live for years. They have many different patterns of “caws” that they express, varying in duration and the number of calls that are strung together. In the vast majority of cases I observe, they sound like some sort of meaningful conversation may be taking place, but they’re just sitting there on the phone lines and not doing anything. If they gave the same call every time they were about to swoop down on some piece of food they’d spotted, that might at least tell us something. But if they’re chatting about something that involves no physical activity, how are we supposed to ever figure this out?

In the case of the whales, assuming we do ever get a functional translator, we can probably make at least one guess. They’re probably talking about what a bunch of jerks the humans are for nearly harpooning them to the brink of extinction. And I suppose they’d have a point.

Source: