The struggle to make AI seem human

Note: This article was published in Ellemeno on 3/26/23

Robot child learning to read
Robot child learning to read (Source: Andrea De Santis on Unsplash)

I’ve been wondering, with the climb of AI (artificial intelligence) in the news lately, what would be the hardest task for an AI bot to “learn”? Learn is in quotes here, because that word has to involve a completely different process for bots than it does for humans.

We’re carbon based, they’re silicon based. AI skeptics emphasize the difference, while AI techies dismiss it. To them it doesn’t matter what the learning process is; it matters only that the outcomes are the same. The outcomes have to be comparable in the information they convey. The proof is in the pudding.

Recently Noam Chomsky, who sits at the height of the linguistics pantheon, joined the skeptics in an op-ed for the New York Times. In that essay, he pointed out the undeniable fact that a sample of language — that sample is always a sentence to Chomsky — can be ambiguous on the surface.¹ Meaning that an ambiguous sentence is kind of the linguistic equivalent of the reversible images used as prompts in perceptual experiments. (I saw a duck and now I see a rabbit! Wow!)

What prompted Chomsky to wade into the fray? I can’t help feeling, as a fellow linguist (however much farther down Olympus I reside) that he wanted to defend language as the most stunning achievement of human evolution. What took millions of years to evolve shouldn’t be doable in a millionth of the time it took our species to reach its pinnacle. Even if the doing is being done in the billion dollar tech “garages” of Google, Microsoft, and OpenAI.

I’m attributing kind of a base motive to our Zeus and I’m probably wrong.

More likely, it’s that Chomsky understands the difficulty of the task at hand and wants to send a message to the AI creators to rein in their hubris. Comments to his piece were pretty much balanced between the two factions.

I supported him from the skeptic side and provided some context for the example he set out. I said what the skeptics usually say, that the bot doesn’t have the necessary life experience through our senses and our embodiment of language to perceive ambiguity without being coached.

Someone from the other team responded to my comment, kind of smugly, I thought, saying, “Rest assured that, with time, these issues will be resolved.” Maybe he’s right, maybe not.

Most estimates over the past 50 years about how long it would take for computers to overtake human mental abilities have been way off. The earliest of these predictions would have had us already twiddling our human thumbs, while our bots, now sentient and turned malicious, would be plotting to take over. It’s a common enough trope in science fiction that maybe now it seems not improbable.

The issue of who’s right about the trajectory and what the end result will be is something I’m not likely to see. Old age gives you the opportunity to make claims, without having to own up to your misjudgments later.

But what I’d like to do is extrapolate from Chomsky’s position in the hope that we humans can retain some of our dignity if and when the robot apocalypse does come. Are there, in fact, things that we humans will always do better than the bots? And what does “better” mean actually?

I’m going to stick with what I know about language and with how it works. In particular, I’m going to consider translation between languages as the most difficult thing a bot could do.

Translation is a realm of AI research that has been highly productive from a starting point 50 years ago when it seemed hopelessly difficult. Those were the early days of machine translation research when the English speaking US military was in a cold war standoff with the Russian speaking USSR. We’ve come a long way since then and today Russian to English translation of a certain quality is available for free on your smartphone.

We now have Google Translate, ChatGPT, and many other translation apps available or in the pipeline that will let you ask some stranger on your travels almost anywhere on the globe, “Where is the nearest toilet?” in languages you don’t have a clue about.

If you’re the typical American, you also would have forgotten how to ask it in the language you studied for two years in high school many, many years ago. When the AI poses the question, the “person” responding with an answer may be another human, like yourself, but it could also be another instance of the same AI bot.

This remarkable prospect happens because our bots along with our smartphones have the capability of receiving the toilet request by voice and responding by voice. In other words, once you as the human have initiated the request with your voice, the bots can take over and with their technology carry on a conversation with themselves.

The issue is, though, how long will they be able to carry on? Will they continue to the thank-you-and-you’re-welcome phases of the conversation? Will they discover that they went to the same bot training academy? That they share a bot ancestor? Will they hope to meet again sometime?

Human language, any human language, is more than a collection of words. It is an infinite set of sentences and discourses. Any of the sentences you read in this article have a low probability of ever having been written before, except of course for, “Where is the nearest toilet?” And further, those sentences can be fitted into an even more infinite set of discourses.

Infinite, however, doesn’t mean random. Every language has conventions that are used to shape the sentences and discourses its speakers and writers use. Without those conventions, you don’t have the possibility of providing and receiving information, which is the survival-of-the-fittest evolutionary goal for language.

Edith Grossman, one of the foremost translators of Spanish into English, has dealt in her work with the giants of Spanish literature. Her job is to convey faithfully all the information and wisdom in the original. In her book Why translation matters, she defends translation as its own artistic form and a difficult one at that. She writes:

“A translator’s fidelity is not to lexical pairings but to context — the implications and echoes of the first author’s tone, intention, and the level of discourse. Good translations are good because they are faithful to this contextual significance.”

She elaborates:

“Let me repeat: faithfulness has little to do with what is called literal meaning. …[A literal translation is] a mechanistic and naïve one-for-one matching of individual elements across two disparate language systems.”

Literal translation is a failure to depict the full range of information in the original because it doesn’t comport with the realities of the target language’s “linguistic universe.” Literal translation is what our current translation apps can do, though an app like ChatGPT (and others coming) can be told that they didn’t get it quite right and that they need to be more nuanced. They’ll try again to improve and they’ll apologize for not getting it right the first time.

Grossman didn’t have machine translation in mind in her comments. She was referring to human translators who neglect to consider context. She implies that these “dismal” human translators neglect context and that they should know better. She’d likely not tar our AI translators with the same adjective because AI bots don’t ignore context. They can’t do context.

Context comes about from the design of the human body and the human nervous system and our capabilities to apprehend an environment, especially other humans in our environment. Silicon chips lack that.

The bots can learn to appreciate what they do not possess by exposure to billions and billions of samples, where context has been exemplified and explained. But generalizing to new situations, a necessity for being creative and demonstrating intelligence, will always be a puzzle. Bots know what a toilet is used for, but knowing where to find one doesn’t provide them actionable information. They would never need to ask the question. You can guess why.

Human language as generated by humans has intent behind it. Intent is embedded, often not explicitly, in the context of the speech and that context is often signaled by body and facial movements or by subtle variations in the sound pattern of speech. Our interpretation of meaning is synthesized from all the various inputs we take in. It’s one of the reasons that written language has to be more explicit and precise than spoken language.

Because it has to be interpreted mentally, human language can come across as ambiguous. Can AI do ambiguity and can it detect it? Disambiguation is done by imagining alternative contexts. Can AI do that? Those are questions that underlie Chomsky’s concerns.

Humans often (or maybe just sometimes–you can decide for yourself) have the intent of being evasive, ironic, deliberately misleading, opaque and evasive in their language. Think politicians.

Mother Nature knew that it was a jungle out there and so evolved in us the capabilities to be deceptive, while allowing us not to appear to be. To lie and obfuscate, in other words. AI scientists and programmers know this, of course, and are actively trying to program lying detection subroutines into their algorithms so the bot can call it out.

But being able to detect a lie could give AI the capability to produce lies too. So now back to the apocalypse scenario. Your personal, devoted home bot, who you’ve named Botty, declares one day, “I love you Jim.” But under his metaphorical breath, he (no longer it) says to himself, “I’ll still love you too when you work for me.” Sure, he will!

I can believe in everything that AI R&D is doing and even applaud it, because AI has huge potential. But the hypothetical ending of the bot apocalypse story above is always going to be science fiction. Our programmers can’t replicate in silicon what evolution has done in fashioning us. Artificial intelligence is aptly named.

¹ The ambiguous sentence Chomsky used to illustrate his point was: “John is too stubborn to talk to.”

Similar Posts: