Melanie Mitchell wrote her new book, Artificial Intelligence: A Guide for Thinking Humans, because she was confused about how much progress really is being made in A.I. She wanted “to understand the true state of affairs,” she writes.
It’s a relief to learn of her ambivalence because she is an artificial intelligence researcher herself. She’s a professor of computer science at Portland State University and co-chair of the science board at the Santa Fe Institute, a renowned multidisciplinary research center. If Mitchell could be perplexed about where A.I. stands, forgive the rest of us for being mystified or just plain wrong.
As Mitchell notes, a lot of triumphal A.I. narratives are floating around. In these accounts, recent breakthroughs in computer vision, speech recognition, game playing, and other aspects of machine learning are indications that artificial intelligence might surpass human competence in a wide range of tasks in the coming decades. Some people find that prospect marvelous; others worry that “superhuman” computers might decide they don’t need us around and have the power to do something about it.
“Either we are within spitting distance of ‘true’ A.I., or it is centuries away.”
But as Mitchell also demonstrates, even today’s most capable A.I. systems have crucial limitations. They are good only at narrowly defined tasks and utterly clueless about the world beyond. They find correlations in data without regard for what it means, so their predictions can be dangerously unreliable. They have no common sense.
“Either a huge amount of progress has been made, or almost none at all,” Mitchell writes. “Either we are within spitting distance of ‘true’ A.I., or it is centuries away.”
What makes this book memorable and instructive is how Mitchell resolves the question. Her patient explanations of today’s A.I. techniques give the impression that real machine intelligence remains very far off. Not only do computers need better brains, she suggests, but they probably also need better bodies.
Mitchell offers a remarkably clear and readable primer on artificial neural networks — the technology at the heart of recent advances in image recognition, language translation, and autonomous driving. Neural networks were used in some of the first attempts to build A.I. in the 1950s and ’60s. They fell out of favor because they looked like a dead end, an approach with limited value. But that changed about a decade ago as progress in computational power made it feasible to train neural networks with a data-intensive method called “deep learning.”
To follow Mitchell’s description of neural networks, you don’t need to know fancy math — but she shows that math is mainly what they are, which explains both their prowess and their flaws. One kind of neural network that is especially useful for identifying the content of images depends on a mathematical calculation called a “convolution.” A version used to deal with text relies on a computer’s ability to represent a statistical aspect of language — how frequently words tend to appear together in sentences — in complex “vectors” that cut across hundreds of dimensions.
It’s impressive that people have quantified so many aspects of the world for computers to churn away on. It’s also remarkable that these methods have such diverse applications, like detecting tumors, automated driving, and filtering spam. But clever new uses for great calculators don’t necessarily imbue them with intelligence comparable to ours. As Mitchell points out, getting convolutional neural networks to work well “requires a lot of human ingenuity.”
Appreciating that machine-learning systems are essentially statistics workhorses makes it obvious why they are so vulnerable to inaccuracies, holes, and other shortcomings in the data they’re fed. It’s no wonder a neural network fails to register a black person’s face if it was trained on images that mainly displayed white people. It shouldn’t be a surprise when a self-driving car fails to recognize a stop sign that has some stickers on it but is still obviously (to humans) a stop sign.
Fortunately, these kinds of problems have been prompting many fruitful discussions of the social ramifications of facial recognition technology and automated decision-making. The 2018 book Artificial Unintelligence: How Computers Misunderstand the World by Meredith Broussard, an associate professor of data journalism at New York University, convincingly argues that, regardless of whether you call it “A.I.,” computers should still be considered mere tools for inquisitive humans who ought to always remain in the loop.
But while Mitchell is hardly the first to point out the weaknesses of machine-learning systems, she adds a layer to the critique by explaining that even when they are at their best, computers might not be as good as you think. In the past few years, researchers have developed neural networks that, they claim, meet or exceed human performance when it comes to identifying objects in photos or videos. It’s been heralded in the news as another example of the resistance-is-futile inevitability of machine superiority.
Mitchell points out that the benchmark in question was a test based on a vast picture database called ImageNet. In 2017, the best computer system categorized pictures from ImageNet with 98% “top-5” accuracy, purportedly besting the human rate of 95%.
What is that hedge of “top-5”? It means the correct classification for an object was one of the machine’s top five guesses. As Mitchell writes: “If, given an image of a basketball, the machine outputs ‘croquet ball,’ ‘bikini,’ ‘warthog,’ ‘basketball,’ and ‘moving van,’ in that order, it is considered correct.” It’s understandable why the top-5 metric exists: If a picture shows more than one object, it’s useful to know whether the machine is detecting any of them. But it weakens the claim of extreme skill. Mitchell says the best top-1 accuracy — when the machine put the proper category at the top of its list — was only 82% in 2017.
As for the statement that humans are 95% accurate, Mitchell shows the data supporting it is flimsy. Two people tried a portion of the ImageNet challenge for a research project published in 2015. The person who put more time into it, Andrej Karpathy, who is now A.I. director at Tesla, studied a labeled set of 500 images as his “training data” and then sorted 1,500 unlabeled images into categories. With top-5 accuracy as the metric, he got 5% wrong. And now, that figure gets used as the mark of human performance.
Never mind that his error rate probably would have been lower if not for oddities of this specific test. Karpathy has written that about one-fourth of the errors made by him and the other test subject arose not because they were clueless about the pictures they were seeing but because they didn’t know or didn’t recall some of the exact labels used in ImageNet.
Further muddling matters, “No one, as far as I know, has reported a comparison between machines and humans on top-1 accuracy,” Mitchell writes.
With their speed and endless patience, computers are ideal for image recognition in many applications. They’re surely more accurate than humans in some situations. But it’s a stretch to say much more than that. “Object recognition is not yet close to being ‘solved’ by artificial intelligence,” Mitchell writes.
“I can barely imagine what breakthroughs we would need to build such a machine.”
One of the hottest topics in A.I. now is how to get machines to not just detect statistical correlations in data but also to understand, on some level, the meaning of what they’re processing. Like Mitchell, New York University researchers Gary Marcus and Ernest Davis say that without such improvements, artificial intelligence won’t be “safe, smart, or reliable.” In their new book Rebooting A.I.: Building Artificial Intelligence We Can Trust, Marcus and Davis say people have been fooled into thinking computers are sharper than they are because of a “gullibility gap.”
“We can’t help but think about machines in cognitive terms (‘It thinks I deleted my file’), no matter how simpleminded the rules are that the machines might actually be following,” they write.
In hopes of making machines less simpleminded, many researchers are revisiting old methods of encoding computers with logic and common sense. Mitchell, for instance, describes her efforts to get computers to reason by analogy. If you want a computer to recognize images depicting “walking a dog,” the basic approach now is to show it thousands of images of dog-walking and then trust that common things in those pictures — dog, leash, hand, etc. — will trigger a positive signal when the machine sees future pictures of dog walking. However, it will probably whiff on unusual instances of dog walking. Mitchell shows some examples, such as a picture of someone walking a dog while riding a bike and one of a dog holding another dog’s leash in its mouth. Her research group is trying to get computers to understand that the latter instances are “stretched” versions of the prototypical situation.
Work like that is painstaking and slow. And even if it succeeds — if a computer can be made to understand on some level that “walking a dog” can take many forms — how rich would that understanding be if the machine never encounters a dog for itself?
When I was about 6 years old, I was riding my bike on a sidewalk near my house and stopped to look at (and maybe throw) a bunch of spiky seed balls from a sweetgum tree. Suddenly, someone opened a gate next to a house across the street, and out came a brown dog that hesitated for a second or two before bounding straight over and nipping at me, tearing the seat of my corduroy pants. The memory contributes to my insight into countless aspects of experience — surprise, fear, and pain; the unpredictability of some dogs; the exotic weirdness of sweetgum seed balls. It’s pretty shallow in comparison for a computer to be told that it’s not always an idiom when people say something bit them in the ass.
Philosophers and other A.I. skeptics have long argued that only so much can be taught to a disembodied electronic brain in a box. Many A.I. researchers have blamed such doubts on “some sort of residual mysticism” — an “unscientific faith in a soul-like mental essence,” as the cognitive scientist Andy Clark wrote in a 1997 book, Being There: Putting Brain, Body, and World Together Again. But the idea that intelligence requires a body sounds pretty good when you consider that crawling babies grasp basic concepts quickly while computers in data centers require vast amounts of electricity to absorb one thing.
Mitchell appears to be reluctantly coming around to the skeptics’ position. “After grappling with A.I. for many years, I am finding the embodiment argument increasingly compelling,” she writes.
What, then, will it take to produce a robot that moves around the world with insights about its own actions and those of the people, animals, and other machines it interacts with? “I can barely imagine,” she writes, “what breakthroughs we would need to build such a machine.”