When Arvind Narayanan gave a presentation at the Massachusetts Institute of Technology called “How to recognize AI snake oil” in 2019, he was surprised to find his academic talk going viral on Twitter, with the slide deck eventually downloaded tens of thousands of times, and numerous requests filling his inbox.
The overwhelming response has since led Narayanan, assistant professor of computer science at Princeton University, to expand his talk into a book that he is co-writing with graduate student Sayash Kapoor. Following the sensation caused by ChatGPT and generative AI, the subject of the book is more timely than ever.
What is ‘AI snake oil’ and how do you distinguish it from the real thing?
Narayanan explains that AI is an umbrella term for a set of loosely related technologies without a precise definition. To help demystify the term, he has devised a scheme that classifies AI from genuine to dubious across three categories: AI relating to perception, AI automating human judgement and predictive AI.
The first category includes technologies such as the song identification app Shazam, facial recognition, and speech-to-text. The second refers to AI used for making content recommendations, automating content moderation in social media, or detecting spam or copyright violations online. The third refers to predictive AI systems in tasks from hiring to setting bail to gauging business risk.
“The third category is really where most of the snake oil is – using AI to predict what a person might do in the future,” says Narayanan. “And then use that prediction to make decisions about them that might, in fact, give or deny them important life opportunities.”
Unlike using AI for something such as speech transcription or image recognition, he explains that there’s no ground truth data, or ‘gold standard’, to compare and evaluate results with predictive AI because the outcomes haven’t happened yet. “The future is fundamentally unpredictable,” he says.
Whether screening job candidates, predicting recidivism or the risk of a motor vehicle accident, Narayanan’s research has found purported AI tools fare little better than flipping a coin. Certainly, they are not as effective as long-established statistical analysis methods like regression analysis.
Generative AI reflects genuine progress
Where, then, does leave ChatGPT? Narayanan views generative AI, including ChatGPT, as an outgrowth of perception-related AI, going beyond just perceiving and classifying content to being able to generate images or text on request. Through such progress, he believes generative AI holds more promise than as a substitute for human judgement or discerning the future.
“The potential is clearly there but a lot of work still lies ahead to figure out which applications are even the right ones,” he says. In that vein, Narayanan points to AI tools he uses himself, such as GitHub Copilot, which can turn natural language prompts into code and translate code between programming languages.
At the same time, he highlights some of the flaws that have recently surfaced, most notably Microsoft’s new AI-powered Bing becoming erratic and telling lies in lengthy exchanges with journalists and early testers. That suggests to him that generative AI won’t necessarily upend search overnight.
Narayanan, who has a lively Twitter account (@random_walker), has also referred to ChatGPT itself as a “bullshit generator”. That isn’t a scientific term. “I just wanted to remind people that chatbots aren’t trained to be accurate,” he explains. “They’re trained to sound convincing, but fundamentally chatbots aren’t built with an ability to evaluate the truth or falsehood of statements.”
As such, he suggests ChatGPT and its rivals shouldn’t be viewed as trusted sources in areas where accuracy is vital, like providing health information. “I don’t think that problem is fundamentally insoluble. A lot of researchers are working on it, but it’s just not there yet,” he says.
Trying to slow the AI gold rush
Within the business realm, Narayanan suggests companies move carefully to incorporate generative AI into their operations. That means starting with the simplest tasks to be automated for productivity gains, “then once you have experience where you start to understand the limitations, gradually build up from there to try more complex tasks”.
That approach might involve customising a general model rather than using a smaller, specialised one. “The reason people are currently excited is because they feel that foundation models are perhaps a more general and quicker way to get to business-specific objectives than to train a model on a particular data set,” says Narayanan.
The recent release of the ChatGPT API by OpenAI is likely to spur the rush of companies and startups harnessing the technology to add chatbots or other AI-powered features to applications, so as not to get caught behind the curve.
Narayanan praised Google for, in contrast, taking a cautious approach, significantly delaying the public release of its AI chatbot amid ethical considerations and internal debate. But in the wake of ChatGPT and Microsoft’s Bing relaunch, the search giant is playing catch-up. It announced its Bard chatbot in February and is planning to include AI in all its major products within months, according to a Bloomberg report.
Calling for regulation to enforce responsible AI
Welcome to the AI arms race. “A lot of the last five years of progress in responsible AI is in fact eroding at this moment,” says Narayanan, who co-authored a textbook on machine learning and fairness. He also led the Princeton Web Transparency and Accountability Project, uncovering how companies collect and use people’s personal information.
To limit the dangers of an AI free-for-all, Narayanan says government regulation will have to play a role in ensuring new AI systems perform as advertised and don’t abet discrimination, disinformation or other harms. Indeed, governments are scrambling to figure out how to address the proliferation of AI across all aspects of society.
But Narayanan emphasises that existing laws, such as those dealing with discrimination or fraud, can already be applied to problems emerging from the rise of AI. In that vein, the US Federal Trade Commission recently issued a warning to businesses about exaggerating what AI products can do or whether they use AI at all.
And since business as an institution enjoys a measure of public trust, Narayanan says it’s especially important companies don’t overpromise what AI can deliver. “Unfortunately, when they overhype some of these technologies and confuse public discourse, they’re doing everyone a big disservice,” he says.