The nifty app CamFind has come a long way with its artificial intelligence. It uses image recognition to identify an object when you point your smartphone camera at it. But back in 2015 its algorithms were less advanced: The app mostly used contract workers in the Philippines to quickly type what they saw through a user’s phone camera, CamFind’s co-founder confirmed to me recently. 1 You wouldn’t have guessed that from a press release it put out that year which touted industry-leading “deep learning technology,” but didn’t mention any human labelers.
The practice of hiding human input in AI systems still remains an open secret among those who work in machine learning and AI. A 2019 analysis of tech startups in Europe by London-based MMC Ventures even found that 40% of purported AI startups showed no evidence of actually using artificial intelligence in their products.
This so-called AI washing shouldn’t be surprising. Global investment into AI companies has steadily risen over the past decade and more than doubled in the past year, according to market intelligence firm PitchBook. Calling your startup an “AI company” can lead to a funding premium of as much as 50% compared to other software companies, according to the MMC Ventures analysis.
Yet ignoring the workers who power these systems is leading to unfair labor practices and skewing the public’s understanding of how machine learning actually works.
In Silicon Valley, many startups have succeeded by following the “fake it ‘til you make it” mantra. For AI companies, hiring people to prop up algorithms can become a stopgap, which on occasion becomes permanent. Humans have been discovered secretly transcribing receipts, setting up calendar appointments or carrying out bookkeeping services on behalf of “AI systems” that got all the credit. In 2019, a whistleblower lawsuit against a British firm claimed customers paid for AI software that analyzed social media while staff members were doing that work instead.
There’s a reason this happens so often. Building AI systems requires many hours of humans training algorithms, and some companies have fallen into the gray area between training and operating. A common explanation is that human workers are providing “validation” or “oversight” to algorithms, like a quality-control check. But in some cases, these workers are doing more cognitively intensive tasks because the algorithms they oversee don’t work well enough on their own.
That can bolster unrealistic expectations about what AI can do. “It’s part of this quixotic dream of super-intelligence,” says Ian Hogarth, an angel investor, visiting professor at University College London and co-author of an annual State of AI report that was released on Tuesday. For the hidden workers, working conditions can also be “anti-human,” he says. That can lead to inequalities and poor AI performance.
For instance, Cathy O’Neil has noted that Facebook’s machine-learning algorithms don’t work well enough in stopping harmful content. (I agree.) The company could double its 15,000 content moderators, as suggested by a recent academic study. But Facebook could also bring its existing moderators out of the shadows. The contract workers are required to sign strict NDAs and aren’t allowed to talk about their work with friends and family, according to Cori Crider, the founder of tech advocacy group Foxglove Legal, which has helped several former moderators take legal action against Facebook over allegations of psychological damage. Facebook has said content reviewers could take breaks when they needed and were not pressured to make hasty decisions.
Moderation work is mentally and emotionally exhausting, and Crider says contractors are “optimized to within an inch of their lives” with an array of targets to hit. Keeping these workers hidden only exacerbates the problem.
A similar issue affects Amazon.com Inc.’s MTurk platform, which posts small tasks for freelancers. In their book “Ghost Workers,” Microsoft Corp. researchers Mary Gray and Siddharth Suri say these freelancers are part of an invisible workforce labelling, editing and sorting much of what we see on the internet. AI doesn’t work without these “humans in the loop,” they say, yet people are largely undervalued.
And a recent paper from academics at Princeton University and Cornell University called out data-labelling companies like Scale AI Inc. and Sama Inc. who pay workers in Southeast Asia and sub-Saharan Africa $8 a day. Sure, that’s a living wage in those regions but long-term it also perpetuates income inequality. A spokeswoman for Sama said the company has helped more than 55,000 people lift themselves out of poverty, and that higher local wages could negatively impact local markets, leading to higher costs for food and housing. Scale AI did not respond to a request for comment.
“Microwork comes with no rights, security, or routine and pays a pittance — just enough to keep a person alive yet socially paralyzed,” writes Phil Jones, a researcher for the British employment think tank Autonomy, adding that it is disingenuous to paint such work as beneficial to a person’s skills. Data labelling is so monotonous that Finland has outsourced it to prisoners.
Improving the employment status of these workers would make their lives better and also improve AI’s development, since feeding algorithms with inconsistent data can hurt future performance. Foxglove’s Crider says Facebook needs to make its content moderators full-time staff if it really wants to fix its content problems (most of them work for agencies like Accenture plc.). The Princeton and Cornell researchers say labelers need a more visible role in the development of AI and more equitable pay.
One glimmer in the darkness: Freelancers who do microtasks on Amazon’s MTurk platform have recently been holding worker forums to approach Amazon on issues like rejected work, according to one of their representatives. They aren’t creating a union per se, but their work is a unique attempt at organizing, giving AI’s ghost workers a voice they haven’t had until now. Here’s hoping the idea catches on more broadly.
- The process was internally referred to as a “hybrid” approach to image recognition, according to Bradford Folkens, the co-founder and current CEO of CamFind parent company CloudSight Inc. When its computer-vision algorithm had a high enough confidence level about a result, it would send that result directly to the user. When it was below a certain threshold, the humans would type out the result and save that for future training. He says the CEO at the time “probably didn’t feel the need to keep reiterating” that CamFind used humans because it had published many patents about this approach.
This column does not necessarily reflect the opinion of the editorial board or Bloomberg LP and its owners.
To contact the editor responsible for this story:
Nicole Torres at firstname.lastname@example.org