Amazon Mechanical Turk (MTurk) is a crowdsourcing website with which businesses can hire remotely located "crowdworkers" to perform discrete on-demand tasks that computers are currently unable to do as economically. It is operated under Amazon Web Services, and is owned by Amazon.[1] Employers, known as requesters, post jobs known as Human Intelligence Tasks (HITs), such as identifying specific content in an image or video, writing product descriptions, or answering survey questions. Workers, colloquially known as Turkers or crowdworkers, browse among existing jobs and complete them in exchange for a fee set by the requester. To place jobs, requesters use an open application programming interface (API), or the more limited MTurk Requester site.[2] As of April 2019[update], requesters could register from 49 approved countries.[3]
History
The service was conceived by Venky Harinarayan in a U.S. patent disclosure in 2001.[4] Amazon coined the term artificial artificial intelligence for processes that outsource some parts of a computer program to humans, for those tasks carried out much faster by humans than computers. It is claimed[by whom?] that Jeff Bezos was responsible for proposing the development of Amazon's Mechanical Turk to realize this process.[5]
The name Mechanical Turk was inspired by "The Turk", an 18th-century chess-playing automaton made by Wolfgang von Kempelen that toured Europe, and beat both Napoleon Bonaparte and Benjamin Franklin. It was later revealed that this "machine" was not an automaton, but a human chess master hidden in the cabinet beneath the board and controlling the movements of a humanoid dummy. Analogously, the Mechanical Turk online service uses remote human labor hidden behind a computer interface to help employers perform tasks that are not possible using a true machine.
MTurk launched publicly on November 2, 2005. Its user base grew quickly. In early- to mid-November 2005, there were tens of thousands of jobs, all uploaded to the system by Amazon itself for some of its internal tasks that required human intelligence. HIT types expanded to include transcribing, rating, image tagging, surveys, and writing.
In March 2007, there were reportedly more than 100,000 workers in over 100 countries.[6] This increased to over 500,000 registered workers from over 190 countries in January 2011.[7] That year, Techlist published an interactive map pinpointing the locations of 50,000 of their MTurk workers around the world.[8] By 2018, research demonstrated that while over 100,000 workers were available on the platform at any time, only around 2,000 were actively working.[9]
Overview
A user of Mechanical Turk can be either a "Worker" (contractor) or a "Requester" (employer). Workers have access to a dashboard that displays three sections: total earnings, HIT status, and HIT totals. Workers set their own hours and are not under any obligation to accept any particular task.
Amazon classifies Workers as contractors rather than employees and does not pay payroll taxes. Classifying Workers as contractors allows Amazon to avoid things like minimum wage, overtime, and workers compensation—this is a common practice among "gig economy" platforms. Workers are legally required to report their income as self-employment income.
In 2013, the average wage for the multiple microtasks assigned, if performed quickly, was about one dollar an hour, with each task averaging a few cents.[10] However, calculating people's average hourly earnings on a microtask site is extremely difficult and several sources of data show average hourly earnings in the $5–$9 per hour[11][12][13][14] range among a substantial number of Workers, while the most experienced, active, and proficient workers may earn over $20 per hour.[15]
Workers can have a postal address anywhere in the world. Payment for completing tasks can be redeemed on Amazon.com via gift certificate (gift certificates are the only payment option available to international workers, apart from India) or can be transferred to a Worker's U.S. bank account.
Requesters can ask that Workers fulfill qualifications before engaging in a task, and they can establish a test designed to verify the qualification. They can also accept or reject the result sent by the Worker, which affects the Worker's reputation. As of April 2019[update], Requesters paid Amazon a minimum 20% commission on the price of successfully completed jobs, with increased amounts for additional services[clarification needed].[6] Requesters can use the Amazon Mechanical Turk API to programmatically integrate the results of the work directly into their business processes and systems. When employers set up a job, they must specify
how much are they paying for each HIT accomplished,
how many workers they want to work on each HIT,
the maximum time a worker has to work on a single task,
how much time the workers have to complete the work,
as well as the specific details about the job they want to be completed.
Location of Turkers
Workers have been primarily located in the United States since the platform's inception[16] with demographics generally similar to the overall Internet population in the U.S.[17] Within the U.S. workers are fairly evenly spread across states, proportional to each state’s share of the U.S. population.[18] As of 2019[update], between 15 and 30 thousand people in the U.S. complete at least one HIT each month and about 4,500 new people join MTurk each month.[19]
Cash payments for Indian workers were introduced in 2010, which updated the demographics of workers, who however remained primarily within the United States.[20] A website showing worker demographics in May 2015 showed that 80% of workers were located in the United States, with the remaining 20% located elsewhere in the world, most of whom were in India.[21] In May 2019, approximately 60% were in the U.S., 40% elsewhere (approximately 30% in India).[22] In early 2023 about 90% of workers were from the U.S. and about half of the remainder from India.[23]
Uses
Human-subject research
Since 2010[update], numerous researchers have explored the viability of Mechanical Turk to recruit subjects for social science experiments. Researchers have generally found that while samples of respondents obtained through Mechanical Turk do not perfectly match all relevant characteristics of the U.S. population, they are also not wildly misrepresentative.[24][25] As a result, thousands of papers that rely on data collected from Mechanical Turk workers are published each year, including hundreds in top ranked academic journals.
A challenge with using MTurk for human-subject research has been maintaining data quality. A study published in 2021 found that the types of quality control approaches used by researchers (such as checking for bots, VPN users, or workers willing to submit dishonest responses) can meaningfully influence survey results. They demonstrated this via impact on three common behavioral/mental healthcare screening tools.[26] Even though managing data quality requires work from researchers, there is a large body of research showing how to gather high quality data from MTurk.[27] The cost of using MTurk is considerably lower than many other means of conducting surveys, so many researchers continue to use it.
The general consensus among researchers is that the service works best for recruiting a diverse sample; it is less successful with studies that require more precisely defined populations or that require a representative sample of the population as a whole.[28] Many papers have been published on the demographics of the MTurk population.[18][29][30] MTurk workers tend to be younger, more educated, more liberal, and slightly less wealthy than the U.S. population overall.[31]
Machine Learning
Supervised Machine Learning algorithms require large amounts of human-annotated data to be trained successfully. Machine learning researchers have hired Workers through Mechanical Turk to produce datasets such as SQuAD, a question answering dataset.[32]
Missing persons searches
Since 2007[update], the service has been used to search for prominent missing individuals. This use was first suggested during the search for James Kim, but his body was found before any technical progress was made. That summer, computer scientist Jim Gray disappeared on his yacht and Amazon's Werner Vogels, a personal friend, made arrangements for DigitalGlobe, which provides satellite data for Google Maps and Google Earth, to put recent photography of the Farallon Islands on Mechanical Turk. A front-page story on Digg attracted 12,000 searchers who worked with imaging professionals on the same data. The search was unsuccessful.[33]
In September 2007, a similar arrangement was repeated in the search for aviator Steve Fossett. Satellite data was divided into 85-square-metre (910 sq ft) sections, and Mechanical Turk users were asked to flag images with "foreign objects" that might be a crash site or other evidence that should be examined more closely.[34] This search was also unsuccessful. The satellite imagery was mostly within a 50-mile radius,[35] but the crash site was eventually found by hikers about a year later, 65 miles away.[36]
Artistic works
MTurk has also been used as a tool for artistic creation. One of the first artists to work with Mechanical Turk was xtine burrough, with The Mechanical Olympics (2008),[37][38]Endless Om (2015), and Mediations on Digital Labor (2015).[39] Another work was artist Aaron Koblin's Ten Thousand Cents (2008).[further explanation needed]
Third-party programming
Programmers have developed browser extensions and scripts designed to simplify the process of completing jobs. Amazon has stated that they disapprove of scripts that completely automate the process and preclude the human element. This is because of the concern that the task completion process—e.g. answering a survey—could be gamed with random responses, and the resultant collected data could be worthless.[40] Accounts using so-called automated bots have been banned. There are services that extend the capabilities to MTurk.[clarification needed]
API
Amazon makes available an application programming interface (API) for the MTurk system. The MTurk API lets a programmer submit jobs, retrieve completed work, and approve or reject that work.[41] In 2017, Amazon launched support for AWS Software Development Kits (SDK), allowing for nine new SDKs available to MTurk Users.[importance?] MTurk is accessible via API from the following languages: Python, JavaScript, Java, .NET, Go, Ruby, PHP, or C++.[42] Web sites and web services can use the API to integrate MTurk work into other web applications, providing users with alternatives to the interface Amazon has built for these functions.
Use case examples
Processing photos / videos
Amazon Mechanical Turk provides a platform for processing images, a task well-suited to human intelligence. Requesters have created tasks that ask workers to label objects found in an image, select the most relevant picture in a group of pictures, screen inappropriate content, classify objects in satellite images, or digitize text from images such as scanned forms filled out by hand.[43]
Data cleaning / verification
Companies with large online catalogues use Mechanical Turk to identify duplicates and verify details of item entries. For example: removing duplicates in yellow pages directory listings, checking restaurant details (e.g. phone number and hours), and finding contact information from web pages (e.g. author name and email).[10][43]
Information collection
Diversification and scale of personnel of Mechanical Turk allow collecting information at a large scale, which would be difficult outside of a crowd platform. Mechanical Turk allows Requesters to amass a large number of responses to various types of surveys, from basic demographics to academic research. Other uses include writing comments, descriptions, and blog entries to websites and searching data elements or specific fields in large government and legal documents.[43]
Data processing
Companies use Mechanical Turk's crowd labor to understand and respond to different types of data. Common uses include editing and transcription of podcasts, translation, and matching search engine results.[10][43]
Research validity
The validity of research conducted with the Mechanical Turk worker pool has long been debated among experts.[44] This is largely because questions of validity[45] are complex: they involve not only questions of whether the research methods were appropriate and whether the study was well-executed, but also questions about the goal of the project, how the researchers used MTurk, who was sampled, and what conclusions were drawn.
Most experts agree that MTurk is better suited for some types of research than others. MTurk appears well-suited for questions that seek to understand whether two or more things are related to each other (called correlational research; e.g., are happy people more healthy?) and questions that attempt to show one thing causes another thing (experimental research; e.g., being happy makes people more healthy). Fortunately, these categories capture most of the research conducted by behavioral scientists, and most correlational and experimental findings found in nationally representative samples replicate on MTurk.[46]
The type of research that is not well-suited for MTurk is often called "descriptive research." Descriptive research seeks to describe how or what people think, feel, or do; one example is public opinion polling. MTurk is not well-suited to such research because it does not select a representative sample of the general population. Instead, MTurk is a nonprobability,[jargon] convenience sample. Descriptive research is best conducted with a probability-based, representative sample of the population researchers want to understand. When compared to the general population, people on MTurk are younger, more highly educated, more liberal, and less religious.[47][18][30]
Mechanical Turk has been criticized by journalists and activists for its interactions with and use of labor.
Computer scientist Jaron Lanier noted how the design of Mechanical Turk "allows you to think of the people as software components" in a way that conjures "a sense of magic, as if you can just pluck results out of the cloud at an incredibly low cost".[48] A similar point is made in the book Ghostwork by Mary L. Gray and Siddharth Suri.[49][importance?]
Critics of MTurk argue that workers are forced onto the site by precarious economic conditions and then exploited by requesters with low wages and a lack of power when disputes occur. Journalist Alana Semuels’s article "The Internet Is Enabling a New Kind of Poorly Paid Hell" in The Atlantic is typical of such criticisms of MTurk.[50]
Some[who?] academic papers have obtained findings that support or serve as the basis for such common criticisms,[51] but others contradict them.[52] A recent academic commentary argued that study participants on sites like MTurk should be clearly warned about the circumstances in which they might later be denied payment as a matter of ethics,[53] even though such statements may not reduce the rate of careless responding.[54]
A paper published by a team at CloudResearch[14] shows that only about 7% of people on MTurk view completing HITs as something akin to a full-time job. Most people report that MTurk is a way to earn money during their leisure time or as a side gig. In 2019, the typical worker spent five to eight hours per week and earned around $7 per hour. The sampled workers did not report rampant[clarification needed] mistreatment at the hands of requesters; they reported trusting requesters more than employers outside of MTurk. Similar findings were presented in a review of MTurk by the Fair Crowd Work organization, a collective of crowd workers and unions.[55][unreliable source?]
Monetary compensation
The minimum payment that Amazon allows for a task is one cent. Because tasks are typically simple and repetitive the majority of tasks pay only a few cents,[56] but there are also well-paying tasks on the site.
Many criticisms of MTurk stem from the fact that a majority of tasks offer low wages. In addition, workers are considered independent contractors rather than employees. Independent contractors are not protected by the Fair Labor Standards Act or other legislation that protects workers’ rights.[United States-centric] Workers on MTurk must compete with others for good HIT opportunities as well as spend time searching for tasks and other actions that they are not compensated for.
The low payment offered for many tasks has fueled criticism of Mechanical Turk for exploiting and not compensating workers for the true value of the task they complete.[57] One study of 3.8 million tasks completed by 2,767 workers showed that "workers earned a median hourly wage of about $2 an hour" with 4% of workers earning more than $7.25 per hour.[58]
The Pew Research Center and the International Labour Office published data indicating people made around $5.00 per hour in 2015.[12][59] A study focused on workers in the U.S. indicated average wages of at least $5.70 an hour,[60] and data from the CloudResearch study found average wages of about $6.61 per hour.[14] Some evidence suggests that very active and experienced people can earn $20 per hour or more.[61]
Fraud
The Nation magazine reported in 2014 that some Requesters had taken advantage of Workers by having them do the tasks, then rejecting their submissions in order to avoid paying them.[62] Available data indicates that rejections are fairly rare. Workers report having a small minority of their HITs rejected, perhaps as low as 1%.[14]
In the Facebook–Cambridge Analytica data scandal, Mechanical Turk was one of the means of covertly gathering private information for a massive database.[63] The system paid people a dollar or two to install a Facebook-connected app and answer personal questions. The survey task, as a work for hire, was not used for a demographic or psychological research project as it might have seemed. The purpose was instead to bait the worker to reveal personal information about the worker's identity that was not already collected by Facebook or Mechanical Turk.
Labor relations
Others have criticized that the marketplace does not allow workers to negotiate with employers. In response to criticisms of payment evasion and lack of representation, a group developed a third-party platform called Turkopticon which allows workers to give feedback on their employers. This allows workers to avoid potentially unscrupulous jobs and to recommend superior employers.[64][65] Another platform called Dynamo allows workers to collect[clarification needed] anonymously and organize campaigns to better their work environment, such as the Guidelines for Academic Requesters and the Dear Jeff Bezos Campaign.[66][67][68][69] Amazon made it harder for workers to enroll in Dynamo by closing the request account that provided workers with a required code for Dynamo membership. Workers created third-party plugins to identify higher paying tasks, but Amazon updated its website to prevent these plugins from working.[70] Workers have complained that Amazon's payment system will on occasion stop working.[70]
Mechanical Turk is comparable in some respects to the now discontinued Google Answers service. However, the Mechanical Turk is a more general marketplace that can potentially help distribute any kind of work tasks all over the world. The Collaborative Human Interpreter (CHI) by Philipp Lenssen also suggested using distributed human intelligence to help computer programs perform tasks that computers cannot do well. MTurk could be used as the execution engine for the CHI.[citation needed]
In 2014 the Russian search giant Yandex launched a similar system called Toloka that is similar to the Mechanical Turk.[71]
See also
CAPTCHA, which challenges and verifies human work at a simple online task
^Schmidt, Florian Alexander (2013). "The Good, the Bad and the Ugly: Why Crowdsourcing Needs Ethics". 2013 International Conference on Cloud and Green Computing. pp. 531–535. doi:10.1109/CGC.2013.89. ISBN978-0-7695-5114-2. S2CID18798641.
^Berg, J. (2015). Income security in the on-demand economy: Findings and policy lessons from a survey of crowdworkers. Comparative Labor Law and Policy Journal, 37, 543.