Table of Contents
- 1 Introduction — A Different Kind of Contribution
- 2 What Is the AI Identity Labor Market and Why Is It Growing?
- 3 How People Are Getting Paid for Their Likeness
- 4 What Happens to Your Data After You Submit It
- 5 Why the Pay Doesn’t Always Match the Value Being Created
- 6 Is This Really Different from Other Gig Work?
- 7 What Could Change About How Contributors Are Compensated
- 8 FAQs
Introduction — A Different Kind of Contribution
A growing number of people are getting paid to contribute their faces, voices, movement, and daily routines to train AI systems. On the surface, the exchange seems straightforward: record a few clips, upload some photos, maybe let an app listen through your phone while you go about your day. Money arrives quickly, no experience needed. What tends to get less attention is what people are actually agreeing to when they do.
The appeal is clearest in places where dollar-denominated payments stretch furthest. In South Africa, a single task on platforms like Kled AI can pay roughly ten times the country’s minimum hourly wage. Contributors in markets like India and the United States are participating too, drawn by the accessibility and speed of the platforms. For many, it is a practical way to earn on their own schedule with no prior experience required.
This article looks at how that market works, what people are actually earning, and what the terms of participation mean in practice.
What Is the AI Identity Labor Market and Why Is It Growing?
The AI identity labor market refers to the ecosystem of platforms that pay individuals to contribute personal data, specifically voice, face, movement, and behavior, for use in training AI systems. It has grown quickly because AI models are running low on usable data.
For years, AI data work was understood primarily as labeling: annotators would draw bounding boxes around objects, flag unhelpful chatbot responses, or transcribe audio clips. That work continues at scale. But a different category has grown alongside it, one where people are not just evaluating data. They are the data.
Platforms in this space recruit contributors to submit navigation footage, record speech in varied emotional tones, allow apps to capture ambient audio through their phones, or sell personal conversations. Contributors receive a flat payment per task or per minute. That data feeds into datasets used across speech recognition, computer vision, emotion detection, and conversational AI.
Researchers have estimated that AI companies could exhaust fresh, high-quality text data to train on as early as this year. The most widely used public training datasets are now restricting access to AI developers. Synthetic data, generated by AI and recycled back into training pipelines, tends to degrade model quality over time. Human data, with its natural variation and contextual depth, remains what models need to generalize well across real-world conditions. That scarcity is what has made personal identity data so commercially valuable.
How People Are Getting Paid for Their Likeness
Understanding what people actually earn requires looking at how each platform structures its payments. Across most, payment is tied to a specific task. Once submitted, the transaction is closed. That structure takes a few different forms depending on the platform and the type of data being collected.
1. Task-Based Contributions
Contributors accept a specific job, such as recording a navigation walk, submitting facial expressions across different lighting conditions, or completing speech variations in multiple emotional tones. Payment is tied to the task itself, not to how the output is used afterward. Rates vary by platform and task type. For instance, Kled AI pays $14 for a single urban navigation recording, while Luel AI, backed by Y Combinator, pays around $0.15 per minute for multilingual conversations.
2. Ambient and Behavioral Data
Some platforms pay contributors to passively allow data collection rather than actively perform a task. Silencio accesses users’ microphones to capture ambient environmental sounds, such as restaurant interiors or traffic at busy intersections, that AI systems use to learn how to interpret real-world audio. Contributors earn based on the uniqueness and volume of what their phone captures.
3. Conversational and Personal Data
Neon Mobile represents a more personal category. The app, which reached the top five of the Apple App Store shortly after launch in September 2025, pays users to record their phone calls for conversational AI training at $0.15 per minute when only one party is a Neon user, doubling when both are. Within days of its rise, TechCrunch discovered a security vulnerability that allowed anyone to access the phone numbers, call recordings, and transcripts of other users. The founder took the app offline but did not disclose the breach to users in his initial notification.
What Happens to Your Data After You Submit It
The payment side of this market is transparent enough. What contributors agree to in the process is often not.
When contributors submit data to platforms in this market, they are typically granting a worldwide, exclusive, irrevocable, transferable, and royalty-free license. That license allows the company to sell, use, display, store, and create derivative works from the contributor’s digital likeness, without time limits or scope restrictions. Reviewing the publicly available terms of service across platforms in this category confirms this language is standard, not an exception.
In practical terms, a voice recording submitted today could power a customer service system for years. A set of facial expressions captured for one purpose could be incorporated into a model used for an entirely different one. Without a clear understanding of what those terms allow, contributors have limited recourse if their data surfaces somewhere they never expected.
Once identity data moves through a dataset sale to a third party, the original platform’s stated policies no longer govern how it is used. Data marketplaces typically claim to remove identifying details before datasets are sold, but legal scholars note that biometric patterns, including voice prints and facial geometry, are difficult to meaningfully anonymize. Stripping a name does not remove the underlying identity signals encoded in how a person sounds or looks.
Why the Pay Doesn’t Always Match the Value Being Created
The earnings contributors receive are real, and in some contexts genuinely meaningful. But the payment structure raises a straightforward question: a contributor is paid for a single moment of input, while the data continues generating value across model versions, product deployments, and licensing deals for years afterward.
The scale of what is being built on top of that data is significant. Leading AI companies were estimated in 2025 to be spending hundreds of millions of dollars annually on human-collected training data. Yet the people supplying it are paid per task, with no stake in what their contribution becomes. Scale AI’s Remotasks platform, one of the largest crowdsourced data operations globally, paid some annotators roughly one U.S. cent per task. When Meta invested over $14 billion into Scale AI in June 2025, the platform’s valuation was built in large part on the output of that contractor workforce. None of that investment translated into better pay or conditions for the people doing the work.
This is the structural tension at the center of the market. Identity data is treated as a raw input with a fixed price at the point of collection, but its value compounds over time as it gets incorporated into models, licensed to other companies, and deployed across products. The contributor’s relationship to that value ends at submission.
Is This Really Different from Other Gig Work?
On the surface, contributing identity data to AI looks like other flexible digital work. You sign up, complete tasks on your own time, and get paid. But the nature of what you are producing sets it apart from most gig work in a meaningful way.
A rideshare driver’s labor ends when the trip does. A freelance writer transfers a piece of content with negotiated terms. Stock photographers license images and retain rights they can exercise again. In each case, the output has a defined relationship to the person who created it.
With identity data, that relationship dissolves at submission. The contributor’s face, voice, or movement pattern becomes a component inside a model that neither they nor the platform can fully trace. It is not that the pay is low relative to the work involved, though that is often true. It is that the work has a shelf life the contributor never agreed to and cannot see.
What Could Change About How Contributors Are Compensated
The one-time-payment model is starting to face pushback, and there are early signs of movement toward something more structured, though it is happening unevenly across industries.
In music, some flat-fee AI training deals have shifted toward usage-based licensing, where artists receive compensation tied to how their audio is used rather than a single upfront payment. In September 2025, Reddit, Yahoo, and Medium launched the Really Simple Licensing standard to charge AI companies per crawl or per response. No major AI company has agreed to comply. On the regulatory side, California’s AB 2602, effective January 2026, voids contract clauses granting broad digital replica rights without detailed terms and legal oversight. The proposed federal NO FAKES Act would create a property right in one’s voice and likeness enforceable nationwide, part of a broader push to address how AI training intersects with consent and copyright.
But most of these developments apply to entertainment and media, not to the larger and less visible population of everyday contributors submitting data through crowdsourcing apps. For that group, the infrastructure is still missing. What the market currently lacks is more structured participation models that give contributors clarity on how their data will be used, greater transparency from platforms about downstream licensing, and real mechanisms for contributors to set terms rather than simply accept them.
The shift from identity as a one-time input to identity as something with ongoing value and enforceable rights is already taking shape in parts of the industry. Extending that logic to the broader contributor market is the next conversation that needs to happen.
FAQs
Yes. Some platforms pay users to submit voice recordings, videos, and other personal data for AI training. Pay varies, and most agreements allow broad use of the data.
It is added to datasets used to train AI models. The data may be reused across systems and shared with other companies over time.
Biometric data like voice and facial features can remain identifiable, even without names attached. Contributors may not know how their data is used or where it appears.