Friday, October 24, 2025

Addressing Human Rights Risks: The Impact of Unequal Inputs in Generative AI

Share

The Increasingly Unequal Landscape of AI and Internet

A quiet yet profound transformation is unfolding across the internet, and it’s one that has significant ramifications for marginalized voices. Generative artificial intelligence (AI) is evolving beyond a mere innovation tool; it is increasingly being recognized as a mechanism that amplifies existing inequalities. For communities communicating in underrepresented languages, the consequences extend far beyond the biases that may arise from these technologies. There’s a real threat of increasing exclusion from digital spaces, distortion of their realities, and exposure to new forms of harm—all occurring within a framework that is largely unregulated and lacking in accountability.

The Blurred Lines Between AI Developers and Social Platforms

The distinction between AI developers and social media platforms has become increasingly ambiguous. Major companies like Meta and xAI are now repurposing vast user datasets—comprising posts, photos, and behavioral patterns—to train their AI models. Alarmingly, this data is often used without user consent or robust safeguards. There are vague justifications, termed “legitimate interest,” that only fuel legal and ethical questions. Meta, for instance, utilizes nearly all public posts from its platforms, such as Facebook and Instagram, to train its models, leading to significant concerns about user privacy. Users in regions like the European Union have the option to opt-out, while those in other locales find themselves without such agency.

Privacy, Freedom of Expression, and Community Impact

The increasing integration of social media and generative AI systems is fundamentally reshaping our perceptions of privacy and freedom of expression. However, it disproportionately risks vulnerable communities. These AI models not only disregard the languages and experiences of these groups, but they also have the capability to algorithmically distort them. This amplifies existing misrepresentations and biases in the very content that is generated, further marginalizing these communities.

Direct AI Training from User Data

The practice of using online content to train AI is not new, yet the unprecedented scale and directness of this data extraction raise significant ethical concerns. Major platforms like Meta, X, and TikTok are utilizing immense amounts of public user data—from posts and images to behavioral patterns. This brings to the forefront pressing questions about consent and surveillance.

In early 2025, LinkedIn found itself embroiled in a lawsuit for allegedly using private user messages to train its AI models. Simultaneously, Meta admitted to training its AI systems using public content from Facebook and Instagram dating back to 2007. The merger of platforms and AI developments signifies a paradigm shift where companies do not merely host user-generated content but actively repurpose it for AI training.

Learning Hate, Producing Bias

Among the most severe risks posed by these AI systems is the amplification of hate and discrimination inherent in their training data. Importantly, AI has no ability to distinguish between harmful and harmless content; it learns indiscriminately from what it is fed. If models are exposed to racist, misogynistic, or violent content, their generated outputs will reflect these same biases, albeit in a seemingly neutral manner.

Following Elon Musk’s takeover of Twitter and its transformation into X, content moderation practices became significantly laxer. This transition was starkly illustrated in early 2025 when Grok 3, an AI model, generated racist imagery and misogynistic content, showing how unfiltered training data could lead directly to harmful outputs. Moreover, the awareness that online behavior may contribute to AI training can lead to self-censorship among marginalized communities, further complicating the dynamics of online expression.

Privacy Violations at Scale

The use of behavioral data to train AI blurs the lines between public and private online existence. According to the EU General Data Protection Regulation (GDPR), processing personal data requires a legal basis. While companies often cite “legitimate interest” for using publicly available data, this rationale falls short when users have no expectation that their data would be utilized for AI training.

A 2013 study revealed that Facebook “likes” could predict personal characteristics like race and religion with surprising accuracy. Today, advancements in technology allow for even deeper inferences, stripping users of control over their data and privacy.

Risks for Non-Mainstream Language Users

Individuals who communicate in underrepresented languages, such as Persian, encounter unique risks in the generative AI landscape. Most AI systems are predominantly trained on content in English and a select few major languages, oftentimes neglecting the nuances of less widely spoken tongues. This oversight results in misunderstanding and misrepresentation, while poorly resourced content moderation for these languages facilitates the unchecked spread of misinformation and hate speech.

In nations subject to sanctions or political isolation, users often have little to no recourse when their rights are infringed upon online. Censorship and restricted connectivity further hamper access to education and digital literacy. This exclusion extends to transparency initiatives where these users lack access to safety updates or AI disclosures in their languages—rendering accountability all but impossible.

The Path Forward: Mitigating Harm

Despite these escalating risks, a rights-based approach offers a path forward. At a technical level, cleaning and diversifying AI training datasets is vital to diminish the perpetuation of harmful biases. However, technical solutions alone will not suffice; systems need to incorporate independent oversight, inclusive of public audits that evaluate training processes and model outputs.

Civil society stakeholders play an instrumental role in holding tech companies accountable through reporting mechanisms and policy analyses. Additionally, advancing digital literacy initiatives can empower users to recognize algorithmic bias and critically engage with AI-generated content. The establishment of stronger legal protections, particularly for those in regions lacking resources, is increasingly urgent, ensuring that AI does not simply reinforce existing societal inequalities, but is steered towards empowerment and equity.

Read more

Related updates