Timnit Gebru
The AI ethics researcher whose firing from Google sparked a global debate about bias, accountability, and who gets to shape the future of AI.
Timnit Gebru
The Necessary Critic
Born: 1983, Addis Ababa, Ethiopia
Timnit Gebru did not set out to become a symbol. She set out to do research. The research she was doing — on bias in facial recognition systems, on the environmental and social costs of large language models, on the ways in which AI systems can embed and amplify the inequalities of the societies they are trained on — made her, in the eyes of the organisation she worked for, inconvenient. When Google fired her in December 2020, in circumstances that remain disputed but that involved a research paper the company had asked her to retract, the dismissal produced a global reaction that demonstrated how charged the questions she had been asking had become.
The reaction was not merely about Gebru as an individual, though the individual aspects — a Black woman researcher in a field dominated by white men, fired in ways that struck many observers as retaliatory — were inextricable from the substantive ones. It was also about the questions her research raised: who gets to decide what counts as harm, when the people building powerful systems have commercial interests in not finding harm; who gets to do the research, when the most important infrastructure for AI research is controlled by companies whose accountability structures are opaque; and whether the frameworks for AI safety that dominate industry and academic discourse — alignment, value learning, corrigibility — adequately address the harms that AI systems are already causing, right now, to people at the margins of technological power.
Addis Ababa, Arizona, and Stanford
Gebru was born in Addis Ababa, Ethiopia, in 1983, the youngest of three children. She lost her father when she was five and her mother when she was eighteen. At nineteen, she was resettled in the United States as a refugee, arriving in Arizona with the kind of rootlessness and determination that characterise people who have had to rebuild their lives more than once.
She attended Arizona State University, studying electrical engineering, and then went to Stanford for her master’s and doctoral degrees, working in Fei-Fei Li’s computer vision group. Her doctoral research focused on computer vision applied to urban environments — using street-level imagery to infer demographic and socioeconomic characteristics of neighbourhoods, work that combined technical sophistication with an early interest in the social implications of AI systems.
At Stanford she was already thinking about questions that the mainstream computer vision community was not much asking: what happens when the demographic compositions of training datasets do not reflect the demographic compositions of the populations systems are deployed on? What are the consequences of deploying systems trained on data collected in wealthy, predominantly white, English-speaking environments in contexts that are none of those things? The technical framing of these questions — as problems of distribution shift, of domain adaptation, of statistical bias — was available, but the social and political framing she brought to them was not standard in computer vision.
After her doctorate she did a postdoctoral fellowship at Microsoft Research before joining Apple briefly and then, in 2018, accepting a position as a research scientist at Google Brain, where she became co-lead of the Ethical AI team alongside Margaret Mitchell.
Gender Shades and the Bias Question
The paper that brought Gebru to broad public attention was not published at Google. It was her MIT Media Lab work with Joy Buolamwini, published in 2018, titled “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification.” The paper evaluated three commercial facial analysis systems — from Microsoft, IBM, and Face++ — and found significant disparities in accuracy across demographic groups.
The systems performed best on light-skinned male faces and worst on dark-skinned female faces. The error rate for dark-skinned women was, in the worst case, more than thirty percentage points higher than for light-skinned men. The disparity was not a matter of the systems being generally poor — they were generally quite good, by the benchmarks of the time — but of systematic failure concentrated in specific demographic groups.
The paper was rigorous, clearly written, and immediately consequential. It produced responses from Microsoft and IBM within months, committing to improvements. It generated extensive media coverage. It established Gebru and Buolamwini as among the most important voices on the question of algorithmic bias. And it demonstrated, more clearly than any previous work, that the bias problem in AI was not a theoretical concern about future systems but a practical problem with systems already deployed commercially.
The paper was also a demonstration of a methodological approach that Gebru would continue to develop: take deployed systems seriously as technical objects, evaluate them rigorously, measure what they actually do rather than what their designers claim they do, and pay particular attention to disparities across demographic groups that are underrepresented in the development teams and training datasets of the systems being evaluated.
Stochastic Parrots and the Firing
In 2020, Gebru co-authored a paper with Emily Bender, Angelina McMillan-Major, and Margaret Mitchell titled “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” The paper raised a series of concerns about large language models — the GPT-class systems that were then becoming the dominant paradigm in NLP research.
The concerns were multiple and specific. Large language models require enormous quantities of energy to train, with environmental consequences that were rarely discussed in the literature celebrating their capabilities. The training data — vast quantities of text scraped from the internet — reflects the demographics, languages, and perspectives of internet users, who are not representative of humanity as a whole. The systems learn to produce fluent, confident text without any understanding of what that text means, creating systems that can mislead users about the nature of the intelligence they are interacting with. And the research agenda around making these systems larger and larger concentrates resources and attention in ways that may not be the most socially beneficial.
Google’s leadership asked Gebru to retract the paper before publication, or to remove the names of Google employees from the author list. She refused and sent an email to colleagues explaining her position. Google terminated her employment, initially claiming she had resigned by sending the email, a characterisation that was immediately disputed by Gebru and by Google employees who had been present in the relevant communications.
The firing produced a reaction that was, by any measure, disproportionate to what had nominally happened. More than 2,600 Google employees signed a letter of protest. Researchers across AI and computer science published expressions of support. The incident generated months of media coverage and congressional attention, and it framed the subsequent public debate about the relationship between AI companies and AI ethics research in ways that continue to shape the field.
DAIR and the Independent Research Agenda
In 2021, Gebru founded the Distributed AI Research Institute (DAIR), a non-profit research organisation designed to conduct AI research outside the constraints of both industry employment and conventional academic structures. The founding premise was that the most important AI ethics research — research that challenges the assumptions and practices of powerful institutions — cannot reliably be conducted by people employed by those institutions.
DAIR has pursued research on a range of questions including the harms of AI-generated misinformation, the labour conditions of data workers whose annotations train AI systems, the environmental costs of large-scale AI, and the governance structures that would be needed to make AI development more democratically accountable. The work is deliberately interdisciplinary, drawing on sociology, political science, and ethnic studies alongside computer science.
Gebru has also been a prominent voice in the debate about what she calls “TESCREAL” — a cluster of techno-utopian ideologies (Transhumanism, Extropianism, Singularitarianism, Cosmism, Rationalism, Effective Altruism, Longtermism) that she argues have disproportionate influence on AI development and safety research. Her critique is that these ideologies, by prioritising speculative long-term risks about superintelligent AI over the documented present-day harms of existing AI systems, systematically direct resources and attention away from the people who are currently being hurt and toward the people who are least likely to be hurt by AI — an essentially political choice dressed up as scientific reasoning.
This critique has itself attracted criticism from researchers who believe that long-term AI risk and present-day AI harm are complementary concerns rather than competing ones, and that framing them as a zero-sum competition for resources and attention is unhelpful. The debate is ongoing and substantive, and Gebru’s position in it is that of a researcher with both the credibility of serious technical work and the motivation of someone who has experienced, personally, the ways in which AI institutions manage dissent.
Key Works & Further Reading
Primary sources:
- “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification” — Buolamwini and Gebru (2018). The paper that established her public reputation.
- “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” — Bender, Gebru, McMillan-Major, Mitchell (2021). The paper that precipitated her firing.
- DAIR Institute publications (2021–present). Her ongoing research agenda outside institutional constraints.
Recommended reading:
- Race After Technology — Ruha Benjamin (2019). The sociological framework that contextualises Gebru’s technical work.
- Weapons of Math Destruction — Cathy O’Neil (2016). The earlier, more accessible account of algorithmic harm that Gebru’s work extends.
- Atlas of AI — Kate Crawford (2021). The most comprehensive critical account of AI’s material and social infrastructure.
- The Alignment Problem — Brian Christian (2020). The mainstream AI safety perspective that Gebru’s critique is in productive tension with.