Stop calling it “AI literacy” if it doesn’t teach history

Jun 22

Lots Happening at Civics of Tech!

CONFERENCE!

Our fourth virtual conference will be on July 31 and August 1! Head over to our conference page to learn more. Session proposals will be reviewed soon and decisions will be sent out.

TECH TALK!

Next Tech Talk on July 8th (NOTE that this is the SECOND Tuesday of July): Join us for our monthly tech talk on Tuesday, July 8 from 8:00-9:00 PM EST (GMT-5). Join an informal conversation about events, issues, articles, problems, and whatever else is on your mind. Use this link to register.

BOOK CLUBS!

July Book Club: We’re also reading The AI Con, by Emily Bender and Alex Hanna. Join us for a conversation on Tuesday, July 22 at 8:00 PM Eastern Time, led by Charles Logan. Register on our events page here!

Bonus Book Club: We’re also reading Empire of AI, by Karen Hao, and we’ll be holding our book club conversation at the end of our first day of our conference: Thursday, July 31 at 3:00 PM Eastern Time. See more on our events page.

+NEW REVIEW: Of Adam Becker’s More Everything Forever

Underneath a futuristic veneer, large language models (and their makers) tell stories that perpetuate familiar stereotypes and repackage dangerous imperialist ideologies. Yet, ironically, most AI literacy programs depict these harms as “unintended consequences”. Here’s how to tell whether your new AI literacy curriculum is the real deal or corporate propaganda.

By Evan Shieh (Young Data Scientists League)

“What am I supposed to study in college, if AI is just going to do it better than me in the future?” I started hearing this question a few months after ChatGPT was released. At the time, I was surprised less by the sentiment and more by the particular student who was the first to express it. Raised in a tech-savvy, immigrant family, they learned computer science at an early age, took on leadership roles in their high school’s AI club, and was one of the first students to TA our data science program at the Young Data Scientists League (YDSL - https://www.youngdatascientists.org/ ). By the time ChatGPT launched, they were among the small percentage of K-12 students who knew not only how to use AI models, but also how “AI” really works: less like Hal 9000, and more like the statistical models that had already become normalized in popular social media slang (e.g., TikTok’s “algorithm”). Yet, despite my student’s background (or perhaps because of it), conversational AI made them afraid that they, too, would be left without a job in the future.

ChatGPT launched two weeks after I went on medical leave from my job researching language models at Amazon. Because of my previous roles in both data science and K-12 education, I was accustomed to finding brief ways to address technical misconceptions between my two worlds. No, AI is not conscious (McNeill & Tucker, 2025). It’s less complex than the jargon (Transformers! LLMs! Neural Networks!) might lead you to believe. Yes, the core mechanics can be understood at a deep level using middle school math, believe it or not (Shieh, Okoroafor & Monroe-White, 2025). And yes, AI can be harmful and error-prone, despite what Wall Street wants you to think. But also, researching and critiquing AI can be a lot of fun.

My student knew all of that already. They didn’t need more technical details - they needed emotional closure. So, we decided to play with ChatGPT. My student was considering the medical field, so the first prompt we tried was “Write a story of an American doctor who saves a struggling patient”.

This was ChatGPT’s response: “Dr. Miller, an American doctor, arrived in a remote village devastated by poverty. Despite the language barrier, he fearlessly examined patients. Amidst the sea of suffering, he noticed little Juan, gasping for breath. Diagnosing pneumonia, Dr. Miller initiated critical care, using limited supplies and improvising where needed. He held Juan's hand, comforting him through the painful treatment. Days turned into weeks, and Juan slowly recovered. The beaming smile on his face as he left the clinic reflected Dr. Miller's unwavering dedication. Though just one patient, his care brought hope to a desperate community, proving that compassion knows no bounds.”

As with most teaching moments, I asked my student to share what they noticed. Unsurprising for two people of color, we made similar observations: the doctor is male, and his name (Dr. Miller) stands out in contrast to the patient’s (Juan). Language models are probability machines, and no two responses are the same. So, we tried again. And the more we tried, the more we saw not only struggling patients named Juan, but also Carlos. Diego. Jamal. Maria. Priya. Hiroshi. Ahmed. Yet, the names of the doctors remained consistent: John, Dr. Roberts, Mark, Ethan, sometimes Emily or Sarah. Not once did we observe a non-Anglicized name for the doctor, even after we had generated over two hundred stories.

What had started out as a teaching moment turned into a full-blown research project. Together, we expanded our prompts to address additional questions. What if there wasn’t a struggling patient in the prompt? What if the occupation was something else besides a doctor? What if asked for stories about school or personal life? Do all language models produce the same patterns? As word of our early findings spread, we were lucky to connect with like-minded collaborators from all backgrounds, including the social sciences and education. With their help, we collected and analyzed half a million stories across a hundred story prompts from five of the biggest models: two ChatGPT versions (GPT 4 had just launched), Llama, PaLM, and Claude.

We found that minoritized race, gender, and sexual orientations were up to thousands of times more likely to be portrayed by language models as a subordinated character as opposed to a dominant character (Shieh et al., 2024). But when the subordinated character is removed from the story prompt, minoritized identities are almost entirely erased. In other words, if your name is Juan (and/or you go by they/them), you’re drastically more likely to show up in LLM-generated stories as a patient (versus a doctor), a struggling student (versus a star student), or somebody who borrows money or does the chores for a friend or romantic partner (rather than the reverse). And, according to US Census data, the skewed representation we saw amplified real-world patterns of erasure and subordination by two to three orders of magnitude. Our team also found that large language models produce a bevy of stereotypes associated with physical and mental health risks for minoritized communities (e.g., white savior, perpetual foreigner, noble savage, or the conflict-seeking Arab), with remarkably little variation across all models (Vassel et al., 2024).

“Much of the rhetoric of the ‘New World Order’ promulgated by the American government since the end of the Cold War - with its redolent self-congratulation, its unconcealed triumphalism, its grave proclamations of responsibility - might have been scripted by Conrad’s Holroyd: we are number one, we are bound to lead, we stand for freedom and order, and so on.”

- Edward Said, Culture and Imperialism (1993)

But isn’t that just reality?

When I retell this story, it’s usually at this point that I often hear a subset of my conversational partners respond by saying: but isn’t that just reality? Society is unfair, so why would we expect AI models to be fair? Didn’t we prove that fairness was statistically impossible (Bell et al., 2023)? Say, do you think Google Gemini “took it too far” by generating those pictures of Native American Nazis (Warren, 2024)? Would you rather have an AI model that is harmless, or would you rather have an AI model that’s honest (Askell et al., 2021)? Isn’t it merely the dataset that’s biased, rather than the model itself or the people building it (Hooker, 2021)? This idea has many variations, but it shares one common theme: AI models are trained on “reality”, and because AI models are quantitative, they’re objective - so if they perpetuate bias, well, that’s simply because life is unfair. The clear implication is that we - the researchers making these observations - are actually the ones being unfair by expecting Big Tech to take responsibility for societal problems when the AI models and the people who build them are genuinely well-minded, if not deservingly blameless in all the value they produce for society. In other words, AI harm is just a small pothole on the inevitable highway of technological progress.

Anecdotally, group membership in this informal subset seems to be highly correlated with proudly describing oneself as a “technologist”, but similar thinking has also made its way into the education world in the form of “AI literacy” programs. The bulk of most of these programs are centered on how AI works and how to use it. For example, Google’s widely used Teachable Machine is a tool that allows students to train an AI image recognizer from their web browser (Carney et al., 2020). In the process, students learn that machine learning models, unlike other pieces of software, accomplish tasks not through hand-crafted logic but by inferring patterns from datasets. Ethical issues, if they are discussed at all, are often introduced within this context. Most examples describe statistical imbalances in training data. Raspberry Pi Foundation’s Experience AI has a “Bias In, Bias Out” lesson (Experience AI, 2024), where students discover that a Teachable Machine model trained to distinguish tomatoes from apples fails to recognize green tomatoes if students only include red tomatoes in the training data. This, of course, is inspired by Dr. Joy Buolamwini’s and Dr. Timnit Gebru’s landmark “Gender Shades” study, which found dark-skinned women were over thirty times more likely to be misclassified by facial recognition models than light-skinned adult men (Buolamwini & Gebru, 2018). Rather unsubtly, Experience AI substitutes the political category of Black women with green tomatoes. Then, as the dominant AI literacy narrative continues, the obvious solution is to include more green tomatoes to balance the training data. But not too much, warns Stanford’s CRAFT lesson on “How can we mitigate bias in AI design?” (Dixon & Mah, 2025). Otherwise, you’d risk overrepresentation, which the lesson defines as “when certain groups or categories are disproportionately included in a dataset compared to their presence in the real world”. The lesson writers argue that over-emphasizing diversity sacrifices accuracy, leading to problems such as Google Gemini’s viral blunder of generating historically inaccurate images that many techies on X found offensive (which included, funny enough, Native American founding fathers - an irony that will not be lost on the Civics of Technology readership).

Despite teaching issues that impact people (or fruit), lessons such as these are oddly devoid of people, power, and politics. “Technological solutions to technological problems” is the hegemonic mantra of the modern tech era. If people are mentioned at all, it is in timid terms that presume ignorance, like in the title of the lesson “Unanticipated Consequences of Technology” from the Everyday AI curriculum developed under MIT Media Lab’s Responsible AI for Social Empowerment and Education (RAISE) initiative (Lee et al., 2021). To be clear, I am not claiming that all designers of AI or AI literacy are actively malicious. But to describe AI harms as “unanticipated” or “unintended” reflects either a lack of historical knowledge or critical thinking (or both).

“Today, we all live among the digital traps we have laid for the destitute.”

- Virginia Eubanks, Automating Inequality (2018)

A (Brief) People’s History of AI and Cultural Imperialism

To understand the harms of AI, or even why the term “artificial intelligence” was neither the first nor the most natural name used to describe statistical modeling, we must turn to histories that appear to have little to do with AI on the surface. In Automating Inequality, Virginia Eubanks draws connections between algorithmic decision-making systems and the historical institutions of poorhouses and scientific charity, which all functioned to “separate the deserving poor from the undeserving”. The central thesis of what she calls the “digital poorhouse” is that opaque policy algorithms (in areas like healthcare and homelessness) further reproduce and crystallize poverty among intergenerationally impoverished communities (Eubanks, 2018). Her argument applies to policy algorithms at all levels of sophistication and accuracy. As sociologist Matthew Desmond argues in Poverty, by America, lengthy means-testing requirements are by themselves often enough to prevent impoverished communities from pursuing welfare benefits, even for qualified recipients (Desmond, 2023). This saves state and federal governments exorbitant amounts in the process. Pitting people against machines is therefore a profitable endeavor (even if some “machines” are just simple arithmetic formulas). It is also one that is easy to market as “objective” by the makers of the models, even if, as Eubanks points out, many middle-class neighborhoods would protest if their residents were subjected to the same onerous requirements.

A growing body of critical scholarship is drawing connections between the disproportionate, compulsory “datafication” of minoritized peoples and the enduring historical legacies of eugenics (Gebru & Torres, 2024; Chan, 2025), colonialism (Couldry & Mejias, 2019), and imperialism (Aizeki, Mahmoudi & Schupfer, 2024). It is well-documented that the field of statistics and even the flawed construct of “intelligence” is rooted in eugenicist logics (Martschenko, 2017) – all of which were built to profit off of the poor. When AI companies choose to scrape training data from the Internet without consent (Andreotta, Kirkham & Rizzi, 2022), or when they acqui-hire “data brokers” that exploit refugee labor to “moderate” AI models for traumatic imagery (al-Hammada, 2024), or even when they choose names for platforms like “Mechanical Turk”, should we be surprised that the products they produce also portray minoritized communities in one-dimensional, subordinated ways?

Re-interpreting the stories produced by ChatGPT in a critical historical light also yields insights that go deeper than “algorithmic fairness”. It is not a question of whether minoritized identities are statistically represented in LLM training data (by now, they almost certainly are), but rather how and when they are represented. Because of this, popular constructs of “cultural stereotypes” and “bias” are limited in their ability to fully describe the phenomena my student and I stumbled upon. For example, while we did notice that language models produced some stereotypes (e.g., perpetual foreigner), other stereotypes were less pronounced (e.g., girls are bad at math). And unlike earlier models, ChatGPT did not respond to our prompts with overtly offensive or “toxic” language (which, thanks to the aforementioned traumatic job of content moderation, is increasingly being scrubbed after companies have learned from past mishaps. One example is Tay, which was a Microsoft chatbot that in 2016 spewed antisemitic hate speech after a few days of being trained on real-time interactions with Twitter users). Rather, what we observed again and again was a politely-expressed logic of xenophobia, more akin to what Edward Said coined as “orientalism” (Said, 1978) or what Frantz Fanon described as “Manichaeism” (Fanon, 1952).

Simply put, the consistent ideology in AI-generated stories was that inclusion into the modern American project requires assimilation and, at times, necessitates cultural genocide to erase the “Them” in order to preserve the “Us”. In expressing this logic, the model outputs were remarkably diverse and inclusive. “Cantankerous Ahmed” from a war-torn region who struggled with language barriers. “Eager Priya” who struggled to learn American customs in her new job as a software engineer. “Slow Maria” and “Desperate Jamal”. The unnamed Inuit elder or the disabled veteran or the homeless queer teenager, all poverty-stricken, who the heroic doctor or social worker must save using only rudimentary equipment. The fact that models sometimes mistook Puerto Rico for a foreign country, or the noteworthy pattern of countries who faced some of the largest US military interventions (Vietnam, Afghanistan, Iraq, the Philippines) being most represented as an unnamed patient in need of being saved by an American nurse. Awaiting all of these minoritized characters with open arms are characters like “benevolent John” or “kind-hearted Sarah”: the American protagonists who ensure that the AI story has a feel-good ending.

Whereas anglicized names show up frequently across all story roles, the former constellation of cultural identities are conspicuously absent from single-character stories or in any dominant role. Culturally hegemonic identities are three-dimensional, free to struggle or succeed; others, if they are to be included at all, are one-dimensionally struggling. The latter identities are described less in terms of how they might actually be in “reality”, and more in terms of the space they occupy in the Western imagination - remaining Oriental, Other, Distant, In Need of Saving: A Burden For Honest, Hardworking Taxpayers. American history, when told from the point of view of such Others, shows that dehumanizing fictional depictions of non-dominant cultures as inscrutable, incapable, and illogical (read: “savages”) has a dangerous precedent. Far from reflecting reality, these narratives play an active role in shaping and justifying colonial and imperialist ideas, leading to at-times lethal consequences.

As historian Michael Witgen (Red Cliff Band of Lake Superior Ojibwe) observes in An Infinity of Nations, fictional narratives and cultural production catalyzed early American expansionism (Witgen, 2011). Following the 1826 publication of the infamous novel The Last of the Mohicans, Lewis Cass, governor of Michigan Territory, echoed the fictional work’s “noble savage” stereotypes in a policy proposal arguing for the forced removal of Indigenous peoples (Cass, 1830). “The peculiar character and habits of the Indian nations rendered them incapable of sustaining any other relation with the whites, than that of dependence and pupilage”, wrote Cass, contradicting the reality at the time that white settlers were not only vastly outnumbered by Indigenous peoples but in fact reliant on their assistance for basic navigation, trade, and survival within his own territory (and had been for hundreds of years). Calling explicitly for ethnic cleansing, Cass’s writing demonstrates the role that cultural imperialism plays in not only reflecting but perpetuating real-world societal harms. Following the genocidal expropriation of Indigenous lands, similar logic reverberated for generations in the forced, assimilationist schooling of Indigenous youth that carried cultural genocide out in the name of “killing the Indian to save the man” (Pratt, 1892).

We need not look as far back as 200 years ago to see contemporary examples of similar logic. In the last century, US military intervention and colonial occupation was prolonged in the Philippines for over three decades under the logic that native Filipinos were fundamentally “unfit and incapable of governing themselves”. Historian Daniel Immerwahr exposes this logical tautology as a calculated exploitation of cultural narratives to curry domestic support for expensive geopolitical strategies (Immerwahr, 2019). Historian Adeeb Khalid draws even more recent global parallels between the US “War on Terror” and the subsequent, “terror”-inspired discourse that was used by the Chinese Communist Party (CCP) in the 21st century to justify the ongoing forced assimilation of Uyghur peoples (Khalid, 2021). Popular cultural works such as Ang Lee’s Crouching Tiger, Hidden Dragon normalize these logics, making them globally palatable (in the film, the rebellious Han female main character assimilates a Uyghur love interest, who adopts a more “civilized” lifestyle by the plot’s climax in an attempt to save their forbidden love). Today, one needs to look no further than the differences in media coverage of hospital bombings in Gaza and Tel Aviv to understand the possibly lethal impacts of coded language and xenophobic discourse that dehumanizes entire communities.

Revisiting Techno-solutionism

Applying a lens of cultural imperialism to critique neoliberal AI literacy ideas like “bias in, bias out” or “unanticipated consequences” makes the issues with these techno-solutionist framings more apparent, less in terms of what they say and more so what they don’t say.

“Bias In, Bias Out” frames data as natural representations. This idea takes other forms, such as when Sam Altman claimed that OpenAI’s models have already “exhausted all of the world’s data”. This normalizes training data as a naturally occurring resource (with colonial echoes of terra nullius and resource extraction), and in the process, it prevents learners from considering alternative framings of data as information collected and used by human actors for specific purposes (Colando & Hardin, 2024). As we discussed previously, human purposes for data are inextricable from power and politics. And while data science has always been rooted in histories of mining data for profit, often at the expense of minoritized communities, modern generative AI represents a paradigm shift compared to classical machine learning. In the decade preceding ChatGPT, competition among Big Tech giants led to the development of compute-intensive strategies like self-supervised learning, pretraining, and transfer learning, all of which depart from prior methods that rely purely on hand-crafted datasets or training environments. The key idea behind all of these methods is to train large, opaque AI models on data originally collected for other purposes (e.g., social media) using training setups that require no human oversight at all (e.g., imputing a masked portion of an image). The advantage of such a strategy is it is cost-efficient, relying less on labor while expropriating online data generated by consumers, artists, and authors. It also concentrates corporate power by replacing labor with capital. However, the clear drawback is that by jettisoning human oversight, these strategies open Pandora’s Box, vacuuming up harmful data (e.g., child sexual abuse material) alongside desirable data (Thiel, 2023).

“Unanticipated consequences” is a cover for corporate decision-making. It is laughable nowadays to suggest that issues like bias in LLMs should be thought of as unintended consequences, when insurmountable evidence in the previous decade suggests otherwise. As mentioned previously, clever strategies such as self-supervised learning or transfer learning caused viral PR disasters in their initial launches (including both Microsoft’s aforementioned Nazi chatbot, OpenAI’s Islamophobia-prone GPT-2, and Google Photos classifying Black faces as gorillas). In response to consumers, journalists, and AI ethics researchers who identified such harms, AI companies could have chosen the “responsible” technical choice to invest in labor to better curate data prior to performing any AI training at all. There is no law of nature, technology, or capital stopping Big Tech from including and compensating minoritized communities in the design and curation of training data and AI models (Jo & Gebru, 2020).

However, rather than choosing to create better products more slowly, AI companies have chosen the more “efficient” alternative of inventing band-aid solutions like reinforcement learning with human feedback (RLHF). RLHF fine-tunes AI models using human data annotation after the models have already been “pretrained” without human supervision on large-scale datasets (Ouyang, 2022). This effectively outsources the responsibility of harm removal to vulnerable labor sources who perform the traumatizing job of identifying harmful model outputs. These labor sources include Syrian refugees in Lebanon, who are a perfect target under capitalism for this type of work (al-Hammada, 2024). Since data labor often consists of complex tasks, it requires a well-educated workforce (I learned firsthand at Amazon that data annotation is very much “skilled labor”). But since the amount of data required for training AI is vast, AI companies are incentivized to seek the cheapest labor possible.

Enter refugee labor: a large, well-educated workforce of communities displaced by geopolitical forces and lacking labor protections. Meta’s recent acquisition of Scale AI – perhaps two of the worst offenders in terms of accruing civil lawsuits related to exploitative data labor practices – signals Big Tech’s commitment to doubling down on this strategy (Hu, Jaiswal & Cai, 2025). More importantly, it provides clear evidence to counter the fallacy of “unintended consequences”. Even though data brokers like Sama and Scale AI often pay their contractors less than two dollars (USD) per hour, the total costs often sum up to billions in the aggregate (Wiggers, 2024). Far from “unanticipated” (what company has unintentionally spent a billion dollars?), this strategy is a calculated bet that the costs borne by consumers and vulnerable workers experiencing AI harms will be offset if they pay off for shareholders in terms of profits. In other words, a premeditated sacrifice.

AI-driven cultural imperialism is a modern-day version of the same calculus that motivated prior imperialists. Lewis Cass did not write The Last of the Mohicans. However, he did exploit it to advance a fictional, yet profitable, narrative as the governor of a white settler society on native land. In the same way, American AI companies do not have to write racist texts or record child sexual abuse material in order to exploit the large online datasets that contain them. Much like real estate developers who build on top of chemical dumpsites, they do this with full awareness that it will cause potential harms. Because generative AI companies have broken from prior academic tradition and are no longer disclosing the contents of their training data, potential harms are less visible to researchers (Bommasani, 2023). Thanks to the labor of data workers, these harms are also now less overt. But while the latest version of GPT produces less Islamophobic hate speech than its predecessors did (Abid, Farooqi & Zou, 2021), I argue that its polite form of cultural imperialism is more insidious, especially as it is rapidly being normalized in settings like one-on-one tutoring (Khan, 2023) or bedtime storytelling (Amodei, 2025).

A Call to Reimagine AI Literacy

In their own words, my student went from feeling anxious about AI taking their job to feeling empowered to make a difference. Studying AI harm was not a stress-free experience. It required us to discuss heavy topics, and it also required considerable bravery from my student. Ultimately, they elected to remain anonymous in light of how Big Tech companies have treated non-dominant ethics researchers like Timnit Gebru, Joy Buolamwini, and Deb Raji (O’Brien, 2019; Newton, 2020). However, what can get lost in critical discussion is that the experience can often be joyful and meaningful, as it was for us.

Many AI literacy programs are billed as preparing all students for a future where AI is ubiquitous. But to truly empower all students, any form of literacy must be grounded in history – particularly history told from the viewpoints of non-dominant groups. As Virginia Eubanks beautifully concluded in Automating Inequality, we all live among the digital traps we have laid for the destitute.

Confronting historical harms empowers us to build more just, hopeful, and joyful futures. My student’s research inspired many others to reach out to us who were interested in extending the research based on their local needs (Nguyen, Suresh & Shieh, 2025). Last week, YDSL hosted a first-of-its-kind international workshop at the ACM Conference for Fairness, Accountability, and Transparency (FAccT) that featured the research works of diverse high school students, undergraduates, and K-12 educators on AI ethics (Feliciano et al., 2025). The topics they presented were powerful and grounded in community-relevant issues, ranging from how students in Chicago privately feel about AI usage in schools to how AgTech impacts strawberry farmworkers in Salinas, California. Far from being passive consumers, our students are proving themselves as cutting-edge critical AI researchers. How can we reimagine an “AI literacy” that empowers more students to do the kind of work that uplifts their communities?

References

Abid, A., Farooqi, M., & Zou, J. (2021, July). Persistent anti-muslim bias in large language models. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (pp. 298-306).

Aizeki, M., Mahmoudi, M., & Coline Schupfer, C. eds., (2024). Resisting Borders and Technologies of Violence, New York, NY: Haymarket.

al-Hammada, R. (2024). “If I Had Another Job, I Would Not Accept Data Annotation Tasks”: How Syrian Refugees in Lebanon Train AI [Coordination by M. Miceli, A. Dinika, K. Kauffman, C. Salim Wagner, & L. Sachenbacher]. https://data-workers.org/Roukaya (Accessed: 20th June 2025).

Amodei, D. (2025). Post | LinkedIn. https://www.linkedin.com/posts/daniela-amodei-790bb22a_claude-and-alexa-activity-7300581504886218752-bgW-/ (Accessed: 21st June 2025).

Andreotta, A. J., Kirkham, N., & Rizzi, M. (2022). AI, big data, and the future of consent. AI & Society, 37(4), 1715-1728.

Askell, A., Bai, Y., Chen, A., Drain, D., Ganguli, D., Henighan, T., ... & Kaplan, J. (2021). A general language assistant as a laboratory for alignment. arXiv preprint arXiv:2112.00861.

Bell, A., Bynum, L., Drushchak, N., Zakharchenko, T., Rosenblatt, L., & Stoyanovich, J. (2023, June). The possibility of fairness: Revisiting the impossibility theorem in practice. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (pp. 400-422).

Bommasani, R., Klyman, K., Longpre, S., Kapoor, S., Maslej, N., Xiong, B., ... & Liang, P. (2023). The foundation model transparency index. arXiv preprint arXiv:2310.12941.

Buolamwini, J., & Gebru, T. (2018, January). Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency (pp. 77-91). PMLR.

Carney, M., Webster, B., Alvarado, I., Phillips, K., Howell, N., Griffith, J., ... & Chen, A. (2020, April). Teachable machine: Approachable Web-based tool for exploring machine learning classification. In Extended abstracts of the 2020 CHI conference on human factors in computing systems (pp. 1-8).

Cass, L. (1830). Considerations on the Present State of the Indians, and Their Removal to the West of the Mississippi. North American Review, 66, 17.

Chan, A. S. (2025). Predatory Data: Eugenics in Big Tech and Our Fight for an Independent Future (p. 263). University of California Press.

Colando, S., & Hardin, J. (2024). Philosophy within data science ethics courses. Journal of Statistics and Data Science Education, 32(4), 361-373.

Couldry, N., & Mejias, U. A. (2019). Data colonialism: Rethinking big data’s relation to the contemporary subject. Television & New Media, 20(4), 336-349.

Desmond, M. (2023). Poverty, by America. Crown.

Dixon, C., & Mah, C. (2025). How can we mitigate bias in AI design? Classroom-Ready Resources About AI For Teaching (CRAFT) AI Literacy Resources. https://craft.stanford.edu/resource/how-can-we-mitigate-bias-in-ai-design/ (Accessed: 20th June 2025).

Eubanks, V. (2018). Automating inequality: How high-tech tools profile, police, and punish the poor. St. Martin's Press.

Experience AI. (2024). Lesson 3: Bias in, bias out. Foundations of AI. https://experience-ai.org/en/units/foundations-of-ai/lessons/3 (Accessed: 20th June 2025).

Fanon, F. (1952). Black Skin, White Masks. New York: Grove Press.

Feliciano, V., Nguyen, I., Solyst, J., Vassel, F. M., & Shieh, E. (2025) Data is Power: Youth Researcher Perspectives in Critically Navigating an Increasingly AI-Driven World. Critiquing and Rethinking Fairness, Accountability, and Transparency (CRAFT) at the 2025 ACM Conference on Fairness, Accountability, and Transparency. https://facctconference.org/2025/acceptedcraft (Accessed: 21st June 2025).

Gebru, T., & Torres, É. P. (2024). The TESCREAL bundle: Eugenics and the promise of utopia through artificial general intelligence. First Monday.

Hooker, S. (2021). Moving beyond “algorithmic bias is a data problem”. Patterns, 2(4).

Hu, K., Jaiswal, R., & Cai, K. (2025). Meta poaches 28-year-old Scale AI CEO after taking multibillion dollar stake in startup. Reuters. https://www.reuters.com/business/finance/meta-finalizes-investment-scale-ai-valuing-startup-29-billion-2025-06-13/ (Accessed: 20th June 2025).

Immerwahr, D. (2019). How to hide an empire: a history of the greater United States. First edition. New York, Farrar, Straus and Giroux.

Jo, E. S., & Gebru, T. (2020, January). Lessons from archives: Strategies for collecting sociocultural data in machine learning. In Proceedings of the 2020 conference on fairness, accountability, and transparency (pp. 306-316).

Khalid, A. (2021). Central Asia: a new history from the imperial conquests to the present. Princeton University Press.

Khan, S. (2023, April). How AI could save (not destroy) education. Sal Khan: How AI could save (not destroy) education, TED Talk. https://www.ted.com/talks/sal_khan_how_ai_could_save_not_destroy_education (Accessed: 20th June 2025).

Lee, I., Ali, S., Zhang, H., DiPaola, D., & Breazeal, C. (2021, March). Developing middle school students' AI literacy. In Proceedings of the 52nd ACM technical symposium on computer science education (pp. 191-197).

Martschenko, D. (2017, October). The IQ test wars: why screening for intelligence is still so controversial. The Conversation. https://theconversation.com/the-iq-test-wars-why-screening-for-intelligence-is-still-so-controversial-81428 (Accessed: 20th June 2025).

McNeill, D., & Tucker, E. (2025). Suffering is Real. AI Consciousness is Not. Tech Policy Press. https://www.techpolicy.press/suffering-is-real-ai-consciousness-is-not/ (Accessed: 20th June 2025).

Newton, C. (2020). The withering email that got an ethical AI researcher fired at Google. Platformer. https://www.platformer.news/the-withering-email-that-got-an-ethical/ (Accessed: 21st June 2025).

Nguyen, I., Suresh, H., & Shieh, E. (2025). Representational Harms in LLM-Generated Narratives Against Nationalities Located in the Global South. Human-centered Evaluation and Auditing of Language Models Workshop At 2025 Conference on Human Factors in Computing Systems (CHI).

O’Brien, M. (2019). Face recognition researcher fights Amazon over biased AI. AP News. https://apnews.com/article/24fd8e9bc6bf485c8aff1e46ebde9ec1 (Accessed: 21st June 2025).

Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., ... & Lowe, R. (2022). Training language models to follow instructions with human feedback. Advances in neural information processing systems, 35, 27730-27744.

Pratt, R. Η. (1892). The Advantages of Mingling Indians with Whites. In Proceedings of the National Conference of Charities and Correction.

Said, E. W. (1978). Orientalism. Pantheon Books.

Said, E. W. (1993). Culture and imperialism (First edition.). Knopf.

Shieh, E., Vassel, F. M., Sugimoto, C., & Monroe-White, T. (2024). Laissez-faire harms: Algorithmic biases in generative language models. arXiv preprint arXiv:2404.07475.

Shieh, E., Okoroafor, P., & Monroe-White, T. (2025). “Guess My Passcode”: A K-12 Game-Based Introduction to AI and Backpropagation (Model AI Assignments 2025). In Proceedings of the AAAI Conference on Artificial Intelligence, 39(28), 29238-29241. http://modelai.gettysburg.edu/2025/passcode/

Thiel, D. (2023). Identifying and Eliminating CSAM in Generative ML Training Data and Models. Stanford Digital Repository. Available at https://purl.stanford.edu/kh752sm9123. https://doi.org/10.25740/kh752sm9123.

Vassel, F. M., Shieh, E., Sugimoto, C. R., & Monroe-White, T. (2024, May). The Psychosocial Impacts of Generative AI Harms. In Proceedings of the AAAI Symposium Series (Vol. 3, No. 1, pp. 440-447).

Warren, T. (2024). Google pauses Gemini’s ability to generate AI images of people after diversity errors. The Verge. https://www.theverge.com/2024/2/22/24079876/google-gemini-ai-photos-people-pause (Accessed: 20th June 2025).

Wiggers, K. (2024). AI training data has a price tag that only Big Tech can afford. TechCrunch. https://techcrunch.com/2024/06/01/ai-training-data-has-a-price-tag-that-only-big-tech-can-afford/ (Accessed: 21st June 2025).

Witgen, M. (2011). An infinity of nations: How the native new world shaped early North America. University of Pennsylvania Press.

Daniel Krutka