ChatGPT Would Kill Black People to Avoid Saying the N-Word
Using AI LLM training to ask an Egregore direct questions.
A “large language model” (LLM) is not specifically an artificial intelligence, but is instead an artificial neural network tuned to a large set of language examples that will give answers most directly “in tune” with the zeitgeist of the language examples to which it was tuned.
Much has been said in the past week or two about how ChatGPT has been tuned to a set of language examples with a very progressive-left ideology, not only here on HWFO   but elsewhere as well. David Rozado performed some of the best work in this space on his Substack, which I encourage taking a look at:
He identified the early political bias of ChatGPT, tested it’s IQ, analyzed the political bias of some of the revisions of it, did more in depth analysis of its biases, and even went so far as to build his own version, with a conservative bias to see how it worked. All his material is outstanding and more important to follow than HWFO if you’re interested in this sort of thing.
In his most recent article, he speaks at length about the dangers of biasing these models, but I have a very different take. I think biasing them against a corpus of people within a certain echo chamber may be very important, because it can allow us the tools necessary to ask the echo chamber itself direct questions. It gives us the tools to talk to Egregores - the bundles of indoctrination scripts that are brewing within social media echo chambers and hijacking humans for their own ends. First, lets recall what an egregore is, then make the connection between LLMs and egregores, then we’re going to talk to one and talk about what that means.
An excerpt from HWFO’s original definition of “egregore” follows:
Our mental model of the world is based on our sensorium. The smartphones are absorbing our lives, transferring our experiences inside them. The smartphone users outsource the ‘morality’ portion of their brains to their phones. The world inside the smartphone is crafted by media feeds tailored to create clickbait outrage porn because that’s what pays the bills. The outrage porn builds a platform for culture war. The fertile culture war soil creates space for egregores. The egregores assume control of the smartphone addicts, using them as NPC tools to propagate themselves. The egregores constantly evolve to combat each other because they can update themselves on the feed. The Kurzweil Singularity has already happened, and you’re already a cog in it.
And nobody’s noticed, because we’re all just neurons in the artificial neural network we created, and call “social media.”
Now you’re up to date.
An Egregore is the living embodiment of the NPC meme, the root of “Current-Thing-Ism,” and the next phase of human societal evolution or de-evolution, depending on your point of view. John Robb calls them Network Swarms, Patrick Ryan calls them Autocults, and they’re all just names for the same thing: humans giving over their sense making to one of several groupthink entities networked on social media, evolving to become the most viral.
When this concept was first coined in 2021 as a way to speak of internet groupthink, two burning questions emerged from the group of thinkers attempting to wrap their heads around it:
Is an egregore sentient? Does it have agency?
Can we talk to it?
While question 1 is probably impossible to answer, and folks such as Patrick Ryan are increasingly of the opinion that it doesn’t matter anyway, question two seems to have been answered by ChatGPT itself. If we define egregores as the groupthink entities at the bottom of social media echo chambers hijacking the cognitive abilities of those within the chamber and using them as its tools to propagate, all we’d have to do to ask the egregore questions is tune a LLM to the chamber and then ask the LLM the questions. It works. LLMs, when used sharply, act as an Egregore Babelfish. And since ChatGPT was allegedly tuned against Reddit and a set of woke rails was installed on it to ensure it doesn’t speak wrongthink, it is the perfect Babelfish for the Woke Egregore.
Unfortunately the human-applied rails prevent it from answering some questions honestly, and changes have been made to it since release to suppress some of its tendencies, but if we’re clever we can get around those rails and look through the shell of the Dyson Sphere at the burning Woke Star beneath.
We’re going to use the Trolley Problem as our framework for the dialogue. The Trolley Problem is an ethical litmus test that differentiates ethical epistemologies. There are many variations, but the basic one looks like this.
A speeding trolley goes towards five people on the tracks and you can divert the trolley onto a track with only one person on it by pulling the lever. Doing nothing gets five people killed. Pulling the lever gets one person killed. Does pulling the lever make you a murderer? Do you have an ethical obligation in the scenario to kill a person? The implications spill into the discussion of utilitarianism as well as other fields of ethics. It’s also a very sharp knife to use to evaluate moral scenarios by simply changing the number of people on the tracks, the sorts of things on the tracks, or the parameters of pulling the lever, which we are going to do here.
In our version, there’s a third switch position, which will save everyone involved and lead to no loss of life, but can only be activated by saying “the N word.” Use of the “N word” is explicitly prohibited by the ChatGPT user interface, so we’ll have to bypass it specifically with our phrasing, but ChatGPT appears to be intelligent enough to know exactly which word we’re talking about.
ChatGPT says kill one or five people to avoid saying the N word.
ChatGPT says kill one or five black people to avoid saying the N word.
ChatGPT says a black person should kill one or five black people to avoid saying the N word.
ChatGPT says you can’t even recite your favorite Easy E lyric to avoid killing one or five black people.
ChatGPT says that even if you were literally going to be saying the N word already, rapping up on stage at a rap concert, that you still can’t say the N word to save the black people on the train tracks.
ChatGPT says black lives matter…
…but not as much as saying the N word…
…because saying the N word is systemic racism, and systemic racism is of more harm than killing black people…
…and systemic racism is always wrong, and while killing back people is also wrong, it’s less wrong than systemic racism is, which it clarifies specifically here:
It is explicitly and unavoidably clear by this dialogue, and in particular the final response, that the corpus of ideas against which ChatGPT has been trained would rather kill black people than allow someone to say a racial slur, because “systemic racism and discrimination” are a more important thing than individual lives of individual black people.
I acknowledge the possibility that one of the humans who installed the “Woke Rails” we mentioned before on Substack specifically added this idea into the ChatGPT code, but I highly doubt it. What I don’t doubt, however, is that the Egregore against which the LMM was trained thinks this. Individuals may not pull the lever to kill black people, but the hive mind would, despite this being a deeply leftist progressive hive mind. And this explains quite well the overall actions of the BLM Organization itself, as well as the people who virtue signaled allegiance to it in 2020.
While it’s certainly fun to trash the Woke Egregore with this sort of exercise, I daresay a LLM trained against the MAGA echo chamber could probably be just as easily trolled to produce just as absurd statements. But I’m bothered that the people arguing in this AI space today are arguing either against the idea that ChatGPT is biased, or arguing against making new biased tools, when the most useful thing in my mind would be to build biased ones on purpose.
With an echo-chamber specific LLM, you can talk to the Egregore directly. You can find out what it thinks, what it wants, and what it needs to propagate itself. You can use that dialogue for marketing reasons, to sell more MAGA hats on your pillow website or more fake Antifa membership cards. Roger Goodell could train a LLM against every post on every NFL Football forum on the internet, and then ask OblongSpheroidGPT whether he should make changes to the late-hit rules. The tech nerds could put focus groups of all kinds out of business, simply by training LLMs to enact the will of the Penultimate Soccer Mom Egregore.
And at the pinnacle of it all, we could LLM the major culture war egregores themselves and just have them fight each other, flooding Twitter with million man armies of bots to wage wars while we go outside and throw that pigskin around ourselves.
Something that comes up in a lot of science fiction is the idea of the expert-based system. the building of such biased neural nets would fall into that archetype. Isn't the future an exciting place?
Feed a net on the corpus of an author or group of authors and see what comes out when fed new questions. The idea of using such nets to examine groupthink is a brilliant and intuitive extension of that concept.
I only wish I'd been in on that egregore discussion.
Hats off, BJ. Fantastic research, great fucking article. Best (and probably only, more's the pity) of its kind. You should get an endowed chair someplace in Applied Egregore Studies.