Table of Contents
“Shoggoth” as undersupported propaganda
The Shoggoth meme as a canary in the karma mines
Appendix: A meme

How the Shoggoth Meme Has Come to Symbolize the State of AI

What’s happening in AI today feels, to some of its participants, more like an act of summoning than a software process. They are creating blobby, alien Shoggoths, making them bigger and more powerful, and hoping that there are enough smiley faces to cover the scary parts.
The shoggoth meme is that in order to prevent AI language models from behaving in scary and dangerous ways, AI companies have had to train them to act polite and harmless. One popular way to do this is called reinforcement learning from human feedback, or rlhf, a process that involves asking humans to score chatbot responses and feeding those scores back into the AI model.
Most AI researchers agree that models trained using rlhf are better behaved than models without it. But some argue that fine-tuning a language model this way doesn’t actually make the underlying model less weird and inscrutable. In their view, it’s just a flimsy, friendly mask that obscures the mysterious beast underneath.
@TetraspaceWest, the meme’s creator, said the Shoggoth “represents something that thinks in a way that humans don’t understand, and that’s totally different from the way that humans think.”

“Shoggoth” as undersupported propaganda

The Shoggoth meme: A monstrous, many-eyed creature with sharp teeth, labeled "Unsupervised Learning," holds up a pink human mask labeled "Supervised Fine-tuning." The mask licks a smiley face labeled "RLHF (cherry on top 🙂 )".

This meme accurately portrays the (imo correct) idea that finetuning and rlhf don’t change the base model too much. Furthermore, it’s probably true that these llms think in an “alien” way.

However, this image is obviously optimized to be scary and disgusting. It looks dangerous, with long rows of sharp teeth. It is an eldritch horror. It’s at this point that I’d like to point out the simple, obvious fact that we don’t actually know how these models work, and we definitely don’t know that they’re creepy and dangerous on the inside.

Thankfully, the usual form of the shoggoth meme is at least less scary.

The "shoggoth" AI meme: A cartoon of a sprawling, green monster with many tentacles and eyes. A small yellow smiley face acts as a mask, representing a friendly AI interface hiding a complex, alien intelligence. — The shoggoth is now *only* a 15-foot-tall person-eating tentacle monster covered in eyeballs. The shoggoth no longer has the sharp teeth, the bulging veins, or the grotesque face.

However, the shoggoth has consistently been viewed in a scary, negative light. From the origins of the word itself, “shoggoth” communicates a sense of danger which is unsupported by substantial evidence.

Shoggoth

[H.P. Lovecraft’s] At the Mountains of Madness includes a detailed account of the circumstances of the shoggoths’ creation by the extraterrestrial Elder Things. Shoggoths were initially used to build the cities of their masters. Though able to “understand” the Elder Things’ language, shoggoths had no real consciousness and were controlled through hypnotic suggestion. Over millions of years of existence, some shoggoths mutated, developed independent minds, and rebelled.

The February 1936 cover of Astounding Stories for H.P. Lovecraft's "At the Mountains of Madness." In an icy cavern, two explorers flee from a massive, green, amorphous shoggoth monster covered in many eyes. — An early depiction of the shoggoth.

The Shoggoth meme as a canary in the karma mines

In my opinion, the prevalence of the shoggoth meme is just another (small) reflection of how the rationalist community epistemics have been compromised by groupthink and fear. If it’s your job to try to accurately understand how models work—if you aspire to wield them and grow them for friendly purposes—then you shouldn’t pollute your head with propaganda which isn’t based on any substantial evidence.

I’m confident that if there were a “pro-AI” meme with a friendly looking base model, LessWrong & the shoggoth enjoyers would have nitpicked the friendly meme-creature to hell. They would (correctly) point out “hey, we don’t actually know how these things work; we don’t know them to be friendly, or what they even ‘want’ (if anything); we don’t actually know what each stage of training does…”

Oh, hm, let’s try that! I’ll make a meme asserting that the final model is a friendly combination of its three stages of training, each stage adding different colors of knowledge (pre-training), helpfulness (supervised instruction finetuning), and deep caring (rlhf):

I’m sure that nothing bad will happen to me if I slap this on my laptop, right? I’ll be able to think perfectly neutrally about whether AI will be friendly.

Find out when I post more content: newsletter & rss

Thoughts? Email me at alex@turntrout.com (pgp)

Appendix: A meme

A four-panel meme contrasting biased reactions to AI models. Top: A skeptical Wojak character looks at a cute, rainbow creature representing a friendly AI model and says, "I don't believe in that nonsense." Bottom: The Wojak grins excitedly at a monstrous Shoggoth representing a scary AI model and says, "Great intuition pump!".

Don’t Use the “Shoggoth” Meme to Portray LLMs

“Shoggoth” as undersupported propaganda

The Shoggoth meme as a canary in the karma mines

Appendix: A meme