Table of contents
Inspired by Scott Alexander’s Mistakes page:
Scott AlexanderI don’t promise never to make mistakes. But if I get something significantly wrong, I’ll try to put it here as an acknowledgement and an aid for anyone trying to assess my credibility later.
This doesn’t include minor spelling/grammar mistakes, mistakes in links posts, or failed predictions. It’s times I was fundamentally wrong about a major part of a post and someone was able to convince me of it.
NoteThe list is currently quite thin. I have made more mistakes than are on this list. I’ll add more as I remember them, or you could email me and politely remind me of times I changed my mind about an important claim of mine!
Reward is not the optimization target
July 25, 2022
I spent thousands of hours proving theorems about the “tendencies” of “reinforcement learning” agents which are either optimal or trained using a “good enough” learning algorithm. (I’m using scare quotes to mark undue connotations.) I later realized that even though “reward” is a pleasant word, it’s definitely not a slam dunk that RL-trained policies will seek to optimize that quantity. Reward often simply provides a per-datapoint learning rate multiplier—nothing spooky or fundamentally doomed.
While the realization may seem simple or obvious, it opened up a crack in my alignment worldview.
November 5, 2024
I read too much into the shock Seltzer poll which showed Harris +3. At one point before the polls closed, my credence even reached 80%—driven by observations of unusually high turnout, which historically was a good sign for democrats.
At least, I thought that high turnout ⭢ higher chance that democrats win. But as I looked up a link to justify that claim, I found that it’s actually not true! According to National Affairs:
To do better, I should have anchored more strongly to base rates via current polling, especially around the economy. On a meta-level, I didn’t realize how much of a news bubble I was in. The media I read portrayed Trump as low-energy & barely filling out his rallies. I’ll check out a wider variety of sources in the future, perhaps via e.g. Ground News.
Find out when I post more content: newsletter & RSS
alex@turntrout.com