Exercise

Why does instrumental convergence happen? Would it be coherent to imagine a reality without it?

  • Here, our descriptive theory relies on our ability to have reasonable beliefs about what we’ll do, and how things in the world will affect our later decision-making process. No one knows how to formalize that kind of reasoning, so I’m leaving it a black box: we somehow have these reasonable beliefs which are apparently used to calculate AU.
  • In technical terms, AU calculated with the “could” criterion would be closer to an optimal value function, while actual AU seems to be an on-policy prediction, whatever that means in the embedded context. Felt impact corresponds to TD error.
  • Framed as a kind of EU, we plausibly use AU to make decisions.
  • I’m not claiming normatively that “embedded agentic” EU should be AU; I’m simply using “embedded agentic” as an adjective.
Black and white trout

Find out when I post more content: newsletter & RSSRSS icon

Thoughts? Email me at alex@turntrout.com