The Gears of Impact

Table of Contents
Notes

Exercise

Why does instrumental convergence happen? Would it be coherent to imagine a reality without it?

Here, our descriptive theory relies on our ability to have reasonable beliefs about what we’ll do, and how things in the world will affect our later decision-making process. No one knows how to formalize that kind of reasoning, so I’m leaving it a black box: we somehow have these reasonable beliefs which are apparently used to calculate AU.
In technical terms, AU calculated with the “could” criterion would be closer to an optimal value function, while actual AU seems to be an on-policy prediction, whatever that means in the embedded context. Felt impact corresponds to TD error.
Framed as a kind of EU, we plausibly use AU to make decisions.
I’m not claiming normatively that “embedded agentic” EU should be AU; I’m simply using “embedded agentic” as an adjective.

☙

❧

Sequence:

Find out when I post more content: newsletter & rss

Thoughts? Email me at alex@turntrout.com

The Pond