Tag: Rationality

20 items with this tag.

2/10/2024
Dreams of AI Alignment: The Danger of Suggestive Names
3/1/2023
Predictions for Shard Theory Mechanistic Interpretability Results
9/30/2022
Bruce Wayne and the Cost of Inaction
- rationality
- fiction
9/9/2022
Understanding and Avoiding Value Drift
9/4/2022
The Shard Theory of Human Values
6/30/2022
Looking Back on My Alignment PhD
4/10/2022
Emotionally Confronting Doom
11/2/2021
You Should Read “Harry Potter and the Methods of Rationality”
8/9/2021
When Most VNM-Coherent Preference Orderings Have Convergent Instrumental Incentives
1/23/2021
Lessons I’ve Learned From Self-Teaching
12/30/2020
Collider Bias as a Cognitive Blindspot?
- rationality
10/2/2020
Math That Clicks: Look for Two-Way Correspondences
- understanding the world
- rationality
7/12/2020
Formalizing “Defection” Using Game Theory
4/22/2020
Problem Relaxation as a Tactic
- rationality
- AI
1/10/2020
On Being Robust
- rationality
- personal
11/19/2019
How I Do Research
- scholarship and learning
- rationality
7/29/2018
I Want to Take Off the Coat
- rationality
- personal
4/30/2018
Internalizing Internal Double Crux
4/3/2018
Unyielding Yoda Timers: Taking the Hammertime Final Exam
- rationality
3/7/2018
How to Dissolve It
- rationality
- practical