activation engineering(12)AI(98)community(5)corrigibility(6)critique(10)deepmind(4)fiction(3)game theory(3)grinnell(2)growth stories(2)human values(6)humor(2)impact regularization(22)instrumental convergence(19)mats program(14)open source(5)personal(14)practical(11)rationality(20)reinforcement learning(7)scholarship and learning(10)shard theory(18)specification gaming(2)summaries(7)talk notes(4)understanding the world(23)website(7)