Latest
Oct
02
AI Alignment Project Ideas (Oct 2, 2024)
I quickly wrote up some rough project ideas for ARENA and LASR participants, so I figured I'd share
6 min read
Sep
24
How much I'm paying for AI productivity software (and the future of AI use)
This post is broken down into two parts:
1. Which AI productivity tools am I currently using?
2. Why does
9 min read
Aug
14
The importance of Entropy
Imagine you're building a sandcastle on the beach. As you carefully shape the sand, you're creating
2 min read
Jul
27
Accelerating AI Alignment Research (Talk)
I gave a keynote talk on how we should be thinking about accelerating AI alignment (safety) research. This is a
1 min read
Jul
09
Using data attribution for AI alignment
This is a post on a recent paper I thought was cool. I give some follow-up project ideas after.
In-Run
5 min read
May
10
Quantum Computing, Photonics, and Energy Bottlenecks for AGI
💡Note: I wrote this post in less than a day and didn't want to spend more time on
10 min read
May
03
AI Insights #1: How Misalignment Could Lead to Takeover & Necessary Safety Properties
AI safety insights number 1: risks of misaligned AI takeover, key properties of AGI safety plans, and dangers of autonomous AI agents maximizing rewards in unintended ways as models advance.
5 min read
Jan
23
My current research and request for collaborators
I wrote this as a bio for EAG Bay Area 2024. I'm sharing this here because it gives
3 min read
Dec
29
But is it really in Rome? Limitations of the ROME model editing technique
I just published a new post on LessWrong. It's about the causal tracing and model editing paper (ROME)
3 min read
Dec
29
An incomplete list of projects I'd like to work on in 2023
Wrote up a short (incomplete) bullet-point list of the projects I'd like to work on in 2023. Here&