
Latest
Aug
08
Automating AI Safety: What we can do today
There have been multiple recent calls for the automation of AI safety and alignment research. There are likely many people
10 min read
Oct
02
AI Alignment Project Ideas
[Nov 27, 2024] I have some new alignment project ideas I quickly wrote up. These are mostly projects I'
6 min read
Sep
24
How much I'm paying for AI productivity software (and the future of AI use)
This post is broken down into two parts:
1. Which AI productivity tools am I currently using?
2. Why does
9 min read
Aug
14
The importance of Entropy
Imagine you're building a sandcastle on the beach. As you carefully shape the sand, you're creating
2 min read
Jul
27
Accelerating AI Alignment Research (Talk)
I gave a keynote talk on how we should be thinking about accelerating AI alignment (safety) research. This is a
1 min read
Jul
09
Using data attribution for AI alignment
This is a post on a recent paper I thought was cool. I give some follow-up project ideas after.
In-Run
5 min read
May
10
Quantum Computing, Photonics, and Energy Bottlenecks for AGI
💡Note: I wrote this post in less than a day and didn't want to spend more time on
10 min read
May
03
AI Insights #1: How Misalignment Could Lead to Takeover & Necessary Safety Properties
AI safety insights number 1: risks of misaligned AI takeover, key properties of AGI safety plans, and dangers of autonomous AI agents maximizing rewards in unintended ways as models advance.
5 min read
Jan
23
My current research and request for collaborators
I wrote this as a bio for EAG Bay Area 2024. I'm sharing this here because it gives
3 min read
Dec
29
But is it really in Rome? Limitations of the ROME model editing technique
I just published a new post on LessWrong. It's about the causal tracing and model editing paper (ROME)
3 min read