Aligned to Flourish (Page 4)

09

Jan

'26

Difficulties in Building an AI Safety Startup

4 min read

09

Jan

'26

Gaining clarity on Automated Alignment Research

4 min read

09

Jan

'26

Better model diffing is needed

2 min read

08

Aug

'25

Automating AI Safety: What we can do today

11 min read

02

Oct

'24

AI Alignment Project Ideas

6 min read

24

Sep

'24

How much I'm paying for AI productivity software (and the future of AI use)

9 min read

14

Aug

'24

The importance of Entropy

2 min read

27

Jul

'24

Accelerating AI Alignment Research (Talk)

1 min read

09

Jul

'24

Using data attribution for AI alignment

5 min read

10

May

'24

Quantum Computing, Photonics, and Energy Bottlenecks for AGI

10 min read

03

May

'24

AI Insights #1: How Misalignment Could Lead to Takeover & Necessary Safety Properties

5 min read

23

Jan

'24

My current research and request for collaborators

3 min read

29

Dec

'22

But is it really in Rome? Limitations of the ROME model editing technique

3 min read

29

Dec

'22

An incomplete list of projects I'd like to work on in 2023

1 min read

19

Dec

'22

(Linkpost) Results for a survey of tool use and workflows in alignment research

1 min read

16

Dec

'22

How learning efficiently applies to alignment research

2 min read

07

Dec

'22

Differential Training Process: Delaying capabilities until inner aligned

3 min read

07

Dec

'22

Near-Term AI capabilities probably bring low-hanging fruits for global poverty/health

1 min read

05

Dec

'22

Is the "Valley of Confused Abstractions" real?

3 min read

05

Dec

'22

Foresight for AGI Safety Strategy

10 min read

28

Nov

'22

Notes on Cicero

3 min read

26

Nov

'22

Detail about factual knowledge in Transformers

2 min read

13

Aug

'22

Current Thoughts on my Learning System

4 min read

27

Jul

'22

What does "Effective" in EA mean to you?

3 min read

23

Jul

'22

Helping organizations survive disasters (and potentially avoid them altogether)

8 min read

22

Jul

'22

I'll be in Berkeley for SERI MATS for the next 2 months

1 min read

22

Jul

'22

AI Alignment YouTube Playlists

1 min read

21

Jul

'22

A descriptive, not prescriptive, overview of current AI Alignment Research

2 min read

23

Mar

'22

A survey of tool use and workflows in alignment research

1 min read

24

Sep

'21

Interesting Applications of GPT-3: Elicit

9 min read

Writing

Difficulties in Building an AI Safety Startup

Gaining clarity on Automated Alignment Research

Better model diffing is needed

Automating AI Safety: What we can do today

AI Alignment Project Ideas

How much I'm paying for AI productivity software (and the future of AI use)

The importance of Entropy

Accelerating AI Alignment Research (Talk)

Using data attribution for AI alignment

Quantum Computing, Photonics, and Energy Bottlenecks for AGI

AI Insights #1: How Misalignment Could Lead to Takeover & Necessary Safety Properties

My current research and request for collaborators

But is it really in Rome? Limitations of the ROME model editing technique

An incomplete list of projects I'd like to work on in 2023

(Linkpost) Results for a survey of tool use and workflows in alignment research

How learning efficiently applies to alignment research

Differential Training Process: Delaying capabilities until inner aligned

Near-Term AI capabilities probably bring low-hanging fruits for global poverty/health

Is the "Valley of Confused Abstractions" real?

Foresight for AGI Safety Strategy

Notes on Cicero

Detail about factual knowledge in Transformers

Current Thoughts on my Learning System

What does "Effective" in EA mean to you?

Helping organizations survive disasters (and potentially avoid them altogether)

I'll be in Berkeley for SERI MATS for the next 2 months

AI Alignment YouTube Playlists

A descriptive, not prescriptive, overview of current AI Alignment Research

A survey of tool use and workflows in alignment research

Interesting Applications of GPT-3: Elicit