Jacques Thibodeau - Aligned to Flourish

May

10

Quantum Computing, Photonics, and Energy Bottlenecks for AGI

💡Note: I wrote this post in less than a day and didn't want to spend more time on it, but

May 10, 2024

10 min read

May

03

AI Insights #1: How Misalignment Could Lead to Takeover & Necessary Safety Properties

AI safety insights number 1: risks of misaligned AI takeover, key properties of AGI safety plans, and dangers of autonomous AI agents maximizing rewards in unintended ways as models advance.

May 3, 2024

4 min read

Jan

23

My current research and request for collaborators

I wrote this as a bio for EAG Bay Area 2024. I'm sharing this here because it gives an overview

Jan 23, 2024

3 min read

Dec

29

But is it really in Rome? Limitations of the ROME model editing technique

I just published a new post on LessWrong. It's about the causal tracing and model editing paper (ROME). Here's the

Dec 29, 2022

2 min read

Dec

29

An incomplete list of projects I'd like to work on in 2023

Wrote up a short (incomplete) bullet-point list of the projects I'd like to work on in 2023. Here's the link.

Dec 29, 2022

Dec

19

(Linkpost) Results for a survey of tool use and workflows in alignment research

In March 22nd, 2022, we released a survey with an accompanying post for the purpose of getting more insight into

Dec 19, 2022

1 min read

Dec

16

How learning efficiently applies to alignment research

As we are trying to optimize for actually solving the problem, we should not fall into the trap of learning

Dec 16, 2022

2 min read

Dec

07

Differential Training Process: Delaying capabilities until inner aligned

I've been ruminating on an idea ever since I read the section on deception in "The Core of the Alignment

Dec 7, 2022

3 min read

Dec

07

Near-Term AI capabilities probably bring low-hanging fruits for global poverty/health

I'm an alignment researcher, but I still think we should be vigilant about how models like GPT-N could potentially be

Dec 7, 2022

1 min read

Dec

05

Is the "Valley of Confused Abstractions" real?

Epistemic Status: Quite confused. Using this short post as a signal for discussion. Here's a link to the LessWrong post

Dec 5, 2022

3 min read