KabirKumar avatar
Kabir Kumar

@KabirKumar

Lead at AI-plans.com

https://ai-plans.com/
$0total balance
$0charity balance
$0cash balance

$0 in pending offers

About Me

Background- DevOps, Admin & Marketing

Values- Humanity, Agency, Truth
Cause prioritization - Alignment- currently focused on organizing the field and making a way to recognize bad ideas.

Projects

Comments

KabirKumar avatar

Kabir Kumar

10 months ago

@Austin Thank you!! I know the site design still needs a lot of work! We're working on a rebuild at the moment, which will be ready soon!

To be clear, Tetraspace was a participant.

KabirKumar avatar

Kabir Kumar

about 1 year ago

From what I know, AI Safety Careers isn't funding constrained- how would the funding help with this?

KabirKumar avatar

Kabir Kumar

about 1 year ago

I think this could be really useful and the folks at Stampy seem to be doing a lot of good work.

KabirKumar avatar

Kabir Kumar

over 1 year ago

Awesome!! Thank you very much!! You might be interested to know, that not only has the event produced many very well thought out critiques and got more people involved and interested in AI Safety- especially in what actually goes into making a plan robust- in the first two days we produced an extremely useful document: https://docs.google.com/document/d/1GQbAnRPvONF8TdQtQuga4WOLk58iNh3tTdsVyGpA4AE/edit?usp=sharing multiple people have talked about how useful and easy to use this document is and often expressing confusion as to why no one has made something like it before!

KabirKumar avatar

Kabir Kumar

over 1 year ago

Great news! Dr Peter S. Park, an AI Safety postdoc at the Tegmark lab has agreed to be a judge!

KabirKumar avatar

Kabir Kumar

over 1 year ago

Excited to say that we have 20 participants for the critique-a-thon so far!

KabirKumar avatar

Kabir Kumar

over 1 year ago

1st to 2nd: Making a list of all the ways alignment plans could go wrong.

We'll put together a master list of potential "vulnerabilities" based on existing research and our own ideas. This will give us a checklist to use when evaluating plans.

3rd to 4th : Matching vulnerabilities to plans

Everyone will pick a few alignment plans to look at more closely. For each plan, you'll label up to 5 vulnerabilities you think could apply and point out evidence from the plan that supports them. Include your level of confidence in each label as a percentage.

5th to 8th : Argue for and against the vulnerabilities.

You'll team up with another participant and take turns, with one defending, the other questioning the vulnerabilities suggested in Step 2. This debate format will help strengthen the critiques. We'll swap sides on the 6th and rotate team member on the 8th.

9th to 10th: Provide feedback on each other's arguments.

Review your partner's reasoning for and against the vulnerability labels. Point out any faulty logic, questionable assumptions, lack of evidence, etc. to improve the critiques.

Step 5- one week of judging:

We'll evaluate submissions and award prizes!
The organizers and outside experts will judge all the critiques based on accuracy, evidence, insight, and communication. Cash prizes will go to the standout critiques that demonstrate top-notch critical analysis

KabirKumar avatar

Kabir Kumar

over 1 year ago

But if folks want to add more, I'd be happy to increase the prize pool. Though, at some point, it might make more sense to pay the researchers who're being judges.

KabirKumar avatar

Kabir Kumar

over 1 year ago

We've already got 13+ attendees with no prize at all and I want to maximize the chances of there being a prize.

KabirKumar avatar

Kabir Kumar

over 1 year ago

Thank you!!
It helps that one of the consultants on our team is a highly experienced cybersecurity professional and professor.
Also, I kinda love breaking things and alignment plans are sooo vulnerable!

KabirKumar avatar

Kabir Kumar

over 1 year ago

Excited to say that within hours of announcement, we already have 10 people who've joined the critique-a-thon!

KabirKumar avatar

Kabir Kumar

over 1 year ago

Researchers interested include:

Dr Tom Everett of DeepMind

Dr Dan Hendrycks of Xai

Dr Roman Yampolskiy

KabirKumar avatar

Kabir Kumar

over 1 year ago

Update: Good news!

Kristen W Carlson, an alignment researcher at the Institute of Natural Science and Technology said they like the site! They also said they found several papers on the site, so it seems to already be proving useful!!

A few other researchers have also expressed interest in the site!

KabirKumar avatar

Kabir Kumar

over 1 year ago

Thank you for your comment!

I agree, getting the site used and having good networking is very important!

On that front, there's actually quite a bit of good news! I've been reaching out to researchers for less than a week and there are already 4 alignment researchers who are very interested in the site! One has been posting his plans himself, another has asked me to post their plan for them, one has joined the team (Jonathan Ng) and another is working on a plan they're happy to have on the site when it's done!

Esbren, the head of Apart Research is also very interested in site and I've spoken with the creator of aisafety.careers who wants to integrate with the site.

I also had a call with Kat Woods who said she really wanted the site to exist and seemed to think it would provide something very valuable.

It's been very promising to get a really great reception from almost every alignment researcher I've talked to about this- the two sceptics have been folk who either think alignment is impossible or that it is basically impossible to have any judgement of a plan since we can't test it. Those are very important points, which I am looking into seriously.

Transactions

ForDateTypeAmount
Manifund Bank8 months agowithdraw370
AI-Plans.com 10 months agoproject donation+370
Manifund Bank10 months agowithdraw5000
AI-Plans.com 10 months agoproject donation+5000
Manifund Bankabout 1 year agowithdraw500
AI-Plans.com Critique-a-Thon $500 Prize Fund Proposalover 1 year agoproject donation+500