[AI Safety Workshop @ EA Hotel] Autostructures

See the LessWrong post: Live Machinery: Interface Design Workshop for AI Safety @ EA Hotel — AI Alignment Forum

Project summary

This is a project for creating culture and technology around AI interfaces for conceptual sensemaking.

Specifically, creating for the near future where our infrastructure is embedded with realistic levels of intelligence (ie. only mildly creative but widely adopted) yet full of novel, wild design paradigms anyway.

The focus is on interfaces especially for new sensemaking and research methodologies that can feed into a rich and wholesome future.

Huh?

It’s a project for AI interfaces that don’t suck, for the purposes of (conceptual AI safety) research that doesn’t suck.

Wait, so you think AI can only be mildly intelligent?

Nope.

But you only care about the short term, of “mild intelligence”?

Nope, the opposite. We expect AI to be very, very, very transformative. And therefore, we expect intervening periods to be very, very transformative. Additionally, we expect even “very, very transformative” intervening periods to be crucial, and quite weird themselves.

In preparing for this upcoming intervening period, we want to work on the newly enabled design ontologies of sensemaking that can keep pace with a world replete with AIs and their prolific outputs. Using the near-term crazy future to meet the even crazier far-off future is the only way to go.

(As you’ll see in the FAQ we will specifically move towards adaptive sensemaking meeting even more adaptive phenomena.)

So you don’t care about risks?

Nope, the opposite. This is all about research methodological opportunities meeting risks of infrastructural insensitivity.

More.

See the rest in the LessWrong post at the top, with lots of examples. Highly recommended if you like weird but fleshed out approaches to alignment. Or watch a 10 minute video here for a little more background: Scaling What Doesn’t Scale: Teleattention Tech.

***

What are this project's goals? How will you achieve them?

Some goals:

Validate some of the hypotheses around live theory. In brief, “live theory” is about the new adaptive theories (powered by more short-term AI tech) that will be necessary to sensemake the crazy future (from extreme AI sophistication).
Experiment and build some sensemaking tools that will be actually useful already, and chart out some concreteness within this strange design philosophy.
Partner with established conceptual alignment researchers and foster collaboration with (research) engineers.
Explore opportunity modeling, to be properly responsible with dreaming a safe and beautiful future.
Help upcoming AI safety researchers meet the rigorous thinking demanded in AI risk and craft plans that don’t suck.
Experiments in wholesome/non-abusive tech to articulate and respond to threat models where tools are not just frustrating but abusive.
Generate some very straightforward, real-world engineering+cultural output as stepping stones to a new organization based on the above philosophy.

How will this funding be used?

We've run two alpha hackathons. The main one is planned to be at CEEALAR, Blackpool, UK, in November 2024.

https://docs.google.com/spreadsheets/d/1SD2SNZiYS6Z03-yn1zs1QaADRhwl5Sv6QqCd9q3SSXk/edit

The spreadsheet above details the breakdown of costs for renting the venue, buying snacks and food for participants, prizes for the winners, API credits for participants, infra costs for posters, marketers, whiteboards, flights cost to travel to UK and salary for organisers.

Who is on your team? What's your track record on similar projects?

We just ran two alpha versions of this hackathon series, one online, and one in-person. We are compiling a report and will post that here soon.

The main team consists of Sahil (formerly SERI MATS, MIRI, currently founding an AI safety org) and Aditya (AI safety community builder in India, IISc PhD student).

Sahil has a strong conceptual research + engineering background with experience in AI risk research and skill and experience in facilitation and community building. He has degrees in engineering, mathematics, economics, and computer science.

Aditya has been working with the Community Builders Grant team of CEA for more than a year now focusing on AI safety field building. He has technical expertise in machine learning studying at India’s top fundamental research university and has given talks at various IITs and IIITs on risks from general intelligence and at EAG London on community building. [previously manifunded]

What are the most likely causes and outcomes if this project fails?

Both the chances and downside of any serious “failure” is low here, since this is just a hackathon. However, for completeness’ sake:

Failure could look like:

It has to be canceled because of logistical issues

No one shows up, potentially because of time-crunch and marketing difficulties (already falsified given the strong response we received)
People show up, and it is too chaotic to be a focused effort, or participants are misled or their time is wasted. (Also falsified for the weekend length)
Someone creates a raw capabilities start-up, or tooling that feeds into capabilities that competes with safety considerations.

Causes could be:

Visa issues for people traveling to venue (the main organizers have obtained one, however)

Poor organization or preparation (we have already run two small ones with great feedback)

How much money have you raised in the last 12 months, and from where?

CEEALAR has offered to fund the venue and food for a week long workshop in November 2024.