Redarc Labs | Manan Wadhwa

@redarclabs

We are an AI safety Lab focused on interpretability under adversarial conditions.Red in our name stands for all the misalignments, jailbreaking and other dangerous behavior.Arc represent that we don’t just want to judge through the outputs but look at the whole journey , the whole arc through interpretability.

redarclabs.com

$0total balance

$0charity balance

$0cash balance

$0 in pending offers

About Me

We are interested more in the mechanisms than just output behaviors and then to create ai control and monitoring protocols on top of those. To prevent or control the above mentioned undesirable behaviors.

India is one country where AI adoption is rapidly growing and may surpass the global usage, furthering the need of initiatives such as ours to be present and work here and there are no similar major organizations most being based in the UK and the US.

Projects

Grant to establish an AI safety lab and fellowship from India

pending admin approval