Manifund foxManifund
Home
Login
About
People
Categories
Newsletter
HomeAboutPeopleCategoriesLoginCreate
sohvenk avatarsohvenk avatar
Sohan Venkatesh

@sohvenk

Independent Researcher. Currently working on AI Safety and Interpretability

https://sohv.github.io
$0total balance
$0charity balance
$0cash balance

$0 in pending offers

About Me

Research Interests: Understanding LLM capabilities, AI safety and alignment, interpretability

My current research focuses specifically on studying LLM capabilities and its failure modes such as alignment faking and model scheming. I use interpretability and representation engineering to understand these mechanisms internally with the goal of improving the safety of AI systems.

Projects

Re-evaluating Chain-of-Thought Faithfulnesspending grant agreement signature