@thaopham
Research keywords: multi-agent risks, game theory, cooperative AI
https://thaopham.dev/$0 in pending offers
I am concerned about risks from advanced AI systems, which I'm primarily motivated to study safety and cooperation in multi-agent settings. My research interests ponder questions like:
How can we evaluate and detect multi-agent deceptive misalignment?
Can we effectively model the Bayesian Theory of Mind in multi-LLM-agent settings to study deceptive behavior?
How can we study emergent LLMs' personas, or steer their aspirations towards human values, in open-ended environments?