Mitigating Reward Misspecification in Reinforcement Learning | Manifund