ImpossibleBench: Measuring Reward Hacking in LLM Coding Agents

(lesswrong.com)

2 points | by gmays 2 days ago ago

No comments yet.