Senior SWE-Bench: open-source benchmark that assesses agents as senior engineers

(senior-swe-bench.snorkel.ai)

145 points | by matt_d 15 hours ago ago

96 comments