Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
DeepSWE: Measuring coding agents on original, long-horizon engineering tasks (datacurve.ai)
2 points by sss111 2 days ago | past | discuss
DeepSWE Measuring frontier coding agents (datacurve.ai)
2 points by e2e4 3 days ago | past | 1 comment
DeepSWE: A contamination-free benchmark for long-horizon coding agents (datacurve.ai)
62 points by ammar_x 4 days ago | past | 20 comments
DeepSWE Benchmark (datacurve.ai)
5 points by xfax 4 days ago | past | discuss

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: