Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
Scaling Coding-Agent RL to 32x H100s. 160% Improvement on Stanford's TBench (github.com/danau5tin)
2 points by Danau5tin 7 months ago | past | 1 comment
Show HN: Multi-Agent-Coder Is #12 on Stanford's TBench. Beats Claude Code (github.com/danau5tin)
5 points by Danau5tin 9 months ago | past | 1 comment
My weekend project accidentally beat Claude Code – #12 on Stanford's TBench (github.com/danau5tin)
2 points by Danau5tin 9 months ago | past | 2 comments
Show HN: Terminal-Bench-RL: Training long-horizon terminal agents with RL (github.com/danau5tin)
125 points by Danau5tin 10 months ago | past | 12 comments

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: