Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

as long as OpenAI and Anthropic keep subsidizing dirt cheap Codex or Claude Code usage, I'll just keep using them as evaluators. The trick is to have a fresh instance doing the reviewing, not the one that did the work.
 help



> The trick is to have a fresh instance doing the reviewing, not the one that did the work.

In my experience that's not neccessary (some people even claim that you must use models from different vendor), and it's expensive since a fresh instance needs to rebuild all the context that's needed in order to properly and thoroughly review. LLMs have no problem throwing "them 5 minutes ago" under the bus when asked to review something "skeptically" and "with fresh eyes".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: