Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Only if the LLM knows the inputs connected to particular outputs, pre-digital era or classified material might not be available, neither informal discussions with other experts.

Most importantly, negative but unused signals might not be available if the text does not mention it.



challenge: provide a single example where the LLM can only provide the output and not the steps? (in text scenario)


An LLM can always output steps, but it doesn’t mean they are true, they are great at making up bullshit.

When the “how many ‘r’ in ‘strawberry’” question was all the rage, you could definitely get LLMs to explain the steps of counting, too. It was still wrong.


can you provide a single example now with gpt 5.4 thinking that makes up things in steps? lets try to reproduce it.


I’m pretty sure you can think of one yourself, I’m not going to play this game. Now it’s 5.4 Thinking, before that it was 5.3, before that 5.2, 5.1, 5, before that it was 4… At every stage there’s someone saying “oh, the previous model doesn’t matter, the current one is where it’s at”. And when it’s shown the current model can’t do something, there’s always some other excuse. It’s a profoundly bad faith argument, the very definition of moving the goal posts.

I do have a number of examples to give you, but I no longer share those online so they aren’t caught and gamed. Now I share them strictly in person.


Caught and gamed? What do you mean?


He means that if the problem becomes known, the AI companies will hack in a workaround rather than solving the problem by making the model more intelligent. Given that they have been caught cheating in that way in the past, I can't blame the GP for not sharing his tests.


Ok so no example.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: