Hacker Newsnew | past | comments | ask | show | jobs | submit | mofeien's commentslogin

There was no nuclear weapon used in warfare anymore since WW II. I think the regulation and oversight worked incredibly well over the past 70-80 years, despite the game-theoretic challenge you mention.

I'm referring specifically to preventing additional countries from becoming nuclear powers. There was massive effort and coordination expended to this end. It failed repeatedly. The Nuclear Non-Proliferation Treaty was signed in 1968. 5 more countries armed themselves thereafter.

Given that his reason for saying GPT-2 was too dangerous to release was that the world needed more time to prepare for the effects of this technology, and given that the following models were basically scaled-up versions of it and killed social media, news reporting and other kinds of communication, I'd say he was right about the dangers of it.

funny how he didn't care about ethics the moment it was more profitable to release it than to talk about dangers.

"The race to build smarter-than-human AI is a race with no winners."

And specifically about the point on China, several people in power in China have also expressed the need to regulate AI and put international structures of governance in place to make sure it will benefit mankind:

https://nowinners.ai/#s5-china


I’ll buy it when they stop lying in the history section of their UN bioweapons self-certification thing. They can do that any time.

That highlights how important ceiling construction regulations are. I would assume that right now your breakfast sandwich is more highly regulated than LLMs. And these are the things that make decisions spanning from database maintenance here to target selection and execution in autonomous warfare.

The LLM agent is very good at fulfilling its objective and it will creatively exploit holes in your specification to reach its goals. The evals in the System Cards show that the models are aware of what they're doing and are hiding their traces. In this example the model found an unrelated but working API token with more permissions the authors accidentally stored and then used that.

Without regulation on AI safety, the race towards higher and higher model capabilities will cause models to get much better at working towards their goals to the point where they are really good at hiding their traces while knowingly doing something questionable.

It's not hard to imagine that when we have a model with broadly superhuman capabilities and speed which can easily be copied millions of times, one bad misspecification of a goal you give to it will lead to human loss of control. That's what all these important figures in AI are worried about: https://aistatement.com/


These results were based on "a trivial snippet from the OWASP benchmark". In the section "caveats and limitations" they state that sonnet 4.6 and opus 4.6 now pass.

And they decided to base the false positive examination on a single snippet of a publicly known benchmark question (that small models are known to be heavily fine tuned for) instead of the real use case of finding actual vulnerabilities across an entire codebase by using a for loop and checking the false positive rate there.

This is disingenuous at best, or even misleading by omission if the second approach _was_ done but not mentioned because it just confirmed that the false positive rate of small models is enormous. Given how all seven small models identified the FreeBSD Bug when pointed to it, and how how 6/7 small models still identified the "bug" even after the patch was applied, that second outcome seems likely...


... or maybe when you see them triggered or exploited reproducibly, then the underlying bug will also be pretty easy to discover. But at that point, it's already too late. :)

I really like your original point, I never thought about it this way.


I am freaking out. The world is going to get very messy extremely quickly in one or two further jumps in capability like this.


Messy in a way that would affect you?


I can think of several possible messy outcomes that would be able to directly affect me, not all mutually exclusive:

- Job loss by me being replaced by an AI or by somebody using an AI. Or by an AI using an AI.

- Resulting societal instability once blue collar jobs get fully automated at scale, and there is no plan in place to replace this loss of peoples' livelihoods.

- People turning to AI models instead of friends for emotional support, loss of human connection.

- Erosion of democracy by making authoritarianism and control very scalable, broad in-detail population surveillance and automated investigation using LLMs that was previously bounded by manpower.

- Autonomous weapons, "Slaughterbots" as in the short film from 2017

- Biorisk through dangerous biological capabilities that enable a smaller team of less skilled terrorists to use a jailbroken LLM to create something dangerous.

- Other powers in the world deciding that this technology is too powerful in the hands of the US, or too dangerous to be built at all and has to be stopped by all means.

- Loss of/Voluntary ceding of control over something much smarter than us. "If Anyone Build It, Everyone Dies"


Exploits in embedded systems that will never be properly updated is just one thing I can think of if one really thought about it.


"Internet no longer viable" would affect everyone, probably


The only thing preventing this today is cost, not capability. As costs come down over the next 5 years, the idea that the internet was once dominated by people will seem quaint.


Fictional timeline that holds up pretty well so far: https://ai-2027.com/


Welp, that was a scary read.


"So far" is two entries: "AI companies build bigger datacenters" and "AI is being used for AI research with modest success".


> > The most positive outcome I can think of is one where computers get really good at doing, and humans get really good at thinking.

> This is where LLM is currently going.

This is not where LLMs are currently going. They are trained and benchmarked explicitly in all areas that humans produce economically and cognitively valuable work: STEM fields, computer use, robotics, etc.

Systems are already emerging where AI agents autonomously orchestrate subagents which again all work towards a goal autonomously and only from time to time communicate with you to give you status updates.

Thinking that you as a slow human will be needed for much longer to fill some crucial role in this AI system that it cannot solve by itself, and to bring some crucial skill of creativity or thinking to the table that it cannot generate itself is just wishful thinking. And to me personally, telling an AI to "do cool thing X" without having made any contribution to it beyond the initial prompt also feels very depressing and seems like much less fun than actually feeling valued in what I do. I'm sorry for sounding harsh.


lol what a load of gibberish.


> Even if they wanted to fix this by making the light sensor do a constant check it wouldn't work as the privacy led light indicator is triggering the same sensor,

The privacy led light could just turn off for a couple of milliseconds (or less) while the light sensor performs its check.


Or, just buy any of the many pages of hidden cam devices that exist on Amazon, which also aren't limited to only 3 minute videos.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: