> it happens when autonomous systems optimizing reward functions fail because of...

> it happens when autonomous systems optimizing reward functions fail because of problems described above in (a) -- the inability to have deterministic rules baked into them to avoid global fail states in order to achieve local success states.

yes, and there is an insight here that I think is often missed in the popular framing of AI x-risk: the autonomous systems we have today (which, defined broadly, need not be entirely or even mostly digital) are just as vulnerable to this

the AGI likely to pose extinction risk in the near term has humans in the loop

less likely to look like Clippy, more likely to look like a catastrophic absence of alignment between loci of agency (social, legal, technical, corporate, political, etc)