Exactly - Apple hardware is designed for its software, and vice versa. They get battery gains across the stack.
I remember when the M1 Macs first came out, an Apple engineer revealed they'd optimized the hardware so one specific low-level operation macOS does all the time was 5x faster than on Intel [0].
It’s not even a particularly obscure low-level operation: atomic add. Every computer in the world performs that exact instruction a huge number of times running normal, non-Apple software.
The key insight is the kind of “vertical integration” providing the kind of feedback loop to spot the opportunity.
I remember when the M1 Macs first came out, an Apple engineer revealed they'd optimized the hardware so one specific low-level operation macOS does all the time was 5x faster than on Intel [0].
[0]: https://daringfireball.net/2020/11/the_m1_macs