Could somebody explain how the M1 can reach 900 GFlops or more? My knowledge is probably outdated by the M1 cpu runs on 3.2Ghz with 8 cores and each instruction needs at least 1 clocktick, so I would say it could reach max 25.6 GFlops?
Similarly, the M1 GPU seems to have only 8 cores on 1.2 Ghz.
Obviously something is wrong in my reasoning, where are all those GFlops coming from?
Thanks, as you say it must be running on the GPU then because even with 5 superscalar instructions per clocktick the M1 CPU wouldn't be anywhere near 900GFlops.
And besides, isn't the thing that makes the M1 so fast the fact that it's a RISC processor that is -not- superscalar?
The headline suggested this had anything to do with the specific qualities of the M1 but apparently this has nothing to do with the M1 CPU? Any modern GPU can easily reach 1 TFlop nowadays.
Modern processors execute more than one instruction per cycle by executing several instructions in parallel on each core.
In sequential code, it's common that some instructions are independent and can theoretically be executed in parallel, this is measured with ILP (Insturction Level Parallelism).
Modern processors exploit ILP by detecting dependencies between instructions and execute independent instructions in parallel. These are superscalar processors.
In addition, some extensions of the instruction sets add instructions that allow you to compute several data at the same time. They are called SIMD extensions (Single Instruction Multiple Data).
Obviously something is wrong in my reasoning, where are all those GFlops coming from?