The only connection between threads and coroutines is that some single-threaded language runtimes only have coroutines, so you might occasionally use them where threads would be a better choice.
Coroutines are a way of structuring single-threaded execution, and a useful one. The example in the Fine Article of a producer-consumer pattern is a good one, attaching a stream to a parser isn't a parallel algorithm so threads are useless for writing it.
Naturally, using a single-threaded paradigm for work which could be performed in parallel is inefficient, but coroutines aren't a poor man's parallelism, they're a control structure which functions on its own terms. They can be combined productively with threads, such as using an event loop in a web server to thread (as in needle) coroutines through various blocking events with a dispatcher, and the runtime can spin up a thread per core to parallelize this, which reduces per-thread coordination to checking the depth of each thread's work queue and farming the request to the least congested one.
Bob Nystrom makes this argument best, I think, in his two-parter on loops and iteration[1,2]. Looping over data structures is of course only one example of how one can apply coroutines, but a very established one. The canonical problem requiring coroutines[3] is also essentially about doing that.
Or for those who want something different there’s the elevator (and elevator-userbase) simulation from TAoCP volume 1, also an essentially concurrent problem with little to no parallelism or I/O to it.
It could be, but given the sometimes astonishing costs of the—effectively—network protocol we know as cache coherency (thousands of cycles if you’re not careful), it’d be a giant waste in many of the cases where stackless coroutines would be perfectly appropriate.
Coroutines are a way of structuring single-threaded execution, and a useful one. The example in the Fine Article of a producer-consumer pattern is a good one, attaching a stream to a parser isn't a parallel algorithm so threads are useless for writing it.
Naturally, using a single-threaded paradigm for work which could be performed in parallel is inefficient, but coroutines aren't a poor man's parallelism, they're a control structure which functions on its own terms. They can be combined productively with threads, such as using an event loop in a web server to thread (as in needle) coroutines through various blocking events with a dispatcher, and the runtime can spin up a thread per core to parallelize this, which reduces per-thread coordination to checking the depth of each thread's work queue and farming the request to the least congested one.