I once found some book on MC68000 programming on the Macintosh. In a chapter on graphics, the book presented an assembly routine for drawing a filled circle and bragged about how machine code gets it down to just .25 seconds. I went, "what???" and read the code: it was calling an integer square root subroutine for every scan line of the circle. Not even a good integer square root routine.
I implemented the circle/ellipse drawing code in Microsoft Windows way back. Bresenham's algorithm is the way to go. The line drawing algorithm is better-known, but it can be extended to a circle. The idea is you calculate the coordinates incrementally, keeping track of the pixel error for each line. You scale the error so everything is simple integer arithmetic, no floating point or square roots needed.
Here’s my guess: the Mac’s drawing routines all did clipping in software. That was accomplished by having drawing of rectangles, circles, ovals, polygons and rounded rectangles (for various operations such as erasing or filling them, drawing a frame along the outside, inverting the bits inside) all come down to:
- compute the set of bits (called a ‘region’) to operate on
- call the appropriate function to erase/fill/… that region
I would guess drawing ovals was done this way:
- create a region for a circle with radius equal to that of the radius of the corners.
- insert horizontal parts to each row of bits in the region to ‘stretch’ the region horizontally into a rounded rectangle that has the correct width, but is only as high as the circle.
- insert vertical parts to each column of bits in the region to ‘stretch’ the region vertically into a rounded rectangle that has the correct width and height.
The first step was identical for the code for drawing circles; the region data structure made the last two operations cheap; they did not require any memory allocations. You had to walk the entire data structure, but for small corner radiuses, it wasn’t that large. Also, one could probably optimize for speed by doing that while creating the region for the circle.
My guess is they didn't even need to be fast, just render once when the window is configured and keep a cache of rounded corners in the window system complete with the masks. They're tiny, and symmetrical.
Compute once when the window is created or resized is what happened, yes, but that doesn’t work when using them for buttons (unlike MS Windows, where every control is a ‘Window’, on Mac OS controls inside a window didn’t have their own drawing context), or when using them for window content (e.g. in a drawing program)
And remember: the original Mac had about 28 kilobytes of RAM free for applications. The system unloaded icons, code and fonts that were available on disk all the time. Few objects were ‘tiny’ at the time.
Probably the fastest way to calculate it is the Bresenham circle algorithm. But if there are a lot of roundrects with the same few small corner size (likely in a window system), a table for that size might be better.