Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Apple Early Chip Design (byrdsight.com)
137 points by salgernon on Jan 2, 2019 | hide | past | favorite | 55 comments


> A RISC based implementation of the Apple II 6502 Processor: In mid ’85 I performed an analysis that showed a simple RISC style implementation of a 16‐bit binary compatible superset of the 8‐bit microprocessor used in the Apple II 6502, along with some judicious use of on‐chip caching, could substantially improve performance – to the point of potentially outperforming the 68000 used in the Mac, and given the simplicity of the 6502 the implementation was “doable” by a small team.

This is fascinating, and I wonder if any of the design work has survived.


Remember that "RISC" in 1985 was barely a word, RISC was, in part a response to the changing costs/speed tradeoffs between memory and CPUs - prior to ~'86 memory was slow and expensive - instructions were heavily encoded to reduce instruction fetch bandwidth - RISC instructions are bigger, easier to encode, they essentially assume caches

A "RISC 6502" likely wasn't a "reduced instruction set 6502" it was more likely a "non-microcoded single cycle CPU running the 6502 instruction set" (possibly with some address extension past 64k and 16-bit data paths) -

remember that even the very first 68ks available in '79 was a 32-bit CPU architecture from day one (even if it had 24-bit addressing and an external 16-bit data bus), while the 8088/8086/80186/80286 were all 16-bit (the 80386 didn't show up until 6 years later)


> A "RISC 6502" likely wasn't a "reduced instruction set 6502" it was more likely a "non-microcoded single cycle CPU running the 6502 instruction set"

The 6502 was not microcoded. It used simple wired logic (implemented in a PLA) to "decode" and "run" opcodes by driving other parts of the chip. This is why many unimplemented opcodes "work", sometimes with recognizable effects.


Using a PLA vs microcode to sequence a multiple clock instruction execution is sort of a no-op (ie they're pretty much the same thing) when comparing how a CISCy chip works compared with a RISC which has instructions designed to be easily decoded by random logic and executed in one clock


The 6502 and Z80 were both 'CISCy' chips, but they were very different when it came to instruction execution. While formally a PLA may have been used to "sequence a multiple-clock instruction" in both, in a practical sense 6502 instructions took a handful of clock cycles to execute, not very dissimilar from later RISC chips. I think your point is more valid for the Z80 than for the 6502.


yes but RISC at the time was all about throwing that stuff away and executing a single instruction per clock (with an actual pipeline, something those early microprocessors didn't have) the result were much faster CPU's that ran essentially one clock and one instructions per L1 cache cycle, rather than one clock per main memory cycle (and each instruction being many memory cycles) - ie 20-50MHz rather than 1MHz - remember 6502s and their ilk ran unrolled pipelines visible on the main memory bus


Is there a good book or resource where I can learn this history of microprocessors?


I'm not sure, I mostly picked this stuff up reading comp.arch as it happened


computer chronicles did a great job documenting it all on video as it happened. https://archive.org/details/computerchronicles


That level of performance had already been achieved. According to a French article (on www.apple-iigs.info) linked from Wikipedia, the 65C816 (introduced 1983) could 'easily' run at 16 MHz. It was only limited to 2.8 MHz so it wouldn't compete with the 7.8 MHz Macintosh. This is believable, as IIgs accelerator cards above 10 MHz were popular add-ons.


The primary factor holding back the 65C816 was the required RAM speed. The 65XX processors required RAM almost twice as fast as the same clocked X86 or 68K.

An 8Mhz 65C816 requires 70ns RAM, whereas an 8Mhz 286 or 68K can use 140ns RAM without wait states.

70ns RAM was very expensive until the early 90s, at which point the 65C816 had already missed its window.


Is the 65C816 running at 16 MHz competitive with a 68000, though? The 68k has a 16-bit data bus, compared to the 8-bit bus of the 65C816, and the 68k has a bunch of registers on top of that.

Although 68000 instruction timings weren't that great.


65C816 is faster than 68K, what good is 16 bit bus when you spend at minimum 4 clocks fiddling your thumbs.


Ok, I see what you mean. Even then, at that point in history it was becoming clear that memory speed would not keep up with processor speed, so I can understand why the 16-bit bus seemed like the way of the future.


I'm not sure about that particular design but there was the 65816.

https://en.wikipedia.org/wiki/WDC_65816/65802

It wasn't exactly a RISC but it was a vast improvement over the regular 6502, and was also co-developed with Apple.


That's how ARM originated - as a simple, yet fast processor philosophically descended from the 6502. In an alternate universe, ARM could very well have stood for Apple RISC Machine.


In a sense, it sort of does-- Apple was involved with ARM fairly early in their lifecycle, collaborating with their design team to build the ARM6 which was ultimately used in the Newton.

That's around the same time as Acorn spun out the ARM design team into its own company, and dropped the Acorn out of its name in the process (first became Advanced RISC Machines, Ltd., then just ARM Ltd.).


Pete (the guy who's resume this is) may not have been as big a fan of the ARM back then ... he worked on the Newton, built the Hobbit CPU which was canned and replaced by an ARM just after he got working silicon back ....


reminds me of an alleged quote about Michelin: "We can make undestructible tires, but why would we"


I think it was clear even then that the 68k had an upgrade path ahead of it. Beefing up the 6502 would have been an evolutionary dead end, and left Apple with an even more fragmented market share


There was an article here a little while ago about the 1980's at Apple (can't seem to find it right now, sorry) where someone said they didn't go with x86 because someone (Motorola?) convinced them it could only scale up to some pretty unimpressive speeds (sub-100 MHz?).

It turns out, if you try hard enough, any CPU architecture can have lots of room to grow. The Intel 8008 wasn't an "evolutionary dead end", so I don't see any reason the 6502 was, except the decision to let it be.


Ultimately, the 68k upgrade path was not THAT long. But I agree with the rest of your argument. As the triumph of x86 has shown, in microprocessors, superior shipping volume can drive investment that covers a multitude of architectural sins. And the 68K architecture was quite nice, actually.


The Motorola 68k was just as much of a dead end in practice... and it took the Amiga and Atari ST platforms down with it, plus possibly others (it was used in a LOT of places). It's only very recently that the Apollo core has been reviving it.


Horrendous mismanagement at both Commodore and Atari respectively took the Amiga and ST down; the writing was on the wall for those platforms long before the 68k architecture was ever starting to plateau.

Even Apple's transition to PPC arguably had more to do with everyone (Motorola included) hopping on the RISC hype train than with some kind of fundamental performance limitations inherent to the 68k architecture.


The 68k was also used in Sun Workstations at the time. Motorola's failure to improve on it left a lot of companies in the lurch, some of which never recovered.

Certainly it started out in no worse a place than the 8086, and Intel was able to improve on that for many years. It seemed like after the 68040 Motorola just kind of gave up on developing chips, and then Intel launched the Pentium and ate their lunch.


After the 68040 Motorola was working on PowerPC because they thought it was impossible to improve performance and maintain 68K compatibility.


Seconded. There is a pretty active Apple 2 community out there who would definitely be interested.

My one question, given what we just read is, "What were the plans for the 1mhz system bus?"


> But probably the most memorable pearl of wisdom is what I have come to call the valley creed. Early on at Apple they took me for a walk and said “Pete my boy, there are three basic rules that apply to a career in the valley, and if you can accept these rules, then you can thrive here. If not, then you should leave” — of course I said, “Ok, what are they?”. Walt said, “#1 — there is no justice”. “#2 — there is no mercy”, and “#3 — this is the most important — are you paying close attention Pete? — #3 is....no one cares”. There you have it – the valley creed.

This is good life advice.


This is nihilism.

You could eke out an argument that it's not nihilism until you throw in #3 -- justice and mercy may well be lacking even in an environment where people have values.

"No one cares", taken at face value, is a way of saying no one has values.


I don't think this is nihilism, I think this is just a rhetorical device to keep yourself grounded and stay rolling with the punches.


I believe this is also partly why the early Apple computers cost much more than the equivalent IBM PC/XT/AT --- the latter did not make use of custom chip designs, instead using standard off-the-shelf parts. (This also made them easier to clone... and the rest is history.)


I think Atari's 8-bit line proves this to be wrong. The Apple II was priced much higher than it needed to be. Even with the Apple II, Apple was more interested in profit margin than market share. Commodore lost its market share to the IBM PC, not Apple who had already lost the lead.


None of the other 6502 machines could compete with Commodore in a price war because Commodore owned MOS technologies. The C64 was all custom chips, which is why it was much cheaper than other machines with similar capability. Apple may have been able to drop the price but they were never going to get down to VIC 20 and C64 prices.

Commodore failed to invest in R&D so by the time the AGA Amiga's came out they had to outsource the fabrication and it ballooned their costs.


Commodore had all sorts of problems particularly after Tramiel left / ousted. Atari actually kept within range of the 8-bit pricing of Commodore and had a much better set of custom chips.

Apple might have never been able to get to C64 prices but they sure could have dropped enough to get well clear of IBM. Regardless, its was Commodore's market share to lose not Apple's.


The Atari had:

An inferior sound chip

Bigger pallets but with fewer colors per cell

Tiny sprites

If anything the custom chips of the C64 and the Atari 8-bit line are about equal, each taking different trade offs.


I find it hard to believe that the Commodore 8-bit line would be competing w/ the IBM PC in any real sense. The natural upgrade path from a Commodore 8-bit was an Atari ST or Amiga, not an IBM PC.


Well, market share wise it was the leader when IBM brought the PC out in 81. IBM even straddled the whole 8-bit / 16-bit line with the 8088 instead of the 8086[1]. It was Commodore's game to lose and they didn't move into the 16 or 32-bit eras with a machine that stayed in the price range they had carved out. Both Commodore and Atari went for the $800 (ST) to $1,300 (Amiga) mark for their next generation, and Apple decided $2,495 was a great idea.

Heck, Commodore's sequel to the C64 known as the C128 coupled the 8502 (better 6510 which was a better 6502) with another 8-bit chip in the Z80. They built a Frankenstein and not a natural upgrade.

1) a curse on all those that prevented IBM from using the 68000. Sure, use the 68000 in something that's supposed to act like a 370 but not in the PC where it would have saved us all a whole lot of crap.


The point of that Z80 chip was to support the CP/M platform, which was used for a lot of business and utility software pre-IBM PC. So, I'm not sure why you think that coupling the two was a bad idea.


Because it was an 8-bit chip with no future in 1985 and CP/M was pretty much dead. Two 8-bit chips was just a bad idea.


A lot of Commodore users switched to the IBM PC because they wanted the same machine at home and at work. My father, for example, picked up an IBM XT but later also bought an Amiga as a second computer because the XT never really replaced the Commodore. It just allowed him to work at home.


In the 1980s custom ASICs were not expensive at all. Apple was able to reduce the cost of Apple ][s and Macs by replacing a bunch of off-the-shelf chips with a single custom ASIC.

If I had to guess, the Mac was more expensive than a PC simply because it did more (e.g. graphics and sound).


Also the price of RAM dropping nearly tenfold from 1977 to 1981.


The late 1980s (after the Mead&Conway revolution) seems like such a magical time for VLSI to me, as someone who wasn't around at that time.

Does this spirit of getting a few engineers together and taping out a chip in a few months still exist anywhere?


Why should it exist? You can choose any chip from 0.05$ Padauk microprocessor to $20k high end FPGA and everything in between off the shelf. Exception was bitcoin and litecoin mining ASICs couple years ago, but I am not following current status.

Edit: typo


This is true right now, but isn't that largely because the cost of fabricating a VLSI design is astronomical, and the tools to debug any issues with a first run also cost an arm and a leg?

If you could get parts made affordably and have some insight into their operation, then you might see more interest. Sort of like how affordable PCB fabrication services have led to a proliferation of customized circuit boards for all sorts of purposes, including spurious art like conference badges.

I bet if it were affordable to get a working design with a bit of studying, you'd see a lot more special-purpose chips, especially in small 3-6 pin packages.


Anything worth making into an ASIC these days is complicated enough that it is going to take way more than a few engineers and a few months. For 28nm you are paying the fab in the ballpark of a million dollars to tapeout a design.


Cryptocurrency? It sounds like son of HFT with fortunes being made and lost in months based on secret exotic hardware.


That was a rather silly time, the number of people I turned down who wanted me to make a quick mining chip was a little ridiculous.

Back to the parent question ... the main issue these days is the size of a team you need to make something serious - you essentially need a million dollars or two to pay for the people, the CAD tools, and the tape out costs ... that means it's pretty hard to just knock together something that's non-trivial


What was the "Newton team" doing in 2007? Did he mean the early iPhone group?

('when I transferred to the Newton team (called “Special Projects” on my transfer form) in early September 2007')


That must be a typo for 1987(?).


I never understood why they needed a custom RTC chip for the Mac. It didn't seem to do anything that the commercially available alternatives couldn't. It did complicate cloning, but brazillian Unitron didn't find it hard to reverse engineer it (though they didn't get it quite right the first time).


I love when people find fun ways to use parser generators (like YACC).


I don't get it? He used YACC to parse equations. Isn't that exactly what YACC is for?


A grammar for motherboard operational equations sounds fun to me at least.


I feel like there is a connection between the functional programming (FP) that logic circuits do, and the imperative programming (IP) of languages like C, through parser generators like YACC. In other words:

(FP) Verilog/VHDL<->YACC<->C (IP)

I have seen many FP to C transpilers (where you write the spec in a FP language like Lisp and it generates IP language C) because it's trivial:

https://www.quora.com/How-can-I-compile-a-Common-Lisp-or-Sch...

But I've never seen a C to Lisp transpiler that breaks the code down into a series of basic functions. My guess is that mutability becomes a major problem and either can't be represented without monads, or becomes unreadable spaghetti where the mutations have to be expressed via types or generics somehow.

In other words, the difficulty of statically analyzing mutability leads to excessively complex FP code that is unreadable by humans.

This might be one of the only examples I've seen where someone went from relatively human readable C to the FP expressed in the circuit by way of YACC.

Edit: upon reading this again, it looks like only the input equations were parsed by the "YACC generated program". The C code was only used for the "event driven cycle accurate simulator". But I still think there is merit in what I said about FP<->YACC<->IP.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: