I'm of the belief that hardware is not truly preserved until you have a fully-accurate emulation of it.
You are correctly that software doesn't require fully accurate timings. Just accurate enough to bypass any timing bugs and replicate the experience. Especially outside of the console space. If your only goal is running all known software, then you can get away with some massive accuracy bugs.
But there is more to hardware preservation than simply running all software that might have been shipped on a platform. Some people want to do retro programming and develop new software for a platform. And if you don't have accurate emulation, then the more likely you are to introduce a bug that works in the emulator but not on real hardware. The less accurate an emulator, the more often you have to check on real hardware. And since there was such a wide variety of real hardware, your collection would have to be huge to ensure extensive testing.
You could argue that people wanting to do retro-coding as a hobby should just test on real hardware, but I'd argue that raises costs to the hobby. Also, in the distant future, the last Pentium III will die, and acquiring real hardware might not be possible. I have no idea if people will still be interesting in retro-coding for the Pentium III that far in the future, but my point is the hardware is not truly preserved unless they can.
BTW, the Pentium III was used in a console, with identical hardware configurations, so an accurate P3 might be more useful there.
Though, the Pentium III was used in a video game console. An accurate emulation could be more useful there.
Neither Xbox or PS2 are emulated at a cycle level though. They mostly rely on recompilation and API emulation, and PCSX2 has gotten pretty far on that. That generation was really the start of cross platform being the norm, and so the start of a dramatic drop off in games requiring very specific hardware details. If your game had to run on Xbox/PS2/WinXP then there has to be some level of portability considered in the software.
> Neither Xbox or PS2 are emulated at a cycle level though
Yet... Like I said, you can get pretty far with low-levels of accuracy. PS2 emulation is actually quite timing sensitive.
I have at least one bug in Dolphin that I investigated, that can't get fixed correctly until we get significantly better GPU timings. And also some speed-running strategies that rely on generating enough lag, but don't work in dolphin because it (usually) emulates the CPU and GPU way too fast.
> your only goal is running all known software, then you can get away with some massive accuracy bugs.
> Some people want to do retro programming and develop new software for a platform. And if you don't have accurate emulation, then the more likely you are to introduce a bug that works in the emulator but not on real hardware.
The two things go hand by hand. If era-developed software is unlikely to suffer from timing bugs, then _your own software_ is also unlikely to suffer from timing bugs. It's down to the same argument.
It's like claiming that because I developed for Pentium 4, my software is unlikely to work on the Pentium 3. Save for the very explicit case that I use some new extensions, how crazily out of the way would I need to go in order to even remotely hit such an issue?
In fact, it can all be summarized to: what CPU timing would you even emulate ? Why would you even target the P3 _specifically_? Why not Transmeta?
Note that this does not apply to accuracy emulation of accompanying hardware, but then again I would also claim that accuracy of hardware emulation is hardly relevant post-P3, since _the real hardware_ often is massively inaccurate by any definition of the word. Why accurately model a specific Radeon card, when the budget model of the same year is completely different , with the differences abstracted by hacks in the driver ?
Accurate timing is most useful when you are retro-programming a video game, or something else real time.
How do you know if your game will run at 60 fps if the execution times aren't accurate?
> what CPU timing would you even emulate
Ideally your emulator would support as many CPU + hardware configurations as possible, at many different speeds, so you can test as many as you want.
But just one single accurate hardware configuration is better than none. At least then I can say "I programmed this game it runs on a Pentium III 550E, with a Riva TNT2"
It's exactly the same when doing retro-programming on real hardware. If you only have one PC, then you can only confirm it's working on that exact same hardware configuration. But accurate emulators have advantages due to cost and ability to easily support multiple configurations.
> How do you know if your game will run at 60 fps if the execution times aren't accurate?
Again, this is not a console. If you rely on a specific Pentium 3's instruction timings to reach 60 fps, your game is not going to reach 60fps _on any other PC_, not even if someone has an identical CPU, since any single other difference in hardware, configuration, or even layout of the filesystem is going to matter much more.
You just can't get away with the same kinds of bug you can get away with in consoles, because even just trying ATI vs NVIDIA (or any two different brands of accelerator) is already going to be a completely different environment and timings, likely enough to trigger all those bugs (or at least more than different instruction timings will).
i.e. even the simplest of emulators (incl. a virtualizer) with a runtime cap is going to suffice for the usecase of mildly estimating a framerate based on the CPU of some era. And there's very little value to increase the accuracy of such estimation since with so wildly varying PC hardware anything you can produce is going to be irrelevant anyway.
(How to make a similar accurate-enough estimation of GPU performance is a different story).
There is a world of difference between your inability to see the appeal/utility of a thing and that thing being actually worthless, as you are so stridently insisting here.
For a mental exercise, just to prove it to yourself, why don't _you_ try examining all the reasons such accurate emulation might be desirable?
The other poster already provided a reasonable motivation - preservation. But since this discussion started, you've only really come out swinging with disparagement. One has to wonder why you are putting so much energy into suppressing and trashing someone's hobby.
Frankly it looks to me like you have some kind of preconceived and inflexible bias, or that you maybe trying to discover the appeal to this effort in a pointlessly adversarial way. Not a great look. If you really want people to think you have a smart, winning argument maybe try to show some understanding of both sides of the coin before floating your attempt at a clever and withering denouncement.
PC games stopped being CPU cycle-bound long ago in mid-late 90's.
The most issues you would have it's for high 486-Pentium I-II era games (specially the multimedia ones) which lots of them were speed bound, but for sure these games will be interpreted by ScummVM one day or another (Macromedia Director engine).
> There is a world of difference between your inability to see the appeal/utility of a thing and that thing being actually worthless, as you are so stridently insisting here.
I am asking a question, and answering "preservation" (which is not really the answer the poster made, since his goal is new development) without giving an actual concrete example of what behavior needs such accurate preservation kind of defeats the purpose of asking the question in the first place.
If the answer is "for the sake of it" that is also fine. But I'm unaware of anything post-P3 that would really require cycle-level emulation, so I ask. Most PC emulators "draw the line" around that era for a reason, even the ones who wouldn't necessarily have performance problems with newer machines (like DosBox).
> But since this discussion started, you've only really come out swinging with disparagement. One has to wonder why you are putting so much energy into suppressing and trashing someone's hobby.
What do you think? Because not only I have the same hobby, my work is also related to this. I am most definitely not interested in thrashing it.
It is also the _main thesis_ of this entire article, so why shouldn't we discuss it?
To add an example to your arguments, the C64 and PC/XT demo scenes show what can be done with cycle accurate retrocomputing. Especially 8088mph: https://youtu.be/yHXx3orN35Y
If someone was to decide to develop a game for a minimum performance requirement of a Pentium III at 500mhz and some GPU, for some reason (which would be a completely arbitrary choice, but that's the hobby, so roll with it) then the only way they can possibly check that out meets that minimum requirement is to test on a machine with that configuration.
Either a real machine, or a accurate emulator.
It doesn't matter if there is other PC hardware configurations out there with different preformance. A minimum requirements just means "I tested on this machine, and it meets the minimums." Ideally you should underspec your minimum requirement test machine so that your target audience can be reasonably expected meet it.
You can't substitute in virtualization. That has zero chance in hell of providing a realistic estimate of performance, even if you paired it with an accurate gpu emulator.
Modern CPUs simply have very different performance characters, instructions that might have huge stalls on the P3 might be extremely cheap under virtualization. Caches are also widely different sizes.
If you use a proper, but inaccurate emulator, you get different issues. Even if it was tuned to provide a decent estimate of cpu performance over average code (and they are typically tuned to overestimate cpu preformance, because people playing games would rather framedrops from real hardware are not emulated), it's just an average that doesn't take into account things like cache misses and branch misspredicts.
If you were to write code with a lot of cache misses or branch misspredicts, your inaccurate emulator would massively overestimate it's preformance compared to a real cpu.
The various issues just add up and it becomes impossible to profile and optimise the game you are developing unless you have an accurate emulator. Other solutions will all point to different parts of the code being hot.
Also, remember this is within the era when you might be still developing a game with a software renderer, and if not you still have to do vertex transform and lighting on the cpu.
Personally I'm not that interested in accurate emulation of PC, the issues get a lot worse when it comes to developing games for 5th, 6th and maybe even 7th gen consoles. That's were my true interest in accurate OoO emulation lies. But I can see why someone might want accurate PC emulation too.
> If someone was to decide to develop a game for a minimum performance requirement of a Pentium III at 500mhz and some GPU, for some reason (which would be a completely arbitrary choice, but that's the hobby, so roll with it) then the only way they can possibly check that out meets that minimum requirement is to test on a machine with that configuration.
This just doesn't happen in modern PC development, save for heavyweights who can afford multiple identical hardware configurations (e.g. HPC clusters). I know I'm repeating myself, but the variety of configurations just makes this highly implausible. Sure, you can be some demoscene type of guy who decides to target specifically this configuration, but then you're literally targeting one processor out of hundreds, and per your own words, the fact that it works on the 500Mhz doesn't mean it will work with the same performance on the next generation or even on the 550Mhz variant. I guess this is obviously fine, but really stretching it. You'll quickly end up having something that only works on your machine, with the same starting disk image, etc.
Even PCem doesn't fully simulate the x86 cache because there is no benefit to it, and that includes cores from eras which were much more sensitive to timing. Branch mispredictions? Forget about it. Most P3 software is going to run concurrently to some other software, anyway.
I'm not saying that you don't need a cycle-accurate simulator to get real timings. I'm saying that with such a large divergence in configurations and environments, virtualization (or any other inaccurate emulator) is likely to provide a performance level that is quite accurately somewhere in the interval. Most specially since you will have actually calibrated it to that interval beforehand :)
Now on consoles I can see the benefit. Consoles are lots of identical hardware, operating systems that tend to get out of the way, and the people who develop for them only test (for obvious reasons) on the console hardware itself or at most a developer edition which has the same hardware (for obvious reasons again). You can have a silent bug that depends on timing of a mispredicted branch or the relative speed between the bus accesses of two cores and _never_ notice it since your testing environment is exactly 1 device (such a bug would immediate flare on a PC on like the 2nd reboot).
Whatever it is that you develop for any one such console, it is highly likely it will work on all the million other sold consoles. Consoles are practically designed to have reproduce-able environments.
On the other hand you practically can emulate the entire x86 software catalog with emulators which _still_ have large differences in behavior at the actual instruction level compared to the hardware, so the instruction timing doesn't really seem important, and creating now some software that does depend on it seems .. complicated.
As an anecdote, not long ago I was working on a x86 emulator, and to my horror I realized that the push/pop instructions were actually miscomputing the operand size on a rather common but not primary situation (long mode but with a 32-bit segment). The emulator was pushing the stack by double the amount it should, and pushing/popping the high dword of registers it shouldn't have clobbered. This was actually happening in some of the most critical operating system code out there (bootloaders, WoW, etc.) ... and yet the bug had been in the emulator for years and no one had been the wiser, booting 64-bit OSes just fine :)
I think the variety in configurations is completely irrelevant.
If you aren't using at least one accurate configuration for your testing, there is a huge risk that you miss your performance target by a huge margin. Your 60fps game could end up running at 20fps on your target minimum hardware. Small performance inaccuracies can massively add up if you have a non-emulated cache-miss or branch miss-predict delay in your inner-most loop.
I think you are massively overestimating how accuracy of timings that you can get though virtualisation or semi-accurate emulation. Yes. They are probably accurate enough for running any historic software from the era, as most code for the PC is well-behaved to not do the wrong thing when running too fast.
It's just for the use-case of developing new software, as soon as you start optimising or profiling, you need accurate timings. And yes, we might be talking about weird demo-scene style projects along the lines of "I want to get the absolute best possible graphics out of the computer I had 25 years ago, no frames dropped, no wasted cpu cycles". I'm talking about the kind of project were someone is writing inner loops with intrinsics or in assembly.
You might argue that such a project is a massive edge case that it's not worth catering too. And if you are writing an emulator, that's a 100% legitimate position to hold, emulators shouldn't have to cater for every possible usecase. My point is only "If you don't have a 100% accurate emulator, and there is some niche use case it can't emulate, then the hardware isn't fully preserved" and that it would be nice if an accurate emulator existed.
> Now on consoles I can see the benefit. Consoles are lots of identical hardware, operating systems that tend to get out of the way, and the people who develop for them only test (for obvious reasons) on the console hardware itself or at most a developer edition which has the same hardware (for obvious reasons again).
Yes, I've chased after bugs in console emulation (Dolphin Emulator) that were impossible to fix correctly without significantly more accurate emulation.
Like the game which memset a staging buffer before data had finished DMAing out. The game was only saved on real hardware because after memsetting, it invalidated the cachelines and in typical situations, none of the memset cachelines had been evicted. Impossible to correctly fix without emulating the existence of an L2 cache. We eventually resorted to patching the game to fix the bug.
Or games where video decoding stutters, because it has a hot inner loop that push the out-of-order CPU and has very few cache misses. It executes faster over the whole frame on real hardware than Dolphin's CPU timing model, which assumes a certain number of cache misses. The game must have tuned it's video codec to use as much CPU time as possible.
We have speed running tricks that don't work in Dolphin, because the depend on lagging the game. And games that freak out when the GPU executes too fast, but when you adjust the timings for those other games freak out because the GPU is executing too slow. It's impossible to calculate accurate GPU timings without running much of vertex transform and a basic depth rasterizer.
These are projects I'd love to work on at some point, accurate CPU and GPU timings for Dolphin, even if they don't run at full speed and bus contention is still ignored. I think might be possible to get within the correct order of magnitude (so 10-50% of realtime), which is workable for some usecases like TASes and testing bugs.
> I'm of the belief that hardware is not truly preserved until you have a fully-accurate emulation of it.
This makes sense in principle, but exact emulation is something computationally prohibitive even for a (probably) 386¹. The computational problems of exact emulation have been described in a famous article about emulating the SNES².
I suppose that emulating even "just" a superscalar architecture is going to be prohibitive (due to the split into micro ops), and an out-of-order one would probably require transistor-level emulation (or at least, another, lower, level of emulation).
¹=Fairly arbitrary; I'm basing this just on the complexity of emulating the SNES, and the following considerations.
You are correctly that software doesn't require fully accurate timings. Just accurate enough to bypass any timing bugs and replicate the experience. Especially outside of the console space. If your only goal is running all known software, then you can get away with some massive accuracy bugs.
But there is more to hardware preservation than simply running all software that might have been shipped on a platform. Some people want to do retro programming and develop new software for a platform. And if you don't have accurate emulation, then the more likely you are to introduce a bug that works in the emulator but not on real hardware. The less accurate an emulator, the more often you have to check on real hardware. And since there was such a wide variety of real hardware, your collection would have to be huge to ensure extensive testing.
You could argue that people wanting to do retro-coding as a hobby should just test on real hardware, but I'd argue that raises costs to the hobby. Also, in the distant future, the last Pentium III will die, and acquiring real hardware might not be possible. I have no idea if people will still be interesting in retro-coding for the Pentium III that far in the future, but my point is the hardware is not truly preserved unless they can.
BTW, the Pentium III was used in a console, with identical hardware configurations, so an accurate P3 might be more useful there.
Though, the Pentium III was used in a video game console. An accurate emulation could be more useful there.