Games just happen to have a lot of operations that need column-based access; but that's not true for all domains. When you go and blindly push the game best practices into other domain, you are just making everybody's life hard and most systems worse.
It's not just column-based access. Formatting your data into a struct of arrays exposes opportunites to pack your data more efficiently and greatly reduce your application's memory usage. Boolean struct fields can become bitsets. Nullable struct fields can become sparse (or dense) maps. Pointer/reference struct fields can become arrays of smaller-width integers that index into a pool. And so on. When everything runs on CPUs that frequently stall on memory accesses, the impact of these sorts of changes cannot be understated - the latency difference between L3 cache and RAM can be on the order of ~10x.
The advice of keeping the data you access frequently contiguous in memory applies to everything on modern hardware. If there is a program where performance is an issue at all, probably this will be one way to make sure performance is good.
Well, if you want to generalize, it's about keeping data with correlated accesses close to each other and aligned inside memory pages, and failing that, yes to keep it at least contiguous. It's not exactly about access frequency, except that you want to optimize the things you access more.
Yes, that's a generic advice for high performance applications that is at least generic enough to apply on anything that is close to a normal computer. You will still need further details if you are talking about things like HPC (ironically) or mainframes, but it's general enough to say people should do it without qualifications.
I feel like this is increasingly the only way to write high-performance code.
With newer hardware, the only thing that's expected to scale is logic density - SRAM (and cache sizes) have stopped scaling with the latest lithographies - and RAM bandwidth hasn't really been scaling for quite a while (I'd think it's even possible that per-core bandwidth has been decreasing) - memory access has been the bottleneck for a while.
> Games just happen to have a lot of operations that need column-based access.
And that's not even true for many code areas in typical games, only where there's at least a few thousands 'things' to process (e.g. particle systems or navigation/collision systems).
DOD makes a lot of sense within specific subsystems, but not necessarily in high level gameplay code (outside specific genres at least).
You can see that in bevy, it feels like they are slowly reinventing a relational database and query language, every time they discover another limitation in their pure ECS architecture they add another macguffin to make it work.
Its true, Bevy is quite limited with only simple parent/child relationships[0] and much of the community is looking for more structured relations.
As it stands its pretty common to hold a `HashMap<Index, Entity>` and manually manage the data structure through derefs or some system that keeps it consistent. Ideally only using it for lists of entities that remain static like a tilemap.
Hrmmm. Not sure this line of thinking makes sense.
Game data access patterns are quite brutal. OOP for games results in extremely inefficient cache use, lots of random access, and lots of pointer chasing.
ECS isn’t a “natural” fit for games. It’s quite difficult and ECS systems are still far from a solved problem.
The two most popular game engines, Unreal and Unity, are decisively non-ECS for almost everything they do.
In any case, I think the underlying principles of DOD apply to all programs. Specific solutions vary, as always.
> The two most popular game engines, Unreal and Unity, are decisively non-ECS for almost everything they do.
Unity is promoting Data-Oriented Technology Stack which includes ECS. They even made some tools to help translate between Gameobject based workflows and ECS based workflows.
It's for high-performance computing with current CPU designs that are dependent on data locality for performance.
I agree that it's a harmful design for business data. Programmers want to push their runtime data model into the database and they have no interest in the operational, maintenance, and performance problems this causes. When someone suggests this kind of thing, I'll ask them "how do we diagnose performance problems with this technology when there are 100,000 concurrent users and millions of data elements?" The rows-and-columns people can answer this question.
> When someone suggests this kind of thing, I'll ask them "how do we diagnose performance problems with this technology when there are 100,000 concurrent users and millions of data elements?"
I don't understand; the exact same performance diagnostics work in both cases. Why is this different? There's nothing intrinsically less performant about this approach. You really think your checkerboard tables and long lists of columns with names like "VALUE12" and "VALUE13" and multiple different kinds of key/value pairs you jammed in there for different clients -- you think those are better performance!?
> 100,000 concurrent users
Do you actually have 100,000 concurrent users? Really? You don't, do you? You just kinda hope you will eventually. And again: this approach is not worse for that.
> millions of data elements
This is absolute peanuts for any modern database system. It's weird that this is your extreme example.
But it is true, much much more often than you probably realize. Just look at your tables and think about how repetitive they are. The reason you can't come up with a lot of "column-based" (as you put it, which is still narrow-minded IMO) operations is because you've never looked for them before. Of course you haven't: you've been stuck in the traditional mode where such things are basically impossible.
Do most of your tables have Name / Description type fields? Here's some "column-based operations": Allow translation of everything in your database. Generate natural-sounding text in a report, inserting these names and descriptions (from multiple different tables, of course). Free-text search of all your important database concepts. Detect similar names to the one the user is wanting to add, to prevent duplication. Clean whitespace. That's five off the top of my head.
Do most of your tables have Archived / Status / Soft-delete type fields? Allow a user to archive a record. Choose whether to include archived records in a query or not. Delete archived records after X days.
Do most of your tables have Comments fields? Allow multiple comments. Track who made a comment and when. Track responses to comments.
Do most of your tables track who last modified the record? Track all modifications. Show a list of recent modifications to any records.
The list goes on and on and on. You call these "column-based operations", which again, is short-sighted. They're more like "concern-based operations". And it turns out everything is a cross-cutting concern. You're shooting this idea down without nearly understanding it.
I don't see that as disallowing "for those who disagree with me: what do you disagree with, exactly?" Isn't that a reasonable question?
Besides, isn't downvoting someone for nonsensical, petty reasons (and leaving no response as to why, of course) at least as harmful to the overall discussion as referring to those downvotes?
> "for those who disagree with me: what do you disagree with, exactly?" Isn't that a reasonable question?
It's a hopeless one. People who downvote generally think you're not worth actually answering. Also note that heated language often gets automatically downvoted. And with few exceptions ("eating babies is bad"), one sided opinions tend to be less popular than anything that appears "balanced".
It's especially tough if your one sided opinion is attacking a popular practice.
I have not down-voted any of your submissions. Instead, I replied with the above to help explain why you were experiencing what you have.
> Besides, isn't downvoting someone for nonsensical, petty reasons (and leaving no response as to why, of course) at least as harmful to the overall discussion as referring to those downvotes?
I would imagine so. However, the culture here AFAIK is to down/up vote as one sees fit and not provide an explanation of same.
Games just happen to have a lot of operations that need column-based access; but that's not true for all domains. When you go and blindly push the game best practices into other domain, you are just making everybody's life hard and most systems worse.