It's a good start to solving an array of problems when considering building a low-latency arbitrary depth graph DB.
Dgraph does these:
1. It shards by predicate so a full join can be executed by one server.
2. Storing SPO as a record (or row) would still be slow because, in graphs, a single SP can result in thousands or millions of objects. That would involve a lot of iteration and multiple disk seeks to read, which gets slow. So, Dgraph stores them in a posting list format, using SP -> list of O, as a single key-value.
3. That can then result in values which are really large. So, Dgraph converts all nodes into 64-bit integers.
4. Intersections are really important in graphs. So, posting lists store all ints in sorted order. To allow quick intersection between multiple posting lists.
5. Add indices, then replication, HA, transactions, MVCC, correctness, linearizable reads, and ...
voila! 3 years later, you have Dgraph!
P.S. I intend to find some time and write a paper about the unique design of Dgraph. There's a lot of original research involved in building it.
Dgraph does these: 1. It shards by predicate so a full join can be executed by one server. 2. Storing SPO as a record (or row) would still be slow because, in graphs, a single SP can result in thousands or millions of objects. That would involve a lot of iteration and multiple disk seeks to read, which gets slow. So, Dgraph stores them in a posting list format, using SP -> list of O, as a single key-value. 3. That can then result in values which are really large. So, Dgraph converts all nodes into 64-bit integers. 4. Intersections are really important in graphs. So, posting lists store all ints in sorted order. To allow quick intersection between multiple posting lists. 5. Add indices, then replication, HA, transactions, MVCC, correctness, linearizable reads, and ...
voila! 3 years later, you have Dgraph!
P.S. I intend to find some time and write a paper about the unique design of Dgraph. There's a lot of original research involved in building it.