we’re pretty data-oriented in this company. Do we have any evidence that throughput of data stored in one format is slower than another?
The main reason why the data is colocated by predicate is because edgelists are a well balanced data structure for the purposes of a graph database - it has a O(1) addition of nodes and edges, it has O(|E|) storage requirements and has a O(|E|/S) query time where S is the number of shards. Thus it can find nodes very quickly. And by that logic, if the data is colocated, then it fetches the data associated with a node quickly as well.
But if you have evidence that it would negatively impact throughput, then please share it. We would love to consider alternatives. But at Dgraph, data talks.