I use the data here, This data is in CSV format, and it has 41.7 million nodes and 1.47 billion relationships.
I’m sharing the converted RDF file now, and my schema:
id: int @index(int) .
followers: [uid] @reverse .
type twitter_user {
id: int
followers: [twitter_user]
}
The results show that the performance of the type function is poor.
And there’s another problem, when I run a query on the k8s cluster, the start_ts returned is always the same.
I suppose we are fetching all the uids associated with the type before doing the intersection. We could be smarter here. Accepting this as an enhancement.
Can you share a way to replicate this? You should get different start_ts unless you are reusing the same transaction every time.
@pawan Thank you for your reply.
I don’t know how to replicate this problem right now, it won’t happen with a single alpha, but three of my four clusters have this problem.
Hi pawan, I have the same problem; if there are too many nodes related to the type, it will be very time-consuming to get all the UIDs associated with the type before the intersection; and the performance of the filtering query for the prediction of equal index is much better than that of the type filtering. This will cause a problem. If you do not use the type filter, it will be faster. You can’t see that this is an enhancement.