Hi! I want to delete old nodes that are older than 1 month on a daily basis.
They are not used for the training of my machine learning algorithm and they use too much space. I indexed the created_at predicate using days. (I don’t need hourly indexing and I thought it results in too many indexes. My data includes millions of nodes even only for a day and I am not able to delete them.
“NQuad count in the request: 1148066, is more that threshold: 1000000”
“Deadline Exceeded” (timeout was 300)
I get the second error when there are less than 1m nodes but still, the number is high. What is the optimal solution for this? Pagination? If it is, what is the optimal node number to delete?
You can set a higher limit for nquads using --mutations_nquad_limit
dgraph alpha -h | grep limit
--mutations_nquad_limit uint Limit for the maximum number of nquads that can be inserted in a mutation request (default 1000000)
--normalize_node_limit uint Limit for the maximum number of nodes that can be returned in a query that uses the normalize directive. (default 10000)
--pending_proposals int Number of pending mutation proposals. Useful for rate limiting. (default 256)
--query_edge_limit uint Limit for the maximum number of edges that can be returned in a query. This applies to shortest path and recursive queries. (default 1000000)
that’s the best approach and I personally recommend this.