Best way to update a predicate on all nodes in dgraph

rahst12 · February 1, 2023, 8:58pm

Hi,
We’ve ingested a fair amount of data (500GBs) and now we’ve realized we ought to have ensured all our predicates off of the nodes were lower-cased. Is there a “best-practice” for performing what would amount to an update all query?

Thanks,
Ryan

MichelDiz · February 2, 2023, 4:14am

rahst12 · February 6, 2023, 5:00pm

Hi @MichelDiz,
I may not have phrased my question the best… I’m interested in lower-casing the value of a specific predicate type. Is the post you linked for renaming the key name of a predicate?

Thanks,
Ryan

MichelDiz · February 6, 2023, 5:08pm

No, there’s no lower-casing function for values. Or anything like that for values.

rahst12 · February 6, 2023, 5:34pm

To lower-case all the values, is there a way to pass a script/function into dgraph to apply to the nodes? (We run elasticsearch too… you can pass “painless” scripts in there)

Alternatively, we’re thinking if there’s a way to query every node, we could then lower-case them in some custom scripting and perform an update to the node.

Is there a way to one-up query each node in the database, like iterate through each one?

Thanks,
Ryan

MichelDiz · February 6, 2023, 5:48pm

You can use an upsert approach with any lang to do this. It can a Py script or JS. The approach would be

do a query, grab the UIDs and the value
Iterate the uids and values and parse the value correctly.
Send a new mutation.

There’s no way to pass scripts to run in the cluster itself.

Something to evaluate(PRs, RFC and feature requests are welcome). But elasticsearch has 13 years old of development. Dgraph is 7. We are in the way.

Yes, but I don’t get the question. Upser Mutation does that. But not the way you want. If you need those function check if we have feature requests. If not, open one.

Cheers.

rahst12 · February 6, 2023, 9:49pm

This is the general process we figured we’d be left with. The challenge we have is getting an initial set of all nodes we need to update without a query timeout occuring. We need to issue a “select all”… We’re trying to think of other ways to access every node in the graph without taking that approach - like using the DGraph UID Hex. We thought it was a “one-up” type value… 0x1, 0x2, 0x3, etc.

For 1 to 10 million:
   1. Query 0x1
   2. Fix case issue
   3. Issue mutate

Is the Dgraph UID constructed in such a way we could iterate through all UIDs?

Thanks,
Ryan

MichelDiz · February 7, 2023, 12:10am

Yeah, you can do that. You can check the UID leased instead of setting 10 million right away.
Because if you do an upsert on a non-existing UID it may return an error or create new ghost nodes.

Topic		Replies	Views
Migrating (renaming predicates, etc) Users	4	1915	November 14, 2019
How to rename predicate easily Dgraph	1	1336	December 16, 2019
How to update all predicates Dgraph mutation	1	670	July 12, 2019
Is there a command to explicitly force a reindex of ALL indices? Dgraph	28	1709	November 18, 2022
How to update a predicate and node using go client Dgraph kind:question	6	1154	July 2, 2020

Best way to update a predicate on all nodes in dgraph

Related topics