I want to upgraph dgraph from 1.0.15 to 1.1.0。now dgraph 1.1.0 has support for a type system。 use the expand()
function, the use of the type system is required。However, there is no dgraph.type in the data of dgraph1.0.15 export。The data exported has 30000000 nodes。Adding dgraph.type one by one node is too much trouble。Is there any convenient solution?
You could use Bulk Upsert to attach type to each node https://docs.dgraph.io/mutations/#bulk-delete-example. Remember to use set
instead of delete
.
The command I am using:
curl -H "Content-Type: application/rdf" -X POST 10.134.27.30:8281/mutate?commitNow=true -d $'
upsert {
query {
v as var(func: has(xid))
}
mutation {
set {
uid(v) <dgraph.type> "Thing" .
}
}
}
' | jq
dgraph.ERROR show:
What caused this?
I was thinking about this. The mutation is too large to be performed in one single request. For now, you could use first
and offset
to do it in small steps (say 1,000,000 at a time) https://docs.dgraph.io/query-language/#offset. Please file an issue on GitHub so that we can either fix it or provide a more useful message.
Thank you!issue link: Upsert Block fails when processing multiple nodes · Issue #4021 · dgraph-io/dgraph · GitHub
In addition to Aman’s tip, adding a “not()” also would help. So you don’t overwrite the dgraph type that is already there.
upsert {
query {
v as var(func: has(xid), first: M, offset: N) @filter(NOT has(dgraph.type))
}
mutation {
set {
uid(v) <dgraph.type> "Thing" .
}
}
}
So, if one would use @filter(NOT has(dgraph.type))
, he wouldn’t even have to use offset, right? Just repeat the query until everything is converted.
I thought about this, and thought @filter will be applied after the func
, but I tested it to be sure, and it actually works:
This query on my db, where not all nodes have articleModifiedOn:
{
first(func: has(articleName), first: 2) {
uid
articleModifiedOn
}
firstfilter(func: has(articleName), first: 2) @filter(has(articleModifiedOn)) {
uid
articleModifiedOn
}
}
Resulted in:
"data": {
"first": [
{
"uid": "0x6ddd1"
},
{
"uid": "0x6ddd2",
"articleModifiedOn": "2019-09-17T20:46:34.966691339Z"
}
],
"firstfilter": [
{
"uid": "0x6ddd2",
"articleModifiedOn": "2019-09-17T20:46:34.966691339Z"
},
{
"uid": "0x704e2",
"articleModifiedOn": "2019-09-17T16:53:17.35865168Z"
}
]
}
Not exactly. If you have billions of nodes to set up for now you have to use offset* - and for any reason, you have some nodes already with <dgraph.type>
- You would skip those, thus having less writes with that offset.
Also to avoid giving wrong types you could use the “NOT
” approach end/or set a predicate set in the query. e.g:
Query to define Person type
upsert {
query {
v as var(func: has(xid), first: M, offset: N) @filter(
has(name) AND
has(age) AND
NOT has(species)
)
}
mutation {
set {
uid(v) <dgraph.type> "Person" .
}
}
}
Query to define Animal type
upsert {
query {
v as var(func: has(xid), first: M, offset: N) @filter(
has(name) AND
has(species) AND
NOT has(~species) #This is useful if the you have reverse edges, to avoid mistakes.
)
}
mutation {
set {
uid(v) <dgraph.type> "Person" .
}
}
}
I think @ppp225 is right. He doesn’t need offset, if he already is filtering out the nodes which have a type.