Has some problem when Upgrade Database from 1.0.15 to 1.1.0

I want to upgraph dgraph from 1.0.15 to 1.1.0。now dgraph 1.1.0 has support for a type system。 use the expand() function, the use of the type system is required。However, there is no dgraph.type in the data of dgraph1.0.15 export。The data exported has 30000000 nodes。Adding dgraph.type one by one node is too much trouble。Is there any convenient solution?

1 Like

You could use Bulk Upsert to attach type to each node https://docs.dgraph.io/mutations/#bulk-delete-example. Remember to use set instead of delete.

The command I am using:

curl -H "Content-Type: application/rdf" -X POST 10.134.27.30:8281/mutate?commitNow=true -d  $'
upsert {
  query {
    v as var(func: has(xid))
  }

  mutation {
    set {
      uid(v) <dgraph.type> "Thing" .
    }
  }
}
' | jq

dgraph.ERROR show:

What caused this?

I was thinking about this. The mutation is too large to be performed in one single request. For now, you could use first and offset to do it in small steps (say 1,000,000 at a time) https://docs.dgraph.io/query-language/#offset. Please file an issue on GitHub so that we can either fix it or provide a more useful message.

1 Like

Thank you!issue link: Upsert Block fails when processing multiple nodes · Issue #4021 · dgraph-io/dgraph · GitHub

In addition to Aman’s tip, adding a “not()” also would help. So you don’t overwrite the dgraph type that is already there.

upsert {
  query {
    v as var(func: has(xid), first: M, offset: N) @filter(NOT has(dgraph.type))
  }

  mutation {
    set {
      uid(v) <dgraph.type> "Thing" .
    }
  }
}
1 Like

So, if one would use @filter(NOT has(dgraph.type)), he wouldn’t even have to use offset, right? Just repeat the query until everything is converted.

I thought about this, and thought @filter will be applied after the func, but I tested it to be sure, and it actually works:
This query on my db, where not all nodes have articleModifiedOn:

{
	first(func: has(articleName), first: 2) {
		uid
		articleModifiedOn
	}
	firstfilter(func: has(articleName), first: 2) @filter(has(articleModifiedOn)) {
		uid
		articleModifiedOn
	}
}

Resulted in:

  "data": {
    "first": [
      {
        "uid": "0x6ddd1"
      },
      {
        "uid": "0x6ddd2",
        "articleModifiedOn": "2019-09-17T20:46:34.966691339Z"
      }
    ],
    "firstfilter": [
      {
        "uid": "0x6ddd2",
        "articleModifiedOn": "2019-09-17T20:46:34.966691339Z"
      },
      {
        "uid": "0x704e2",
        "articleModifiedOn": "2019-09-17T16:53:17.35865168Z"
      }
    ]
  }

Not exactly. If you have billions of nodes to set up for now you have to use offset* - and for any reason, you have some nodes already with <dgraph.type> - You would skip those, thus having less writes with that offset.

Also to avoid giving wrong types you could use the “NOT” approach end/or set a predicate set in the query. e.g:

Query to define Person type

upsert {
  query {
    v as var(func: has(xid), first: M, offset: N) @filter(
    has(name) AND
    has(age) AND
    NOT  has(species)
)
  }
 mutation {
    set {
      uid(v) <dgraph.type> "Person" .
    }
  }
}

Query to define Animal type

upsert {
  query {
    v as var(func: has(xid), first: M, offset: N) @filter(
    has(name) AND
    has(species) AND
    NOT  has(~species) #This is useful if the you have reverse edges, to avoid mistakes.
)
  }
 mutation {
    set {
      uid(v) <dgraph.type> "Person" .
    }
  }
}

I think @ppp225 is right. He doesn’t need offset, if he already is filtering out the nodes which have a type.