Bulk/massive upserts feature

Hey @amanmangal, I saw https://github.com/dgraph-io/dgraph/issues/3817
what about the “bulk/massive upsert” thing I’ve mentioned 2 months ago? Do you think it should be a new issue or related to 3817?

“How to Update millions or records in a table” a common SQL question.

The ref query.

upsert {
      query {
          me(func: regexp(email, /^*@dgraph.io*$/)) {
          v as uid
        }
      }
        mutation {
          set {
            uid(v) <WorksAt> "Dgraph Labs" . 
            #fix typo "Dgraph Lab" to "Dgraph Labs"
          }
      }
}

If Dgraph finds 40 workers at “@dgraph.io”, we create 40 Nquads Clones with different uids or 40 mutation blocks. To add or update a new information.

This would be useful for games scenarios (distributing score to groups of a game), social networks, contextualized changes(like, someone typo the company’s name and we need to alter it in several context levels - could means thousands of nodes) and so on.

upsert {
      query {
          me(func: eq(team, "Manchester United RPG Club")) {
            rpgTeam {
                    v as uid
             }
        }
      }
        mutation {
          set {
            uid(v) <overallTeamScore> "3000" .
          }
      }
}

Also this could be a kind of “drop/clean”. We need to delete a company from our DB. Let’s say that yellow company here has 2k employees. After delete all yellow’s information I wanna delete all employees from my DB due legal reasons.

upsert {
      query {
          me(func: regexp(email, /^*@yellow.company*$/)) {
          v as uid
        }
      }
        mutation {
          delete {
            uid(v) * * .
          }
      }
}

Also delete a thread posts from a social network

upsert {
      query {
          me(func: regexp(username, "@MichelDiz")) {
          posts @filter(eq(postID, "156358129154782120") {
                 POST as uid
                 rootcmts as comments {
                  TH as thread {
                   cmts  as comments {
                   WHO as ~commented
                    }
                  }
                }
            }
        }
      }
        mutation {
          delete {
            uid(POST) * * .
            uid(rootcmts) * * .
            uid(TH) * * .
            uid(cmts) * * .
            uid(WHO) <commented>  uid(cmts) . #This deletes a reverse edge 
                                              #from users that commented there.
          }
      }
}

All of this should work in master. The PR https://github.com/dgraph-io/dgraph/pull/3612 adds support for this along with conditional Upsert. It would be really useful if you could try these on master and share your experience reports. Share the examples here, I could use them for an Upsert Blog too.

1 Like

Perfect! Wonderful work Aman!

forget this =>But only S * * deletion isn’t working.

Uh! I’ve forgot I need Types to be able delete!

Another way to do this https://github.com/dgraph-io/dgraph/blob/d6af37853169fccd5b266606c8d23beacdb66187/dgraph/cmd/alpha/upsert_test.go#L342

Actually, I think it would be better than using “first:”. Because there may be several people with 56 years. So they should all take the title “oldest” and not a random one.

It would be perfect if the aggregation could pass the UID along with the value. That way I wouldn’t have to do another round trip block. And also I wouldn’t need to add an index to age pred.

upsert {
  query {
    has_age as var(func: has(age)) {
      a as age
    }
    var() {
    MAX as max(val(a))
  }
    u as var(func: uid(has_age)) @filter(eq(age, val(MAX)))
  }
  mutation {
    set {
      uid(u) <oldest> "true" .
    }
  }
}
  • I think this particular test is older test written to test Upsert where a variable can only take one value.
  • S * * probably doesn’t work because tests do not have type information.

Hey, can we support value variables in upsert mutation?

upsert {
      query {
          me(func: eq(team, "Manchester United RPG Club")) {
            rpgTeam {
                    v as uid
                    OVTS as overallTeamScore
             }
            mySum as math(3000+ val(OVTS))  # let's say 3k + 5500
        }
        mutation {
          set {
            uid(v) <overallTeamScore> val(mySum)  .
          }
      }
}

I also think that “val ()” would be a wayout for dealing with facet changes. As I mention in my last comment in this issue below.

Thanks for pointing it out. This is already on our roadmap and in my todo list as I have mentioned before. Good chance it won’t be available in 1.1 though. But this is a feature that doesn’t require API change, should be easy enough (from API perspective) to add in say 1.1.1.

1 Like