How to do an upsert block with multiple linked variable from separate query blocks

(GraphQL) Schema:

type Address {
  id: ID
  tempCountry: String!
  country: Country @hasInverse(field: usedBy)
  ...
}
type Country {
  name: String! @id
  countryCode: String
  usedBy: [Addresses] @hasInverse(field: country)
  ...
}

When I imported data I set just the <Address.tempCountry> predicate instead of going through the extra effort to map the countries and then link back to the right country. Now I am trying to resolve this to remove the Address.tempCountry predicate. What I need to do is to map each address to the existing corresponding Country. I already created the Country nodes. The Country.name = Address.tempCountry

I have figured out how to do multiple query blocks with Dgraph and have this so far:

{
  addresses(func: has(Address.tempCountry)) {
    uid
    country_name as Address.tempCountry
  }
  countries(func: eq(Country.name, val(country_name))) {
    uid
    name
  }
}

The next part is where I am stuck. I need to take the above and put it in a upsert block to set the Address.country & Country.usedBy predicates and delete the Address.tempCountry predicate.

Something like this:

upsert {
  query {
    addresses(func: has(Address.tempCountry)) {
      address_uid as uid
      country_name as Address.tempCountry
    }
    countries(func: eq(Country.name, val(country_name))) {
      country_uid as uid
      name
    }
  }
  mutation {
    set {
      uid(address_uid) <Address.country> uid(country_uid) .
      uid(country_uid) <Country.usedBy> uid(address_uid) .
    }
    delete {
      uid(address_uid) <Address.tempCountry> * .
    }
  }
}

But I believe I am missing something here because this above will set every address to every country and every country used by every address.

Can I get any direction on how to do this correctly?

Bulk Upsert is not highly recommended for everyone. In that case, you need to know how to control collateral damage when using such a procedure. My recommendation is to limit this query to one operation at a time.

For example, add a temporary predicate stating that there was a migration. e.g. “<migrated>” and add the check @filter (NOT has (<migrated>)) to your query. That way you avoid messing things up.

To limit one operation at a time, you must use the “first” paging parameter.

e.g

addresses(func: has(Address.tempCountry), first: 1) @filter(NOT has(<migrated>))

...
set {
      uid(address_uid) <Address.country> uid(country_uid) .
      uid(address_uid) <migrated>  "" .
      uid(country_uid) <Country.usedBy> uid(address_uid) .
    }

I would also separate the delete operation for later. I would add a “<to_be_deleted> "Address.tempCountry" . ” predicate and later I would delete them all via upsert block. Just to avoid deleting important data unintentionally.

I didn’t cover every part of the problem, as I didn’t fully understand it. If you provide a sample exemplifying the problem in practice, I may be able to go deeper. But in general, Bulk Upsert does not work in all cases.

Let me know if you have more questions.

Cheers.

2 Likes

I think I have resolved the limitation which leads back to no way to do foreach or a loop, so I again have to script outside of Dgraph to get this to work. The problem I ran into is that it was setting every address to be linked to every country which was not the desired effect. I understood this better after digging around the forums some more and our other discussion about loops

1 Like

Got it, indeed we need loops to overcome some limitations on Upsert Block. And also other small ones. But the team is focused on bigger issues, so we have to wait.

Understood.

Security, Performance, and then Functionality.

Just trying to find any possible work arounds for the time being.

Hi @amaster507
I also encountered the same problem, how did you finally solve it?