Upsert nested types in go

Hi,

I am new to graph databases and dgraph.

I am trying to work out the best way of adding multiple relationships at the same time while avoiding duplicates.

I have 3 types, Searches, Results and Sentences.

The problem I have is Searches may hit the same sentences via results. So I need to use an upsert. But I need to be able to upsert lots of searches, Results and sentences without duplicating any.

I am currently doing it in stages, upsert sentences first (so as not to duplicate them), then upsert the results, then upsert the links one at a time querying for the uid of the result and sentence. But this seems really inefficient.

Am I going about it all wrong?

Thanks

Dean

A single upsert block can do several queries and those results can be used to generate a mutation. For example, you can use Conditional Upsert https://dgraph.io/docs/mutations/conditional-upsert/ Taking into account several blocks of queries to finally mutate.

Trouble is I have an unknown number of things. Adding items with go is really easy. But upserts are a bit of a pain as you have put uid( ) as the UID of the object, But if I have 20 if them its a pain to write the query and then add then update the object before you marshal the json.

Can you clarify what Upsert operation are you doing? In Dgraph we have two types.

I copied the one in the Go example.

query {
		doc as p(func: eq(doc.id,"%d")){
		uid
	   }
	  }

then create a request and set that as the query, and then set the mutation to be the json of the node i want to upsert then set the Uid if the node to uid(doc)

My structs look like this So I add a search that might have many results and point to many contexts, but the contexts might be the same for multiple searches

type Search struct {
	Uid     string   `json:"uid,omitempty"`
	Query   string   `json:"querystring,omitempty"`
	Results []Result `json:"result,omitempty"`
	DType   []string `json:"dgraph.type,omitempty"`
}

type Result struct {
	Uid      string    `json:"uid,omitempty"`
	Document Document  `json:"document,omitempty"`
	Index    string    `json:"result.index,omitempty"`
	Contexts []Context `json:"result.context,omitempty"`
	DType    []string  `json:"dgraph.type,omitempty"`
}

type Context struct {
	Uid     string   `json:"uid,omitempty"`
	Context string   `json:"context.id,omitempty"`
	DType   []string `json:"dgraph.type,omitempty"`
}

Can you share the link from the example?

There you go, I didn’t use the fragment in the query block.
https://godoc.org/github.com/dgraph-io/dgo#example-Txn-Mutate-UpsertJSON

This example is upsert-procedure - you should use the Upsert Block instead. Here goes an example dgo/upsert_test.go at a9ad93fe6ebda3ef813e9677ca9974fe8e15f296 · dgraph-io/dgo · GitHub

Thank you, I will check it out

Im getting there. Could you tell me why if the document is not found, the upsert doesn’t create a new Result. It attached the the context to the existing one. Here is the upsert;

upsert{
  query{
        srch as r(func: eq(querystring,"(1efficiency~x)")){
		      uid
		      result @cascade{
		         ruid as uid 
		         document@filter(eq(doc.id, "2318563")){
		              uid
		              doc.id
				     }
			    }
	    }
       doc as p(func: eq(doc.id,"2318562")){uid}
	  	 ctx0 as c0(func: eq(context.id,"2318563-641318")){uid}
       ctx1 as c1(func: eq(context.id,"2318563-641328")){uid}
       ctx2 as c2(func: eq(context.id,"2318563-641722")){uid}
    }
    
    mutation{
      set{
      uid(srch) <dgraph.type> "Search" .
      uid(srch) <querystring> "(1efficiency~x)" .
      uid(srch) <result> uid(ruid) .
      uid(ruid) <document> uid(doc) .
      uid(ruid) <dgraph.type> "Result" .
      uid(ruid) <result.context> uid(ctx0) .
      uid(ruid) <result.context> uid(ctx1) .
      uid(ruid) <result.context> uid(ctx2) .
      uid(doc) <dgraph.type> "Document" .
      uid(doc) <doc.id> "2318563" .
      }
    }
}

Here is the Json response when I the doc doesn't exist
{
  "data": {
    "code": "Success",
    "message": "Done",
    "queries": {
      "r": [
        {
          "uid": "0x46f557"
        }
      ],
      "p": [
        {
          "uid": "0x46f55c"
        }
      ],
      "c0": [],
      "c1": [],
      "c2": []
    },
    "uids": {
      "uid(ctx0)": "0x46f560",
      "uid(ctx1)": "0x46f561",
      "uid(ctx2)": "0x46f562"
    }
  },
  "extensions": {
    "server_latency": {
      "parsing_ns": 59323,
      "processing_ns": 19719788,
      "encoding_ns": 8199,
      "assign_timestamp_ns": 565345,
      "total_ns": 20583389
    },
    "txn": {
      "start_ts": 3511372,
      "commit_ts": 3511373,
      "preds": [
        "1-dgraph.type",
        "1-doc.id",
        "1-document",
        "1-querystring",
        "1-result",
        "1-result.context"
      ]
    }
  }
}

and here is the json when it does exist.

{
  "data": {
    "code": "Success",
    "message": "Done",
    "queries": {
      "r": [
        {
          "uid": "0x46f557",
          "result": [
            {
              "uid": "0x46f559",
              "document": [
                {
                  "uid": "0x46f55c",
                  "doc.id": 2318563
                }
              ]
            }
          ]
        }
      ],
      "p": [
        {
          "uid": "0x46f55c"
        },
        {
          "uid": "0x46f565"
        }
      ],
      "c0": [
        {
          "uid": "0x46f55d"
        }
      ],
      "c1": [
        {
          "uid": "0x46f55e"
        }
      ],
      "c2": [
        {
          "uid": "0x46f55f"
        }
      ]
    },
    "uids": {}
  },
  "extensions": {
    "server_latency": {
      "parsing_ns": 46657,
      "processing_ns": 19766848,
      "encoding_ns": 17991,
      "assign_timestamp_ns": 392200,
      "total_ns": 20409347
    },
    "txn": {
      "start_ts": 3511412,
      "commit_ts": 3511413,
      "preds": [
        "1-dgraph.type",
        "1-doc.id",
        "1-document",
        "1-querystring",
        "1-result",
        "1-result.context"
      ]
    }
  }
}

Hi @deanroker123

The doc.id values do not match. The value in the filter is “2318563”, while the value in the variable section ls “2318562”. Could you please review?

Sorry I am actually doing the mutation in go and had the problem so I tried to recreate it in Ratel, but had the same problem.

Here you go.

Doc exists

upsert{
  query{
        srch as r(func: eq(querystring,"(1efficiency~x)")){
		      uid
		      result @cascade{
		         ruid as uid 
		         document@filter(eq(doc.id, "2318564")){
		              uid
		              doc.id
				     }
			    }
	    }
       doc as p(func: eq(doc.id,"2318564")){uid}
	  	 ctx0 as c0(func: eq(context.id,"2318564-641319")){uid}
    }
    
    mutation{
      set{
      uid(srch) <dgraph.type> "Search" .
      uid(srch) <querystring> "(1efficiency~x)" .
      uid(srch) <result> uid(ruid) .
      uid(ruid) <document> uid(doc) .
      uid(ruid) <dgraph.type> "Result" .
      uid(ruid) <result.context> uid(ctx0) .
      uid(ctx0) <context.id> "2318564-641319" .
      uid(ctx0) <dgraph.type> "Context" .
      uid(doc) <dgraph.type> "Document" .
      uid(doc) <doc.id> "2318564" .
      }
    }
}

{
  "data": {
    "code": "Success",
    "message": "Done",
    "queries": {
      "r": [
        {
          "uid": "0x46f580",
          "result": [
            {
              "uid": "0x46f581",
              "document": [
                {
                  "uid": "0x46f584",
                  "doc.id": 2318564
                }
              ]
            }
          ]
        }
      ],
      "p": [
        {
          "uid": "0x46f584"
        }
      ],
      "c0": []
    },
    "uids": {
      "uid(ctx0)": "0x46f585"
    }
  },
  "extensions": {
    "server_latency": {
      "parsing_ns": 105384,
      "processing_ns": 19054504,
      "encoding_ns": 10784,
      "assign_timestamp_ns": 475125,
      "total_ns": 19844632
    },
    "txn": {
      "start_ts": 3512326,
      "commit_ts": 3512327,
      "preds": [
        "1-context.id",
        "1-dgraph.type",
        "1-doc.id",
        "1-document",
        "1-querystring",
        "1-result",
        "1-result.context"
      ]
    }
  }
}

New Doc

upsert{
  query{
        srch as r(func: eq(querystring,"(1efficiency~x)")){
		      uid
		      result @cascade{
		         ruid as uid 
		         document@filter(eq(doc.id, "2318566")){
		              uid
		              doc.id
				     }
			    }
	    }
       doc as p(func: eq(doc.id,"2318566")){uid}
	  	 ctx0 as c0(func: eq(context.id,"2318566-641319")){uid}
    }
    
    mutation{
      set{
      uid(srch) <dgraph.type> "Search" .
      uid(srch) <querystring> "(1efficiency~x)" .
      uid(srch) <result> uid(ruid) .
      uid(ruid) <document> uid(doc) .
      uid(ruid) <dgraph.type> "Result" .
      uid(ruid) <result.context> uid(ctx0) .
      uid(ctx0) <context.id> "2318566-641319" .
      uid(ctx0) <dgraph.type> "Context" .
      uid(doc) <dgraph.type> "Document" .
      uid(doc) <doc.id> "2318566" .
      }
    }
}

{
  "data": {
    "code": "Success",
    "message": "Done",
    "queries": {
      "r": [
        {
          "uid": "0x46f580"
        }
      ],
      "p": [],
      "c0": []
    },
    "uids": {
      "uid(ctx0)": "0x46f587",
      "uid(doc)": "0x46f586"
    }
  },
  "extensions": {
    "server_latency": {
      "parsing_ns": 54644,
      "processing_ns": 19938648,
      "encoding_ns": 5351,
      "assign_timestamp_ns": 394100,
      "total_ns": 20573168
    },
    "txn": {
      "start_ts": 3512330,
      "commit_ts": 3512331,
      "preds": [
        "1-context.id",
        "1-dgraph.type",
        "1-doc.id",
        "1-document",
        "1-querystring",
        "1-result",
        "1-result.context"
      ]
    }
  }
}

Hi @deanroker123 ,
Could you please add a cascade as above, along with the “srch” query and check the behavior? Please let us know.

Nope, That created a new search but still used the original result.

upsert{
  query{
        srch as r(func: eq(querystring,"(1efficiency~x)"))@cascade {
		      uid
		      result @cascade{
		         ruid as uid 
		         document@filter(eq(doc.id, "2318567")){
		              uid
		              doc.id
				     }
			    }
	    }
       doc as p(func: eq(doc.id,"2318567")){uid}
	  	 ctx0 as c0(func: eq(context.id,"2318567-641319")){uid}
    }
    
    mutation{
      set{
      uid(srch) <dgraph.type> "Search" .
      uid(srch) <querystring> "(1efficiency~x)" .
      uid(srch) <result> uid(ruid) .
      uid(ruid) <document> uid(doc) .
      uid(ruid) <dgraph.type> "Result" .
      uid(ruid) <result.context> uid(ctx0) .
      uid(ctx0) <context.id> "2318567-641319" .
      uid(ctx0) <dgraph.type> "Context" .
      uid(doc) <dgraph.type> "Document" .
      uid(doc) <doc.id> "2318567" .
      }
    }
}

{
  "data": {
    "code": "Success",
    "message": "Done",
    "queries": {
      "r": [],
      "p": [],
      "c0": []
    },
    "uids": {
      "uid(ctx0)": "0x46f589",
      "uid(doc)": "0x46f588",
      "uid(srch)": "0x46f58a"
    }
  },
  "extensions": {
    "server_latency": {
      "parsing_ns": 40156,
      "processing_ns": 19167152,
      "encoding_ns": 3319,
      "assign_timestamp_ns": 452967,
      "total_ns": 19823447
    },
    "txn": {
      "start_ts": 3512426,
      "commit_ts": 3512427,
      "preds": [
        "1-context.id",
        "1-dgraph.type",
        "1-doc.id",
        "1-document",
        "1-querystring",
        "1-result",
        "1-result.context"
      ]
    }
  }
}

Please try as above (move the ruid to the result segment).

That works, thank you.

Is the other way I tried it a bug?

1 Like

It’s not a bug. We are allowing the cascade to operate and then collect the UIDs. This is an area of the documentation that we need to improve.

Thanks for your help?

Glad to be of help! Have a great day.

One more question.
Thinking about insert performance. I think I am going to have about 3.5 million contexts when I’m finished loading all the data in.The more items I add the slower the insert gets. I am guessing its because it does a look up for each of the contexts to see if it exists. so the more contexts there are the longer it takes.

Is there a better way of doing this?