How can I build xidmap

relunctance · December 7, 2018, 4:55am

dgraph live --help

  dgraph live [flags]
  -x, --xidmap string            Directory to store xid to uid mapping

I Need Write incremental data every day , So I Need xidmap to Remove duplication nodes.
But , How can I export xidmap from now database ?

relunctance · December 7, 2018, 6:52am

I has been used badger sucess export uids from dgraph database.

Is there a tool to export xidmap directly?

the code :

package main

import (
	"encoding/binary"
	"fmt"

	"github.com/dgraph-io/badger"
)

func main() {
	opts := badger.DefaultOptions
	opts.Dir = "/dgraph/p" // Directory to store posting lists.
	opts.ValueDir = "/dgraph/p"
	db, err := badger.Open(opts)
	if err != nil {
		panic(err)
	}
	defer db.Close()
	err = ForeachUids(db)
	if err != nil {
		panic(err)
	}
}

// export all uids
func ForeachUids(db *badger.DB) error {
	err := db.View(func(txn *badger.Txn) error {
		opts := badger.DefaultIteratorOptions
		opts.PrefetchSize = 100
		it := txn.NewIterator(opts)
		defer it.Close()
		for it.Rewind(); it.Valid(); it.Next() {
			item := it.Item()
			k := item.Key()
			err := item.Value(func(v []byte) error {
				uid, n := binary.Uvarint(v) //get uid from binary
				if n == len(v) {
					fmt.Printf("key=[%s] , uid: 0x%x\n", k, uid)
				}
				return nil
			})
			if err != nil {
				return err
			}
		}
		return nil
	})
	return err
}

MichelDiz · December 7, 2018, 8:43pm

Dgraph don’t uses XIDs. It just create them during a load. The flag --xidmap is useful just to setup a TMP folder path.

However you can use external ids https://docs.dgraph.io/mutations/#external-ids

relunctance · December 10, 2018, 4:03am

Hey guys

external-ids can’t solve my current problem

I used dgraph for file parent process by file md5

when I live load the first time is ok , but the next live load can not Automatic recognition of existing UIDs . when next load data A new uid will be created by the same subject
That is not what I expected.

My application scenario is that every day there will be new different file parent process data need written.
so, I need to automatically identify the existing UID for the same subject.

Do you have any good suggestions for me?

eg:

the fist day :

<0000710bf0bdf4394113147bf904da3c> <md5> "0000710bf0bdf4394113147bf904da3c" .
<332feab1435662fc6c672e25beb37be3> <md5> "332feab1435662fc6c672e25beb37be3" .
<0000710bf0bdf4394113147bf904da3c> <pmd5> <332feab1435662fc6c672e25beb37be3> .

the next day:

<0000710bf0bdf4394113147bf904da3c> <pmd5> <c6fa526514b961b5b8a9585d1eff5f9d> .

I want “0000710bf0bdf4394113147bf904da3c” not create a new uid and need “c6fa526514b961b5b8a9585d1eff5f9d” auto build a new uid .

MichelDiz · December 10, 2018, 1:40pm

I believe you should create a program in Py or Go using Upsert Procedure. Live or Bulk will not do this procedure.

https://docs.dgraph.io/howto/#upsert-procedure

relunctance · December 11, 2018, 3:16am

thanks for you reply.

but now I need to maintain third party databases to save md5->uid map .

Scenes like this Incremental Data Problem , Can bulk loader support it in feature ?

MichelDiz · December 14, 2018, 2:39pm

I don’t believe so. The loaders are made just for loading RDFs. There’s no other function.
It would be necessary to evaluate something universally accepted to introduce as a feature to Loaders. But this is only coming from the community.

yeahvip · November 21, 2019, 8:56am

I want to use dgraph live to increment new data because upsert may be slower. But how can I make the map between uid and xid? Can I make a unique identifier for every triplet to replace the mapping of uid and xid?

MichelDiz · November 21, 2019, 2:51pm

Hey @yeahvip,

When you have a question, please open a new topic. And reference other topics instead of commenting on them. For when you write in an old topic. It can trigger emails to the people involved. And not everyone likes to receive emails from old subjects.

About your question.

XID mapping only exists for entities. Consequently, edges that belong to this entity must contain the blank node in order to be mapped.

Blank nodes(unique identifiers) are used in that case. Quickstart - Dgraph

In general Dgraph does not use XID, only UIDs. Internally Dgraph handles this, but it is not open for users manipulation.

If you have URI, URL, UUID, GUID, BIC, UDID, SSID, NPI, shortuuid, Snowflake, MongoID and etc. You must use this approach here Mutation - Dgraph.

Cheers.

Topic		Replies	Views
Xid mapping in `live` ingest Dgraph	4	582	January 14, 2025
Where is the mapping of xids to uids which is created by bulk Users	3	678	April 5, 2018
Mapping all UIDs before Upgrade HELP! Dgraph	9	395	June 28, 2021
Understanding bulk data loads, and bulk updates, with XID in v0.8 Users	2	874	November 1, 2017
Bulk loader -x option Users mutation	7	851	May 9, 2020

How can I build xidmap

eg:

About your question.

Related topics