"func: type() "error in 20.11.03 or later

OK Michel!

Please Forget about Lease Uid I will POST ANOTHER ISSUE about this

Let’s JUST TALK ABOUT type function in this POST

I just tried blank nodes mehod to create nodes like you did

        rdf += f'''<_:{person_uid}> <dgraph.type> "person" .\n'''
        rdf += f'''<_:{person_uid}> <xid> "{person_xid}" .\n'''
        rdf += f'''<_:{person_uid}> <person_name> "{name}" .\n'''

The result is still the same in dgraph version 21.12.0



Please make sure you are using Dgraph v21.12.0

As @amaster507 said

This seems to describe what you are seeing:

Forbid Massive Fan-outs

Certain keys in the graph suffer from a massive fan-out problem. These keys are typically index keys. For example, a certain string value might be a default value set to all the nodes in the graph. A reverse index on this value could point to millions of nodes in the graph, hence creating huge posting lists. Dgraph would split such a posting list across multiple keys, so as not to exceed Badger’s value limits and also allow partial reads of this index key.
A typical key like this would have dozens of splits. We noticed, however, that some keys have thousands of splits – that’s possible when the fan-out is in billions of nodes. A query using this key would be slow at best and would crash the system at worst by causing a massive memory consumption or a massive CPU spike.
In v21.12, we have added a flag to forbid any key which has greater than 1000 splits by marking it forbidden in Badger. Once a key is forbidden, that key would continue to drop data and would always return an empty result.
Almost all backends we have seen (in Dgraph Cloud) are not affected by this change. But, in the rare case that a user is affected, rewriting the query to use another key for the root function would fix the issue. We think this small downside is worth the upside of keeping the system stable and performant.

Yes, I’ve copied your command from your comment. I did not changed anything related to this.

Check if you are starting from scratch please.

I have done this same test with your code about 6 times. 3 of them with Blank node instead of the generated UIDs. Only with Blank nodes I saw no issue.

Check the docker volumes, the binded path and so on. Also if there is any dangling Zero instance somewhere.

So you also get type fucntion error when you use fixed uid?

This one I always get 500 k nodes.

This too 500k

This is zero results. But* I saw this with some result when it was with the original code.

Our application use lease uid then manually assign information to the only leased nodes for quite a long time.

and

It can work normally before 20.11.0 (the total number of nodes in our application is about 200 million, and the type method query is also normal in previous versions)

Did dgraph change the processing logic of lease uid later?

Our post is a bit too long, do you think we need to open a new post to sort out the problems we encountered?

Nope, the UID leasing is about the same for years. The only change we had was long ago adding the feature in the API for lease it manually.

If you found new issues unrelated this this, sure a new post is good. Even if has correlation. If it is a new thing, a new post is welcome.

Hi Michel!

I just try to run the code, again!

The lease uid just speeds up the problem, when enough nodes are written to, the error occurs even with blank nodes (in my case, around 5570000?)

You may increase the parameter batch_size
like

batch_size = 10000
for i in range(5000):
    rdf = ""
    for j in range(batch_size):
    # create data
        name = faker.name()

here is the full code

import pydgraph
import hashlib
import time
import requests
from faker import Faker



client_stub = pydgraph.DgraphClientStub('192.168.171.77:9080')
client = pydgraph.DgraphClient(client_stub)

# lease_uid
# /assign?what=uids&num=100 allocates a range of UIDs specified by the num argument, 
# and returns a JSON map containing the startId and endId that defines the range of UIDs (inclusive). 
# This UID range can be safely assigned externally to new nodes during data ingestion.
# See docs in https://dgraph.io/docs/deploy/dgraph-zero/


url = "http://192.168.171.77:6080/assign?what=uids&num=1000000000000000000"

payload={}
headers = {}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

# convert xid to dgraph_uid and make sure uid under uid num we juset lease(assign?what=uids&num=1000000000000000000)
def xid2uid(xid):
    uid = "0x" + hashlib.md5(xid.encode(encoding='utf-8')).hexdigest()[8:23]
    return uid

faker = Faker('en_US')


batch_size = 10000
for i in range(5000):
    rdf = ""
    for j in range(batch_size):
    # create data
        name = faker.name()

        person_xid = f"person_{i*batch_size + j}"
        person_uid = xid2uid(person_xid)
        # rdf += f'''<{person_uid}> <dgraph.type> "person" .\n'''
        # rdf += f'''<{person_uid}> <xid> "{person_xid}" .\n'''
        # rdf += f'''<{person_uid}> <person_name> "{name}" .\n'''
       # use blank nodes as you like to
        rdf += f'''<_:{person_uid}> <dgraph.type> "person" .\n'''
        rdf += f'''<_:{person_uid}> <xid> "{person_xid}" .\n'''
        rdf += f'''<_:{person_uid}> <person_name> "{name}" .\n'''
        
    txn = client.txn()
    # Running a Mutation
    txn.mutate(set_nquads=rdf)
    txn.commit()
    
    print(f"Finish mutation {i * batch_size + j + 1} nodes")

For sanity relief, I recommend that you change this UID proceduring to decimal literal notation instead of HEX. I’m pretty sure you won’t have new issues doing so.

PS. I’ll check the new code tomo.

Quick question. Is there a special reason why you have to control the creation of UIDs manually?
Have you tried using Liveloader’s Upsert XID option?

See you tomorrow,

Cheers.

This is About type function

I’ve simplified the test script a bit so you don’t get confused

import pydgraph
import hashlib
import time
import requests
from faker import Faker



client_stub = pydgraph.DgraphClientStub('192.168.171.77:9080')
client = pydgraph.DgraphClient(client_stub)

# lease_uid
# /assign?what=uids&num=100 allocates a range of UIDs specified by the num argument, 
# and returns a JSON map containing the startId and endId that defines the range of UIDs (inclusive). 
# This UID range can be safely assigned externally to new nodes during data ingestion.
# See docs in https://dgraph.io/docs/deploy/dgraph-zero/


# url = "http://192.168.171.77:6080/assign?what=uids&num=1000000000000000000"

# payload={}
# headers = {}

# response = requests.request("GET", url, headers=headers, data=payload)

# print(response.text)

# convert xid to dgraph_uid and make sure uid under uid num we juset lease(assign?what=uids&num=1000000000000000000)
# def xid2uid(xid):
#     uid = "0x" + hashlib.md5(xid.encode(encoding='utf-8')).hexdigest()[8:23]
#     return uid

faker = Faker('en_US')


batch_size = 10000
for i in range(5000):
    rdf = ""
    for j in range(batch_size):
    # create data
        name = faker.name()

        person_xid = f"person_{i*batch_size + j}"
        # person_uid = xid2uid(person_xid)
        # rdf += f'''<{person_uid}> <dgraph.type> "person" .\n'''
        # rdf += f'''<{person_uid}> <xid> "{person_xid}" .\n'''
        # rdf += f'''<{person_uid}> <person_name> "{name}" .\n'''
        # rdf += f'''<_:{person_uid}> <dgraph.type> "person" .\n'''
        # rdf += f'''<_:{person_uid}> <xid> "{person_xid}" .\n'''
        # rdf += f'''<_:{person_uid}> <person_name> "{name}" .\n'''
        # Thas's Fine Michel!
        # We will not use hex in this TEST
        rdf += f'''<_:{person_xid}> <dgraph.type> "person" .\n'''
        rdf += f'''<_:{person_xid}> <xid> "{person_xid}" .\n'''
        rdf += f'''<_:{person_xid}> <person_name> "{name}" .\n'''
        
    txn = client.txn()
    # Running a Mutation
    txn.mutate(set_nquads=rdf)
    txn.commit()
    if i % 10 == 0:
        print(f"Finish mutation {i * batch_size + j + 1} nodes")

Everything works fine when the new test script starts running,
but when it reaches 4,310,000 nodes, the result becomes 0


image

Henceforth, we can see this paradoxical example

@MichelDiz
Please run this demo !

I believe this is a very serious issue and look forward to your reply.

Cheers.

I’ll do. But take notice that even if I find something, you won’t see any fix in mid-short term. Especially since your case seems to be unique and very specific.

Hi Michel! @MichelDiz

First of all thank you for your attention to this issue!

Our Application maybe “unique and very specific”,

But the problem I’m reporting is an easy trigger and a serious bug (can’t locate a specific type of node, and counts errors based on the node type)

Please be sure to run this script, I have removed all the parts that confuse you, this standard blank node insertion method, as long as the number of nodes of a certain type reaches a certain threshold, it will also trigger this problem.

I hope you can realize the seriousness of this problem in time.

Thanks again for following up on this issue!

import pydgraph
import hashlib
import time
import requests
from faker import Faker



client_stub = pydgraph.DgraphClientStub('192.168.171.77:9080')
client = pydgraph.DgraphClient(client_stub)

# lease_uid
# /assign?what=uids&num=100 allocates a range of UIDs specified by the num argument, 
# and returns a JSON map containing the startId and endId that defines the range of UIDs (inclusive). 
# This UID range can be safely assigned externally to new nodes during data ingestion.
# See docs in https://dgraph.io/docs/deploy/dgraph-zero/


# url = "http://192.168.171.77:6080/assign?what=uids&num=1000000000000000000"

# payload={}
# headers = {}

# response = requests.request("GET", url, headers=headers, data=payload)

# print(response.text)

# convert xid to dgraph_uid and make sure uid under uid num we juset lease(assign?what=uids&num=1000000000000000000)
# def xid2uid(xid):
#     uid = "0x" + hashlib.md5(xid.encode(encoding='utf-8')).hexdigest()[8:23]
#     return uid

faker = Faker('en_US')


batch_size = 10000
for i in range(5000):
    rdf = ""
    for j in range(batch_size):
    # create data
        name = faker.name()

        person_xid = f"person_{i*batch_size + j}"
        # person_uid = xid2uid(person_xid)
        # rdf += f'''<{person_uid}> <dgraph.type> "person" .\n'''
        # rdf += f'''<{person_uid}> <xid> "{person_xid}" .\n'''
        # rdf += f'''<{person_uid}> <person_name> "{name}" .\n'''
        # rdf += f'''<_:{person_uid}> <dgraph.type> "person" .\n'''
        # rdf += f'''<_:{person_uid}> <xid> "{person_xid}" .\n'''
        # rdf += f'''<_:{person_uid}> <person_name> "{name}" .\n'''
        # Thas's Fine Michel!
        # We will not use hex in this TEST
        rdf += f'''<_:{person_xid}> <dgraph.type> "person" .\n'''
        rdf += f'''<_:{person_xid}> <xid> "{person_xid}" .\n'''
        rdf += f'''<_:{person_xid}> <person_name> "{name}" .\n'''
        
    txn = client.txn()
    # Running a Mutation
    txn.mutate(set_nquads=rdf)
    txn.commit()
    if i % 10 == 0:
        print(f"Finish mutation {i * batch_size + j + 1} nodes")

Hi @purist180 , I had the same question as you did. @MichelDiz in my application I need to count the number of entities of a certain type . But we meet the same bug that func(type(person)) return 0 in some scenes. How can i calculate the number of entities of a certain type in a credible way?

The only other way is to find a predicate that all nodes have (which may not be applicable in your dataset) and then getting all nodes that have that predicate using the has(...) function.

This limitation on the type function needs to be rethought and resolved.

In practical applications, we found that using type(person){count(uid)} is much more efficient than has(person_name){count(uid)}

Our application is still using Dgraph version 20.11.0,
but we still hope that Dgraph team @core-devs can notice this problem as soon as possible.

After all, type function is still a relatively important and basic function in DQL.

@MichelDiz Has there been any update on this? My team is encountering the same issue on v21.12 after a few million nodes have been inserted with the predicate “dgraph.type” set to “Person”.

A query for all nodes with type(Person) such as this

q1(func: type(Person)) {
    count(uid)
}

returns 0 nodes.

But when I query the data using has(nodeSpecificPredicate) all of the data is available.

q0(func: has(Person.lastname)) {
    count(uid)
}
q0(func: has(Person.lastname)) {
    uid
    dgraph.type
    Person.firstname
    Person.lastname
}

and the dgraph.type is correctly set to “Person” so it is only an issue with querying, the data is not lost.

Have you tested ths situation that a point with the same perdicate to multiple points (maybe 5000000 points)? I found the relationship cannot be found. I tested that a point points to multiple points. In this case, the relationship has been discarded. Then How do I save the data that a point related to multiple point ?

Here is the data I tested.

with open("property.rdf","w",encoding="utf-8") as ff:
    p = f"<usa> <countryname> \"USA\"  . \n"
    ff.write(p)
    for i in range(7000000):
        p = f"<tom-{i}> <username> \"tom-{i}\" . \n"
        ff.write(p)
       
        p = f"<usa> <Torelated>  <tom-{i}> . \n"
        ff.write(p)

We’re able to replicate this as well - migrating from 20.11.0 to 21.12.0 dropped all the results of the type index - was able to get the dgraph.type in object as well, but eq and type functions yield nothing.

this is exactly what caused our problem at least, really surprised that there isn’t anything visible in the logs for such an important operation.

Since we’re running a fork, we ended up commenting out the whole thing (increasing the limit to billions don’t work - that is probably the bug but don’t have time to investigate more)

The exact changes that solved the problem is:

if l.forbid || len(out.parts) > MaxSplits {
	glog.Infoln(`List is too big to split. Deleting the list.`)
	/*
		var kvs []*bpb.KV
		kv := &bpb.KV{
			Key:      alloc.Copy(l.key),
			Value:    nil,
			UserMeta: []byte{BitForbidPosting},
			Version:  out.newMinTs + 1,
		}
		kvs = append(kvs, kv)

		// Send deletion for the parts.
		delKvs, err := deletionKvs(true)
		if err != nil {
			return nil, err
		}
		kvs = append(kvs, delKvs...)
		return kvs, nil
	*/
	glog.Infoln(`But we're not doing this because its still broken.`)
}

This is in function func (l *List) Rollup(alloc *z.Allocator) ([]*bpb.KV, error) in file posting/list.go around line 918.

2 Likes