Schema building taking too much time

Hi we are using dgraph v1.0.17, we have around 1.4 Million vertices. Problem we are facing is that whenever we do bulk load or any changes in the schema, dgraph becomes unusable for 30-45 minutes. We believe that dgraph is busy in rebuilding schema which makes it unusable. Is this how it is supposed to be or we are doing something wrong here.

We are facing this issue on local with single dgraph zero and single dgraph alpha as well as on gke with 3 replicas of each.

Would really appreciate any help in this regard.

Thanks

Hi, index building takes time when you change the schema. However, we
have added background indexing in latest version by using which you can query dgraph while the index is rebuilding in background . You can get latest version from [Release Dgraph v20.03.1 · dgraph-io/dgraph · GitHub]
Also even without background indexing it shouldn’t take that much time. Can you please share schema here for us to test it ?

Hi Jatin, Thanks alot for taking time to answer my question.

We are planning to upgrade to dgraph v1.2.2 but it can take some time. Also I was wondering why sudden bump from v1.2 to v20.x, I was looking for the reason but couldnt find anything.

Thanks

Yeah,we are using calender versioning now. So,v20.x is nothing but Dgraph v2.0 release.
Also it will be helpful for us to see issue in more details if you can share schema with us.

gpc:uid . 
bvid:string @index(exact) . 
code:string . 
gsid:string @index(exact) . 
name:string @index(fulltext) @lang . 
type:string @index(exact) . 
depth:float . 
email:string @index(exact) . 
range:uid . 
width:float . 
bricks:uid @reverse . 
height:float . 
images:uid @reverse . 
source:uid . 
values:uid @reverse . 
classes:uid @reverse . 
comment:string . 
product:uid @reverse . 
rangeId:string . 
subType:string @index(exact) . 
version:default . 
dataType:string . 
families:uid @reverse . 
googleId:string @index(exact) . 
rootType:string @index(exact) . 
createdAt:int @index(int) . 
updatedAt:int @index(int) . 
attributes:uid @reverse . 
codeValues:uid . 
dgraph.xid:string @index(exact) . 
properties:uid . 
subclasses:uid . 
description:default . 
smartSearch:uid . 
dgraph.password:password . 
dgraph.group.acl:string . 
dgraph.user.group:uid @reverse . 

Thanks muhammadbial.It will be helpful for us to know what schema changes you were doing or this problem occur for every type of schema change?.Let us know if you face same issue after installing new version.

Hi Jatin, this occurs every time we do bulk load and also when we do any schema change. Like recently we just added this:

<volumetricWeight>: float .

Hi muhammadbilal,we have a movie dataset with 1 million triplets.
For that dataset on version v1.0.17 took 2m28.530383445s to load using live loader and any schema change was also quick.

You can download it using
wget “https://github.com/dgraph-io/tutorial/blob/master/resources/1million.rdf.gz?raw=true” -O 1million.rdf.gz -q

After that load it using live loader as :
dgraph live -r 1million.rdf.gz

Please report the time it take to load and for any schema change or any other issue.

Yes Jatin, this is really fast we have tested it many times in past. However our data is big. In our most recent backup we have 10.6 million triplets.
Even bulk loader takes more then 20 minutes to load our data.

Yeah, got it.Installing new version should solve the problem. Thanks for your inputs.