Very slow schema mutation on empty database

I have a reasonably large schema, 185 predicates, 22 types and a fair number of indices. But installing the schema takes over 60 minutes on a fairly beefy workstation. In SQL this is something that would return instantly, so I’m wondering if there is an issue.

I understand that updating many indexes would be pretty intensive, but when there is no associated data, it doesn’t seem to make as much sense.

I’d prefer to not paste my schema here since it is for an internal app, but I’m happy to otherwise send it to anyone who is interested in loading it.

Is that amount of time to be expected?

FYI, I’m using Dgraph version:

Dgraph version   : v20.03.1
Dgraph SHA-256   : 6a40b1e084205ae9e29336780b3458a3869db45c0b96b916190881c16d705ba8
Commit SHA-1     : c201611d6
Commit timestamp : 2020-04-24 13:53:41 -0700
Branch           : HEAD
Go version       : go1.14.1

60 minutes sounds way too slow, especially for a fresh database with no data.

You can run the Alter request to make indexing run in the background: https://dgraph.io/docs/query-language/#indexes-in-background

If you can share your schema, please do. Feel free to DM me.

Just sent a DM. Thanks.

Do you know if running indexing in the background would make the new schema usable faster? I notice while the schema is being updated, no new types or predicates are visible in Ratel, it just hangs at Refreshing Schema... until my request returns.

Queries that require the index would not be able to run until the index has been created. For instance, using the anyofterms function requires the term index to be created first. Other than that, queries and mutations can still be serviced while indexing is done in the background.

Can you also share the cluster set up? On my computer, the schema update finishes in about 20 seconds on a fresh cluster of v20.03.1.

Well I’m glad it’s hopefully just my setup.

I am just running three commands in three different shells:

dgraph zero
dgraph-ratel
dgraph alpha --lru_mb 2048 --zero localhost:5080

We had some DM conversation, but I wanted to updated this thread with where we currently are:

When I add the schema using these three commands:

dgraph zero
dgraph alpha --lru_mb 2048 --zero localhost:5080
curl localhost:8080/alter -T ~/schema.txt

It takes over an hour to load the schema. Last night I tried it and I killed it after 90 minutes. Dgraph alpha consumes large amounts of CPU during this time.

I’m on Arch Linux with 12 cores and 64Gb of ram. My brother has the exact same workstation and operating system, so I had him run it with the exact same dgraph executable. It finished in about 5 seconds.

While the schema is updating on my system, certain other applications will not run, while others seem to complete quickly. Python scripts run immediately. But another of my Go applications wouldn’t run within a few minutes, so I killed it. go build also takes forever. Alpha increases my system load from 4 to 30.

It seems to be an issue specific to my system, though I don’t have issues with any other applications. I’m going to update/reboot and report back.

I guess it was just something weird going on with my system. After an update and reboot, it now takes about 7 seconds.

1 Like