Dgraph Alpha Crashing when deleting large amount of data

aditya_shelar · May 9, 2024, 7:05am

We are using Dgraph v22.0.2. Have deployed 3 alpha and 3 zeros.

In this DB I have multiple projects where the project is a Type.
Each project has around 15 other Types of data connected to it through edges.
I am using the following query to delete the project.

upsert {
    query {
           var(func: uid(0x12b603d)) {
            p_uid as uid
            p1: ~project@filter(type(Party)) {
                party_uid as uid
                ~party { pm_uid as uid }
                location {
                    loc_uid as uid
                    state { state_uid as uid }
                    country1 { count_uid as uid }
                }
            }
            p2: ~project@filter(NOT (type(Party))) { s_uid as uid }
        }
        var(func: uid(p_uid)) { Item as ~project@filter(type(Item)) }
        var(func: uid(Item)) @cascade { ~saleitem { sales_uid as uid } }
    }
    mutation {
        delete {
            uid(pm_uid) * * .
            uid(state_uid) * * .
            uid(count_uid) * * .
            uid(loc_uid) * * .
            uid(sales_uid) * * .
            uid(party_uid) * * .
            uid(s_uid) * * .
            uid(p_uid) * * .
        }
    }
}

When I run this query for multiple projects one after the other,

the memory consumption increases significantly
The DB be comes unresponsive
one of the alpha nodes goes into a crash loop and does not recover until we delete the whole data from the infrastructure side.

Please let me know what can we do to solve this.

Damon · May 9, 2024, 3:38pm

You may first analyze performance of the query. Because DQL queries are declarative and composable, you can run each sub-element of the query separately, and Dgraph will tell you the time taken (processing_ns in nanoseconds) and a “metrics” section.

If one element of this query is very slow, or accesses high numbers of uids or properties, that could take up memory. In particular, if the query is slow and you launch many queries rapidly, the server will try to execute them in parallel, and each uses memory. E.g. if a query takes 1 second, and you launch 100 per second, about 100 will be executing at any given instant.

Second, I would check on the number of UIDs returned by the query part to be sure the number of deletes being done in a single transaction is manageable.

Damon

Topic		Replies	Views
Alpha crashes when loading data Dgraph	7	699	July 1, 2020
Query optimization to reduce query time and memory usage Dgraph	4	407	July 25, 2020
Delete data over million UIDs Dgraph kind:question	3	190	March 21, 2024
Deleting data of type int causes the cluster to crash Dgraph status:accepted , kind:bug , ticket:created	4	829	June 17, 2021
Execute query, alpha takes up high memory and ends up OOM Dgraph kind:question , dgraph	4	861	August 3, 2022

Dgraph Alpha Crashing when deleting large amount of data

Related Topics