Dgraph Release Candidate v24.0.0-rc1 is now available

Hi Dgraph Community,

We are pleased to announce that the Release Candidate for Dgraph v24.0.0- v24.0.0-rc1 is now available for you to try and provide feedback.

Since the alpha release in March, we’ve added test coverage (up from 64% to 67%) and made several enhancements to our CI infrastructure, plus a few important enhancements and bug fixes.

Key highlights of the 24.0.0-rc1 release include:

  • Support for a native vector type at both GraphQL and DQL levels
  • Extend Liveloader to work with the vector type (Bulkloader will be available in GA)
  • Vector search specific fixes
    • #9084: Fix similar_to() error return when data is not present in (Community reported error)
    • #9083: Update query_rewriter to fix dotproduct and cosine query conversion
    • #9078: Fix incr restore and normal restore for vector predicates
  • Community contributed PRs:
    • #9030: Add support for Polish Language
    • #9047: Reduce x.ParsedKey memory allocation from 72 to 56 bytes
  • Dgraph/Badger fixes:
    • #9007: Fix deadlock occurring due to time-out
    • #9085: Fix deadlock in runMutation and error handling
    • #2018: Reduce resource consumption on empty write transaction
  • Performance fixes:
    • #9068: Add cache to dgraph.type predicate
    • #9065, #9089: Type filter fixes
    • #9088: Update postinglistCountAndLength function
  • Update to Golang v1.22 - performance and monitoring improvements
  • Upgraded Golang client
  • Number of CVE Fixes

You can find the complete changelog here.

For examples of using vector type in DQL and GraphQL please see the blog posts for 24.0.0-alpha and 24.0.0-alpha3.

Please note that this is a Release Candidate and not the final release.

We expect GA release for v24 to be available in the next couple of weeks, depending on Community feedback. Please file Github Issues with the label v24.0.0-rc1 if you find any problems or bugs in this release.

Thank you,

– The Dgraph team


In a previous post, it was said Dgraph v24 would boast: “a new caching approach that will boost performance of all applications” (Dgraph 24.0.0-alpha is now available on Github and DockerHub)

I skimmed the difference between alpha3 and rc1 (Comparing v24.0.0-alpha3...v24.0.0-rc1 · dgraph-io/dgraph · GitHub). There’s a lot of change references to a GlobalCache, CachePL, and the UpdateCachedKeys function. Is this the new caching approached mentioned before?

It looks like the CacheDefaults have changed too:

Old - 	CacheDefaults        = `size-mb=1024; percentage=0,65,35;`
New +	CacheDefaults        = `size-mb=1024; percentage=0,80,20;`

(Comparing v24.0.0-alpha3...v24.0.0-rc1 · dgraph-io/dgraph · GitHub)

Can the Dgraph team elaborate on these updates?

Also, to confirm, this portion was not backported to v23.1.1, that was just #9065?

@rahst12 Yeah, these are the changing changes. It’s still a work in progress, we want to be sure that cache changes don’t increase the overhead instead of decreasing. Basically GlobalCache would now store all the items in memory, and release it as per requirement. Currently we are still figuring how best to store items inside it using CachePL. Currently it’s only caching dgraph.type predicate.
Cache Defaults have been changed to see if it improves the performance or not. Index cache, only gets used when we have encryption setup, and by default there is no encryption.
Cache changes will not get ported to 23.1 release.

Have you looked at my recommendations for optimizing the type system? Replacing dgraph.type under the hood with an edge based type system instead of a predicate based type system?

Also brought up here

Thanks a lot @amaster507 for the suggestion. I read through it, and what you have said, is exactly how Dgraph stores the type data right now. From what I understand, you are asking us to store, <dgraph.type.User>: [uids]. That’s exactly how we create the index. The predicate is dgraph.type, and index value is User. The posting list is created over the combination of this predicate and index value.
The problem that we have right now, this individual list can get quite big. And every query, reads the entire list again each time. We need to be smarter about this. We need to read less data from the disk. We have made a couple of changes to address this, but we are still pushing more. First change is to figure out when do we need to read index value, vs when do we just check type of items that we want to query. Currently if you are querying less than 10 times, and want to add a type filter, we only check the 10 items.
Second change is to add this cache. With this, type data will not get read multiple times from disk. We will update and modify this in memory.
In future, we want to change the 10 items, to a statistical based number. This is kind of what @mrjn suggested too under your comment. For the cache, we are going to improve how we store this data in memory. We can go the sroar route, but it was too buggy. We are thinking of just using a simple map for now.

@harshil_goel as v24 uses the same badger version as v23.1 no import/export should be necessary for upgrading, can you confirm?

Yes, no import export is required for v24. A drop in binary replace is sufficient.