We are pleased to announce that the Release Candidate for Dgraph v24.0.0- v24.0.0-rc1 is now available for you to try and provide feedback.
Since the alpha release in March, we’ve added test coverage (up from 64% to 67%) and made several enhancements to our CI infrastructure, plus a few important enhancements and bug fixes.
Key highlights of the 24.0.0-rc1 release include:
Support for a native vector type at both GraphQL and DQL levels
Extend Liveloader to work with the vector type (Bulkloader will be available in GA)
Vector search specific fixes
#9084: Fix similar_to() error return when data is not present in (Community reported error)
#9083: Update query_rewriter to fix dotproduct and cosine query conversion
#9078: Fix incr restore and normal restore for vector predicates
For examples of using vector type in DQL and GraphQL please see the blog posts for 24.0.0-alpha and 24.0.0-alpha3.
Please note that this is a Release Candidate and not the final release.
We expect GA release for v24 to be available in the next couple of weeks, depending on Community feedback. Please file Github Issues with the label v24.0.0-rc1 if you find any problems or bugs in this release.
I skimmed the difference between alpha3 and rc1 (Comparing v24.0.0-alpha3...v24.0.0-rc1 · dgraph-io/dgraph · GitHub). There’s a lot of change references to a GlobalCache, CachePL, and the UpdateCachedKeys function. Is this the new caching approached mentioned before?
It looks like the CacheDefaults have changed too:
Old - CacheDefaults = `size-mb=1024; percentage=0,65,35;`
New + CacheDefaults = `size-mb=1024; percentage=0,80,20;`
@rahst12 Yeah, these are the changing changes. It’s still a work in progress, we want to be sure that cache changes don’t increase the overhead instead of decreasing. Basically GlobalCache would now store all the items in memory, and release it as per requirement. Currently we are still figuring how best to store items inside it using CachePL. Currently it’s only caching dgraph.type predicate.
Cache Defaults have been changed to see if it improves the performance or not. Index cache, only gets used when we have encryption setup, and by default there is no encryption.
Cache changes will not get ported to 23.1 release.
Have you looked at my recommendations for optimizing the type system? Replacing dgraph.type under the hood with an edge based type system instead of a predicate based type system?
Thanks a lot @amaster507 for the suggestion. I read through it, and what you have said, is exactly how Dgraph stores the type data right now. From what I understand, you are asking us to store, <dgraph.type.User>: [uids]. That’s exactly how we create the index. The predicate is dgraph.type, and index value is User. The posting list is created over the combination of this predicate and index value.
The problem that we have right now, this individual list can get quite big. And every query, reads the entire list again each time. We need to be smarter about this. We need to read less data from the disk. We have made a couple of changes to address this, but we are still pushing more. First change is to figure out when do we need to read index value, vs when do we just check type of items that we want to query. Currently if you are querying less than 10 times, and want to add a type filter, we only check the 10 items.
Second change is to add this cache. With this, type data will not get read multiple times from disk. We will update and modify this in memory.
In future, we want to change the 10 items, to a statistical based number. This is kind of what @mrjn suggested too under your comment. For the cache, we are going to improve how we store this data in memory. We can go the sroar route, but it was too buggy. We are thinking of just using a simple map for now.