The Good, The Bad, The Ugly - State of Dgraph

Talking with @gajanan (or @gajanansc) earlier today, He requested, I post a discuss topic summarizing a bunch of links for feature requests, current problems, ideas, suggestions, etc. The goal here is to be factual and technical. Oh and for those who may have not known before, @verneleem is also me :wink:

If I have to do all of the raw data filtering, manipulation, and analytics in a client outside of the database, then why have the database at all?

^ Sometimes it seems right now that we have 2 API languages and 0 DB languages with Dgraph.

If you havenā€™t yet, please see The State of Dgraphā€™s GraphQL API Notion Document. I will reference many of the same topics from this document over again here to try to get a complete overview all in one place linking to many different subjects.

  • Missing @auth rule for post update state
  • Field level authorization, not the same as ACL predicate control in DQL Enterprise
  • External node/type based auth rules
  • Scalar Validation/Constraints (possible solution pre-hooks in GraphQL API)
  • Edge/Relationship Validation/Constraints in DQL
  • Separating/Combining Interface and Implementing Type @auth rules

Auth rules on interfaces cascade to the implementing type auth rules and get combined that implementing types must match ALL rules of the type and the interface, but sometimes that is not wanted, but rather the need is for the rules to be combined with OR logic instead.

  • Separating @auth outside of the GraphQL schema itself
  • Re-usable/Global @auth rules
  • Scoping/Cascading @auth rules
  • Hard limiting results in GraphQL API ā€” Prevent Data Scraping
  • [Completed?] Combining @auth and @custom DQL resolvers
  • Nested Filtering for DQL and GraphQL API (without using @cascade directive)

    AKA: Dealing with more normalized forms of data

  • Paginating child nodes as a whole irrelevant to their multiple parental levels (related to nested filtering)
  • Logically combining filters together from different levels in the graph (related to nested filtering)
  • Ordering/Sorting by nested data
  • Filter by Aggregated results
  • Scalar comparisons
  • String pattern matching (not full regexp)
  • Date/time filtering and manipulation
  • Order by enums in GraphQL API
  • Sorting by Aggregation
  • Calculated Fields/Triggers
  • Full Text Search Best Match Scoring
  • Full text search across multiple fields:
  • String Functions for inter graph comparisons and manipulation
  • Simplifying/Enhancing groupby
  • Verbosity of Multiple Node Updates in GraphQL
  • Deep Mutations

http://discuss.dgraph.io/t/deep-mutations-graphql/9789/4?u=amaster507

https://dgraph.io/docs/graphql/mutations/deep/

http://discuss.dgraph.io/t/data-doesnt-update/11489

https://github.com/hasura/graphql-engine/issues/1573

  • Comparative Inputs

http://discuss.dgraph.io/t/atomic-field-operations/14128/3?u=amaster507

  • Auto Incrementing Fields

http://discuss.dgraph.io/t/feature-request-directive-that-indicates-that-id-fields-should-be-auto-populated-on-creation/14992

http://discuss.dgraph.io/t/atomic-field-operations/14128?u=amaster507

  • Need to correct generated payload list nullability in GraphQL API

Payloads right now in Dgraph are generated as nullable items in a list. ā€œqueryUser: [User]ā€ But this should be corrected to the tightest possible type such as ā€œqueryUser: [User!]!ā€ which means that the result will be an array, it could be an empty array, but no items in the array can or ever will be null.

  • No Arraysā€”only Sets/Lists

http://discuss.dgraph.io/t/documentation-that-arrays-work-as-a-set-not-storing-duplicates/9590?u=amaster507

But what needs thought out even more so is what this change would bring with it in terms of API capability

  • Add/Update values at positions in a list chosing to replace or skoot over existing values that may exist.
  • Replace a list in its entirety. This is rather difficult to do but is such a simple use case
  • Delete values from a specific index in a list. Right now you can only delete all items in a list or items in a list by their value.
  • Allow lists to maintain order (for lists of scalars and lists of type (aka edges))
  • Move item(s) in a list to a specific location without changing any values
  • Unions sometimes produce unexpected results

http://discuss.dgraph.io/t/using-unions-and-interfaces-and-end-up-with-non-nullable-field-not-present/15680?u=amaster507

http://discuss.dgraph.io/t/dgraph-directive-does-not-work-as-expected-with-union-types/13021

  • @auth on union types

Completed? Needs to be documented or implemented if not currentlyā€”havenā€™t tested personally

http://discuss.dgraph.io/t/union-types-in-graphql/9581?u=amaster507

  • Custom Directives

The idea of custom directives is centered around directives on the developer creating directives available to clients. A developer may wish to allow some kind of direct script being processed on command such as logging the result or adding some metadata to the response (not data, but in the extensions response)

http://discuss.dgraph.io/t/custom-field-concatenation-without-needing-external-script/7283?u=amaster507

http://discuss.dgraph.io/t/can-i-get-a-substring-in-a-query/10522/3?u=amaster507

  • Custom DQL Mutations

This really depends on how much is being refactored in the GraphQL codebase and how that refactoring is done. If GraphQL will still be rewritten into DQL then it is equally important to support not only DQL in custom queries, but also DQL in custom mutations.

http://discuss.dgraph.io/t/why-does-custom-dql-not-support-mutation/10366?u=amaster507

  • Custom DQL Fields

Custom fields can currently be resolved with lambda, but it would be beneficial (again depending on how the refactor is done) to allow custom fields to also be resolved with DQL.

http://discuss.dgraph.io/t/can-i-use-dql-to-calculate-the-value-of-a-custom-field/15877?u=amaster507

  • Additional and Custom Scalars

Developers often finding themselves needing to add custom scalars for various reasons. These can often be represented as strings but with additional constraints such as Email, HexColor, Tuple

http://discuss.dgraph.io/t/json-blob-as-a-scalar/11034?u=amaster507

http://discuss.dgraph.io/t/can-we-have-remote-scalars/11726?u=amaster507

  • Authentication Service

Dgraph built an authentication system for Dgraph Cloud and was discussing open sourcing it. I believe something like that should be made and integrated into the GraphQL API so that users can easily authenticate against their own data and maybe use lambdas to return the claims from the database that the developer wants to use when a client authenticates.

http://discuss.dgraph.io/t/open-sourcing-in-house-dgraph-auth/13398?u=amaster507

  • Open Source all Enterprise Features
  • Auditing
  • Namespacing
  • [Completed in 21.12?] Backups

Should these really be enterprise or are these just enterprise level to force users into the Cloud? Are Enterprise licenses even available anymore? (They were not [or ridiculously purposefully priced outside of the budget to whom it was being quoted] under the last administration)

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/89?u=amaster507

  • Schema Migration Tools

There is a need for when users migrate their schema, they expect the data to follow. This is not done in Dgraph now.

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/34?u=amaster507

http://discuss.dgraph.io/t/migrating-renaming-predicates-etc/5239?u=amaster507

https://www.edgedb.com/docs/guides/migrations/index

  • Facets are not first class citizens

http://discuss.dgraph.io/t/improve-facets-with-upsert-block/5640

If nested filtering and linking-nodes are not able to be constrained, then we canā€™t get rid of facets. But even then, maybe we can take the concept of _linkingnodes and make them work without needing to declare the type in the middle, like how prisma creates pivot tables without you needing to specifically create them and in the prisma ORM it lets you link directly through the pivot table like as if it was a 1:1 relationship. For Reference: EdgeDB has what it terms ā€œlink propertiesā€ and abstracts these onto types.

https://www.edgedb.com/docs/guides/link_properties

  • Mapping GraphQLā€™s @hasInverse vs DQLā€™s @reverse

Right now Dgraph GraphQL API, uses the @hasInverse directive to ā€œmapā€ inverse relationships and then the API keeps these pairs of edges balanced with mutation. This creates additional work for adding RDF data with live/bulk loader to add two edges for every inverse relationship.

It might be better if Dgraph would just allow the mapping of the ~ reverse edges.

  • Var Blocks in GraphQL

http://discuss.dgraph.io/t/directive-idea-to-support-var-blocks-in-graphql/14419?u=amaster507

  • Fuzzy Full Text Search

The ability to use TRIGRAMS on phrases and not just words. You cannot search for a fuzzy phrase, only a fuzzy word. This limits full-text search.

http://discuss.dgraph.io/t/fuzzy-full-text-search/16224?u=amaster507

  • Count Words

The ability to count the number of times word appears in a text and sort by that value. This would make it possible to write relevant search algorithms.

http://discuss.dgraph.io/t/how-to-implement-keyword-based-relevance-sorting/16174?u=amaster507

  • DQL Loops

very much needed to simplify algorithms without stepping in and out of queries/mutations with a client.

http://discuss.dgraph.io/t/foreach-func-in-dql-loops-in-bulk-upsert/5533/6

  • More educational material
  • Some docs that allow us to get into the DGraph code easier, so we can contribute
  • Transparent roadmap, open issues and bugs, so we donā€™t get surprised by missing features or minor bugs

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/3?u=amaster507

  • Communicate with your users and get us involved! One post from [the Dgraph Labs] a week giving simple product updates and plans can go a very long way.

  • Provide a generous free tier.

  • An out of the box local (offline) development experience (aka without me having to learn / do much)

  • A way to batch mutations so I can roll back a group of changes.

  • Be able to simply replace a list in a mutation.

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/9?u=amaster507

  • Removing Edges using null does not work

http://discuss.dgraph.io/t/removing-edges-using-null-does-not-work/15420

  • Directly integrate other tools into Dgraph like https://magic.link , Auth0, and make it configurable with a few clicks
  • Detailed and accurate GraphQL errors
  • Focus really hard on making it as easy as possible for devs to build side-projects and hobby-projects free/cheap Dgraph instances

Dgraph has all the pieces in place to build the ultimate low-code tool. The simpler you can make it for users, the more users youā€™re gonna get.

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/10?u=amaster507

  • No sharding of predicates
  • No Query Planner (CTOā€™s current vision)
  • Upsert by xid painful for ingest-heavy workloads
  • Missing a native [time]range type

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/14?u=amaster507

  • The upgrade process requires downtime.

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/17?u=amaster507

http://discuss.dgraph.io/t/improve-dgraph-upgrade-experience-by-supporting-in-place-rolling-upgrades/8821

  • It would be nice if we have one click hosting on platfroms like digital ocean for open source

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/19?u=amaster507

  • Lack of native timestamps.

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/21?u=amaster507

  • have is a multi-RAFT approach for regional clusters like CockroachDB is doing

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/22?u=amaster507

  • itā€™s own mobile solution with offline sync

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/23?u=amaster507

http://discuss.dgraph.io/t/call-for-collaboration-designing-a-dgraph-offline-first-library/9293/11

  • BM25 and or custom search ranking
  • More and custom tokenizers

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/25?u=amaster507

  • Lack of support for some algorithms
  • many problems in path lookup

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/26?u=amaster507

  • sub-select statements in graphql
  • More in cloud editor help & messages for when certain changes will orphan data or cause negative side effects

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/34?u=amaster507

  • I donā€™t have time to write my own middleware, and I donā€™t want to host anything myself to deal with servers
  • lambdas are time consuming and should only be reserved for complex tasks

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/40?u=amaster507

  • Cannot Paste with Ctrl+V on Cloud UI

http://discuss.dgraph.io/t/cannot-paste-with-ctrl-v-on-cloud-ui/15195?u=amaster507

  • Separarate GraphQL from Dgraph [as a Plugin?]

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/70?u=amaster507

http://discuss.dgraph.io/t/what-is-dgraph-lacking/16010/71?u=amaster507

  • Typescript binding for dql client as well so responses are typed?

Reference: EdgeQL TS Client Achieved this! https://www.edgedb.com/docs/clients/js/index

  • Fix Dgraphā€™s type system so that the type is not a string value/predicate

http://discuss.dgraph.io/t/how-does-type-system-work/17616?u=amaster507

http://discuss.dgraph.io/t/func-type-error-in-20-11-03-or-later/16237/29?u=amaster507

http://discuss.dgraph.io/t/ask-dgraph-founder-anything/16352/24?u=amaster507

http://discuss.dgraph.io/t/ask-dgraph-founder-anything/16352/28?u=amaster507

  • An official dgraph toolset for data and schema versioning that is manageable within our codebases.

http://discuss.dgraph.io/t/ask-dgraph-founder-anything/16352/31?u=amaster507

  • Math Functions

https://neo4j.com/docs/cypher-manual/current/functions/mathematical-numeric/

  • Subscriptions based on CDC

https://dgraph.io/docs/master/enterprise-features/change-data-capture/

  • Subscriptions should use graph-ws

https://github.com/enisdenjo/graphql-ws/blob/master/PROTOCOL.md

  • [Completed with Learner Nodes?] Add Geo Replication

http://discuss.dgraph.io/t/geographically-distributed-datacenter-replication/13311

  • Cloud Requested Fixes/Improvements from @jdgamble555

    • Add the storage [for file/media uploads management possibly connected to S3]
    • Make data studio have CRUD functionality
    • Allow renaming types in the UI
    • Build an Auth System
  • The problem with Lambdas [A MUST READ!]

https://www.notion.so/Dgraph-Lambdas-d5e6e6614d394fb8afc935e2faa0aacf

  • Pagination with Cascade Cache 22

  • Continue bulk/live from where is stopped earlier

http://discuss.dgraph.io/t/santa-wish-list-for-dgraph-in-2021/12070/4?u=amaster507

  • Offset-based pagination is slow

http://discuss.dgraph.io/t/offset-based-pagination-is-slow/8774/5

  • Add distance for geo

http://discuss.dgraph.io/t/add-distance-for-geo/8406

  • Cannot Run Upsert in Dgraph Cloud DQL UI (only supports JSON not RDF)

http://discuss.dgraph.io/t/cannot-run-upsert-in-dgraph-cloud-dql-ui/14957?u=amaster507

  • Move @default into 22.0[?]

http://discuss.dgraph.io/t/automatic-value-for-creation-date-and-modification-date/11922?u=amaster507

  • Custom [Digging] Function(s)

http://discuss.dgraph.io/t/custom-digging-function/9499?u=amaster507

  • Defer field selection to subquery when using @custom DQL resolver

http://discuss.dgraph.io/t/defer-field-selection-to-subquery-when-using-custom-dql/9221/3

  • Multiple Reverse Edges

http://discuss.dgraph.io/t/why-cant-i-create-two-separate-attributes-to-the-same-node/17067?u=amaster507

http://discuss.dgraph.io/t/one-way-vs-two-way-hasinverse/7255?u=amaster507

http://discuss.dgraph.io/t/support-nested-link-and-multiple-link/10031?u=amaster507

  • Counting within pagination

http://discuss.dgraph.io/t/feature-request-count-property-on-relationships/14949?u=amaster507

http://discuss.dgraph.io/t/metadata-to-determine-if-there-are-more-results/15422?u=amaster507

  • Cursor Based Pagination

http://discuss.dgraph.io/t/cursor-based-pagination/14049?u=amaster507

Support for JSON-LD

http://discuss.dgraph.io/t/support-json-ld-on-dgraph/7162?u=amaster507

  • Dreaded Context Exceeded Bug

http://discuss.dgraph.io/t/mutation-failed-because-dgraph-execution-failed-because-context-deadline-exceeded/15221?u=amaster507

  • HIPAA Compliance

http://discuss.dgraph.io/t/dgraph-and-hipaa/15624?u=amaster507

  • Incomplete Items (I believe) From http://discuss.dgraph.io/t/dgraph-product-roadmap-2021/12284

    Lambda Should continue to resolve GraphQL
    Remote authorization hooks
    Auth on Union type
    Pre/post auth hooks for update mutation
    Global auth rules
    Replacing types in GraphQL schema: show left over data
    Support DQL Variables in Mutations
    String transformation functions
    TF-IDF scoring on full-text search [Dgraph 21.07]
    Integration with Kafka
    Integration with KeyLines
    Support for Gremlin
    Integration with BI Tools (e.g. Tableau)
    Import Neo4j json or CSV
    ORM for top-3 languages ( JS/TS, Py, Java )
    Load/stream data directly from SQL to Dgraph Cloud
    Load/stream data directly from MongoDB to Dgraph Cloud
    Load/stream data directly from Elastic to Dgraph Cloud


This is a start at summing up everything yet again. Others like @BenW @jdgamble555 might have more to add here tooā€¦

Probably missed someoneā€™s beloved feature request or problem needing a workaround and Iā€™m sorry, that was not on purpose. :heart_eyes:

7 Likes

Hi @amaster507,

Thanks for quickly putting this together.

2 Likes

Since we brought up in our discussion HL7 you might also be interested in