What is Dgraph lacking?

I still think that every system should allow for maintenance windows, but I understand not wanting them. Facebook, Google, etc. don’t do maintenance windows, and Dgraph is pitched as a backend solution that should have been used by Google, but they passed on it.

I think the big kicker still is that the data files are part of the upgrade as well and cannot just be dumped between versions. I would think that eventually the data files would not need to be changed between versions just upgrading the algorithms to query/mutate the data in them more efficiently and with more tools/functions.

1 Like

I am here as a hobbyist developer ,

No i guess ,

From my point of view there are lot of possibility for dgraph it can be a great database for apps or apps that need complex relations ; dgraph would be better in such cases
as @BenW mentioned https://roamresearch.com.

Another Issue i find is the pricing of dgraph cloud or hosting it
It would be nice if we have one click hosting on platfroms like digital ocean for open source
or maybe have a hobby plan for it like 2$ or 5$ per month on dgraph cloud
the current with 1mb limit is just not enough and the enterprise and shared plans are just out of budget for hobbits developers like me
I myself am a very big fan of dgraph due to performance ,graphql and the many possibility of it as a general purpose database and the simplicity it provides over other databases
but kept away from it due to its plans
and same for so many hobbyist developers like me

For that graphql with typescript with Apollo becomes complicated and not easy to start as compared to supabase or firebase

3 Likes

Are you using codegen? It is an absolute must if using GraphQL, Apollo, and Typescript

With Dgraph, Apollo, Codegen, React, and Typescript you have ONE source of truth for your types that are strongly typed from the top to the bottom of your app. Need to change the types anywhere, you just change them in one place, the Dgraph GraphQL schema, and then deploy/rerun codegen and you have the same schema updated at the database (DQL schema) and the frontend (Typescript). Gamechanging!! :exploding_head:

3 Likes

The missing feature that surprised me most was lack of native timestamps – gives the impression applications aren’t the main use-case.

7 Likes
  • Is it lacking performance?

Not that I’ve found, but I echo @seanlaff 's comment above re: sharding on predicates. For us, this is a huge concern. The vast majority of our graph will only have a few predicates (xid being ubiquitous).

  • Do you really need transactions?

Yes, if this is meant to be a production application database.

  • Should Dgraph not be a database? Should it be a serving system instead?

We are looking for a graph database, not a graph layer on a relational database, if that’s what you meant.

  • Should Dgraph interface with other databases better?

If Dgraph could ingest RDBMS schemas that would be interesting, though I don’t know how you would solve that.

  • Does Dgraph not work with GraphQL properly? or React?

We are not interested in GraphQL as the main feature of Dgraph. We are looking for a massively horizontally scalable graph database that can do the high-performance graph traversal that an RDBMS isn’t tuned for. And in fact we prefer to use Dgraph Cloud rather than host it ourselves. It’s not an easy product to self-administer, and we’re thankful for the support.

In fact, the focus on GraphQL has been a little disappointing. We maintain a GraphQL schema just in case, but we use DQL exclusively.

The only wishlist item I have is a multi-RAFT approach for regional clusters like CockroachDB is doing, but it’s not at all a deal breaker.

4 Likes

You know, one thing I would pay double for is if Dgraph had it’s own mobile solution with offline sync. MongoDB has Realm, but unfortunately it only syncs with MongoDB. Even if Dgraph built a syncing gateway that works with Realm (presumably by building an extension to Realm that allows it to sync with Dgraph’s servers) that would be incredible.

As nice as that would be, I totally understand the hard stop of Dgraph to not go this direction and even limit supported OS to only Unix. This helps the team to super focus and build out what matters with the best way instead of doing it one way for Unix, another for PC, another for Mobile, etc.

2 Likes

OK my selfish requests.
I want Dgraph to provide:

  • Blazing FAST query results
  • Cheaper than other platforms because its been designed to take advantage of SSDs instead of everything having to live in ram
  • Clear examples of how to accomplish things
  • ES like search and query DX and performance if not faster, a decent if not identical query string query api with great index intersection queries
  • BM25 and or custom search ranking
  • More and custom tokenizers
  • The whole damn thing is postings lists it should have amazing search!
    UIs are built with search interfaces, ES query string query is THE interface I have used for over 8 years on multiple projects to build applications. As soon as you add search to an app, the search index drives all the views because it does the filtering, the sorting, the faceting etc. All of those queries are constructed using a simple query string query api. I would love to see this. Query string query | Elasticsearch Guide [7.15] | Elastic

Personally I don’t care about lambdas, maybe some day I will but they seem like a crutch for missing features and I have no visibility or intuition about how they impact performance, scaling and ongoing cost of operation vs something like aws lambda which I already use extensively

Also I don’t care about GraphQL, I love that I can write a simple schema and get a full API that rocks, GraphQL was just a hoop I had to jump through. Perhaps subscriptions will change my mind on the value of GraphQL but it also seems like it holds the platform back because if something is not in the spec we can’t use it? Having two query languages feels kludgy not to mention yet another graph query language seriously you couldnt just use gremlin or sparql or something? I’m certain you have great reasons for YAGQL but the pre-existing stuff already has so much documentation /rant

3 Likes

Lack of support for some algorithms, such as community algorithms. And there are many problems in path lookup, often too much memory or too long query time to get results

3 Likes

More and custom tokenizers

There is already support for custom tokenizers, fwiw. Indexing with Custom Tokenizers - Query language

2 Likes

@iluminae thanks for the link. I have avoided DQL so far and stick exclusively to the GraphQL side of the cloud service. I will certainly take a look but since I already do pre-processing in my aws lambdas upon ingest I will probably implement my custom tokenization there. Downsides to adding DQL to my project are that it’s yet another thing to figure out, it has more power but requires more maintenance of relationships whereas GraphQL has more guard rails and does more hand holding which I appreciate.

6 Likes

I think there are some things missing from DQL that could reduce friction when developers consider adopting Dgraph. For example, the ability to include non-aggregate predicates in @groupby query blocks. I’m willing to bet that it’s a common query folks try, especially those that come from SQL backgrounds and are used to queries like SELECT COUNT(1), id, name FROM table GROUP BY id. Improvements to pagination allowing you to query pages using limit/after when applying some non-uid sort on an indexed field would also be welcome. I haven’t tried GraphQL, so I don’t know if these features are available there.

I could probably do without transactions for most of my queries, with one notable exception: there’s currently no way to replace a list entirely in one query. The only option is to run two mutations, one delete and another set.

4 Likes

TLDR; DGraph makes a high quality wife, but a terrible girlfriend.

Everyone has different backgrounds, so here is mine:

I’m a senior dev freelancing to support ~5 small to mid size companies (100k/year to 20 million). Some of those projects are the kind that could explode in growth. I’ve stood up close to a dozen websites in the last 2 years from scratch and shipped 4 different react-native applications. The name of the game for me is fast scalability. I use dgraph cloud, and mostly use the graphql API for one project. I’ve played with Neo4j before, but mostly I’ve done SQL.

For me, I obviously want it all. I want a scalable backend that requires little to no code from me for the core operations. I want the ability to customize and plug into that backend when I need it to.

RESPONSE SUMMARY
DGraph makes a high quality wife, but a terrible girlfriend. It has some great virtues but you have to invest a lot into it to get them. It’s still too needy and picky, so unless it’s exactly the solution you need, it’s not worth dealing with it’s idiosyncrasies yet.

I intend to only use dgraph on some projects for the foreseeable future. The projects must both need high scalability and have extensive relationships that are critical to core operation of the project. If I can get away with Hasura, I will continue to use Hasura for the time being due to it’s extra polish and significantly better developer experience.

I want to recommend DGraph to people. I have a consulting company that I work with that I’d love to get to use dgraph instead of Hasura beacuse it would simplify many of their problems. But I can’t because Dgraph is not developer friendly.

To use DGraph more I would need to see:

  • Filtering across relationships (in graphql)
  • Bulk Update across relationships (in either, but preferably graphql since it’s cleaner than DQL)
  • Something like SQLs cascade on delete/update system so I can enforce data integrity at the data layer

To recommend DGraph more I would also need to see:

  • the graphql for dql tutorial be improved to help more than junior developers (sub-select queries, sub-select in mutations, group by, windows, indexing, rank)
  • Improved DQL docs for upserts OR add sub-select statements in graphql (preferred!)
  • Faster uptime to first external query
  • More in cloud editor help & messages for when certain changes will orphan data or cause negative side effects. IE - "You dropped the column “oldDataColumn” from your schema, would you like to remove it from your DQL schema and also remove all predicates for that column?
  • Better lambda support. Currently lambdas are slow and unwieldy. Hasura has a simple UI that would help a ton.

Also, I’d be happy to do some User meetings to discuss things more in depth, walk you through my development experience and it’s frustrations, brainstorm ideas, discuss training materials or help almost in any other way.

I’ve already mostly committed a pretty large project to dgraph and would love to be able to use it for more projects and recommend it more. It could easily become my go to backend for everything.

DETAILED RESPONSE

To help out I’d like to rate DGraph compared to both Postgres, and Hasura

Scoring PostgreSQL vs Dgraph (out of 5)

Basic CRUD:
Dgraph: 4 Postgres: 5
Relationships:
Dgraph: 5 Postgres: 3
Data Maintenance:
Dgraph: 2 Postgres: 4
Data Validation:
Dgraph: 2 Postgres: 4
Data Transforms:
Dgraph: 3 Postgres: 5
Scalability
Dgraph: 5 Postgres: 3

Explanations:
Basic CRUD
Pretty close here, the -1 to dgraph is mostly due to having to maintain a giant schema file instead of being able to break things out into separate files
Relationships:
Dgraph wins here, and this + scalability is the reason I’m using DGraph still even with it’s limitations
Data Maintenance:
This is painful in DGraph. I shouldn’t need anything more than an editor to maintain my data. I have to write scripts way too often. Bulk updates, bulk deletes, renaming columns, moving/copying data from one column to another. Even using DQL I still find this obnoxiously painful too often, not always, but way more than it should be.
Data Validation:
Mostly minus points here because of a lack of more find grained data types, and any form of constraints to ensure data continuity. Please don’t improve this too much until after making Data Maintenance better.
Data Transforms:
There are no triggers, except Lambda, and Labmda is a non-performant pile of avoid me right now. Also, doing any sort of data transform across a table requires a script. Doesn’t work good in DQL or GQL. Say what? We have a graph database that excels at relationships and none of our core query languages support maintaining or working with your data across those relationships.
Scalability:
Yeah, Graph databases win here big time when dealing with large datasets. Only applicable to larger clients. Since my project is still in-progress I’m just hoping dgraph scalability lives up to what it feels like it should be capable of.

Scoring Hasura Cloud vs Dgraph Cloud (out of 5)

Authentication/Authorization:
Dgraph: 3 Hasura: 5
Side Effects:
Dgraph: 3 Hasura: 5
UI Console vs Dgraph Cloud Console
Dgraph: 3 Hasura: 5
First Time Setup:
Dgraph: 4 Hasura: 4
Graphql API Quality:
Dgraph: 2 Hasura: 5

Explanations:
Authentication/Authorization:
-1 to DGraph because Hasura’s permissions UI is phenomenal. Dgraph’s is good enough, but if they were to copy Hasura’s permission mechanism, that would be way better. -1 Because it wasted 4+ hours of my life trying to get my first query working externally. Being on DGraph cloud I was missing adding an SDK key. Finally found the docs in a completely different section of the website, buried. Write a guide: “Getting your first external query working”, or even better allow me to have my server in “dev mode”, and give me a better error message with a link to the docs.
Side Effects:
Mostly because DGraph’s lambda system is painful. I get one tiny little editor in the cloud, or I have to set up a new custom build and upload to an endpoint. If I’m going to do that I’ll just do it on AWS or something where there is already a lot of tooling support. Especially because lambda’s on dgraph so far are slow.
UI Console vs Dgraph Cloud Console:
Hasura’s is more polished. DGraph still has one giant schema file, the inability to delete a type and all it’s child predicates and their data, one giant lambda file. Way too much padding so I have to scroll all the time. No separated permissions or relationship definitions.
First Time Setup:
They’re both pretty equal here. In Hasura you have to setup your DB separately, but you get a UI that can help you start building your schema effectively and with separated tables.
In DGraph the Graphql Schema is a lot nicer for modeling data than SQL, no separate database, but you have one place for one giant schema file which leads to a LOT of scrolling. Plus, putting a comment in the bottom of the schema file for your auth stuff? Seriously?
Graphql API Quality: I actually think that DGraph has more features overall here, but the big reason I ding DGraph so hard is they are a Graph database that doesn’t allow you to walk relationships on a filter. Cascade fits some use cases, but not all (like mutations). Hasura only works on SQL databases, but still allows you to walk across relationships.

7 Likes

To be clear, you’re talking about Nested Filters (probably the most requested GraphQL Feature).

And this is my suggestion, the @reference directive. This could be accomplished now by using a post-hook lambda, but you would have to write it. This is definitely a huge missing feature, but do-able now.

J

1 Like

Just jumping in here. Not answering on behalf of Manish.

Well, you can break the schema into separate files. It is very simple. The Schema API is very straightforward, you can add OR delete. Which means you can add any piece of the schema at any moment that it won’t break or undo anything. It will be a problem tho if you set a different thing to the same predicate several times. This can trigger the indexing process or something.

Can you explain what is your issue with this in detail?

Can you tell what are the solutions in other DBs related to this? this looks obvious manual work to me. Humm, maybe you want something like Ratel to do things like renaming?

I still believe it would be possible to do live upgrades with Dgraph due to the nature of it being a Cluster. But it takes attention to develop it.

For example: Let’s say you have 6 Alphas in version v21.03.2 with 3 Zeros in version v21.03.2. You could create a second cluster of the same but a latest version and then move with moveTablet the predicates to this new cluster. In practice, you would double the cluster size temporarily. The price you would pay to have live upgrades.

The only problem with this would be if there were drastic changes in the way instances communicate within the cluster.

All that would avoid some manual work during an upgrade. But during a move of tablets, mutations gets blocked.

2 Likes

TLDR; I feel like Dgraphs biggest need is Developer eXperience. DX is the only thing stopping me from using it on more projects, or pushing my clients and partner devShops to use it more.

@jdgamble555

To be clear, you’re talking about Nested Filters(probably the most requested GraphQL Feature).

And this is my suggestion, the @reference directive. This could be accomplished now by using a post-hook lambda, but you would have to write it. This is definitely a huge missing feature, but do-able now.

Yup, Nested Filters. I realize that you can do it now writing a custom Lambda, but in many DGraph and modern competitors you don’t have to.

@MichelDiz

Well, you can break the schema into separate files. It is very simple. The Schema API is very straightforward, you can add OR delete. Which means you can add any piece of the schema at any moment that it won’t break or undo anything. It will be a problem tho if you set a different thing to the same predicate several times. This can trigger the indexing process or something.

Can you explain what is your issue with this in detail?

Sure! To start with let me address your points:

The Schema API is very straightforward, you can add OR delete.

Yes, but why? Building a basic file list on the side, an editor on the right, and saving in the UI is not a large task. Particularly if it adds additional insights, checks, and guidance. Most competitors have one. It makes it a lot easier for me to setup team coordination and training to have a central UI representing our DB, backend, and it’s metadata.

I feel like these kinds of small things are part of the reason DGraph may be struggling. Your response comes across as a little defensive, which makes sense. You’ve built a neat product and want people to like it and use it, but it isn’t polished and it isn’t intuitive, and those small differences add up in the long run by pushing training costs back on to me, your customer.

Is the DGraph team hungry to build something revolutionary? You’re on the cusp of having my dream backend. I don’t know you, but I don’t see that hunger here on the forums. I see responses indicating you feel like you should have already one and everyone should recognize it.

“Bulk updates, bulk deletes, renaming columns, moving/copying data from one column to another.”

Can you tell what are the solutions in other DBs related to this? this looks obvious manual work to me. Humm, maybe you want something like Ratel to do things like renaming?

Does anyone on your team have significant SQL experience?
I don’t care where it happens, even it if was just an API call, and it’s just an example of some of the basics I’ve had to fight. Here are some examples of things that are 1 minute tasks in Postgres, but at least an hour or two, if not more in dgraph:

Renaming a column in SQL looks like this:

update table product rename column product_price to price;

This will change the name of the column, but keep the data type and all data on the row connected to the row still, no data loss, no orphaned predicates. I realize it’s not as easy here since you’re maintaining each field via predicates in a map instead of indexing rows by byte length.

There are two hard things in programming: Cache invalidation, naming things, off by one errors.

Jokes aside, naming things is hard, anyone not renaming their fields occasionally either has a massive database, or is a foolish.

Bulk copy
Update TableB set column1=(select Val from TableB where ....) and set column2=(select Val from TableB where ...)

In some cases I might need to do a window to prepare the data

select data from (select a, b, c from TableC C join TableD D on C.id=D.id group by ....)

update TableA set column1 = data.a where data.b = TableA.g

Now that I understand DQL better, this mostly works until I have to get into anything nested.
A live example that came up yesterday, and that I’m still fighting. In my product we have tagging. So my schema looks like this:

type Tag { id: ID, taggedItems: [Item] @hasInverse(field: tags) }
type Item {id: ID, tags: [Tag] @hasInverse(field:taggedItems) }

Except some how, the @hasInverse isn’t working, so I only have one collection populated. I’m trying to copy from Item into Tag, and because of a bug in DQL nested predicate variables are not flattened and it doesn’t work (this is talked about elsewhere on the forum).

And once that’s done I still have to figure out exactly which pattern of adding data will trigger the @hasInverse correctly and which one will not.

Bulk Deletes
Now that I know some DQL better, this is mostly okay. Getting an error in Ratel of “t” and just that then having to dig into the network logs and the docs to figure out the status codes is annoying, but tolerable enough, it would be a lot nicer if it just said “Not logged in”. Non-200 error codes can send status messages that could be displayed, it isn’t complicated either.

My biggest issue with bulk deletes is even if it does work, I’m constantly hitting the 1,000,000 count meaning I not only have to figure out the right way to modify the data, I have to architect a paging solution. Yes, I need to learn more about DGraph and get better at DQL, I’m working on that, but my goal is a simple copy of data that would take me 1 minute in SQL. I don’t make any money wrestling DGraph data, I lose money.

Removing a column
In SQL:
alter table TableA drop column old_column on cascade delete

And just like that I’ve removed a column, all of it’s data (no orphaned predicates), and removed any child tables that were reliant on me.

Computed Columns
In Postgres:
`alter table TableA add column name text generated always as (CONCAT(first_name, ’ ',last_name)) stored;

And just like that I have a computed column that recalculates itself every time the first_name or last_name are updated

Summary
DGraph is 70% of the way to being my dream backend. I specialize in rapid full-stack development, but I also try to account for scalability. I would love to have backend that I can quickly throw a prototype together for a client for, and then not have to re-work it after 2 years as they scale. AWS lambdas can deliver scale, but not the flexibility of a graph DB, and it’s not fast. SQL is fast and clean, but doesn’t scale without significant continued effort and investment, and even re-planning and structuring your data for partitions.
DGraph could do both/all of that, you just need polish honestly. Too many small barriers that add up so that I can’t convince one of my clients to train their junior on it or to take the plunge.

My backend selection process:
Does the project need scale? If no, then use Hasura because I can build a lot faster on it.
If yes, continue:
Does the project also have a lot of relationships like a recommendation engine, a feed/timeline, or something like that? Consider DGraph
Does the product need a lot of subscriptions (live chat, etc)? Consider Hasura since they’ve proven their subscription effectiveness (See Hasura scale to 1 million active GraphQL subscriptions/live queries)

5 Likes

I wrote a Node.js script for that exact thing:

1 Like

The big thing here is that SQL is in the market since 1970s and Dgraph is about 6 years old. Also, Graph DBs are totally different paradigm from SQL. Several SQL concepts won’t work in a Graph DB. For example, we don’t have “columns”, we have predicates which is the smallest part of a node/entity.

In your whole life, you were introduced to the SQL world(from some course degree or dev courses) and just now you are learning a new thing. I personally have too little experience with SQL. I have just used MySQL in my whole life and nothing too deep. Dgraph was hard for me in the beginning, but after understanding the hard parts all get very easy for me.

In the end, you can’t compare the two for the reason SQL is made for columns and rows, has 51 years of existence, virtually everybody is exposed to it. And a Graph is a truly relational DB. Graphs are made to turn relations simpler. And if you buy this idea, should take the whole package right? IMHO I don’t think we should make a SQL like lang just to please SQL users.

I agree that our main issue is to educate users and explaining to them that this is a different paradigm.

Yes, I think you come from Dgraph cloud, right? The way you say this sounds like you have used the Cloud and just the GraphQL feature. Building the UI is “simple” for sure. This can be accomplished better with time. Dgraph dev cores are the best in the dev core. The UI part is second in importance. But as I said, you can do small pieces of your schema and hit the cluster any time that you won’t have any problem. The only “problem” is that you have to do it in your client end or via cURL(even Ratel you can do this. Just go to Bulk edit).

It is not defensive. I just wanna be aware of it. And share with your a possible solution for now.

I think so. But that’s the thing, they’re focused on the Graph DB paradigm and not SQL.

Here Migrating (renaming predicates, etc) - Users - Discuss Dgraph

The query is bigger, but it does the same thing.

You probably can do root-level copies with the same upsert block query.

Did you open an issue for that? share the link pls.

hasInverse doesn’t work out of the box in DQL. You have to learn deeper for this.

9 out of 10 of this error is typo.

See? you still are in the SQL paradigm. The way Dgraph works is like a puzzle. Some tasks that you do in SQL are too expensive for us to put natively to users. Deleting something in cascade is dangerous, so ideally, the user should create their own query for it. Thus avoiding unwanted results.

You can also use “Drop predicate”.
https://dgraph.io/docs/ratel/schema/#bulk-edit-schema--drop-data
https://dgraph.io/docs/cloud/cloud-api/backend/#api-command-6

All the other points you have mentioned before and now I’m not covering cuz they are or valid or something to be picked or not by Manish.

2 Likes

@MichelDiz - I think you guys are looking at this the wrong way. Why not be experts of general SQL concepts, noSQL concepts, and how other Graph Databases do their job.

Why reinvent the wheel? Understanding what does and doesn’t work can save you 45 of those years.

The alternative is having orphaned phantom predicates, which should not be an option either. This will and does also create unwanted results. This is a perfect example of how SQL solves this problem keeping data consistency with FK CONSTRAINTS, an under-the-hood feature that seems to be entirely missing in DQL.

I believe these queries are mathematically exact, so why does the user need to create their own? If there is unwanted results, it is probably for another reason (bug in the database)?

That being said, I am not pretending to understand anything about DQL, just want a GraphQL that works similar to Hasura.

J

3 Likes