Make Multi-Tenancy open source

Relaxe111 commented :

Well this is not exactly that. According to comments written earlier, they will consider if it will make sense to open source multi-tenancy. So I think than to speculate around this issue, will be better to wait for an official announcement from Dgraph team. )

Willem520 commented :

@Willem520, from this roadmap, they do not plan support this feature in community edition, it will be included in enterprise edition.

I think they will reconsider this feature

grizzly-monkey commented :

We were evaluating dgraph for our SaaS application. The planned feature that we would need is Multi-Tenancy
Most of the DBs give this as part of OSS code and is a basic necessity these days.
It would be great if you give it a thought to have this as part of open source road map.

Thanks
Jeet

liveforeverx commented :

Hi, @manishrjain !

I would like to join all other people here (at least 5) with a question about multitenancy.

I think, that almost everyone, who uses DGraph is interested or can profit with this feature or get a better user experience by using DGraph. Even, who just runs pet projects and not able to buy a full licence would be interested in this feature. Just not to run different Dgraph’s instances for different pet-projects and mostly 2 Dgraph instances per pet project (for example, because dataset in dev and test is different and test is cleaned, so instead of running one Postgres, you need most probably to run multiple DGraphs per any project).

I know people, who simply do not take a database as a serious database without this feature (and it was most major no in adaptation on my previous company, as I remember and one of the important points in my current company) and it was one of the most oft complaints I heard from people on meetups.

It would be great, if you would think about making this feature accessible to everyone, it would be great, if it would be Open Source.

Just some examples:
Neo4j offers it for free: Managing Multiple Databases in Neo4j - Developer Guides
ArangoDB offers it for free: Working with Databases | Databases | Data models & modeling | Manual | ArangoDB Documentation
OrinetDB offers it for free: http://orientdb.com/docs/last/OrientDB-REST.html#post---database

I personally do not know another database, which doesn’t have this feature or one, which offers this feature in Enterprise edition.

But, even, if it would be a one-time payment, for an acceptable price for the sole developer for this feature, I would like to consider buying it even for personal use on my development machine to get a better UX from DGraph usage, as I know it from any other database I used before personally and professionally. There was an opinion that other databases offer it not for free, I don’t know any other database, which has multiple databases, has an open-source version and doesn’t offer it for free.

My personal example: at the moment, I reset DGraph by every switch between developing an adaptor for DGraph and my pet-project and after the switch, I refill my pet-project with data every time. My tests on the pet-project carefully designed to clean every rubbish they create by themself to avoid this problem and be able to use dev + test on the same database.

P.S. I gave talks in meetup about DGraph and Elixir, I’m the maintainer of the most advanced Elixir driver for Dgraph.

sorenhoyer commented :

dgraph looks dope, but if multiple databases / multi tenancy support is only planned for enterprise customers I think most people doing SaaS applications, at least startups will just look elsewhere (eg. ArangoDB) which could be a real shame and possibly a lot of lost revenue (eventually). I’m sure people will be willing to pay once they get a decent amount of customers on, so if you don’t want to give it away for free, maybe make a limit of say 200, 500 or 1000 databases (1 for each customer/tenant) for the Community version. If you need more I’m sure you can also pay for it. I for one is in that situation right now. I’d definitely have started my first SaaS on dgraph if only you had planned this as a community feature, but in the end decided to go with Arango due to this :confused:

Hi,

Do you have plan when or which verison to make it open-source?

Thanks a lot.

Just came here to share my thoughts on this matter.

I agree with most of the above, than other OSS databases come with multi-namespace/schema support out of the box and that this is something which DGraph IMHO should support too.

Some others have also mentioned, and I would agree with this point, that ACLs is an acceptable feature to maintain as enterprise-only. I would prefer it not be, but I understand why you would want to make it so.

From my use-case perspective, we have data which is owned by different accounts, and when querying DGraph we only want to return the data owned by them. Our users don’t get raw access to the DGraph instances, they go through our application services which apply authentication/authorisation rules, as well as handling other non-graph data. As such we don’t require ACLs, just a way to only query data belonging to that user. Our current approach is to add a predicate to every node with their account ID, however this is starting to slow down our queries when we have 1M+ nodes, even if the account only has a few hundred themselves.

We’re about to start a new project and would have preferred to use DGraph. We’ve been putting off the project in priority to wait for multi-tenancy to come, but now we’re most likely going to go down a different approach.

We’re still a startup, <20 staff including a few developers, so the opportunity for enterprise licenses is a possibility for the future, but not now. The only way smaller companies are ever going to invest in an enterprise license is if they can use it in anger for 12-18 months and have success using it; this issue is preventing our company from doing so.

1 Like

I agree with most of the comments above, and I definitely think that multi-tenancy should be in the community edition. I also think that ACLs should be in the community edition, since that’s also a fairly fundamental part of databases.

In general, I think enterprise versions should be for things like:

  • 24x7x365 management / support
  • Improved resource utilization (e.g. more cores / parallel processes)
  • Quicker patches for bugs/updates
  • Handling multi-datacenter replication (learner nodes falls into this category)

These kinds of features often make sense financially to go with an enterprise contract because they would cost the company more to go with a free version than the enterprise version.

However, as many have pointed out, lack of multi-tenancy and ACLs can be a barrier for even adopting the technology in the first place, thus losing out on an opportunity for a future customer when they really see the value in the product for their business (which I believe many businesses would do once they gave it a chance).

This is especially important with a product like Dgraph because in a way it’s potentially paradigm-changing for many businesses, and as mentioned above, it could just be a non-starter because of the lack of multi-tenancy.

1 Like

Just to chime in here, MT was always meant to be a proprietary feature – just like distributed aspect of Dgraph was always meant to be open-source.

The rationale for Dgraph proprietary features has been that these can be worked around with more user-side code, so they are not absolutely necessary for functioning of the DB. That rationale still holds, and therefore we don’t have any plans to make MT or ACLs open source.

Both of these, along with all other enterprise features, are however, present in Dgraph Cloud. That’d be the recommendation for anyone who’d like to use these features.

1 Like

As a new user that looks for a DBMS for a world-size project, I can say that this statement doesn’t take into consideration real world needs for this kind of developement. I was pretty much convinced by your product, but I’m definitly not willing to pay 199$/month just to get this basic feature and i’m not willing to bloat my codebase with DBMS’s job to be able to use the community version. I’m not sure that someone would want to do that either.

You say that it’s available in Dgraph Cloud, but according to your princing plan MT is only available for dedicated servers.

There are not so many companies that are willing to start by spending so much on a new product, considerating learning curve, the proposals of your competitors and the obvious need for these features for watever serious project. Pet projects, maybe, can overcome lacking this features, but to run a real business…

Or, at least, make it available to the shared plan. I can considere investing time and efforts to migrate from good old RDMS paradigm to Graph based databases, but I can’t decently spend so much money and resources to build an advanced proof of concept.

Many developpers will agree on that point.

In my case, as a result it’s a no. I hope that you will change your mind about it, Dgraph sounds really promising.

3 Likes

Just to tackle this point. The shared instances start at $9.99/mo. One can spin up multiple of those as needed, each instance acting like a graph silo, to get MT. This would still be cheaper than running an Amazon RDS or equivalent.

So, since it’s unclear with the princing page, what would be the cost of a shared instance with only MT enabled ? On what criterion that price is based on ? You must have an idea. And if you have an idea, why not filling the product page with this critical option cost ?

My guess is that you know that this feature is essential, and you try to convince your users to become customers, or your customers to spend more and become better customers. You run a business, you need to get more customers and free users doesn’t pay bills.

But the cost of getting your product is not only the per month price. It’s time investment, employees training, many trials and errors. And before knowing if it will worth the investment, you must allow your users to be able to produce some proof of concept to convince their company, or themself, that your product is the way to go. Since it’s not very clear what would be the cost of the minimal features that you can provide, and that any serious DBMS natively provide, even in open source world, you may discourage users to only considere your product.

If you ease the creation of PoC for your users, you will get more customers.

2 Likes

We are currently considering DGraph as an option to switch stack from PostgreSQL / Clickhouse to Graph Database / Clickhouse bundle.

MT as open-source option would definitely favor the decision to choose DGraph. I would share the thought process here.

While having MT as open-source would be great we are ok with writing our own crutches to patch the functionality we need. The main question is whether you plan to provide all the performance / clustering / core functionality as open-source (machine learning features down the road included).

The sensitive nature of the data we work with limits us to hosting it ourselves. On the other hand both for me and co-founders of fellow startups the speed of development is more important than costs associated with cloud offerings. Ease of setup is the major driver for using managed k8s for example.

I believe you wouldn’t loose lots of DBaaS customers if you offer all the functionality in the open source version since from my experience lower maintenance cost of cloud offering is the decisive factor with faster development speed being the second most important. For those working with sensitive data and forced to manage the cluster themselves not having the functionality might be a major roadblock.

Personally, I would like to know what are the chances of core functionality being locked behind cloud offering since that is the main issue for us. Rolling our own MT / ACL while cumbersome is still possible.

TIA

I’ve been so excited about Dgraph, started my new project with it, and then I ran into this blocker.

I don’t need ACLs. I agree that ACLs make sense as an enterprise feature. I do however need to create multiple logical databases within a single Dgraph instance.

The workarounds suggested above won’t work for me:

  • I can’t host multiple Dgraph instances because my project will have 10000+ logically separate databases.
  • I can’t go the enterprise route because my project is open source.

I think this will be a blocker for many open source projects that want to integrate with Dgraph. Supporting other open source projects seems like it goes well with Dgraph’s 1st core value of “Be Open Source”.

If it would be approved, I am willing to do the work and submit a pull request to bring namespaces to the community version of Dgraph (without ACLs).

How would you secure namespaces without ACL? Essentially you could never allow someone to have access to change the schema for only their namespace without also giving them the access to change the schema for every namespace.

I don’t need to secure the namespaces, I just need logical separation. I need to be able to upload a schema for one namespace without blowing away the schemas for the other namespaces.

I think the point of the ACL is to secure the namespaces, so when I say I don’t need the ACL, I’m saying I don’t need the security features.

I think this lines up with what others are saying about wanting separate namespaces for separate apps in local development, or separate test and dev namespaces.

Logical database separation is a critical feature for local development and testing even if you don’t use it at all in production.

1 Like

Yeah, I thought about putting postfixes on my apps, so:

App1: study
App2: music

Then you would have:

type User_study {
  id: ID!
  username: String
  ...
}
type User_music {
  id: ID!
  username: String
  ...
}

I guess you would query like queryUser_study, although I am not sure what the naming restrictions would be (can you have _?)… maybe even prefixes somehow…

Not ideal, but if you use a query builder, this would not be terrible.

J

This is what I actually recommend, that way you can still query across the singular database if needed, you cam’t query across the enterprise namespaced database at all.

As far as naming restrictions, the spec limitation is that types and fields in GraphQL match the regex, [a-zA-Z][a-zA-Z0-9_]*.

It is conventional that they don’t mix cases bust not required. (Eg, lowercase, UPPERCASE, snake_case, camelCase, PascalCase)

But they cannot contain periods (.) or other characters outside of the above such as chinese characters.

Ok, then I would definitely do prefixes:

type study_User {
  ...
}
type music_User {
  ...
}

J

It may be interesting to note that dgraph internally prefixes all predicates/types with the namespace individually with a number, so prefixing your predicates/types yourself is virtually identical to what dgraph would do at the storage layer.