Multi-Tenancy 2021

Following from Multi Tenancy in Dgraph

Multi-Tenancy in Dgraph is a proprietary feature. Discussions about making it open-source belong to Make Multi-Tenancy open source.

Multi Tenancy

Goals

  • Keys based separation of data
    • Do this in Dgraph, instead of Badger for simplicity.
    • In Dgraph, we just need to modify keys.go and context.
    • Context would carry the namespace information from the query
  • Each user belongs to one namespace
  • Guardians of the cluster
  • Restores and export data would need to specify namespace.
  • When we create a namespace, we need to also create the corresponding internal types and predicates. So we can store their graphQL schemas and stuff like that.

PRs to create

  • Given a namespace, the data should be separated out using keys and context (don’t worry about internal types, don’t worry about namespace creation, just focus on reading and writing).

  • Create and Delete a namespace.

    • Only expose uint64 as the namespace.
    • We use random number generator to determine namespaces.
    • As namespaces are banned and their data is dropped, we can reuse those numbers.
  • For that, we need the “ban” feature in Badger. We need a (DefaultPrefix + namespace pred with separator) in keys.go prefix. So, essentially an equivalent of PredicatePrefix in keys.go, we need a NamespacePrefix.

    • Get schema back correctly — be aware of namespace.

Next PRs

  • ACLs with namespace.
    • The default 0x0 namespace is the guardians for the entire cluster.
    • JWT has namespace, group. That should tell you what kind of guardian it is.
    • Namespace guardian should be able to add/delete members of that namespace.
    • 0x0 guardians can do anything.
    • 0x0 guardians can create / delete namespaces.
    • Each member is a member of that namespace only.
      • 0x0G created two namespaces, A and B.
      • A adds member M.
      • B adds member M.
      • M’s string ID (stored against the namespace) would still end up creating two Dgraph UIDs.
      • So, M is essentially two different profiles.
      • This is IMPORTANT to ensure that each namespace’s data is fully encapsulated within that namespace’s key prefix in Badger. This allows easy namespace Ban and then drop, without having to do complex “cleanups” later.

Non-Goals

  • Live Loader, Bulk Loader
  • Don’t muck with transaction timestamps for now.
  • Backups are common for now. No specific namespace backup for now.
  • Exports are common as well.
  • Don’t worry about /state for now.
  • What would be the tablet name?
  • Need a faster way to “ban a prefix” in Badger [Task in itself]
    • If we drop a namespace, we don’t stop the world.
    • Instead, we write it down into Badger manifest.
    • And every time we compact, we drop those keys, if they belong to that namespace.
    • Badger would also not write or read data from that namespace anymore.
  • Guardians of the namespace

Backups Thoughts

If each namespace has its own backup, then we don’t need to change the format of the backup data. The client would just tell us which namespace to load them up in.

But, this won’t work for cluster-wide operations. Need that for maintenance. So, let’s just go with common backups for now.

2 Likes

A post was merged into an existing topic: Make Multi-Tenancy open source