How to model schema if we have many to many relationship between multiple models/entities?

I am designing a schema for my new application. During this process, I am stuck at one point. Here is my use case.

I have three entities, User, Group, and Company.
They are related to each other as many to many. For example,
User can be a part of multiple companies,
A company can have multiple users.
User + Company can have multiple groups. (User + Company + Group combination should be unique)

Following is SQL schema which represents this relationship

Table: User
UserId    Name    
1             Alice
2             John

Table: Group
GroupId    Name
11             Group 1
12             Group 2

Table: UserGroupMap
Id    UserId    CompanyCode    GroupId   
1     1             1234                    11
2     1             1234                    12
3     1             5678                    11
4     2             1234                    12
 

Here is my attempt to convert this use case into DGraph schema

type User {
    name
    companyCode    
    userGroups
}

type Group {
    name
}

name: string @index(hash) .
companyCode: string @index(hash) .
userGroups: [uid]      @reverse .

But with this approach I can able to maintain uniqueness of User + Company + Groups, but now I am duplicating user information for there each company.

How can I transform this into DGraph schema?
Any thoughts?

1 Like

I don’t see how you are duplicating user information for each company. Do you have a query?

Check the UserGroupMap table in SQL schema. For UserId 1 (Alice) we have 2 companies associated (1234 and 5678). If we migrate this data into DGraph schema then we will have 2 user nodes as follows.

User Node 1
Name: Alice
Company: 1234

User Node 2
Name: Alice
Company: 5678

Ah, ok.

Are groups and company code separate concepts? Can they exist independently of one another? Or is a company tied to a groupID

Yes, they are separate concepts/entities. They can be related to each other through User only.
But we have one specific use case.
User + Company + Group combination should be unique.

So something like this would work:

type User {
    name
    groups
    companies
}

type Company {
    code
}

type Group {
     name
}

name: string @index(hash) .
code: string .
groups: [uid] @reverse .
companies: [uid] @reverse .

But then How can I get the list of Groups for User + Company combination?
e.g. Check the UserGroupMap SQL table

Table: UserGroupMap
Id    UserId    CompanyCode    GroupId   
1     1             1234                    11
2     1             1234                    12
3     1             5678                    11
4     2             1234                    12

Here UserId 1 is associated with 2 companies with different groups.

{
    q(func: type(User) {
         uid
         companies { code }
         groups { name }
    }
}

This will return all the companies and groups associated with the user. But How can I get a list of groups associated with the User + Company combination?
e.g. Refer to SQL data from UserGroupMap
If I pass User = Alice and Company = 1234 it should return Groups = [Group1, Group 2]
If I pass User = Alice and Company = 5678 it should return Groups = [Group 1]

Dgraph does not currently support the concept of unique across multiple types. There are a few different ways to accomplish the use case:

get the list of groups for User + Company combinations

Option 1, with a map type:

This would create a new type to handle this relationship as a node. An edge can only link between two nodes, so when a link is needed between more than two nodes then a type is needed.

type UserGroupMap {
  user: User
  company: Company
  group: Group
}

Then you can do a query with cascade to get the group of any specific user/company combination

query ($user: uid! $company: uid!) {
  groups(func: (type(UserGroupMap))) @cascade {
    user @filter(eq(uid,$user))
    company @filter(eq(code,$company))
    group {
      uid
    }
  }
}

Option 2, with edges between group to companies and users.

Thinking about the use case leads me to believe that the Groups are related to the Users and Companies. This would enable a form of access control of the group depending on the User and the Company.

type Group {
  users: [User]
  companies: [Company]
}

In this schema you would not need to query the map, but rather just the Group itself.

h/t to @amaster507. Very many thanks.

2 Likes

I think, Option 1 is the perfect fit for my use case, because in Option 2 we still not able to maintain any uniqueness/ relation between User + Company + Group.

Thank you so much @chewxy for your help.

@chewxy - If I use the given query to fetch records it is taking too much time.

Please see the stats.

"extensions": {
    "server_latency": {
      "parsing_ns": 130280,
      "processing_ns": 6228941594,
      "encoding_ns": 8102954,
      "assign_timestamp_ns": 613212,
      "total_ns": 6237896123
    },
    "txn": {
      "start_ts": 241857
    },
    "metrics": {
      "num_uids": {
        "_total": 1714223,
        "dgraph.type": 0,
        "email": 1,
        "entity": 416453,
        "uid": 416477,
        "upn": 48344,
        "user": 416453,
        "usergroup": 416453,
        "usergroupname": 21
      }
    }
  }

I have almost half a million nodes for UserGroupMap type and this data is going to increase in the future.
Any recommendations?

@sandeepkangude, can I ask to clarify your use case. Can you clarify this at all with a real world example and use case? Are you trying to implement a company/user/group access controls? I think there still might be a better schema design to handle your needs, but I don’t understand the use case at the moment.

But to answer your question specifically… While the use of cascade directive does result in the correct results, it is not optimized. The main reason is that it needs to read every node in the UserGroupMap, traverse all of their edges to the user and company and group. Then after it gets all of the results the cascade filters the data to only those that have all fields. Essentially the query is trying to traverse a ton of edges to conclude that it doesn’t need that data. Maybe a reversed query might help. Dgraph doesn’t do any kind of query optimization per se and it leaves that on the user to form the best queries knowing their own data. With DQL, maybe you can do a multiple block query that would get the results better:

query ($user: uid!, $company: uid!) {
  var(func: eq(uid, $user)) {
    ids1 as ~user { # reverse the edge
      uid
    }
  }
  var(func: eq(uid, $company)) {
    ids2 as ~company { #reverse the edge
      uid
    }
  }
  groups(func: uid(ids1)) @filter(uids(ids2) AND type(UserGroupMap)) {
    uid
  }
}

This is my attempt to optimize without knowing more about your use case. @MichelDiz can you check me on my DQL syntax here? I am still not confident with DQL syntax and I don’t have a dev server running to test code on.

There’s no uid type in DQL variables or ! not null. Just int , float , bool and string. Also is good to give a name to that query.

e.g

`query myQuery($user: string, $company: string)`
1 Like

Thanks for correcting me there. You’re the man with DQL.

@amaster507,

Yes, it is a valid use case in my project. Let me provide some data to support this use case.
Note: Here Company means an entity of a company.

Example: Facebook has 2 offices one in SFO and one in New York. Those two offices are entities. User Alex is working for Facebook and he is having different roles for each entity.

Alex	Facebook-SFO	Quality
Alex	Facebook-SFO	Finance
Alex	Facebook-NY	    Quality
Alex	Facebook-NY	    Marketing

We need to maintain User + Entity + Group as a unique combination.
I hope I am able to explain the problem statement correctly.