@id as the combination of two things (composite index)

Hello all!
I’ve been wondering if there is a way to create an @id as the combination of two things

I want to add custom metadata to one field, for example custom validation rules, let say we have this:

type Animal {
  id: ID!
  name: String
  age: Number
  toys: [Toy]
}

type Toy{
 name: String
 color: String!
 owner: Animal
}

And now let’s say I want to consume some metadata from my front, for example the type or if it is required (before doing the request)

Then, i create a new type, “Meta”:

Enum Types { 
 String
 Number
 Date
 ...
}

type Meta {
field: String! @id
type: Types!
required: Boolean!
}

but the problems here are:

  1. there is 1 meta to every field, so i need to link it to the combination type-field. For example i could have:
Meta1 = {field: "Animal.age", type: "Date", required:"false"}

So now in the front i can query Meta(field=“Animal.age”) and get the metadata for that particular field to validate it in the front.
Problem => I have no idea how to link the node meta to the combination of type&field

  1. Even if I manage to create that @id combining type and field, I don’t have a real edge between “Animal.age” and its “Meta”. Is artificial. But if i try to do an edge then it becomes a nonsense as there could be N Animals, but all of them have the same Meta per field, so linking the same “Meta” array to each Animal seems like an overkill.

To clarify it eveb more the link is
fype.field <-> metadata
not
type <-> metadata (as there are M fields and M meta)
neither
node <-> metadata (as there N nodes with M fields, but only M meta)
neither
field <-> metadata (as different types can have the same name, like toy.name and animal.name)

Anyone has a way to do this properly?

1 Like

Also, another use case: Sometimes we need information at the Edge level, and this edge is usually represented with a node instead because of the limitations of edges in the graphql standard.

For example lets think of candidates and job positions. If there is an edge “application” this edge application is defined by the composition of both ids candidate.id and jobPosition.id.

Is not just one entity linked to 1 candidate and one job position. The full point about its existence it’s that actually its an edge that links in a 1-to-1 relationship 1 candidate and 1 job position. And it can have some fields like: Date of application, status (f.i: application being reviewer)… and even an edge to “Interviews” as one application process can be linked to several interviews.

So in order to run “getApplication” properly, i will need to provide the combination of the two ids: Candidate and Job Position.

As you can see, i have a completely different use case, but the problem is the same.

  • I need to have an ID as the composition of two things

Man this keeps coming up a lot here lately. IMHO this is a modeling problem rather than a unique key constraint problem. We call it a unique key constraint because we have been stuck with RDBMS for so long.

The other thing is that GraphQL fails to recognize what Dgraph terms facets and even Dgraph doesn’t handle facets all too well (not 1st class citizens of the graph).

So what is the fix then? How can we have these properties on an edge and multi-id constraints? The only answer we have right now (and probably only one we will have for a very long time) is to create what I term linking nodes as you described above.

This makes some things more difficult such as filtering and ordering with nested values but it is possible.

Modeling the jobs use case becomes something like:

type Person {
  id: ID!
  name: String!
  applications: [Application]
  postedJobs: [Job]
  reviewedApps: [Application]
}
type Job {
  id: ID!
  name: String!
  applicants: [Application]
  filledBy: Applicant
  postedBy: [Person] @hasInverse(field: "postedJobs")
}
type Application {
  id: ID!
  appliedTo: Job! @hasInverse(field: "applications")
  applicant: Person! @hasInverse(field: "applicants")
  appliedAt: DateTime!
  reviewed: [Person] @hasInverse(field: "reviewedApps")
}

How does this model fail to meet the jobs use case?

Every Person can have many applications and those applications each refer to only one person and one job, so the only thing lacking would be an Auth rule to prevent the same application to be created between the same Job/Person.

This last part is business logic. We are just so accustomed to creating pivot tables with unoquely constrained keys to apply this business logic rule, but the same is possible with Dgraph GraphQL with auth rules.

1 Like

What you’re talking about are composite indexes like in SQL. You want a primary key on two columns.

This had been a feature request for a while.

As @amaster507 said, you can model your way out of this, but you really need to look at your security as well, since we do not have field-level auth yet. The only work around is probably going to be custom mutations, unless you’re not worried about security.

But, you need to change the model to re-think how GraphQL works.

J

1 Like

I’m doing somthing similar but is a bit messy to query.
How would you getApplication by composing applicantID and JobID?

Security will be important for my use case. Thanks for the issue, it clarified a lot of stuff in my mind

Btw the workaround is actually a mix of both of your answers, as I need both the node-edges relations and the uniqueness of the composite, so the workaround would be something like this:

type Application {
  id: ID!
  appliedTo: Job! @hasInverse(field: "applications")
  applicant: Person! @hasInverse(field: "applicants")
  appliedAt: DateTime!
  reviewed: [Person] @hasInverse(field: "reviewedApps")
  compositeIndex: String! @id @lambda
}

where I can getApplication by its compositeId, but i can also understant the connections

For the native composite indexes, I hope they manage to do it for cases with 3 or more indexes, as there are some types which uniqueness is defined by more than 2 indexes.

In this particular use case we could let candidates apply to same position after 3 months, and then applications would be identified by: ${Candidate}${Job}${try}

where candidate and job are links and try is just a number

Until we get nested filtering, you have to either use cascade or make a custom dql query, here is with cascade

query ($job: ID!, $person: ID!) {
  getPerson(id: $person) @cascade {
    applications {
      id
      appliedTo(filter: { id: [$job] }) {
        id
      }
    }
  }
}

You can parameterized the cascade to allow some fields to be empty

1 Like

FYI It seems like it is not possible to have an @id that is calculated using a @lambda
image

I created a field that i want to compose with the two ids:

type Application {
  compositeId:String! @id @lambda
  JobId: String!
  CandidateId: String!
}

and got that error.

I guess i will just manually add those IDs

lambda fields are not saved and persisted, lambda fields are generated using the lambda at query time. Also, you cannot filter/order by lambda fields since they are not persisted anywhere there are no indexes to do anything with.

Makes sense. So here i would need the famous missing feature “post-hook”? So i can inject a value into that field on creation?

1 Like

https://dgraph.io/docs/graphql/lambda/webhook/

See the @lambdaOnMutate. But there is this glaring gotcha:

Lambda webhooks only listen for events from the root mutation. You can create a schema that is capable of creating deeply nested objects, but only the parent level webhooks will be evoked for the mutation.

This is an issue tho. There is no way to workaround this?

nope. :frowning:

Not at this time.

The famous missing feature is pre-hooks, (not post) which would pretty much solve every security problem FWI.

One other work around here would be:

type Candidate {
  id: ID!
  applications: [Job] @hasInverse(field: candidates)
}
type Job {
  id: ID!
  candidates: [Candidate]
}

I think it is worth mentioning that this is not necessarily an issue. If your first lambda webhook handles the nested data as well, it does not matter. Sure that could be repeated code, but do-able. Also, you can call nested data if you call it from a GraphQl mutation within that lambda. When not handled correctly, you can see yourself in a loop, so you know this works. It really depends on what you’re trying to accomplish.

J

1 Like

Thanks for the info! Getting it clearer and clearer.

The problem with that workaround is that applications have some data and even other edges. For example each application has an “applicationDate” and can be linked to N interviews, each interview to scores of those interviews etc. so Application in this use case requires being a node itself, but a special node defined as an edge as it is defined as the connection of two things … I know it’s weird

Regarding lambdas, I’m afraid I did not quite catch it. You mean creating webhooks for each type? Let’s say that what I want is just a default value (for example, the composite Index on creation).

So for applications, I want to have a computed field that runs on Application.add

compositeID =  `${candidate_id}_${job_id}`

As I want this to happen every time we create an application I have to create a webhook at candidate level and at job level, so I’m sure that regardless of the path I choose, I set that field properly.

I think is already messy to have to create 3 webhooks when is clear that I just want it to happen every time i create one Application node.

But what scares me is that it can keep going. Let say we update a recruiter that will be linked to this job, that will be linked to one candidate as its interviewer. Then i’m creating the edge from a 4th place, so it means that i will need also to create a webhook there to ensure that

recruiter { 
   job { 
      application {
        ...
       }
   }
}

And thre are infinite combinations as i could do something like job > application > candidate > application so it is not possible to create a webhook for every combination isn’t?

So, you can’t use the Webhooks to do what you want. The record would have already been created, and the webhooks are post-hooks. You want to enforce the record.

I think you need to think about what your application is doing. Just because you can create an application from 3 different ways, doesn’t mean you would or you should. You should still keep your application consistent, and have ONE place to create your data, or at least one main mutation.

Example, if I wanted to post a job on indeed I could probably go through a category first, or pick a category later, but I arrive in the same place and submit the same information on one form (Maybe indeed does not work like that, but it is just an example).

In your case, you could arrive to the application from the recruiter page, but the application would still be the same, submitting the same information. If you create a custom mutation lambda that verifies this compositeID does not exist before the data is added, you can always ensure consistency.

So use your data model:

type Candidate {
  id: ID!
  applications: [Application]
}
type Application {
  id: ID!
  job: Job! @hasInverse(field: applications)
  candidate: Candidate! @hasInverse(field: applications)
}
type Job {
  id: ID!
  applications: [Application]
}

Lock down the Application node being editable in GraphQl, and always use your only your custom mutation if you’re going to create or edit an application. This is the only way I know this can be done.

I think this proves why we NEED @id on multiple fields. At the very least, we need an @auth directive that won’t allow you to create a node unless the @id field == $job + __ + $candidate.

Just my thoughts,

J

That auth idea opened a ton of possibilities in my mind. Is true that we can’t compute things using auth but just understood thanks to you that we can add any kind of validation there, and that is amazing!

Regarding my little use case, the problem is that i will expose the graphql endpoint, like an API, so i’m not in charge of what people will write as requests, so i have to discard the post-hooks

I think i’m just forced to create a backend that manages this. And maybe in the future use a pre-hook that computes that composite ID before I create my node

1 Like

Yes and no. We currently can’t control what WILL be in an update, hence the update-after need. Really we also want Field Validation, but that is for another topic.

So you understand, custom lambda mutations are different from lambda webhooks (we currently only have post, not pre). You can now create a custom mutation to solve this problem currently, and you can disable the graphql layer with a rule.

J

1 Like

Must admit that i’m having a bad time understanding the differences with crating a custom query or mutation and @custom, @lambda, field lambdas, mutation lambdas, query lambdas, webhook lambdas…

Still not there but i’m getting closer. I guess instead of creating my own backend and di everything in DQL I could just disable the graphql layer and create lambda mutations manually :thinking:

Will explore this!