Implement custom JS resolvers in GraphQL

arijit · August 11, 2020, 12:08pm

Motivation

Implement custom JS resolvers in GraphQL that will help the user execute arbitrary business logic in addition to using the auto-generated resolvers.

User Impact

Users can directly use these JS resolvers instead of writing another NodeJS server to wrap around Dgraph. This will allow them to process data at the server end and transform them. This can be used in a range of cases like:

Applying auth rules on fields. Based on the query and JWT values, the user could decide to hide some fields when returning the result.
Applying some pre or post-processing logic before calling the auto-generated resolver.
- Example of pre-processing logic would be to automatically add created_at or updated_at fields for a type.
- A post-processing step might be used to calculate the count or avg and return the final result to the user.

Implementation

Instead of executing a query via one of the auto-generated resolvers, we could also allow the resolver to be a JS function. This function can make HTTP calls to arbitrary endpoints or a DQL query/mutation, transform the result and give us back the response to work with.

type User {
    id: ID!
    firstName: String!
    lastName: String!
    updatedAt: DateTime!
    fullName: String
    followersCount: Int
    followers: [User]
}

type Query {
    getCustomUser(firstName: String!): User @custom({
      js: "fetchUser"
    })
}

type Mutation {
    updateUserLastName(id: ID!, lastName: String!) User @custom({
      js: "updateLastName"
    })
}

JS resolver

Custom mutation

function updateLastName(parent, args, context, info) {
   // similar to context.dgraph.graphql, we would also have context.dgraph.dql 
   // which would allow you to run DQL queries and mutations on the underlying 
   // Dgraph instance using dgraph-js

    now := time.Now() // or something similar in JS

    var data = context.dgraph.graphql({
      query: ```
      mutation($id: ID!, $name: String!, $now: DateTime! ) {
        updateUser(filter: {
          ids: [$id],
        },
        set: {
          lastName: $name,
          updatedAt: $now
        }
        ) {
          firstName
          lastName
          updatedAt
        }
      }
      ```,
      variables: {
        id: args.id,
        name: args.LastName,
        now: now,
      }
    })
    
    return data
}

Custom query

function fetchUser(parent, args, context, info) {
   // similar to context.dgraph.graphql, we would also have context.dgraph.dql 
   // which would allow you to run DQL queries and mutations on the underlying 
   // Dgraph instance using dgraph-js

    var data = context.dgraph.graphql({
      query: ```
      query($id: String!) {
        getUser(id: $id) {
          firstName
          lastName
          followers {
            id
          }
        }
      }
      ```,
      variables: {
        first_name: args.firstName
      }
    })
    
    data.fullName = data.firstName + data.LastName
    data.followersCount = data.followers.length
    return data
}

Arguments (similar to Apollo client so that user’s have to change minimal code)

parent : Empty for custom queries and mutations. Would be used later to have the * parent object when we support resolving custom fields.
args : GraphQL arguments for the request.
context : Contains auth info of the user (custom claims) and also provides access to calling internal GraphQL resolvers or DQL query/mutation.
info : Query AST and execution information

So the custom query, can call a predefined resolver like getUser or queryUser and then transform the result before returning to the user. Similar things are possible for mutations. This would allow us to define mutations like updateUserName, updateUserLocation etc. where the validation can be done before to make sure that we only allow updating certain properties and then we can fall back to calling an internal resolver.

The JS resolvers would be stored as data inside Dgraph through an HTTP API.

Execution

Since hooks will be written in JS we need a way to execute them.

Solution 1: Execute JS in a separate NodeJS server (preferred)

Run Nodejs server in sandbox mode and send the JS code to it via RPC to execute it there. NodeJS already has a sandbox mode. This gives us support for running ES6 and also the ability to import and use external libraries within the JS code. The only limitation is we have to make network calls but that should be faster as it will typically be running in the same machine.

Example code of how this might work. GitHub - arijitAD/Golang_Node_Executor: Executes Nodejs via GRPC from Golang client.

Solution 2: Use a Go library to execute JS

Example code: Sample program that takes the input to the JS function and executes it and prints the output.

Note: It is also possible to send a Golang Struct as input params and retrieve it back.

	vm := otto.New()
	if _, err := vm.Run(
		`function JSHook(name) {
			if (name === "Arijit")  {
				name = "Friends"
			}
			name = 'hello, ' + name + '!'
			return name;
		}`); err != nil {
		panic(err)
	}

	output, err := vm.Call("JSHook", nil, "Arijit")
	if err != nil {
		panic(err)
	}
	fmt.Println(output)

	output, err = vm.Call("JSHook", nil, "Friends")
	if err != nil {
		panic(err)
	}
	fmt.Println(output)

Otto limitations

Doesn’t have a good solution for importing external libraries.
Cannot issue fetch request which is a non-starter.
Doesn’t support ES6. Only supports ES5.
Old library and not actively maintained.

Validating and Storing Resolvers:

Once hooks are validated, we can store them in memory and as a key in badger similar to the schema.
Otto allows us to validate JS. In the case of NodeJS server we can expose a validation endpoint.

filename := "" // A filename is optional
src := `
    (function(){
        console.log("Hello, World.");
        return;
    })();
`
// Parse some JavaScript, yielding a *ast.Program and/or an ErrorList
program, err := parser.ParseFile(nil, filename, src, 0)

Unknowns/limitations

Resolving a field through a JS function. We’ll only support custom queries and mutations for now. We can of course later support resolving fields as well in batch mode. Single-mode won’t make much sense.
The set of libraries that the user can use within their JS code would be limited and their versions would be fixed and controlled by us.
How do we store the JS functions inside Dgraph as some metadata which isn’t affected by DROP_ALL and DROP_DATA operations.
Support for other languages like RUST, Go etc. by exposing a gRPC interface.

balaji · August 11, 2020, 2:04pm

Otto library looks old. Try to find some library which is maintained or we can make some binding to call remote library

arijit · August 11, 2020, 2:17pm

There were two libraries that I found otto and v8go. Otto seems more stable and has more contributors working on it.

michaelcompton · August 12, 2020, 12:40am

Cool, I can’t wait to have this sort thing. I have a few questions…

(1)
For the general interface of the hooks, should we think about using argument that match up with the general resolver interface. Across GraphQL implementations that’s pretty stable and in particular in the JS world mostly people will know Apollo. In Apollo server it’s this:

myHook(parent, args, context, info) { ... }

One big reason to do it that way is that it makes it so easy to take something that was working elsewhere and just drop it into Slash GraphQL and it still works. If we have a different interface, then we have to teach that interface to people and they have to change their code if they have something already.

(2)
Isn’t there a use case for wanting to implement the whole query/mutation as custom code - e.g. it not be just pre- and post- processing?

(3)
Same question as (2), but for fields. Isn’t there a need to just implement say a particular field in a type as a custom bit of JS?

(4)
As an example of a use case, let’s say I want to have my own mutation to add a post. A post might be like

type Post {
  id: ID
  title: String
  text: String
  datePublished: DateTime
  author: Author
}

In my app, I don’t really want the auto generated addPost, because I want to add the datePublished by injecting the current time, and I don’t want the mutation to add a post to have the user in it because I’m going to add that from the JWT, so really, I want to do this

type Mutation {
  newPost(title: String, text: String): Post @myJScode....
}

That mutation should just do some input validation, some auth check, add some arguments, and then call addPost. Can we allow things like that?

(5)
Dependencies … if their JS code has a dependency on some npm package, can we allow that? Do we have to set a list of accepted npm packages and restriction them to that (Auth0 does that)?

amaster507 · August 12, 2020, 12:51pm

Some quick thoughts…

I agree with @michaelcompton to follow the norm with resolver arguments. That is pretty standard. The request, identity, and stash should be parameters in context; results would be parent (empty for pre processing); arguments=args; info=info. If we can keep info the same formatted as normal resolver.

I have use cases where I would need to dgraph uid even if the schema does not use ID. Would this be possible in a post with the results?

I am assuming that by receiving the results I could then modify the results remove/add fields back to the request. For instance taking a firstName, middleName, lastName field and concatenating them together into a name field. This may take a pre and post to accomplish. preQueryPerson { /* if name is requested add firstName, middleName, lastName to the request */ } postQueryPerson { /* if name was requested concatenate the other fields to form one back to the user */ }.

Couldn’t all of that be handled with preAddPost { /* add argument for author based upon JWT user && add datePublished argument to (new Date()).toISOString() */ }

Will the Pre process have a function to return without continue the pipeline? Let’s say in @michaelcompton example above the JWT did not contain what we expected. We could catch this with an @auth rule, but it would be better if we could stop it in the Pre process script and return an error message without continuing the pipe, never hitting any auth rules, and not hitting the db any more.

Can we also add a way to generate additional input in the generated queries/mutations that are not stored but only used for the pre/post scripts?

Thinking this would go in the schema some how such as:

directive @inputs(fields: [CustomFields]) on OBJECT | INTERFACE
input CustomFields {
  field: String!
  type: String!
}

type Person @inputs(fields: [{field:"filters" type:"[String]"}]) {
  id: ID!
  name: String @remote # generated by post script
  firstName: String
  middleName: String
  lastName: String
  ...
}

arijit · August 12, 2020, 6:10pm

Agree with this format. I looked up apollo and others and this seems to be the standard format. I will update the RFC to use this format.

Yes, I will update the RFC with the flow and couple of example use cases.

This can be easily done with pre hooks for addPost, Since we are already passing the request and JWT to the Hooks we should be able to modify the request to achieve this.

I am exploring this part. I haven’t yet found any go library that allows us to import js library and execute it. But I think we fetch the library source code and add it to our Hooks.

arijit · August 12, 2020, 6:44pm

Currently it won’t be possible to fetch the uids as we will be rewriting the queries after the pre hooks have been executed.

This will be possible using pre hooks. We can remove that field from query and mutation.

michaelcompton · August 12, 2020, 10:59pm

The point is that the interface changes. The user doesn’t want to use addPost because the interface for that contains fields for datePublished and author etc, so they want to use a mutation with a more appropriate interface like newPost(title: String, text: String): Post, but in the end they do want to add a post inside the implementation of that.

Same thing holds for update. I could have updatePost with a pre-hook that splits into x number of cases for the things you can do … if you are updating the text, then this must be true … if you are adding a like to the post, then this must be true, etc. But that’s naf. You’ much rather just have a mutation updatePostText(id: ID, newText: String) and likePost(id: ID).

Take, for example, a real GraphQL API like GitHub. It doesn’t contain just one mutation updateIssue. It has closeIssue, addComment, addTag, etc. The custom JS hooks is a nice way for us to allow extending your schema in that sort of direction.

pawan · August 21, 2020, 9:07am

I had some questions here.

Does Otto support making HTTP calls and such? A user might want to write a resolver which makes a REST call and serves the data over our GraphQL API, would that be possible here?
Another thing that I noticed is that Otto only supports ES5, is that going to be an issue for users given that there are newer versions of JS available now.
Can we import external libraries like https://momentjs.com/ or https://lodash.com/ and use them from Otto?

arijit · August 24, 2020, 6:07am

No, Otto doesn’t allow to make HTTP calls and we can’t import libraries.

It currently supports only ES5 and doesn’t support ES6 fully.

I feel there is a lot of limitation when using this library. Instead, I was thinking that it will be better if we could run Nodejs server in sandbox mode and send the JS code to it via RPC to execute it there. This way we will be able to make remote HTTP calls and also we can include any library from npm.
@pawan @michaelcompton Let me know your thoughts on this. So that I will explore more.
NodeJS already has a sandbox mode.
Also, there are a couple of node libraries that support running node js code in sandbox VM.
VM2

michaelcompton · August 24, 2020, 6:21am

I think you have to be able to make HTTP calls and import (some - maybe a list we control) libraries.

The sandbox thing sounds interesting. At least worth investigating to see what the limitations are.

pawan · August 24, 2020, 7:01am

Had a call with @gja about this as well. Based on the call we decided that support for ES6 and the fetch protocol is essential. He also told us that we don’t need to worry much about deps because we would expect the user to give us a JS file bundled with webpack. We concluded that running this as a separate node server with the code executing in a VM might be the best way to go about this.

@arijit is going to try and cook up a small example of this using some external libraries and making an HTTP example to see if it works as expected.

Some other things to look at or tackle later

Cloudflare workers and if we can use those here.
Are the VM isolates recycled or do we leave it out there. That is, is all of this run in a server-less manner or not.

gja · August 24, 2020, 7:13am

ANd while we are at it, some way to pass environment variables to the script would also be useful eventually (not needed for v1).

davidLeonardi · August 24, 2020, 11:35am

Sounds like a great usecase for Deno?

amaster507 · August 24, 2020, 12:53pm

What about lambda functions? Sort of like how netlify is a 3rd party tool making AWS easy for the end user. You guys already use AWS so no new thing to go get. AWS already supports packages. AWS has a good CLI that could be used by a slash front end.

I understand that this sort of separates it from being in core but rather an external pointer.

Use it like the custom directive now but with a pre and post directives.

I believe this will also help dgraph from being bloated. With the high RAM consumption already, mqybe adding more things that will use even more RAM is not the best solution.

davidLeonardi · August 24, 2020, 3:13pm

Lambdas have horrible warm-up latency…

amaster507 · August 24, 2020, 3:26pm

So does Slash if it has not been used recently. There are ways to keep functions warm but that does require scheduled requests to the function which will raise the usage. Just throwing out ideas as a solution since nothing is set in stone yet.

Another benefit of this is allowing end users to write the function in whatever language they want and include whatever packages they want. It could then be as lean or as heavy as desired. And it goes with the current model of custom directives pointing to 3rd party hosted scripts.

I am still for dgraph hosted JS hooks though as well, as long as they can support packages and fetch (which can be just another package).

jmsegrev · August 24, 2020, 3:56pm

Wouldn’t having this pre/post hooks work similarly to the custom resolvers be a simpler and more generic solution?, It would require a timeout to be configurable per hook, but it should satisfy most use cases and it would not be tied to any particular programming language or platform.

pawan · August 25, 2020, 10:51am

The user can already use a Lambda function and execute the pre, GraphQL resolver and post logic by using a custom query/mutation. Are you expecting anything more there which can’t be done right now?

pawan · August 25, 2020, 10:54am

That would be easier to support yes but I am afraid it won’t be fast enough if the pre and post hooks have to be executed as HTTP calls to remote servers. Having it be executed in memory or over a Node server running locally would be more performant.

Topic		Replies	Views
Custom Mutations - Graphql Documentation	5	829	October 12, 2020
Need to inject data into a request from clients that cannot be trusted. Please help on how to do this? GraphQL	3	329	July 29, 2020
How Do I Use My Endpoints? GraphQL Queries and Mutations - Dgraph Blog Blog	0	475	August 28, 2020
Status on Missing Custom Features GraphQL kind:question , dql , custom-directive , custom-resolver	0	651	December 4, 2021
The `@custom` directive - Graphql Documentation	0	445	August 28, 2020