Query sever timestamps in GraphQL?

Hi! Is there any progress on this?

No progress yet, But it is one of the features we are aiming for 21.03 release.

Thanks

4 Likes

Here is my updated thoughts on the matter. Can we make it mongoDB like. Mongoose v8.0.3: Schemas

With a directive @timestamps( createdAt: Boolean, updatedAt: Boolean )

It could be used the following way:

type Post @timestamps { # defaults both createdAt and updatedAt to true
  id: ID
  title: String!
}

The predicate _createdAt and _updatedAt should be added to the list of reserved names.

If a user needed to manually set or update the _createdAt or _updatedAt I believe it should be allowed through DQL only. Meaning that GraphQL does not allow any input for these values other than searching.

Cons: There is no way to timestamp edges without either using facets or a linking type. Facets are further out than 21.03 I believe and even then are not first rate citizens. Like many other implementations of timestamps, it does not indicate which predicates/edges were updated, just that there was something updated. With other databases using joining tables it is possible to update timestamps in a join table without updating timestamps on the tables that were joined.

Pros: All timestamps are handled 100% by the rewriting methods in Go. These timestamps are directly immutable by a user within GraphQL. This ensures that the _createdAt is true and was not updated by a later GraphQL mutation. Auto updating timestamps on nodes when edges are mutated, gives a sync script the ability to refresh the data for that entire node and their referenced edges.

Implementation:

  • When an add mutation is rewritten and its type has the @timestamp directive, it gets the _createdAt predicate added with the equivalent of new Date()
  • When an update mutation is rewritten and its type has the @timestamp directive, the _updatedAt predicate is set to the equivalent of new Date()
  • When an edge is created/removed that utilizes the @hasInverse directive, then both nodes on either side are updated with the _updatedAt predicate with the equivalent of new Date().

This would then generate this GraphQL schema:

type Post {
  id: ID
  title: String!
  _createdAt: DateTime!
  _updatedAt: DateTime
}
input PostFilter {
  id: [ID!]
  _createdAt: DateTimeFilter
  _updatedAt: DateTimeFilter
  has: PostHasFilter
  and: [PostFilter]
  or: [PostFilter]
  not: PostFilter
}
enum PostHasFilter {
  "title"
  "_createdAt" # seems kinda silly having a required field, but following precedents already set.
  "_updatedAt"
}
input PostOrder {
  asc: PostOrderable
  desc: PostOrderable
  then: PostOrder
}
enum PostOrderable {
  "title"
  "_createdAt"
  "_updatedAt"
}
type PostAggregateResult {
  count: Int
  titleMin: String
  titleMax: String
  _createdAtMin: DateTime
  _createdAtMax: DateTime
  _updatedAtMin: DateTime
  _updatedAtMax: DateTime
}
input AddPostInput {
  title: String!
  # Notice, do not include _createdAt and _updatedAt here because they are handled internally
}
type AddPostPayload {
  post(
    filter: PostFilter
    order: PostOrder
    first: Int
    offset: Int
  ): [Post]
  numUids: Int
}
input UpdatePostInput {
  filter: PostFilter!
  set: PostPatch
  remove: PostPatch
}
input PostPatch {
  title: String
  # Notice, do not include _createdAt and _updatedAt here because they are handled internally
}
type UpdatePostPayload {
  post(
    filter: PostFilter
    order: PostOrder
    first: Int
    offset: Int
  ): [Post]
  numUids: Int
}
type DeletePostPayload {
  post(
    filter: PostFilter
    order: PostOrder
    first: Int
    offset: Int
  ): [Post]
  msg: String
  numUids: Int
}
PostRef {
  id: ID
  title: String
  # Notice, do not include _createdAt and _updatedAt here because they are handled internally
}
type Query {
  getPost(id: ID!): Post
  queryPost(
    filter: PostFilter
    first: Int
    offset: Int
    order: PostOrder
  ): [Post]
  aggregatePost(filter: PostFilter): PostAggregateResult
}
type Mutation {
  addPost(input: [AddPostInput!]!): AddPostPayload
  updatePost(input: UpdatePostInput!): UpdatePostPayload
  deletePost(filter: PostFilter!): DeletePostPayload
}

Which would translate roughly to this DQL schema:

Post.title: string .
Post._createdAt: dateTime @index(hour) .
Post._updatedAt: dateTime @index(hour) .

Maybe the index should be configurable from the @timestamp directive, hmm…

Alternatively, the _createdAt and _updatedAt GraphQL fields could all map to the same two predicates. This would make DQL statements easier to get a list of everything that was updated filtering on a single predicate. However, this may degrade distributed performance for larger databases as the number of predicates that could be sharded is dropped leaving a large amount of work always in a single alpha/group.


After Thoughts through PM:

So would that be sortable?

Timestamps would be sortable but not mutable directly. Meaning in GraphQL you couldn’t do set: {_createdAt: "2020-01-20"} as that is reserved only for the add mutation or when doing any input ref that creates new.

What if you needed to update a time stamp? For instance for importing data with current timestamps

for importing, I think it should go 100% into DQL if you are importing something that has a preset timestamp. Anything imported using GraphQL you could create your own DateTime field and write to that instead. That would allow to do:

  • This was imported at [_createdAt]
  • The imported data was originally created at [custom importedDateTime field]

Sounds good. Just seems like there needs to be some sort of way to modify that. Not super easily of course since it should only be used in special cases

It would be modifiable through DQL. I think that is the preferred method anyways. Thinking of GraphQL as the API and DQL as the database language. It would be similar to a REST API serving MySQL. In MySQL you could update the timestamps as needed, but when using the REST API, the timestamps are all handled internally blindly, outside of your control.

5 Likes

Thank you for this well thought out push for server-side timestamps!

I love this. Would help a lot with getting only newest data and not receiving old data from sub-graphs over and over again.

Does this need to be so strict? Why not allow both? For instance, we would definitely need mutable timestamps for client ↔ server synchronization.

When implemented like this, you could choose to allow it or not:

type TimestampConfig {
   active: Boolean!
   mutable: Boolean!
}
@timestamps( createdAt: TimestampConfig, updatedAt: TimestampConfig )
1 Like

This is just my opinion on how and why I would do it this way. I am not developing this, so the core-devs would have to make the final decision in this regards by my $.02 is:

It should be strict to ensure that the API layer never does actions it should not be doing. With Dgraph GraphQL, this is a little harder to see at first. But as I explained in my edit after thoughts above, the GraphQL implementation inside of Dgraph is just that, an API. It takes the GraphQL and rewrites it using rules into DQL. In an API layer, actions such as adjusting timestamps are not permitted. If it was, then any user would be able to adjust a timetamp willy-nilly. Think of it how it may be with other database and APIs. If a timestamp is automated by the databse, then the API uses that automation of the timestamps and does not allow writing to that through the API. And looking at the implementation I wrote above, if the _updatedAt field gets set in a rewriting process into the DQL mutation, then allowing a user to also set this _updatedAt predicate could result in writing two values and then the rewriting process becomes more complex with needing then to decide when not to add in the automated predicate if it is supplied by the user. But I think the issue goes deeper than this…

I believe a client->server synchronization should be different than a server->client synchronization. Hear me out. Right now there are no good implementation that I have found for offline GraphQL. The only things really out there is client side cache of GraphQL (ie: Apollo Client). To update the client cache with a source of truth, the client should first update the source of truth (server) and then with the response update the cache. Therefore a client->server sync is not really pushing the source of truth, but is setting pieces of truths and returns the source of truth for those pieces (that would contain the timestamps). For any server<->server syncs that needs to set these timestamps, that should be done using DQL live imports and rdf streams IMO.

It would be interesting to see any implementation of client->server syncs where the client is collecting mutations and then running those in batches at a later point to perform sync. This would add more complications because the client would then be responsible for ensuring that there were no conflicts of unique ids and also require some sort of blank node implementation in GraphQL. I don’t think the GraphQL spec is ready yet for client->server sync.

For the time being, if you have a timestamp field you want to let the database control, let it control it, if you have a timestamp field that you want to control, then do so with a regular DateTime field as I stated above. This would get a timestamp feature into production quickly which could then be iterated upon for feature enhancements later.

Sure - the implementation would be a bit more complex, but is this really a deciding criterion?

I’m fine with this approach. But I fear that once it is implemented immutable-only, there won’t be a reiteration for a loooooong time given the vast amount of (important) feature- and bug-requests currently.

We didn’t find any either, that’s why we build it from scratch. We have one dgraph database running on the client (+ electron react app) and one dgraph database on our server. The user can work offline, all data is stored in his dgraph instance and when he goes online, we synchronize both databases using GQL mutations. And I can tell you - it’s working just fine and isn’t much effort either when using code-generation. And that’s why we would need mutable timestamps. When the user creates a post offline on date XXX, the same data should show online on that post.

I’d also happily implement the synchronization with DQL if you think that this is better suited. Can you point me to some ressources where I can learn how to use “DQL live imports and rdf streams” for server<->server synchronization purposes?

@amaster507 Thank you for the detailed post. What you have here is pretty good and can be implemented pretty much as it is.

Yeah, we’d properly like to have the predicates named as Type._createdAt and Type._updatedAt so that they can be sharded across Alpha nodes as the data grows.

I also agree that they should be set automatically and shouldn’t be exposed via the GraphQL API. This is also because the GraphQL API is supposed to be used by browsers that are clients and it’s not a good practice to be setting timestamps via the client.

1 Like

Sorry, why do you want to limit clients to browsers? What about my @custom logic resolvers? They are doing a lot of stuff that dgraph-gql can’t do (and might never be able to do) and use GQL clients to write back to my dgraph instance. They are running on my servers, so I trust them to write correct timestamps.

Can’t we find some kind of agreement here? E.g. GQL-clients can change timestamps when they send some kind of authentication header?

The thing is, that this decision is very important for us. It will decide if we have to learn DQL, throw month of developer work away and rewrite our complete synchronization logic. And if it is like this, we better start yesterday than next week.

Also if you decided to make it read-only, I need to be sure that server<->server sync can be achieved (with partial data) with DQL. Can you give your opinion on this @pawan ?

1 Like

Typically wouldn’t you have more timestamps, something like createdAt for the server and initialAt for an optional offline init event?

1 Like

That will do of course.

You may find this proposal useful: GraphQL error: Non-nullable field was not present in result from Dgraph - #6 by abhimanyusinghgaur

Notice how you can configure which fields you want in the mutation input and later extending @input to support default values too. So, you will totally be able to configure mutation inputs as per your use-case. Both createdAt and updatedAt can be accomplished that way.

For the part about authentication, I think, that may be taken care of when we support field-level auth as well.

What’s the status on this, given 21.03 is out? I don’t see anything related to generating timestamps on the GraphQL directives page.

As much as I want this feature (not as much as basic @auth fixes, deep mutations / deletes, and nested filters), this is one of the ONLY good things with the Lambda Webooks (which is post, not pre processing).

https://dgraph.io/docs/graphql/lambda/webhook/

You could easily use:

async function addAuthorWebhook({event, dql, graphql, authHeader}) {

  // add createdAt to to Author node
  const createdAt = new Date().toISOString();

  const results = await graphql('mutation { addAuthor { ... ')
  return results.data.addAuthor.author;
}

async function updateAuthorWebhook({event, dql, graphql, authHeader}) {
    
  // add updatedAt to Author node, will automatically replace anything
  // if anyone tries to manually update this
  const updatedAt = new Date().toISOString();
   ...etc
 // I am lazy, so you have to do some more research for the specifics here...
 // but hopefully you get this gist
}

While you can secure updatedAt, you cannot completely secure createdAt, as it can be changed by anyone at anytime.

Unfortunately, until the update-after-bug is taken seriously, this is yet another case where dgraph is lacking basic security functionality.

J

4 Likes

This is really disappointing. I’ve been investigating dgraph for adoption for portions of our user-facing data model, but it seems the GraphQL interface is unsuitable for direct use by clients, and of severely limited use by a backend, which was one of the main draws of the database.

Whilst it looks like lambdas can be used to work around some of these issues, it’s an extra thing to deploy and maintain. Does dgraph provide test harnesses so developers can verify lambdas prior to deployment?

Lambda Mutation for Every Mutation

I should add that near complete security is possible with Dgraph Cloud (with the exception of limiting the number of queries and RBAC in the DB).

However, you basically have to ONLY use custom lambda mutations for every single add or update mutation. You have to block out all regular mutations like in this post:

Firebase Function?

You could also technically use a Firebase Function to do all of your mutations (only if you use Firebase Authentication for your JWT). Queries need to be fast, not necessarily your mutation (depending on your own app’s philosophy). While this seems like a bad idea because you’re running cold start functions on an external server, they are actually pretty quick and really easy to implement and test.

It is worth noting that anyone who takes this method seriously would have to run a separate graphql query immediately after each mutation in order to keep the other graphql cache up-to-date. Here is the URQL version, although Apollo has something similar:

client.query(
  query,
  { /* variables */ },
  { requestPolicy: 'network-only' },
).toPromise();

I have also written something like this in my easy-dgraph package.

If you don’t use graphql to update the mutation, your graphql client will not have an updated cache and will get the wrong results unless there is a refresh. Even custom mutations allow you to return the query for a reason. You could also technically just do a cache-update manually returning the query from the Firebase function, if Dgraph ever charges per query.

Answer

So, to answer your question, no. Lambdas do not have any test sweets like Firebase Functions do. You could, however, compile the lambdas on your frontend from typescript, and do unit tests on a separate test-database (which you should have anyway). You can’t add any external modules though, but for simple stuff, this is recommended.

Just having @auth for basic front end queries is pretty awesome, even though it is very limited. It is just not sufficient for any mutations or some cases with queries. This is a huge step in the right direction from other products.

So, please stick with Dgraph. I have done my research. There is no other graphdb that does everything out-of-the-box for a good price. There always seems to be a third-party module in most cases just to use it on the front-end, there is an extreme price, AND (not OR here), there is insufficient documentation with a high learning curve. If you don’t need a graphdb, there ARE better options. I think every database should be a graphdb, but that is just me.

I believe the main thread of these problems will be fixed in the next year, but you can work with it now just fine. I would develop without the backend security, and add the lambdas or firebase functions last.

J

1 Like

Could someone on the dgraph team give an update on whether this feature is scheduled for release in a particular timeframe? A quick search of the open Dgraph PRs suggests it’s not currently in flight, and implementing in using lambda is complex and less than ideal performance wise (ie. for upsert we need to check if nodes exist to figure out whether to set created or not, and then I guess that the dgraph graphql endpoint is again doing that query to handle upsert validation).

I only ask as I got a few hours into implementing it myself before realising how painful it would be :sweat_smile:

Be the change you want to see in the world :slight_smile:

4 Likes

From what @maaft has verbalized on the PR, I see value in allowing the fields to be mutable by GraphQL if that is possible and default them to normal operation. Almost like SQL handles timestamps, if not provided then we will default on create and on update.

But nonetheless I am glad to see this feature in whatever form it comes. Maybe I can look at the PR and get some sense into how some other things may be possible. It is good to have something like this and all of the changes required to make it happen in a single PR to learn by.

@dpeek Thank you so much for raising the PR. I really appreciate your effort in making Dgraph better. I have put some thoughts into your PR and do let me know if it makes sense to you. You can either expand here or maybe we can get on a call to have more clarity. I would be happy to help in any way possible and get this released :slight_smile:

1 Like

Thank you @dpeek!

We really really appreciate all your work and efforts on this! Not only is this a timestamps new feature, but it is a default value new feature which is awesome!!!

This was incredible, and we need more users contributing like you!

Security Problem

That being said…

For anyone who finds this after 21.12 is released, you will not be able to secure your fields updatedAt and createdAt until after the update-after-auth, which is hopefully in the next version 22.0X… ?

As of Right now, if you want to secure a field with add rule, you can do an @auth directive like so:

type Post @auth(
  add: { rule: "query { queryPost(filter: { not: { has: [createdAt, updatedAt] } } { id }" }
) {
  ...
  createdAt: DateTime @default(add: { value: "$now" })
  updatedAt: DateTime @default(update: { value: "$now" })
}

However, if you do this with update rule rule:

type Post @auth(
  update: { rule: "query { queryPost(filter: { not: { has: [createdAt, updatedAt] } } { id }" }
) {
  ...
  createdAt: DateTime @default(add: { value: "$now" })
  updatedAt: DateTime @default(update: { value: "$now" })
}

It evaluates the data before it is entered into the database, not after the mutation, like the add rule does… so, we need update-after-auth.

The @auth directive was really built for RBAC and ABAC using claims inside your Authentication Token. That being said, the @auth directive is SOOO much more powerful, and can do things like Field Level @auth — not to mention it fixes the security vulnerability of being able to create a new node owned by anyone else.

Field Security for Now - field level @auth?

You can only really secure a field differently by creating a new nested field. Usually you want to give your users the ability to create a post / tweet / comment, but now allow them to edit the date the post was created or updated, as this should be automatic and secure.

You can do this now (before v21.12) by using Post Lambda Webhooks and creating a nested field. This method is definitely a hack, and you have to disable @hasInverse, but it works as expected.

However, I cannot seem to fathom a way to secure this new @default directive. If we use nested nodes, and the nested node is secure from add and update, we won’t be able to add or edit it at all differently. The only way would be to use a lambda, which would get us back to my method.

So, if security is important to you, use my method, or wait for some type of field-level-auth security, like the update-after-auth feature and bug fix.

Maybe the actual docs will cover a security method for these fields and I am wrong. Maybe someone has another hack we will be able to do immediately (let us know!)

That being said, the @default directive and all the hard work from @dpeek are very useful NOW, and we greatly appreciate it!

J

1 Like