Querying for historical values, and creation and modification timestamps as dateTime

I read that dgraph is configuring badger to keep all value mutations in the log permanently:

options.NumVersionsToKeep . We set this by default to 1 in Badger, but in Dgraph, we set this to infinity

So… My question is, can we access that “infinite history” via dgraph queries?

I am experimenting with adding explicit created and modified dateTime predicates, but it feels like this could/should be provided via an implicit function of some sort.
Intuitively i would like this kind of query:

{
  everyone(func: eq(dgraph.type, "Person")) {
    uid
    created as @created(uid)
    modified as @modified(uid)
    seconds_since_created: math(since(created))
    name@* @facets @history
  }
}

to return this kind of result:

{
  "data": {
    "everyone": [
      {
        "uid": "0x2737",
        "created": "2020-07-29T12:50:41.849Z",
        "modified": "2020-07-29T14:10:05.56Z",
        "seconds_since_created": 4799.593736,
        "name@en": "Hyong Sin",
        "name@en|isChanged" : true,
        "name@en|history": {
               "2020-07-29T14:10:05.56Z":"Hyung Sin"
        },
        "name@ko": "형신"
      },

Is there any undocumented magic (or magic that i didn’t manage to find in the docs) to do such things?

If not, could this be a feature request? Would it be reasonably easy to implement because of the way badger ~is_working with dgraph ?

3 Likes

@gotjoshua Dgraph does set the NumVersionsToKeep to infinite but this doesn’t mean we keep all the versions forever. Dgraph takes periodic snapshots of data. Every time we create a new snapshot, dgraph will set a timestamp (think of it as a marker) and all the duplicated/deleted data below this marker will be dropped by badger even when the NumVersionsToKeep is set to infinite.

I don’t think we support any such thing.

@pawan what do you think about @gotjoshua’s suggestion?

1 Like

Thanks for the reply @ibrahim

Cool, are these snapshots somehow accessible?

I’ve been thinking some more about this and mocking up the way i’d like it to work in my dgraph offline experiments.

This led me to another question:
Is it now possible to set/get modification and creation timestamps automatically via dgraph (maybe a plugin? or?) or do i always need to create these from the client?

There are some interesting details to consider regarding the different “types” of history:

  1. Attributes vs Subjects (values vs objects with values)
    (eg. what does the history look like for a subject with various types of nested attributes and objects)
  2. Does the modification date of a top level subject update when a nested attribute is changed?
    (eg. The isClose boolean facet for Amit and Michael’s friendship changes - do Amit and Michael’s modification timestamps update? what if the attribute or facet is even more deeply nested?)
  3. What about Arrays and Objects and Arrays of Objects?

@gotjoshua These are internal to dgraph and a user wouldn’t (shouldn’t) need access to them.

You’ll have to do this via the client.

This is actually an interesting question. If you end up implementing the creation/modification time predicate, please do post it here so that others can benefit from it in future :slight_smile:

You can already store values corresponding to a timestamp. This can be done via having an intermediate node which links to other nodes that store the value corresponding to a time. So whenever you update the value you can add a node with the new value and the updated timestamp. Would that solve your usecase?

Greetings, I can easily “manage” to create a data structure that will store the history data additionally. My offline data structure looks like this in indexedDB:

What I would love is that this “revMap” could be returned by just adding the @history command to the attribute in the query and that I would not need to store additional (and often repetitive) data.

I don’t quite understand the internals of dgraph enough to know if this is a reasonable feature request or not… But I do know that the data is there, at least until it is cleaned up with a snapshot. And, I assume that the frequency of snapshots can/could be configured.

The basic implementation is visible in the skeleton project i’m working on: gridsome-dgraph
It’s rather simple really:

const modDate=new Date(),
  modified=modDate.toISOString()

 const NquadMutationString = `
        <${uid}> <${field}> "${newData.value}" (modified=${modified}) .
        <${uid}> <modified> "${modified}" .
      `

this creates a modification time stamp as a facet on the attribute that was set and on the parent uid subject:
ModifiedTimeStampAsFacetAndAttribute

But, but, but…
Is there really no way for me to write a plugin or extension that would do it automatically on the “server-side” ?