Type/Schema System: introducing object types in schema

Hey @core-devs

We added scalar type support to dgraph schema last week. Have been thinking about what would be the best way to add objects category to it. Here are my thoughts:

Note: This is part of implementation outlined here - Supporting type and schema in Dgraph through GraphQL

###A few constraints/assumptions/deviations from Graphql that we want to have:

  1. Predicates should have universal types. This means that if we define an edge name to be of type string, it will always be of this type irrespective which entity it is coming out of.
  2. We want the object types implementation to be as lean and simplistic as possible. This means forgoing GraphqL constructs like Interfaces, Enums, etc. for now. We can add them later as required.

###Object types will have following properties (points explained by the following example):

  1. Since we are forgoing the interface GraphQL type, we just have one object category to denote all entities in the system.
  • Object struct will have a list of attributes (along with their types stored as a map) to denote possible edges for that entity.
  • Although objects could just be denoted with a name, storing this kind of attribute list will help in query/mutation validations.
  • Edges coming out of object can either point to scalar types or to another entities, so attribute types will have two kinds:
    • scalar types, denoted as <predicate> -> <scalarType>
    • entity types, denoted as <predicate> -> [<objectType1>, <objectType2>, ...]
  • For entity type mappings, list can contain multiple object types since an edge coming out of an entity can point to varioues entity types.
    • Example: A friend edge/predicate coming out from a personType entity (e.g. Tarzan) can point to a personType entity or an animalType entity.
  • Every instance of object type will have a Name field specified by the client through schema file.

Example of the schema file to instantiate object types as defined above:

  • Note that friend attribute in person points to a multi-element list, this corresponds to point 5 above.
  • Objects in schema MUST be declared before outlining their attributes (this can be made optional through some smart code, like inferring nested json as object types)
  • One thought is to have species attribute directly match to speciesType instead through a list to denote presence of exactly one outward edge.
{
	"name": "string",
	"age": "int",
	// all object types must be defined before hand
	"robot": "object",
	"person": "object",
	"speciesType": "object",
	// object types expanded with attributes
	"robot": {
		"name": "string", // for scalar types, only previously specified type definition will be considered
		"friend": ["person"], // [...] denotes a list (multiple edges)
 		"creator": ["person"],
		"species": ["speciesType"]
	},
	"person": {
		"name": "string",
		"friend": ["person","robot"],
		"father": ["person"],
		"species": ["speciesType"]
	},
	"speciesType": {
		"name": "string"
	}
}

###The Schema parser will work as follows:

  • Define and store scalar types as encountered (name and age)
  • If object type encountered:
    • define and instantiate with obj.name = <key> if not present in schema.
    • ignore if present
  • If nested json is encountered:
    • If an object is not present with obj.name = <key>, throw error.
    • If present, append attributes to it.
    • Note: If any attribute was already defined at root level, it’s type will be inferred from there otherwise a new <predicate> -> <scalarType> pair will be added to the schema.
    • This would ensure that in future, when we want object specific types for scalars, we can have them without much ado.

Have tried to include as much explanation as possible in this post, but please do let me know any doubts or suggestions regarding this.

Thanks!


Annotations:
FAQ:

  1. Why objects are explicitly defined in schema file beforehand and not inferred directly.
  • To handle cases like this:
	"animal": {
		....
		"friend": ["person", "animal"],
 		...
	},
	"person": {
		"friend": ["person","animal"],
                ...
	}

The idea here is to blanket all entities in our system through this mechanism (under the object category) in a simple way and later make more concise and restrictive GraphQL types like Interfaces, Unions, etc.
But also keeping in check that expanding this definition in future is easily manageable.

That won’t be fluid. The idea behind separation of storage and schema is to keep the schema fluid. So, if a predicate starts off with string (all of them do by default), and then the user sets it to int, we should allow that. In fact, that’s what differentiates us from everybody else.

I think we can simplify this to just one object type. You can either parse the friend list as personType, or as animalType, not both. Or, you could just forgo this, and find the minimum common list of fields which both of these define and have a baseType and ask for that. I think this is a simpler, cleaner design. Also, sort of like how Go interfaces are built.

The schema looks good. But, not quite how GraphQL defines it. From the spec, they use:

type Person {
  name: String
  age: Int
  picture: Url
  relationship: Person
}

In particular, the schema is not JSON. It’s GraphQL. Also note, that they keep the relationship edge as Person, and not a list of objects.


We do want to implement the Interface system, but GraphQL’s current interface system is old-school. It’s strict like Java, and not fluid like Go. https://facebook.github.io/graphql/#sec-Interfaces For now, don’t worry about it. We’ll look at it once we have the object type system properly implemented.

Yes, this fluidity will always be there. What I meant was something else. By universal types, I meant this:

	"person": {
		"age": "string",
                ...
	},

	"robot": {
		"age": "int",
                ...
        }
...

This shouldn’t be allowed (based on our last discussion). I suggested we have predicate types specific to the entity they are part of but you asked me to keep them global. Which means age here can have just one type across all the entities.
Hence, the structure is like:

        "age": "int",
	"person": {
                ...
	},

	"robot": {
                ...
        }

The flexibility of changing types in schema will always be there. I didn’t say that we can’t change the predicate types, just meant that they can’t be changed in a particular run-time. For e.g.,

  • An attribute marked as int type can be changed to float type, just have to modify the schema file and restart the server.
  • But an attribute marked as float currently can’t be coerced into int (using Go ParseInt func). Although, this can be easily achieved with a bit of code tinkering.

You are right on here. That’s exactly what I had in mind for entity mappings, using interfaces. But you will notice that I wrote in assumptions that we are forgoing Interfaces (based on our last discussion) for this iteration, so didn’t introduce them here.
Nevertheless, let’s continue with single object type mapping for now which we can evolve to interfaces later.

Not clear what you meant by this.

Um, have to disagree with you here. Actually, the examples they give in spec is very limited case by case basis. Here, when they mention this schema, I think in attribute relationship, they intend it to point to just one node (in the mentioned e.g., it points to “Priscilla Chan”), hence the absence of a list. In real examples, they do make lists of such object types.

Please take a look at graphQL list description here.
And a list of examples of graphql schema spec with lists here.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.