RFC: Allow language Tag support in GraphQL

Summary

We have language tag support in dgraph where we can mutate or query a single field in multiple languages. Many users wants this functionality to be added in graphql. And also users which are having data in dgraph with fields @lang directive will be able to expose that in graphql. We will add support for @lang in graphql and allow users to mutate/query value of single field in multiple languages.

Dgraph @lang support

If we have @lang for a field in dgraph then a single field will allow us to set or query it’s value in multiple languages using @ followed by language tag .
For example , we can set food reviews in multiple languages as follow

{
  "set": [
    {
      "food_name": "Sushi",
      "review": [
        {
          "comment": "Tastes very good",
          "comment@jp": "とても美味しい",
          "comment@ru": "очень вкусно"
        }
      ],
      "origin": [
        {
          "country": "Japan"
        }
      ]
    }
  ]
}

and we can query it using different ways to either get tag or untagged values or to get all or selected languages. For example we can get the comments for Sushi written in any language using below query

{
  food_review(func: eq(food_name,"Sushi")) {
    food_name
    review {
      comment@*
    }
  }
}
{
  "data": [
    {
      "food_name": "Sushi",
      "review": [
        {
          "comment@jp": "とても美味しい",
           "comment@ru": "очень вкусно",
          "comment": "Tastes very good"
        }
      ]
    }
  ]
}

For more examples look at https://dgraph.io/docs/tutorial-4/#querying-using-language-tags

GraphQL Design for @lang support

In graphql we can’t mutate or query multiple values for a single field. We should explicitly have different fields for every language tag in schema and then while doing query or mutation convert them to appropriate dgraph predicates.

One solution for this is to use the @dgraph type for every such field in graphql and describe the corresponding dgraph predicate name. For example, we can have a field name and if we want to have its value in hindhi we can have one another field nameHi with @dgraph(pred:name@hi). And similarly, we can have different such fields for other languages.

     type person {
         name: String @lang
         nameHi: String @dgraph(pred:"name@hi")
         nameEn: String @dgraph(pred:"name@En")
      }

This design can lead to lots of language fields in a type if we have multiple such @lang type.

Changes

We are purposing design in which user just need to give the language codes that he wants to support and any combined codes he wants to query and we will generate the fields for the users.
For this we are adding @lang directive on field of type String in graphql with below arguments.

fieldName: String @lang(codes: [String], combine: [String])

codes: slice of strings, user specify the language codes that he want to support for this field.
combine: slice of strings, merged language codes that user may want to query

Example :

type Student {
	fullName: String @lang(codes: ["hi", "en", "ru"], combine: ["hi:en","en:ru:hi","en:.","en:ru:."])
        Reg_No: String
}

we will generate all the fields which correspond to the codes and combined codes.

type Student {
      fullName                // untagged value
      fullName_hi           @dgraph(pred:"fullName@hi") // corresponds to language code "hi"
      fullName_en           @dgraph(pred:"fullName@en") // corresponds to language code "en"
      fullName_ru           @dgraph(pred:"fullName@ru") // corresponds to language code "ru"
      fullName_hi_en        @dgraph(pred:"fullName@hi:en:.")  // corresponds to combined code "hi:en" 
      fullName_en_ru_hi     @dgraph(pred:"fullName@en:ru:hi") // corresponds to combined code "en:ru:hi" http://discuss.dgraph.io/t/wip-rfc-allow-language-tag-support-in-graphql/13027
      fullName_en_un        @dgraph(pred:"fullName@en:.")    // corresponds to combined code "en:." 
      fullName_en_ru_un     @dgraph(pred:"fullName@en:ru:.") // corresponds to combined code "en:ru:."
}

Notes

  1. un code is reserved for untagged string.
  2. we won’t be able to query name@* type of code because it returns a list. Although this can be achieved by querying all the codes.

User requests for language tag support in graphql:

  1. Language Tag Support

  2. Multi-language strings supported by GraphQL using @lang directive

@jdgamble555 @nossila @colinskow @dkjii @gregerolsson @amaster507 Pease provide your feedback on above RFC.

1 Like

Not a big fan of the proposal. I like the @dgraph Tag being used to do various things. No need to make this more complicated.

CC: @hardik

2 Likes

yeah, but we need to have multiple fields with @dgraph directive for every language code user want to support. That can become more tedious if we have many such fields with @lang directive in the schema.
We just wanted to have a clean approach to do this.

1 Like

This is not very clean in my opinion. There are two parts to this equation, supporting languages in queries and supporting languages in mutations.

Supporting languages in queries (if changed from dgraph directive) should be implemented by using a lang directive in the query itself.

queryStudent {
  fullName # untagged value
  fullName_hi: fullName @lang(code: "hi")
}

The main problem with this though is that the GraphQL spec does not support directives on inputs. So how to work around this? The only limited way it can be done is with multiple input fields. But that really adds complications to required fields. Here are some things that need thought through:

  • What happens if a user marks the field with the proposed lang directive as required? Are all language fields then required?
  • What happens if a user combines the @id directive with the @lang directive? Will this be supported or throw error?
  • How to support a set of languages easily across the entire schema. If I have hundreds of string fields across multiple types, I would want to support the same language set across all strings easily without adding a complicated schema directive to every field.

A better way might be to add the directive to the query or mutation itself. With GraphQL, I don’t really see the need for a user needing different languages at once. The client would have one preferred language. If the language directive was raised to the query/mutation level, then the rewriter could add the language tag to every string field (not @id)

query @lang(code:"hi") {
  queryStudent {
    fullName
  }
}
mutation @lang(code: "hi") {
  addStudent(input: [{ fullName: "एंथोनी मास्टर" }]) {
    student {
      fullName
    }
  }
}

The only thing this would not handle is setting multiple languages at once. But would a GraphQL client with a single preferred language have the information to set the multiple languages at once? I think that might be better handled with a language hook on a mutation that grabs the string fields and enhances the database with other language translations using a translation API.

I personally don’t use the language features, and don’t see myself using them in the near future. I would prefer users to use whatever language they prefer directly. Google Translate in Chrome does a fair job of translating content when needed.

1 Like

Thanks @amaster for detailed comment.

  1. If the user marked field with lang directive required then we can make the untagged field as required and language tag fields as optional.

  2. Similarly we can apply @id directive to untagged value.

  3. We haven’t think of this but may be we can add slice of strings in schema hreader which will contain all the common languages that user want to use and then use that in different fields across types.

I see some problems with this design also .For example , we are passing single code to query/mutation but what if there are multiple fields with @lang directive. And putting @lang in query/mutation will complicate things.

So as of know we are thinking of supporting it through @dgraph directive and user needs to define all the fields which corresponds to different language tags that he want to use.

GraphQL design with @dgraph directive

User needs to define different fields for different language tags that he want to use.
Example:

type Person {
         name: String 
         nameHi: String @dgraph(pred:"Person.name@hi")
         nameEn: String @dgraph(pred:"Person.name@en")
         nameHi_En:  String @dgraph(pred:"Person.name@hi:en") //desn't generate for mutation 
         nameHi_En_untag:  String @dgraph(pred:"Person.name@hi:en:.") //doesn't generate for 
      
      }

We will add @lang in dgraph schema for the corresponding predicate automatically.
User needs to give dgraph predicate name of the untagged field(name in this case) in @dgraph argument of language tag field.
Dgraph predicate name for the corresponding graphql field typename.fieldname
For example, in nameHi: String @dgraph(pred:"Person.name@hi") we have Person.name which is predicate in dgraph for the corresponding field name in graphql. If user give some other field name in argument then it won’t work as expected.

Note

  1. User needs to at one place either in all language fields in type or in the interface.
  2. We won’t be adding fields in mutation patch which have multiple language tags , for example field nameHi_En: String @dgraph(pred:"Person.name@hi:en") can only be queried.

Interaction with exiting directives for language tagged fields:

schema directives

@id : not required now, can be added in future .
@search: applicable only on one field(tagged or untagged value) and will apply on all gql fields which map to the same dgraph predicate.
@lambda: doesn’t work with @dgraph directive
@custom : doesn’t work with @dgraph directive
@hasinverse: doesn’t apply on string fields

query directives

@skip: Work normally as with other fields
@include: Work normally as with other fields

This didn’t make sense.

@custom or @lambda combined with @dgraph doesn’t work does it? So combining it with @lang doesn’t make semse.

@hasInverse is for edges not strings so combining it with @lang should throw an error

@search should be limited somehow. What happens when a different index is defined on different language tags.

@skip and @include are query side not schema directives.

1 Like

If support is added for languages in one way or another, I support it! I do agree that simpler is better.

1 Like

GraphQL design with @dgraph directive

User needs to define different fields for different language tags that he want to use.
Example:

type Person {
         name: String   // Person.name is the corresponding dgraph field for this field
         nameHi: String @dgraph(pred:"Person.name@hi")
         nameEn: String @dgraph(pred:"Person.name@en")
         nameHi_En:  String @dgraph(pred:"Person.name@hi:en") // won't be added to mutation patch
         nameHi_En_untag:  String @dgraph(pred:"Person.name@hi:en:.") //won't be added to mutation patch
      
      }

We will add @lang in dgraph schema for the corresponding predicate automatically. User needs to give dgraph predicate name of the untagged field(name in this case) in @dgraph argument of language tag field. Dgraph predicate name for the corresponding graphql field typename.fieldname.

For example, in nameHi: String @dgraph(pred:"Person.name@hi") we have Person.name which is predicate in dgraph for the corresponding field name in graphql. If user give some other field name in argument then it won’t work as expected.

Note

  1. User needs to define all language fields at one place either in type or in the interface.
  2. We won’t be adding fields in mutation patch which have multiple language tags , for example field nameHi_En: String @dgraph(pred:"Person.name@hi:en") can only be queried.

Interaction with exiting directives for language tagged fields:

schema directives

@id : not required now, can be added in future .
@search: applicable only on one field(tagged or untagged value) and will apply on all gql fields which map to the same dgraph predicate.
@lambda: doesn’t work with @dgraph directive
@custom : doesn’t work with @dgraph directive
@hasinverse: doesn’t apply on string fields

query directives

@skip: Work normally as with other fields
@include: Work normally as with other fields

This feature is implemented and has been merged to master branch.
It will be available in 21.07 release.
PR: feat(GRAPHQL): Add language tag support in GraphQL by JatinDevDG · Pull Request #7663 · dgraph-io/dgraph · GitHub

2 Likes

If we were to define say 50 languages for every predicate that we wanted to define a @lang on (perhaps using an automated script to generate our schema), since they’re all mapping to the same underlying predicate, is the extra memory footprint going to be negligible?

yeah, language tags are not stored as seprate predicates in Dgraph.
In graphql schema we just specify the possible language tags and they are just stored in schema. Only after we make query or mutation Dgraph will know about them and they start taking a memory as we use them.
So if you have 50 language tags in graphql schema corresponding to a lang field then initially we only have memory for predicate in Dgraph but later as we start query/mutation for language tag field they will start taking memory in Dgraph.

2 Likes

Thanks for your impressive tech Dgraph!

I’m on dgraph/standalone:v22.0.2

Updating the graphql schema at the /admin/schema endpoint doesn’t seem to work with the "at"dgraph directive yet. Maybe you can add that caveat to your docs, or remove the example?

From your docs here
type Person { name: String # Person.name is the auto-generated DQL predicate for this GraphQL field, unless overridden using "at"dgraph(pred: "...") nameHi: String "at"dgraph(pred:"Person.name"at"hi") # this field exposes the value for the language tag"at"hifor the DQL predicatePerson.nameto GraphQL nameEn: String "at"dgraph(pred:"Person.name"at"en") nameHi_En: String "at"dgraph(pred:"Person.name"at"hi:en") # this field uses multiple language tags:"at"hiand"at"en nameHi_En_untag: String "at"dgraph(pred:"Person.name@hi:en:.") # as this uses., it will give untagged values if there is no value for "at"hior"at"en }
Adding the directives gives this error “resolving updateGQLSchema failed because line 9 column 12: Missing colon in type declaration. Got @”
As I understand, this feature is not yet implemented in graphql only dql allows language querying, correct?
Thanks again

@Jay_Vid That’s right, the @dgraph directive to support multilang was introduced in 21.07. The current release (v22) was taken from 21.03.

I’ll add an issue in the dgraph docs repo.

1 Like

Using this feature or any of the described combinations from language-support with IRIs as a predicate name fails for me with a parsing error.

version: v23.1
running on Docker
GraphQL Endpoint: http(s)://{host}:{port}/admin/schema

Steps to reproduce:

Trying to deploy a GraphQL schema with:

type Concept implements Thing {    
    prefLabel: String @dgraph(pred: "<http://www.w3.org/2004/02/skos/core#prefLabel>@.")
    ...
}

results in:

{"errors":[{"message":"resolving updateGQLSchema failed because line xxx column xx: Missing colon in type declaration. Got @ (Locations: [{Line: 3, Column: 4}])","extensions":{"code":"Error"}}]}

Using it without, succeeds but delivers no results, as in my case there are only tagged strings.

type Concept implements Thing {    
    prefLabel: String @dgraph(pred: "<http://www.w3.org/2004/02/skos/core#prefLabel>")
    ...
}

{"data":{"code":"Success","message":"Done"}}

Is there a way to overcome this issue within the GraphQL generator?

I have chosen Dgraph for my use-case because of the mentioned support for IRIs predicate names and the ability to map those into a GraphQL API through @dgraph directive.