@lang directive, maybe even more powerful than initially thought ❔


#1

hello,
I’m very much interested in DGraph

I have a question regarding the @lang directive. It’s usefulness is obvious to me (it’s a killer feature!), but after reading the “Language Support” section of the documentation, I wonder:

  • Is the mechanism that drives this feature in any way actually related to i18n?
    Because I can see other ways this could be used that are not related to i18n. For example:

    color@rgba
    // get me results in RGBA
    color@rgba:hsva:cmyk:.
    // get me results in RGBA and fallback to HSVA, CMYK or whatever
    length@mm
    // get me results in millimeters
    

    I ask, because I don’t think I would get the same mechanics, if I just name an edge color_rgba.

  • And also, this query:

    {
        q(func: allofterms(name@en, "Farhan Akhtar")) {
            name@hi
            name@en
    
            director.film {
                name@ru:hi:en
                name@en
                name@hi
                name@ru
            }
        }
    }
    

    will produce:

    {
    "data": {
        "q": [
            {
                "name@hi": "फरहान अख्तर",
                "name@en": "Farhan Akhtar",
                "director.film": [
                    {
                        "name@ru:hi:en": "दिल चाहता है",
                        "name@en": "Dil Chahta Hai",
                        "name@hi": "दिल चाहता है"
                    }
                ]
            }
        ]
    }
    }
    

    which, of course, is very handy. But it doesn’t tell me to which language the values in the response actually resolved to. What I mean is: instead of getting:

        "name@ru:hi:en": "दिल चाहता है"
    

    it would be nice to get

        "name@hi": "दिल चाहता है"
        // so that I know what language I am getting back
    

    or even:

        "name@ru:hi:en>hi":  "दिल चाहता है"
        // now I know both what prompted the result and the resolution
        // (I'm not suggesting this is a nice syntax 😆)
    
        // or:
        "name@ru:hi:en": {
            "hi": "दिल चाहता है"
        }
    
        // or:
        "name@ru:hi:en": ["hi", "दिल चाहता है"]
    

Please let me know if I am missing something in the docs.

By the way, I ask, not just out of curiosity, but because my first question is directly related to an actual use case I am dealing with right now. So the answer might tilt the scales towards using DGraph in my startup.
So, looking forward to your thoughts on this matter @mrjn

best,
F


(Manish R Jain) #2

This seems useful because it indeed is useful to know which language the result was in. The main concern from users would be that they wouldn’t know in advance what their JSON key is going to be. But, one could argue that they can use alias to ensure that.

Feel free to add a GitHub issue for that. CC: @francesc

Also, we should ensure that this feature can work with any string passed to it, not just language strings. @MichelDiz can you ensure that the feature would with color strings as well?


(Michel Conrado) #3

It works fine.

Schema

Height: string @lang .
length: string @lang .
mycolor: string @lang .
name: string @index(exact) @lang .
thickness: string @lang .

Mutation

{
	"set": [{
			"name": "colors",
			"mycolor@rgba": "255,0,0,0.3",
			"mycolor@rgb": "255,0,0",
			"mycolor@hsla": "0, 100%, 50%, 0.3",
			"mycolor@hsl": "0, 100%, 50%",
			"mycolor@hex": "#ff00004d",
			"mycolor@cmyk": "0,100,100,0"
		},
		{
			"name": "sizes",
			"thickness@mm": "3",
			"length@mm": "841",
			"Height@mm": "1189",
			"length@in": "33.1",
			"Height@in": "46.8"
		}
	]
}

Query

{
  q(func: eq(name, [colors, sizes])){
    uid
    name
    mycolor@cmyk
    mycolor@hex
    mycolor@rgba:hsva:cmyk:.
    length@mm
    length@in
  }
}

Result

{
  "data": {
    "q": [
      {
        "uid": "0xfffd8d67d832e08e",
        "name": "colors",
        "mycolor@cmyk": "0,100,100,0",
        "mycolor@hex": "#ff00004d",
        "mycolor@rgba:hsva:cmyk:.": "255,0,0,0.3"
      },
      {
        "uid": "0xfffd8d67d832e08f",
        "name": "sizes",
        "length@mm": "841",
        "length@in": "33.1"
      }
    ]
  }

(PPP225) #4

I find @lang a killer feature as well! One thing I would find useful, is to have “full-text search with stemming and stop words” support on locales with country codes.

Example: If I understand correctly, currently, having text@en would get us those features, however using text@en-US or text@en-CA or text@zh-yue would not.


#5

Thank you @mrjn and @MichelDiz for your response, and @ppp225 for chipping in.

it seems both @ppp225 and I, see value in the @lang directive but for different reasons.

I am effectively interested in overloading (maybe misusing) @lang to have “multi-unit” values (because I can benefit from the fallback mechanism). Also, for example, I am interested in storing color@rgba as an int (I’m not sure I can do that @lang, can I?), saving as int would save me some space (at least 50%) and make any computation faster, right?

I am also interested in going deeper with i18n as @ppp225 described.

So, shouldn’t these two scenarios be served by two different directives:

  • @lang for i18n and,
  • say @unit for multi-value

@unit would have the same fallback mechanism as @lang but stripped of any i18n related features and allowed to have other values other than string.

What do you think? :thinking:


(Michel Conrado) #6

Only strings :confused:

To some extent you could use float, but int I think it doesn’t fit. Let’s say for example the colors, they have divisions in comma or dot or even have percentage. Hex has distinct characters, numbers and letters. After saying that, I can’t see any advantage in INT or Float (Or exclusive use in sizes). Otherwise you have your own way to do things.


#7

Sorry @MichelDiz , I don’t understand what you mean by “fit”,

an int in DGraph is a int64 (8 bytes) and JSON uses Number and can hold integers from -(2^53-1) to 2^53-1, both have plenty of space to hold a uint32 in Go which is 4 bytes (1 byte, 0 to 255 values, for each R, G, B and A), in my scenario the client unpacks it into 4 numbers if needed. They same goes for any other color model :wink:

Am I missing sth? :slight_smile:

So, to clarify, will @lang take an int ? :thinking:


(Michel Conrado) #8

I mean dots, percentage character, hashtag character, mix of numbers and letters. You can’t put this in a INT, Unless you have your own way of handling these values by INT.