GraphQL string semantics unclear and inconsistent in the presence of escape sequences

Given a schema like this:

type Post {
    id: ID!
    raw: String!
}

Here’s a way to break a mutation:

mutation MyMutation {
  addPost(input: {raw: "\xaa Raw content"}) {
    numUids
  }
}

giving rise to the following error:

{
  "errors": [
    {
      "message": "Unexpected <Invalid>",
      "locations": [
        {
          "line": 2,
          "column": 26
        }
      ]
    }
  ]
}

This works:

mutation MyMutation {
  addPost(input: {raw: "\\xaa Raw content"}) {
    numUids
  }
}

Inconsistent Semantics

The Graphql spec defines a String as such: “String: A UTF‐8 character sequence.”. So far so good. Both sequences "\xaa" and "\\xaa" are valid UTF-8 character sequences. They have different meanings, however. The first one is basically another way of writting "ª", while the second would be represented as "\xaa".

Perhaps then there was some unstated requirement that escape sequences (\n, \t, \a, \xhh etc) should themselves be escaped. But that’s untrue. Our implementation is inconsistent.

To wit, this works too - the sequence is correctly stored.

mutation MyMutation {
  addPost(input: {raw: "\n Raw content"}) {
    numUids
  }
}

It would seem that some escape sequences are more privileged than others. This should not be the case. So what could they be? Perhaps control characters do not require escapes, while octal/hex literals require escapes. The following table enumerates what is OK and what is not:

Character Type OK?
\0 Control No
\a Control No
\b Control Yes
\t Control Yes
\n Control Yes
\v Control No
\f Control Yes
\r Control Yes
\x1a Control No
\e or \x1b Control No
\xHH Hex literal No
\OO Octal literal No
\uHHHH Unicode literal No

It’s rather maddening. Especially when dealing with text corpuses in the large. For example, people on forums type/mistype \u all the time (e.g. a dyslexic user may type "me\u" instead of "me/u"). Ironically, this very post can’t go into Slash GraphQL without further processing.

What is the ideal way?

thisistheway

Jokes aside, strings are hard, man. I think we should follow the GraphQL spec and just accept a sequence of UTF8 characters. That means not expecting the inputs to be escaped. We should let people type whatever they want in strings.

1 Like

Hi @chewxy, nice finding !!
I am creating ticket for this and will make these enhancements in coming sprints after discussion with graphql team.

I’m putting in a PR to vektah/gqlparser, which I have traced the errors to be.

1 Like