SQL migration tool: shenanigans and solutions

These are solutions for some issues I encountered when migrating a Shakespeare database to DGraph. The generated DGraph dataset is open source, do to check it out and leave a star :star:. A shoutout to the DGraph community for helping me out!

Issues with MySQL

  1. Without foreign keys, DGraph links are not generated. The legacy MyISAM engine does not support foreign keys, migrate to a newer engine to add them.

  2. Unsupported types: blob, mediumInt and char cause the migration tool to fail.

  3. Different naming conventions: MySQL generally uses snake case and plurals for naming columns and tables. By contrast, GraphQL types are in singular and use Pascal case. After the sql.rdf file is generated, you may wish to change type and property names. Eg. play_characters should be PlayCharacter in GraphQL.

  4. Foreign keys need to be renamed: Foreign key relations are transformed to GraphQL relations. Eg. let paragraphId be primary key of Paragraph table. It is present as a foreign key in Character table. <Character.paragraphId> must be replaced with <Character.paragraph> so that we get a properly named GraphQL schema.

    type Character {
        charId: String! @id
        paragraph: Paragraph # confusing if this field instead was paragraphId: Paragraph
    }
    

Issues with migration tool

  1. GraphQL schema must be separately designed and explicitly added in order to execute GraphQL queries. The generated schema.txt is a DQL schema, adding this does not generate a GraphQL schema. Note that a DQL schema is auto-generated when GraphQL schema is added.

  2. <dgraph.type> fields are not generated in the sql.rdf file. Without <dgraph.type> the node data can be queried from DQL but not from GraphQL.

    Regular expression solution: Let <Character.charId> be the table ID. We wish to add <dgraph.type> "Character" for every character.

    In VS Code, regex search for (.*?) <Character.charId> (.*?) . and replace with

    $1 <Character.charId> $2 .
    $1 <dgraph.type> "Character" .
    

    This gives us

    _:Character.1citizen <Character.charId> "1citizen" .
    _:Character.1citizen <dgraph.type> "Character" .
    
  3. @hasInverse do not work when data is added using RDF instead of a GraphQL mutation. The inverse connections need to be explicitly added. This can’t be done manually for large datasets.

    type Paragraph {
        paragraphId: String! @id
        character: Character
    }
    
    type Character {
        charId: String! @id
        paragraphs: [Paragraph] @hasInverse(field: character)
    }
    

    Regular expression solution: Suppose the generated dataset has Paragraph.character predicates and we want to add the inverse Character.paragraphs predicates.

    In VS Code regex search for (.*?) <Paragraph.character> (.*?) . and replace with

    $1 <Paragraph.character> $2 .
    $2 <Character.paragraphs> $1 .
    

    This gives us

    _:Paragraph.866084 <Paragraph.character> _:Character.hamlet .
    _:Character.hamlet <Character.paragraphs> _:Paragraph.866084 .
    
  4. After migration, we are using externally generated IDs (the table primary keys) in DGraph. We need to use String! @id instead of type ID for these fields.

    type Character {
        charId: String! @id # not ID type
    }