Storing JSON documents in a graph

Hi, I’m trying to build a graph that represents relationships between instances and other things that exist in my AWS account.
instances, security groups, databases etc all have large XML documents that represent them (see below). I’d like to write a program to parse the xml and generate a graph - converting it to a RDF format.

Some of the top level properties that use primitives like strings, integers etc are easy to model in the graph, as they are just properties of the node. But the XML contains nestled information like the block device mapping below. Where we have this kind of nestling does it make sense to construct that as a separate node, that has a relationship between the instances?

i.e. something like:

device_1 name “/dev/xvda”
instance has_a <device_1>

I figured this could be a generic approach to parsing XML responses and constructing a graph, but maybe the ‘has_a’ relationship would become heavily overloaded. Then again with filtering, I could still find things that I’m interested in.

Any thoughts or suggestions of how to solve this type of problem?

<item>
                    <instanceId>i-1234567890abcdef0</instanceId>
                    <imageId>ami-bff32ccc</imageId>
                    <privateDnsName>ip-192-168-1-88.eu-west-1.compute.internal</privateDnsName>
                    <dnsName>ec2-54-194-252-215.eu-west-1.compute.amazonaws.com</dnsName>
                    <keyName>my_keypair</keyName>
                    <amiLaunchIndex>0</amiLaunchIndex>
                    <productCodes/>
                    <instanceType>t2.micro</instanceType>
                    <launchTime>2018-05-08T16:46:19.000Z</launchTime>
                    <placement>
                        <availabilityZone>eu-west-1c</availabilityZone>
                        <groupName/>
                        <tenancy>default</tenancy>
                    </placement>
                    <monitoring>
                        <state>disabled</state>
                    </monitoring>
                    <subnetId>subnet-56f5f633</subnetId>
                    <vpcId>vpc-11112222</vpcId>
                    <privateIpAddress>192.168.1.88</privateIpAddress>
                    <ipAddress>54.194.252.215</ipAddress>
                    <sourceDestCheck>true</sourceDestCheck>
                    <groupSet>
                        <item>
                            <groupId>sg-e4076980</groupId>
                            <groupName>SecurityGroup1</groupName>
                        </item>
                    </groupSet>
                    <architecture>x86_64</architecture>
                    <rootDeviceType>ebs</rootDeviceType>
                    <rootDeviceName>/dev/xvda</rootDeviceName>
                    <blockDeviceMapping>
                        <item>
                            <deviceName>/dev/xvda</deviceName>
                            <ebs>
                                <volumeId>vol-1234567890abcdef0</volumeId>
                                <status>attached</status>
                                <attachTime>2015-12-22T10:44:09.000Z</attachTime>
                                <deleteOnTermination>true</deleteOnTermination>
                            </ebs>
                        </item>
                    </blockDeviceMapping>
...
...
...

Hey CodeG.

There is no way to store JSON in the Dgraph. What you can do is create on your application a parse to that part of you doc to String format and then save to an Edge. Then when needed return this Edge to JSON in your application level.

I did not understand that part. Are you creating unique predicates for each device?

Also you could use with that parsed doc @Facets Get started with Dgraph

Cheers.

1 Like

This is your Doc as JSON mutation format.

{
    "set": [
 		  { "item": {
        "instanceId": "i-1234567890abcdef0",
        "imageId": "ami-bff32ccc",
        "privateDnsName": "ip-192-168-1-88.eu-west-1.compute.internal",
        "dnsName": "ec2-54-194-252-215.eu-west-1.compute.amazonaws.com",
        "keyName": "my_keypair",
        "amiLaunchIndex": "0",
        "productCodes": "",
        "instanceType": "t2.micro",
        "launchTime": "2018-05-08T16:46:19.000Z",
         "placement": {
           "availabilityZone": "eu-west-1c",
           "groupName": ""
           "tenancy": "default"
         },
        "monitoring": {
           "state": "disabled"
         },
        "subnetId": "subnet-56f5f633",
        "vpcId": "vpc-11112222",
        "privateIpAddress": "192.168.1.88",
        "ipAddress": "54.194.252.215",
        "sourceDestCheck": "true",
        "groupSet": {
            "item": {
          		 "groupId": "sg-e4076980",
               "groupName": "SecurityGroup1"
      		   }
         },
        "architecture": "x86_64",
        "rootDeviceType": "ebs",
        "rootDeviceName": "/dev/xvda",
        "blockDeviceMapping": {
            "item": {
          		 "deviceName": "/dev/xvda",
               "ebs": {
                   "volumeId": "vol-1234567890abcdef0",
                   "status": "attached",
                   "attachTime": "2015-12-22T10:44:09.000Z",
                   "deleteOnTermination": "true"
                 }
      		   }
         },
      }
    }
	]
}

In the example below you will create a single node with each information in the Edges item and ebs. You would just parse this format again for XML when you need to

{
        "blockDeviceMapping": {
           "item|deviceName": "/dev/xvda",
            "ebs|volumeId": "vol-1234567890abcdef0",
            "ebs|status": "attached",
            "ebs|attachTime": "2015-12-22T10:44:09.000Z",
            "ebs|deleteOnTermination": "true"
      		   }
         }

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.