Querying and visualization with Jupyter notebook

Hello, I’m looking to integrate a DGraph database with a Jupyter Notebook where we can query data from the db with visualizations. On the viz side one of the most promising I’ve seen so far is pygraphistry (GitHub - graphistry/pygraphistry: PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer). It doesn’t provide Dgraph support out of the box but I was thinking transforming the dgraph JSON output into a Pandas dataframe which pygraphistry can then process. Is this a reasonable way to approach it, or maybe there is a better one?

Thanks in advance.

Hi @pirxthepilot, welcome to Dgraph!
Have you tried Ratel for visualization? You can check it out here in the playground and github to build locally. But I don’t think there is a easy way to connect Ratel with Jupyter notebooks yet. What is your use case in terms of the data size and kind of queries you wish to visualize.

@Anurag thanks for the welcome!

I am using Ratel and it’s pretty great, but we also work with other data that’s not in DGraph and would like to consolidate everything in one place (Jupyter). Do you think the pygraphistry method can be one way to do this?

This will be used primarily in analyzing log events in a security context. As for data size, I’m not sure yet, we are still playing around with it :slight_smile:

Hey @pirxthepilot,
We haven’t seen anyone use pygraphistry with Dgraph but you are encouraged to try. I feel you can use any visualization tool once you have the JSON data from dgraph. We have seen people use d3 with Dgraph maybe you could explore that as well.

Oh nice, d3 looks very promising! Will check it out, thank you!

@pirxthepilot @Anurag Just saw this when another dgraph user asked the same thing:

With just a few lines, you should be be able to do something like:

import json, pandas as pd, pydgraph, graphistry

# #### 1. Dgraph data wrangling ###

txn = pydgraph.DgraphClient(...).txn()
dql_query = """
my query
"""
dql_opts = { }
edges_df = pd.DataFrame(json.loads(txn.query(dql_query, dql_opts).json))
print(edges_df.info())
print(edges_df.sample(10))

# #### Plotting 1-liners ####

g1 = graphistry.edges(edges_df, 'some_src_col', 'some_dst_col')
g1.plot()

nodes_df = # same flow as for dql -> edges_df
g2 = g1.nodes(nodes_df, 'some_node_id_col') # new plottable that layers on node bindings
g2.plot()

url = g2.plot(render=False)  # now as a url for, say, manual iframe'ing in an app or pasting into Slack

It’d be great to get a sample notebook into the dgraph + pygraphistry docs, esp. now that we have the free Graphistry Hub tier out. This should save folks time from having to the more awkward & manual graph libs, bring in the ability to use pydata flows, enable them to see much more of their data + visually interact with it.

Also worth noting, for folks not using the dgraph python client, all of the above is just thin wrappers over Graphistry’s REST web API (and optional JS client one): https://hub.graphistry.com/docs

For those following this thread, see: Jupyter Notebook with graph visualization (via Graphistry) now available