Using dgraph as data warehouse?


#1

Hi there!

does anybody use dgraph as a data warehouse backend?

I’m trying to evaluate if I can use dgraph efficiently as data warehouse with bi-temporal data modeling for example in financial institutions. Financial institutions have usually very complex data model and the their regulator asks often for a information, that can be only estimated using deep joins and filters. I hope, that with dgraph a) the usually complex data model could be simplified (so that no data model specialist could work with it) and b) deep joins should be executed faster.

Any experience with such use of dgraph?

Cheers!


(Rus) #2

Warehousing as in type 2 dimension? if u find a realtime bi-temporal DB that changes tables while referencing R/T recorded tick on 64 instruments, let me kno. Far as I throw abstraction, it requires dual channel GPU and CUDA (or equiv) to achieve what I need.
yor location?


#3

Thank you for your answer.

What I would like to do is importing end of date state data in dgraph. To avoid importing each day data without any changes, I would like to use some bi-temporal abstraction like ibm db2 with four timestamps for each node (business_time_start, business_time_end, sys_time_start, sys_time_end) - take a look here for the implementation in db2. The graph implementation would be something like that:

arrows%20(3)

When I query the graph, then I could use a filter for example on business_time… to return the state of the data on some point in time.


#4

If I have predicates used in the root node and in the expanded node (see example above).
Can I use one global filter to filter all nodes while expanding the graph?

When I use the query below, the filter applies only to the root nodes:

{
  q(func: type(user)) @filter(gt(business_time_start, "2019-09-15") and eq(business_time_end, "9999-12-31")){
    uid
    name
    user_listing{
        uid
        name
    }
  }
}

Do I really need to apply the same filter to each edge when expanding?

{
  q(func: type(user)) @filter(gt(business_time_start, "2019-09-15") and eq(business_time_end, "9999-12-31")){
    uid
    name
    user_listing@filter(gt(business_time_start, "2019-09-15") and eq(business_time_end, "9999-12-31")){
        uid
        name
    }
  }
}

(Rob Streeting) #5

I am also interested if there is a feature to do this, currently we have to use the filter in every uid predicate as @graphpivot showed above.


#6

Hi @rstreeting. Thank you for your comment. Are you using bitemporal modeling or just one time dimension? Can I also ask you:

  • For which purpose are using the time dimensoins? Some kind of data analytics, for example comparing two data points for reporting etc.?
  • What kind of front-end are you using? I’m actually working on a hybrid server-client-application, where one can call data from dgraph, then import it on the server side in python-pandas dataframe and make fun stuff like filtering, grouping, pivot, adding external data, export data etc

Cheers!


(stroboshaver) #7

From the bitemporal perspective coming from an rdbms i can not see how this schema fits with bitemporal usage in dgraph, because dgraph has no complex primary keys.

I guess you had to come up with at least one extra type for each existing type and facets on the relation to cover bitemporal data.

What was your solution for bitemporal data so you could query by uid and system and business time?


(system) closed #8

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.