A getting started thing


(Dalu) #1

OK to cut to the chase.

How I do use dgraph in Go?
The tour doesn’t help much.
I feel it’s artificially complicated and not on point.
I have this concrete problem and I’d like to solve it and by solving it I’m learning how to use dgraph.
Those abstract movies and schemas or are they called mutations. I don’t know.
It’s all too much.

Ok let’s do something like this:

type Forum struct {
id string
name string
description string
threads Thread
}

type Thread struct {
id string
forum_id string
author_id string //is an openid-connect id of a subject
title string
preview string
posts Post
}

type Post struct {
id string
thread_id string
author_id string
parent_post_id string
title string
body string
}

how do I …

  • create the schema?
  • create a forum in Go
  • query for a forum in Go
  • update a forum in Go
  • delete a forum in Go

Let’s start with “create the schema”.
And, how do I isolate… is 1 dgraph database supposed to be for 1 project only?
I don’t understand the terminology.
Why mutation? Why not table or collection? This isn’t biology.


(Pawan Rawal) #2

Hey @dalu

We do have some examples here but with this new release that we are doing today it should be even easier to add data to the graph (which we do through something called mutations).

With the SetObject it should be very easy to add/update data. Once I am done with the release, I can write an example to do the CRUD that you are talking about and it can be part of our docs.

No, mutations here is like your INSERT/UPDATE statement in a SQL database. Basically, anything that modifies the data.

Unlike a SQL database, you don’t need to have a fixed schema. You can just add your data and modify the schema on the fly if you need to.

Typically yes unless you namespace the predicates by the project name or some id.

We don’t have table or collections. Graph databases have nodes and edges. You could read https://docs.dgraph.io/guides/#what-is-a-graph and see if this brings a better understanding.


(Pawan Rawal) #3

Hey @dalu

I wrote a small program for doing CRUD for the forum.

Let me know how this looks. This would work with the latest version of Dgraph (v0.8.3)
As you can see I didn’t set any schema explicitly. You can update the schema later if your queries need them. Hope this clarifies a few things.


(Dalu) #4

Hello @pawan,

sorry for the late reply. I skimmed it yesterday but couldn’t post because different system without credentials.

first of all, thank you so much, that’s a great starting point.
I will probably have questions.

For instance:

  • do I have to use int64 for the identifier?
  • do I have to create the whole structure forum, thread, post in order to be able to manipulate a let’s say post?

I will play around with it and get back to you. I’m sure I’ll have questions.


(Dalu) #5
docker run -it -p 8080:8888 -p 9080:8889 -v ~/dgraph:/dgraph --name dgraph dgraph/dgraph dgraphzero -w zw
Setting up listener at: localhost:8888
Setting up listener at: localhost:8889
2017/10/10 11:41:14 node.go:234: Found hardstate: {Term:2 Vote:1 Commit:4 XXX_unrecognized:[]}
2017/10/10 11:41:14 node.go:246: Group 0 found 4 entries
2017/10/10 11:41:14 raft.go:292: Restarting node for dgraphzero
2017/10/10 11:41:14 raft.go:567: INFO: 1 became follower at term 2
2017/10/10 11:41:14 raft.go:315: INFO: newRaft 1 [peers: [], term: 2, commit: 4, applied: 0, lastindex: 4, lastterm: 2]
Running Dgraph zero...
2017/10/10 11:41:17 raft.go:749: INFO: 1 is starting a new election at term 2
2017/10/10 11:41:17 raft.go:580: INFO: 1 became candidate at term 3
2017/10/10 11:41:17 raft.go:664: INFO: 1 received MsgVoteResp from 1 at term 3
2017/10/10 11:41:17 raft.go:621: INFO: 1 became leader at term 3
2017/10/10 11:41:17 node.go:301: INFO: raft.node: 1 elected leader 1 at term 3
^CShutting down...
2017/10/10 11:41:57 gRpc server stopped : accept tcp 127.0.0.1:8888: use of closed network connection
2017/10/10 11:41:57 Stopped taking more http(s) requests. Err: accept tcp 127.0.0.1:8889: use of closed network connection
2017/10/10 11:41:57 All http(s) requests finished.
All done.
darko@wrk ~ $ docker start dgraph 
dgraph
darko@wrk ~ $ docker ps
CONTAINER ID        IMAGE               COMMAND              CREATED             STATUS              PORTS                                                                NAMES
f4b75c05ff4a        dgraph/dgraph       "dgraphzero -w zw"   49 seconds ago      Up 2 seconds        8080/tcp, 9090/tcp, 0.0.0.0:8080->8888/tcp, 0.0.0.0:9080->8889/tcp   dgraph
$ go run main.go
Creating a forum and associated threads and posts.

2017/10/10 13:42:23 rpc error: code = Unavailable desc = transport is closing
exit status 1

(Pawan Rawal) #6

Looks like you just ran dgraphzero and not the second command to run dgraph.

docker exec -it dgraph dgraph --bindall=true --memory_mb 2048 -peer 127.0.0.1:8888

From the documentation at https://docs.dgraph.io/get-started/#using-docker


(Pawan Rawal) #7

For now yes, because we use a uint64 for ids internally. Though we could also parse it into a string.

No, if you see the example carefully, I am manipulating the thread directly. Similarly, you could manipulate the post directly when you have its id.


(Dalu) #8

I see. Every “entity” or “node” has its own ID but the id is global. So every “dataset” is of a variable type but gets its own autoincrementing unique id assigned.
So translated to a mongodb it’s like one giant collection, there are not different “tables”/“collections”.
Only that the connections between the fields are connected by a “descriptor”.
[node forum] -field- [attribute/node/leaf title]
One large collection and you mentioned namespaces?

The identifier uint64 seems large but when there are different types of data all using the same global id IMHO the identifier needs something that is more alike a bson.ObjectId aka something that can not overflow, as it has a time and counter (and a machineid) component in it. I understand that if 10 billion people would make 100 posts every day it would still take a couple of 100s of 1000s and more years to reach id overflow but what if? What if you have something like facebook that also have movie data and location data and all kinds of data and what if you have all the public/open data feeds in your graphdb? Maybe make it a time+uint64 string id. Or uint128

Sorry for being “lazy”, but when the 2nd command from the “Get started with docker” doc is ran, that is ran in foreground. When I run it with -d so it runs in the background the database is inaccessbile. Is there a way to just have dgraph running like with a --restart=always so it’s available all the time without needing to run the 2nd docker line in the background?
Also I see you’re building upon ubuntu:16.04(base) and then 14.04(release), wouldn’t it be better, since Go can be statically linked to have a FROM scratch. I mean, sure, if multiple docker images use ubuntu on the same host then it’s not an issue but if not then you have 200-600some MB of space wasted on a base OS you’ll never need. And the curl https://get.dgraph.io | bash could be done in Go with a helper. Still smaller than a whole OS.

Just a few thoughts. I’ll get to analyzing the code and putting it into corresponding functions now :slight_smile:


(Dalu) #9

I did a few changes. For instance in the Post type.

type Post struct {
	Id         uint64  `json:"_uid_,omitempty"`
	AuthorId   string  `json:"author_id,omitempty"`
	ParentPost *Post   `json:"parent_post,omitempty"`
	Thread     *Thread `json:"thread,omitempty"`
	Title      string  `json:"title,omitempty"`
	Body       string  `json:"body,omitempty"`
}

I’m not sure if this Thread is needed here in Post but it’s good to have a reference to the thread the post belongs to.

And I also changed Thread because it would seem that this makes more sense, in a practical sense:

type Thread struct {
	Id        uint64  `json:"_uid_,omitempty"`
	Forum     *Forum  `json:"forum,omitempty"`
	FirstPost *Post   `json:"first_post,omitempty"`
	LastPost  *Post   `json:"last_post,omitempty"`
	Posts     []*Post `json:"posts,omitempty"`
}

So the Thread’s title is determined by a query for the first Post and there’s the last post, aka the latest reply to the thread (so you can mouseover skim the original post’s content and the last post’s content, see if it’s worth reading etc.).
Likewise there’s a Forum reference (to skim other thread titles in the forum or w/e).

So the question is … “relationships”.
a Thread belongs to a Forum (One), a Forum can have many Threads (ToMany)
a Post belongs to a Thread (One), a Thread can have many Forums (ToMany)

Forum > Thread OneToMany (aka []*Thread)
Thread > Forum ManyToOne (aka forum_id)

Thread > Post OneToMany ( aka []*Post)
Post > Thread ManyToOne ( aka thread_id)

Why not ManyToMany? Because One Post can only belong to One Thread.

But how do you do that in Go and dgraph?

https://tour.dgraph.io/intro/3/ says

  schema {
    name: string @index(exact, term) .
    age: int @index(int) .
    friend: uid @count .
  }

So friend is a uid of a node with the @count. Oh right, I forgot, I’d like to know how many Threads are in a Forum and how many Posts are in a Thread. Is anything special required on the Go side?

What is recommended?
Let’s take the Thread Post relation and structs.
Would it make more sense to have a ThreadId uint64 json:"thread_id" in Post or is Thread *Thread json:"thread" good enough?


(Dalu) #10

After playing around this question more or less answers itself :slight_smile:
It’s all in the query. Let me experiment, things will be more clear then.


(Dalu) #11

How do I write an index in a model?

I tried

type Forum struct {
    Title string `dgraph:"title,index" json:"title"`
}
type Forum struct {
    Title string `dgraph:"title,@index(exact,term)" json:"title"`
}

(Manish R Jain) #12

You need to pass a schema. You can see more details in docs.dgraph.io.


(Dalu) #13

If the answer was there I wouldn’t ask.
I can add a schema to a request with Req.AddSchema. But https://godoc.org/github.com/dgraph-io/dgraph/protos#SchemaUpdate ?? You wot?

But what irritates me is that this schema is then used for ALL nodes.
Before I tried this forum thing I did parts of the tour.
The tour had a schema with an index on name, therefore I was able to query by name.
Had I not done it, I wouldn’t be able to do so.

I’m finding it hard to believe that one schema fits all. Usually it doesn’t. Maybe I don’t understand it correctly.

Your blog has a post where the structs have dgraph:"" tags.

Anyhow don’t let me keep you, it’s 0.8.3 and 1.0.0 seems far away yet.
It looks promising but in this state I’d rather not trust my data to it.

Btw neither the docker problem nor creating/adding a schema with the Go client is explained in your documentation.


(Manish R Jain) #14

@dalu: Can you please file a Github issue regarding the documentation issues that you have encountered? We treat doc issues seriously, so we’ll get them fixed asap.

Regarding one schema fits all, I’m not sure what you mean. Are you looking for an equivalent of SQL table, where each table has their own schema?


(Dalu) #15

Sorry for the late reply, I didn’t receive any notifications by mail or overlooked them in all the spam.

Well no, I’m just looking for means to say…
this data type has these indices
and this one has those

In a translated, SQL, sense yes you could say it’s like every table has their own schema.

The way I see dgraph is
It’s 1 giant MongoDB collection (because a MongoDB collection’s content is also undefined or doesn’t have a schema apart from the _id field being bson.ObjectId by default if not overwritten)
Now I can define a schema

schema actually I forgot how to write a schema, now that I’ve been absent like 2 weeks, let’s see

mutation {
  schema {
    name: string @index(exact, fulltext) @count .
  }
}

how we have some data types

type Person struct {
Name string
Age uint
}

and it just happens to also have a data type of Profile, which also has a Name field

type Profile struct {
Name string
}

maybe the name example isn’t the best example.

Basically when setting a “schema” which is essentially a “index those fields in those ways” and “treat them as being this”

Here’s a good example

type Article struct {
created time.Time
}
type Order struct {
created uint64 // because it's a unix timestamp
}

Those 2 mean the same thing but their field members are different.
So you can’t have 2 nodes with the same predicate be different types.

You could just not mention those fields in the schema, but how would you query?

Regarding documenation, it’s not clear how to create a schema with the Go client.
I’ll create an issue.

The 0.9 client (the 0.8 one is even more confusing)

SchemaUpdate
https://godoc.org/github.com/dgraph-io/dgraph/protos#SchemaUpdate
is used in

Dgraph.CheckSchema
https://godoc.org/github.com/dgraph-io/dgraph/client#Dgraph.CheckSchema

What does CheckSchema mean? Is that how you define a schema?
If so what does a schemadefinition look like in this context?
Let’s take this schema definition from the documentation

mutation {
  schema {
    name: string @index(exact, fulltext) @count .
    age: int @index(int) .
    friend: uid @count .
    dob: dateTime .
    location: geo @index(geo) .
    occupations: [string] @index(term) .
  }
}

(system) #16

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.