I keep getting "Attribute username not indexed" even though I set the schema

ospaarmann · May 30, 2018, 10:26pm

Hey everyone. I have an issue with using indexes. I set the schema with a mutation but still get the error “Attribute not indexed” when I try to use a filter. Maybe anyone has an idea what is going on
My schema:

"id: string @index(hash) .
username: string @index(exact, hash).
name: string @index(exact, term) .
email: string @index(exact, hash) ."

Server log when I send it over:

dgraph_server     | 2018/05/30 22:22:06 server.go:251: Got schema: [predicate:"id" value_type:STRING directive:INDEX tokenizer:"hash"  predicate:"username" value_type:STRING directive:INDEX tokenizer:"exact" tokenizer:"hash"  predicate:"name" value_type:STRING directive:INDEX tokenizer:"exact" tokenizer:"term"  predicate:"email" value_type:STRING directive:INDEX tokenizer:"exact" tokenizer:"hash" ]
dgraph_server     | 2018/05/30 22:22:06 mutation.go:191: Done schema update predicate:"id" value_type:STRING directive:INDEX tokenizer:"hash"
dgraph_server     | 2018/05/30 22:22:06 mutation.go:191: Done schema update predicate:"username" value_type:STRING directive:INDEX tokenizer:"exact" tokenizer:"hash"
dgraph_server     | 2018/05/30 22:22:06 mutation.go:191: Done schema update predicate:"name" value_type:STRING directive:INDEX tokenizer:"exact" tokenizer:"term"
dgraph_server     | 2018/05/30 22:22:06 mutation.go:191: Done schema update predicate:"email" value_type:STRING directive:INDEX tokenizer:"exact" tokenizer:"hash"

Query that causes the error:

dgraph_server     | 2018/05/30 22:18:47 server.go:394: Received query:   {
dgraph_server     |     user(func: has(user))
dgraph_server     |     @filter(eq(username, "9a3f141e-f1dd-48f8-b57e-73d670f7d670"))
dgraph_server     |     {
dgraph_server     |       id
dgraph_server     |       name
dgraph_server     |       username
dgraph_server     |       email
dgraph_server     |     }
dgraph_server     |   }

Any help is greatly appreciated!

Edit to clarify: The username is a UUID because the query comes from integration tests. The id is an internal id (also a UUID). Not Dgraphs uid.

MichelDiz · May 30, 2018, 10:58pm

I simulated here locally and everything is working. Here’s how I reproduced.

id: string @index(hash) .
username: string @index( hash).
name: string @index(term) .
email: string @index(hash) .

It is recommended to use indexing in an economical way. You can even do it the way you are doing. But I would recommend using Hash whenever you search for accuracy. This can save time and space.

Example. Username is usually a single word (even if it is a large word) can be hash. Email has no need to index for exact if you’re indexing by Hash. You choose one or the other. They are very similar indexations. You are only duplicating the indexing.

And name there is no need either. When you do Query with “allofterms” it will use this indexing normally. So if you’re already using term, there is no need to use hash or exact. Hash and exact are for accuracy queries only.

{
  set{
    _:Julian <id> "9a3f141e-f1dd-48f8-b57e-73d670f7d670" .
    _:Julian <name> "Julian Goldheaven" .
    _:Julian <username> "9a3f141e-f1dd-48f8-b57e-73d670f7d670" .
    _:Julian <email> "Julian@Goldheaven.com" .
  }
}

{
user(func: has(username))
      @filter(eq(username, "9a3f141e-f1dd-48f8-b57e-73d670f7d670"))
        {
          id
          name
          username
          email
        }
}


{
  "data": {
    "user": [
      {
        "id": "9a3f141e-f1dd-48f8-b57e-73d670f7d670",
        "name": "Julian Goldheaven",
        "username": "9a3f141e-f1dd-48f8-b57e-73d670f7d670",
        "email": "Julian@Goldheaven.com"
      }
    ]
  },
  "extensions": {
    "server_latency": {
      "encoding_ns": 976900
    },
    "txn": {
      "start_ts": 10036,
      "lin_read": {
        "ids": {
          "1": 20
        }
      }
    }
  }
}

If it still happening it would be necessary to see closely better what is happening.

I am available.

Cheers.

ospaarmann · May 30, 2018, 11:12pm

Hey,

thank you for the quick reply! I checked via Ratel and found out, that the schema was indeed not set. I investigated further and found, that I can set the schema via an alter operation. So the server is ok.

The reason must be the client I am using. I wrote a rpc client for Elixir myself (GitHub - ospaarmann/exdgraph: gRPC based Elixir Dgraph client. Under development.). Based on the proto file I set the schema string and send it over. This seems to not work. But I am not sure what the correct way would be.

Part of the api.proto:

message Operation {
	string schema = 1;
	string drop_attr = 2;
	bool drop_all = 3;
}

I could post the code in question but I am not sure if it makes sense, since it is written in Elixir.

ospaarmann · May 30, 2018, 11:17pm

What I find weird though is that the request from my client shows up in the server logs and it looks fine. Is it possible that it is not commited?

MichelDiz · May 30, 2018, 11:22pm

Cool! I like the idea of Elixir, but I have not had enough time to really learn. But then in your case I can not help. I also have no practice with gRPC.

Have you tried taking a look at the various clients of Dgraph?

ospaarmann · May 30, 2018, 11:24pm

Not yet. But to get back to the server logs. I tried setting the schema once via Ratel and once via my client.

Ratel:

dgraph_server     | 2018/05/30 23:20:31 server.go:251: Got schema: [predicate:"id" value_type:STRING directive:INDEX tokenizer:"hash"  predicate:"username" value_type:STRING directive:INDEX tokenizer:"exact" tokenizer:"hash"  predicate:"name" value_type:STRING directive:INDEX tokenizer:"term"  predicate:"email" value_type:STRING directive:INDEX tokenizer:"hash" ]
dgraph_server     | 2018/05/30 23:20:31 mutation.go:191: Done schema update predicate:"id" value_type:STRING directive:INDEX tokenizer:"hash"
dgraph_server     | 2018/05/30 23:20:31 mutation.go:191: Done schema update predicate:"username" value_type:STRING directive:INDEX tokenizer:"exact" tokenizer:"hash"
dgraph_server     | 2018/05/30 23:20:31 mutation.go:191: Done schema update predicate:"name" value_type:STRING directive:INDEX tokenizer:"term"
dgraph_server     | 2018/05/30 23:20:31 mutation.go:191: Done schema update predicate:"email" value_type:STRING directive:INDEX tokenizer:"hash"
dgraph_zero       | 2018/05/30 23:20:32 oracle.go:75: purging below ts:694, len(o.commits):0, len(o.aborts):0, len(o.rowCommit):0
dgraph_server     | 2018/05/30 23:20:35 draft.go:865: Writing snapshot at index: 606, applied mark: 616

My client:

dgraph_server     | 2018/05/30 23:20:43 server.go:251: Got schema: [predicate:"id" value_type:STRING directive:INDEX tokenizer:"hash"  predicate:"username" value_type:STRING directive:INDEX tokenizer:"exact" tokenizer:"hash"  predicate:"name" value_type:STRING directive:INDEX tokenizer:"term"  predicate:"email" value_type:STRING directive:INDEX tokenizer:"hash" ]
dgraph_server     | 2018/05/30 23:20:43 mutation.go:191: Done schema update predicate:"id" value_type:STRING directive:INDEX tokenizer:"hash"
dgraph_server     | 2018/05/30 23:20:43 mutation.go:191: Done schema update predicate:"username" value_type:STRING directive:INDEX tokenizer:"exact" tokenizer:"hash"
dgraph_server     | 2018/05/30 23:20:43 mutation.go:191: Done schema update predicate:"name" value_type:STRING directive:INDEX tokenizer:"term"
dgraph_server     | 2018/05/30 23:20:43 mutation.go:191: Done schema update predicate:"email" value_type:STRING directive:INDEX tokenizer:"hash"
dgraph_zero       | 2018/05/30 23:20:52 oracle.go:75: purging below ts:695, len(o.commits):0, len(o.aborts):0, len(o.rowCommit):0
dgraph_server     | 2018/05/30 23:21:05 draft.go:865: Writing snapshot at index: 607, applied mark: 617  |

I cannot see a difference in the log messages. So that is strange…

MichelDiz · May 30, 2018, 11:27pm

Try curl localhost:8080/admin/export and look what goes out in the .schema file to be sure. In your client setup.

This would be the final step to be absolutely sure something has been written on the Server.

If everything is ok then there is something wrong with your client and it really can be the Proto file.

https://docs.dgraph.io/deploy#export-database

ospaarmann · May 30, 2018, 11:37pm

Edit: Only difference is that Ratel get’s back data from the server:

{"data":{"code":"Success","message":"Done"}}

My client receives an empty string.

About exporting the schema. I deleted all data (deleting the folders from the server), restarted the server and did an export. The schema file was empty as expected. I then ran my command from the client to set the schema and did an export again. The schema file now contains this:

id:string @index(hash) . 
name:string @index(term) . 
email:string @index(hash) . 
username:string @index(exact,hash) .

I still get the error though when querying. On Ratel and on my client.

MichelDiz · May 30, 2018, 11:40pm

What Ratel says?

ospaarmann · May 30, 2018, 11:41pm

No indices in ratel if I click on schema. When I query I get the error.

MichelDiz · May 30, 2018, 11:56pm

If Ratel does not show neither a predicate is because nothing has been written. Ratel will always be able to see predicates no matter what the situation.

If there are predicates and it returns “Your query did not return anything”. It means that it is indexed, but the value of the query was not found. Usually this returns to the other clients as an empty query itself.

Now if Ratel informs you that it has not been indexed (the way you are reporting on username) this would be some bug. But sure it is not, you would solve it creating a completely new instance of Dgraph. I can not see any situation happening this way. If it happens it is data corruption, it would be better to check the disks. As said above, I have been able to replicate what you wanna do, so it is not bug.

ospaarmann · May 31, 2018, 12:09am

Well, I figured it out. I set up a fresh instance of Dgraph and went through it again.

Setting the schema in my client → shows up in ratel and in the export
Querying for data with a filter → works
Running my tests → doesn’t work

So the tests seem to do something that breaks everything. I then realized that I run a drop_all operation before the test suite. My reasoning was that this only deletes the data. But it seems to also delete the schema.

Is there any way to delete all data from Dgraph without trashing the schema? What I did was wrong as it appears now…

Thank you for your support so far! You were of great help!

MichelDiz · May 31, 2018, 12:14am

Always delete everything. There is no drop data-only. You can request it there from Dgraph’s GitHub.

One way to solve this would be after Drop_all you already write it the same schema right away.

ospaarmann · May 31, 2018, 12:17am

This is a good idea. Thanks again.

system · June 30, 2018, 12:17am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Can't seem to filter for dash "-"? Dgraph	1	495	June 25, 2021
RebuildIndex failed through admin interface Users	4	753	April 14, 2017
What's the default index? Users	7	1028	July 13, 2018
Filtering on same predicate using multiple indices Dgraph	9	467	July 3, 2020
Mutation using dgo succeed but no displayed when I query it Dgraph Cloud kind:question	11	747	December 1, 2020

I keep getting "Attribute username not indexed" even though I set the schema

Related topics