Sorting-by-count-2

Many sorries for resurrecting dead topic, but this one was best match from search results.

Something is wrong here o_O

I have

type <Password> {
	prop.password string @index(hash, trigram) @upsert .
}

type <Account> {
	password_uid [uid] @count @reverse 
}

Challenge is to find most popular password among millions of records

Suggested approach:

{
  passwords as var(func: eq(dgraph.type, "password")) {
    used_count as count(<password_uid>)
  }

  _(func: uid(passwords), orderdesc: val(used_count), first:1) {
    used_times:val(used_count)
    prop.password
  }  
}

Gives:

{
  "used_times": 95,
  "prop.password": "lezhi"
}

But I know the answer for that challenge, and simple test:

_2(func: eq(prop.password, "123456")){
  count(<password_uid>)
}

Results in:

{
  "count(password_uid)": 27869
}

What am I doing wrong? :slight_smile:

Dgraph metadata

dgraph version

Dgraph version : v21.12.0
Dgraph codename : zion
Dgraph SHA-256 : 078c75df9fa1057447c8c8afc10ea57cb0a29dfb22f9e61d8c334882b4b4eb37
Commit SHA-1 : d62ed5f15
Commit timestamp : 2021-12-02 21:20:09 +0530
Branch : HEAD
Go version : go1.17.3
jemalloc enabled : true

This doesn’t look good at all, unless you are a hacker creating a password dictionary to use in attacks.

Firstly, passwords should not be exposed in the DB. If you are doing this, stop right now! NOT SAFE. You should encrypt it. Anf if you encrypt it, this approach won’t work. Unless we add some loops in DQL.

e.g “For each node, try this password and count”. But that would be unethical, as you are brute forcing your database with your users data…

Don’t care how it looks :slight_smile: It can be “count names”, “count dog nicknames” etc.
The fact is that dgraph somehow counts wrong

Cuz we don’t have loops. BTW, not sure what you mean by “how it looks”.

PS; oh, now I see where the “How it looks” come from. You have interpreted my phrase literally.

Any idea how to find most used password without loops in one request then, please?

Nope, cuz you need to iterate over the passwords. But show me an example of your mutation just to check.

There is no way to order by or filter by a count in dgraph. You can aggregate and count just now order/filter

Recommended approach works (Sorting by count - #2 by MichelDiz)

I’ve found my mistake:
it seems that <dgraph.type> has no index by default, so Dgraph didn’t retrieve all the nodes by eq(<dgraph.type>, "password") even though needed node has <dgraph.type>=="password" (checked), but did what I wanted with has(<prop.password>)

{
  passwords as var(func: has(prop.password)){
    used_count as count(<password_uid>)
  }
    
  _(func: uid(passwords), orderdesc: val(used_count), first:1){
    used_count:used_count
    password:prop.password
  }
}

It should have. oO

Are you sure that the other nodes are with uppercase? ( “password” ≠ “Password” )

This could be a v12.12 issue with the type index problem.

See this for more info:

1 Like

Yep, no mistake in lowercase/uppercase, “trust me, I’m C++ programmer”.

And yes, dgraph.type is not being indexed properly, but all my artificial predicates were indexed without any problem (they show proper count), will rely on them.

Also, will look into "func: type() "error in 20.11.03 or later - #29 by amaster507 thread.

Thanks folks! :slight_smile: