Sorting-by-count-2

mrXCray · August 3, 2022, 7:06pm

Many sorries for resurrecting dead topic, but this one was best match from search results.

Something is wrong here o_O

I have

type <Password> {
	prop.password string @index(hash, trigram) @upsert .
}

type <Account> {
	password_uid [uid] @count @reverse 
}

Challenge is to find most popular password among millions of records

Suggested approach:

{
  passwords as var(func: eq(dgraph.type, "password")) {
    used_count as count(<password_uid>)
  }

  _(func: uid(passwords), orderdesc: val(used_count), first:1) {
    used_times:val(used_count)
    prop.password
  }  
}

Gives:

{
  "used_times": 95,
  "prop.password": "lezhi"
}

But I know the answer for that challenge, and simple test:

_2(func: eq(prop.password, "123456")){
  count(<password_uid>)
}

Results in:

{
  "count(password_uid)": 27869
}

What am I doing wrong?

Dgraph metadata

dgraph version

Dgraph version : v21.12.0
Dgraph codename : zion
Dgraph SHA-256 : 078c75df9fa1057447c8c8afc10ea57cb0a29dfb22f9e61d8c334882b4b4eb37
Commit SHA-1 : d62ed5f15
Commit timestamp : 2021-12-02 21:20:09 +0530
Branch : HEAD
Go version : go1.17.3
jemalloc enabled : true

MichelDiz · August 3, 2022, 9:15pm

This doesn’t look good at all, unless you are a hacker creating a password dictionary to use in attacks.

Firstly, passwords should not be exposed in the DB. If you are doing this, stop right now! NOT SAFE. You should encrypt it. Anf if you encrypt it, this approach won’t work. Unless we add some loops in DQL.

e.g “For each node, try this password and count”. But that would be unethical, as you are brute forcing your database with your users data…

mrXCray · August 3, 2022, 10:11pm

Don’t care how it looks It can be “count names”, “count dog nicknames” etc.
The fact is that dgraph somehow counts wrong

MichelDiz · August 3, 2022, 10:33pm

Cuz we don’t have loops. BTW, not sure what you mean by “how it looks”.

PS; oh, now I see where the “How it looks” come from. You have interpreted my phrase literally.

mrXCray · August 3, 2022, 10:44pm

Any idea how to find most used password without loops in one request then, please?

MichelDiz · August 3, 2022, 10:45pm

Nope, cuz you need to iterate over the passwords. But show me an example of your mutation just to check.

amaster507 · August 3, 2022, 11:07pm

There is no way to order by or filter by a count in dgraph. You can aggregate and count just now order/filter

mrXCray · August 4, 2022, 9:09am

Recommended approach works (Sorting by count - #2 by MichelDiz)

I’ve found my mistake:
it seems that <dgraph.type> has no index by default, so Dgraph didn’t retrieve all the nodes by eq(<dgraph.type>, "password") even though needed node has <dgraph.type>=="password" (checked), but did what I wanted with has(<prop.password>)

{
  passwords as var(func: has(prop.password)){
    used_count as count(<password_uid>)
  }
    
  _(func: uid(passwords), orderdesc: val(used_count), first:1){
    used_count:used_count
    password:prop.password
  }
}

MichelDiz · August 4, 2022, 12:35pm

It should have. oO

Are you sure that the other nodes are with uppercase? ( “password” ≠ “Password” )

amaster507 · August 4, 2022, 4:30pm

This could be a v12.12 issue with the type index problem.

See this for more info:

mrXCray · August 5, 2022, 4:36pm

Yep, no mistake in lowercase/uppercase, “trust me, I’m C++ programmer”.

And yes, dgraph.type is not being indexed properly, but all my artificial predicates were indexed without any problem (they show proper count), will rely on them.

Also, will look into "func: type() "error in 20.11.03 or later - #29 by amaster507 thread.

Thanks folks!

Topic		Replies	Views
How to filter count result Dgraph dgraph	1	649	April 12, 2022
Release notes v0.7.7 Users	1	703	November 28, 2017
Returning sorted results by number of matches Dgraph kind:question	1	372	November 30, 2020
Sorting by count Dgraph	6	2796	September 9, 2023
Query password Users dgraph	3	681	December 22, 2021

Sorting-by-count-2

Dgraph metadata

Related topics