Dgraph v24 introduces vector data type and similarity search to DQL query language.
This is a companion discussion topic for the original entry at https://dgraph.io/blog/post/v24-dql
Dgraph v24 introduces vector data type and similarity search to DQL query language.
How do you apply a threshold when looking for similar things?
You query for some similar things (let’s say the first 10), compute the similarity in a variable and apply a filter of the similarity value:
query categories($v: float32vector) {
var(func:similar_to(category.embedding,10,$v)) {
vemb as category.embedding
categorysimilarity as Math (($v) dot vemb)
}
list(func:uid(categorysimilarity)) @filter(gt(val(categorysimilarity),0.8)){
category.Value
val(categorysimilarity)
}
}
Note that for entities with little instances, you may query directly all the instances and compute the similarity:
query categories($v: float32vector) {
var(func:type(category)) {
vemb as category.embedding
categorysimilarity as Math (($v) dot vemb)
}
list(func:uid(categorysimilarity)) @filter(gt(val(categorysimilarity),0.8)){
category.Value
similarity:val(categorysimilarity)
}
}