Reverse Seek(prefix) not working as expected

What version of Go are you using (go version)?

$ go version
go version go1.14.2 linux/amd64

What operating system are you using?

$ uname -a
Linux machine1 5.7.0-1-amd64 #1 SMP Debian 5.7.6-1 (2020-06-24) x86_64 GNU/Linux

What version of Badger are you using?

github.com/dgraph-io/badger/v2@v2.0.3

Does this issue reproduce with the latest master?

Yes.

foo@machine1:/go/src/github.com/xinau/relmon$ go get github.com/dgraph-io/badger/v2@master
go: github.com/dgraph-io/badger/v2 master => v2.0.1-rc1.0.20200718025359-1ccf3a875dd7
go: downloading github.com/dgraph-io/badger/v2 v2.0.1-rc1.0.20200718025359-1ccf3a875dd7
go: downloading github.com/dgraph-io/ristretto v0.0.3-0.20200630154024-f66de99634de

Steps to Reproduce the issue

package main

import (
	"fmt"
	"reflect"
	"testing"
	"github.com/dgraph-io/badger/v2"
)

func TestReverse(t *testing.T) {
	db, err := badger.Open(badger.DefaultOptions("").WithInMemory(true))
	if err != nil {
		t.Errorf("opening database: %s", err)
	}

	for _, test := range ([]struct{
		key []byte
		val []byte
	}{
		{[]byte("key1.0"), []byte("val1")},
		{[]byte("key1.1"), []byte("val2")},
		{[]byte("key2.2"), []byte("val1")},
		{[]byte("key1.3"), []byte("val3")},
		{[]byte("key2.4"), []byte("val2")},
	}){
		if err := db.Update(func(txn *badger.Txn) error {
			ent := badger.NewEntry(test.key, test.val)
			return txn.SetEntry(ent)
		}); err != nil {
			t.Errorf("adding value: %s", err)
		}
	}

	for _, test := range ([]struct{
		key  []byte
		want []byte
	}{
		{[]byte("key1"), []byte("val3")},
		{[]byte("key2"), []byte("val2")},
	}) {
		if err := db.View(func(txn *badger.Txn) error {
			opts := badger.DefaultIteratorOptions
			opts.Reverse = true
			it := txn.NewIterator(opts)
			defer it.Close()

			it.Seek(test.key)
			if !it.ValidForPrefix(test.key) {
				return fmt.Errorf("key not valid for prefix")
			}

			item := it.Item()
			item.Value(func(data []byte) error {
				if !reflect.DeepEqual(data, test.want) {
					t.Errorf("got %q want %q", data, test.want)
				}
				return nil
			})
			return nil
		}); err != nil {
			t.Errorf("getting value: %s", err)
		}

	}
}

What Badger options were set?

The DefaultOptions with In-Memory where set.

What did you do?

Iterating in reverse over a list of keys and trying to get the last key for prefix as shown in the test case.

What did you expect to see?

The test case above to pass.

What did you see instead?

$ go test ./internal/store --count 1
--- FAIL: TestReverse (0.00s)
    store_test.go:61: getting value: key not valid for prefix
    store_test.go:61: getting value: key not valid for prefix
FAIL
FAIL	example.com/main	0.005s
FAIL

Adiditonal notes?

When executing the same test with IterationOptions.Reverse set to false the test fails as expected with.

$ go test ./internal/store --count 1
--- FAIL: TestReverse (0.00s)
    store_test.go:55: got "val1" want "val3"
    store_test.go:55: got "val1" want "val2"
FAIL
FAIL	example.com/main	0.005s
FAIL

P.s. This issue is probably me overlooking something obvious, but I could need some help or explanation to get the test to pass.

P.s.s I enjoy the work and tools you folks are providing and truly want to say thanks.

1 Like

I managed to get the test case working, it seems it’s caused by it.Seek(key).
It could be that I misinterpreted the documentation.

Seek would seek to the provided key if present. If absent, it would seek to the next smallest key greater than the provided key if iterating in the forward direction. Behavior would be reversed if iterating backwards.

I hope someone with deeper knowledge of the source code could explain it to me.

47,55c47,49
< 			it.Seek(test.key)
< 			if !it.ValidForPrefix(test.key) {
< 				return fmt.Errorf("key not valid for prefix")
< 			}
< 
< 			item := it.Item()
< 			item.Value(func(data []byte) error {
< 				if !reflect.DeepEqual(data, test.want) {
< 					t.Errorf("got %q want %q", data, test.want)
---
> 			for it.Rewind(); it.Valid(); it.Next() {
> 				if !it.ValidForPrefix(test.key) {
> 					continue
56a51,58
> 
> 				item := it.Item()
> 				item.Value(func(data []byte) error {
> 					if !reflect.DeepEqual(data, test.want) {
> 						t.Errorf("got %q want %q", data, test.want)
> 					}
> 					return nil
> 				})
58,59c60,61
< 			})
< 			return nil
---
> 			}
> 			return fmt.Errorf("key not found")

I’m going to change the title of the issue to reflect this new knowledge of mine.

Hey @xinau, Welcome to the community over discuss. Thanks for reaching out to us for the queries.

As mentioned, the seek would seek to the next greatest key smaller than the provided key. In your case, you have the following keys [“key1.0”, “key1.1”, “key1.3”, “key2.2”, “key2.4”] and you can use an iterator to traverse over it in sorted manner.

When you do it.Seek("key1"), it seeks to key which is just smaller than “key1” (smaller than “key1.0” in this case), so it.ValidForPrefix() returns false as there exist no such key. While for it.Seek("key2"), it sets the iterator to “key1.3” for which it.ValidForPrefix(test.ket) returns false as it is not prefix of “key1.3”. See the two conditions here

When you use rewind, the iterator seeks to the largest value of the KV store (as you have set reverse iterator), you are skipping the keys if it is not valid for prefix.

Moving it to Users category as it fits there.
I hope this answers your question. Feel free for followups. Also, mark it as solved if it addresses your query. :slight_smile:

1 Like

Thanks @Naman for the quick response and clarification. For some reason my brain came to the conclusion that it.Seek("key1") would have been at "key2.2".

Can you provide some more background as to why the behaviour reverses as from my point of view this makes seek “useless” for traversing over a prefix backwards as the complete prefix is skipped ?

key := []byte("key")
for it.Seek(key); it.ValidForPrefix(key); it.Next() {
	item := it.Item()
	item.Value(func(data []byte) error {
		fmt.Printf("key %q val %q\n", item.Key(), data)
		return nil
	})
}

This iteration over the prefix "key" with the data from the test case while iterating forward yields the following output

key "key1.0" val "val1"
key "key1.1" val "val2"
key "key1.3" val "val3"
key "key2.2" val "val1"
key "key2.4" val "val2"

While backwards no output is generated due to the reversed behaviour not as I initially expected

key "key2.4" val "val2"
key "key2.2" val "val1"
key "key1.3" val "val3"
key "key1.1" val "val2"
key "key1.0" val "val1"

Another concern I have is if there are any performance drawbacks with my working test case as I’m iterating over the complete key range regardless if a given key is in the data or not and which route can I pursue to improve this at best?

Seek if not considered specifically for prefix case makes sense as when forward iterating you would jump to next key (higher than the provided key) if provided key is not found, with reverse you would want to jump to next key (which will be smaller than the provided key) as the meaning of next has changed due to change in direction.

In case, you want to reverse iterate over the prefix, there is a workaround that you append byte 0xff to the Seek key (see reverse-iteration-doesnt-give-me-the-right-results).

Yes, as you have to see the trailer of lots of movies even when you know you want to see “Sherlock”. Moreover, you have a better solution now (i.e., 0xff) :slight_smile:

1 Like

@Naman Thanks for the explanation. I’m mark the issue as resolved (once again :slight_smile:).

1 Like