Correct usage of paging aka ranges?

madman22 · November 17, 2020, 12:33am

I’ve implemented an abstraction layer on-top of badger with convenience functions like nodes and lists. I’m trying to get paging correct and am hoping to get some expert opinions if this is the best way to go about this.

The BadgerNode has a prefix on the iterator to make sure that only valid keys are parsed and not more nodes. One optimization would be if I could exclude certain keys, specifically keys that end with a suffix. We have key prefix, why not key suffix?

github.com

madman22/database/blob/61d2be4e742d1ec1891e0b4e5559de9151cd2ba3/database.go#L957


			currentPage++
			visited = 0
		}
	}
	if len(errs) > 0 {
		return list, errors.New(strings.Join(errs, " "))
	}
	return list, nil
}

func (dbd *BadgerNode) Range(page, count int) (List, error) {
	if dbd.db == nil {
		return nil, ErrorDatabaseNil
	}
	if page < 1 {
		page = 1
	}
	if count < 1 {
		count = 1
	}
	list := make(List)

Thanks!

edit:
Another optimization might be to store the node keys as another entity and retrieve them on node creation. Although, I like the simplicity of the nodes only writing to the database if the nodes themselves have values stored, if something creates a database node but never writes to it, there is no database access.

chewxy · November 19, 2020, 11:44am

@Naman - is this something we might look into adding?

Naman · November 19, 2020, 12:21pm

We can’t iterate over keys with a suffix due to design principle of LSM tree. Badger uses LSM tree as underlying data structure. In each level (except L0), keys are sorted. This is to make read/write efficient. Comparison of 2 byte-slices, be default, happens from left to right hence allowing badger optimising read over keys with a prefix.

madman22 · December 9, 2020, 6:41am

My solution is to create an entity prefix so the db and nodes can iterate over each nodes actual data separately of each other. I’ve created backup and restore methods with version control so the db looks for the version setting, if it is lower than current, it backs up to a zip.Writer, closes the database, renames the folder structure, creates a new databases, then imports the backups with the new entity prefix. Seems to work pretty good.

Thanks for the help!

madman22 · December 9, 2020, 7:10am

Maybe introduce a

Skip int

field to the iterator so when doing paging it can skip the first x number of items in the iterator?

Topic		Replies	Views
Performance issue in prefix iteration Badger kind:question	0	860	March 17, 2022
Optimal way to iterate over few set of keys Badger kind:question	6	954	October 3, 2021
Badger for range scan Badger	2	747	June 4, 2019
Can I seek database entries based off a particular index? Badger badger , dgraph	0	623	October 4, 2022
[badger] Prefix + reverse lookup Badger	5	1662	April 21, 2018

Correct usage of paging aka ranges?

Related topics