Use Badger as Pwned-Password lookup-server

(Florian Harwöck) #1


I’m building a service with Go and GRPC ( that is able to query the pwned password list (

The source list comes in a single .txt file with a size of 30 GB in the format:


In order to make this data queryable I have built a GRPC service that has a single function CheckPassword and returns a single bool Leaked. Now comes the tricky part: I searched for an integrated Key-Value database to store the 30GB in the filesystem (like SQLite, but for Key-Value pairs optimized) and found RocksDb. This (as most of you obviously know) hasn’t worked very well due to cgo. Now I found Badger and would like to use it as the persistence layer.

After the decision to use Badger was made I wrote a little “Import” tool to convert the 30GB text file to a badger database. After the start, I realized ultra-low-performance for the conversion. I ended up with 800KB of data per 30sec. Therefore I would nearly need two weeks (~13 days) to fully import the data. Therefore I think I made a terrible mistake somewhere or I’m just not fully aware of how Badger works.

Here is the somewhat shorted code I used for my Import-tool

for {
	buf, _, err := r.ReadLine()
	if err != nil {
		if err.Error() == "EOF" {

	key := strings.ToLower(string(buf)[:40])

	err = db.Update(func(txn *badger.Txn) error {
		return txn.Set([]byte(key), []byte{})
	if err != nil {

Does someone have any idea what I did wrong? Thanks in advance :hugs:

(Pawan Rawal) #2

The fastest way to load data into badger would be to batch your updates and also perform updates concurrently. The program that we use to benchmark badger write performance, has 32 goroutines writing data in batches of 1000 entries.

(Florian Harwöck) #3

Now finished the rewrite :relaxed: With the things you mentioned and a few other tips and tweaks I got from Slack, I managed to get about 60-80MB/s.

Thanks for the help! :slight_smile: