Lifetime of startTs

What is the lifetime of a startTs? Can I query a specific startTs for ever or will older versions of the graph be deleted after some time / some number of mutations? I saw the following log line

Docker: I0708 09:33:53.890109      28 oracle.go:107] Purged below ts:3, len(o.commits):0, len(o.rowCommit):3

Does this mean querying with startTs < 3 will not provide the exact same data as before the purge?

Cheers,
Enrico

Not sure I understand lifetime of startTS, but startTs is used to track the start timestamp of your transaction. If there was a transaction which deleted the data which you are now querying, that shouldn’t show up if your current startTs is greater than the previous txn’s commitTs.

Also the log is from here:

func (o *Oracle) purgeBelow(minTs uint64) {
	o.Lock()
	defer o.Unlock()

	// Dropping would be cheaper if abort/commits map is sharded
	for ts := range o.commits {
		if ts < minTs {
			delete(o.commits, ts)
		}
	}
	// There is no transaction running with startTs less than minTs
	// So we can delete everything from rowCommit whose commitTs < minTs
	for key, ts := range o.keyCommit {
		if ts < minTs {
			delete(o.keyCommit, key)
		}
	}
	o.tmax = minTs
	glog.Infof("Purged below ts:%d, len(o.commits):%d"+
		", len(o.rowCommit):%d\n",
		minTs, len(o.commits), len(o.keyCommit))
}

Let me rephrase my question: I am repeatingly querying (read-only) with the exact same startTs so that I get reproducible results, to guarantee I am not interfered by any concurrent writes. For how long can I read that startTs without seeing changes? For hours?

No, it cannot be used for that long. Mainly because rollups happen (the changes to a list are compacted into a single list). Once that happens, the previous versions are no longer available. Lists are being continually rolled up so there’s no clear limit on how long this will take but reading a list puts it in the queue of lists to be rolled up so I suspect the lifetime of startTs is not very long.

In general, the workflow for queries involves creating new transactions that get assigned new timestamps. I don’t think trying to query for old data is very well supported, especially if there are ongoing mutations.

Maybe you could add a creation time for the nodes that you want to query and use that to create a query that has reproducible results without depending on the startTs of the transaction.

So this means that when you iterate over your result via pagination, Dgraph’s transactions cannot guaranteed you to get consistent results during mutations. So even though Dgraph isolates reads from write transactions, those rollups break subsequent reads. Usually, subsequent reads are fine to see writes, but in case of pagination, that is not really what you would expect.

@martinmr Is there any way to defer dgraph’s rollups for the time I am iterating over result pages? Will rollups be deferrd as long as there are read-queries running on an alpha for that startTs? Would sufficiently many concurrent reads block rollups? Does each alpha have its own rollup cycle, or do all alphas drop old versions synchronously?

That’s incorrect. The point that @martinmr is making is that if there are rollups, reading below the roll up TS might give you errors. But, they won’t ever give you incorrect results. Every read happens at a timestamp, hence providing you a snapshot over the entire DB. But, we’re rolling up, dropping older versions and so on, so if you have a very long running read, then you might just error out.

We could have a flag to disable rollups, but it would affect read performance. I don’t see much value in it.

So the good news is that I won’t silently see results from later transactions, so what I get is consistent. But I may fail reading the full result. While Dgraph guarantees consistent results, it cannot guarantee me getting the full result through pagination (once a rollup occurs).

That is a limitation of pagination that should clearly be stated in the docs.

Just because a rollup happens, doesn’t mean you won’t get the data below it. My understanding was that you’re talking about some very long running read transactions. For long running ones, the snapshot might have advanced as well, and then Badger compactions might have happened, which would cause those versions to be GCed and such.

But all just depends on data ingestion speeds. So, if you want to try doing this crazy thing of using a 2-hr old start timestamp, the “warranty is void” and you’re on your own.