Hi Ryan,
We are in the process of releasing a new patch and the ARM image got published prematurely.
We should have the images for both architectures for the new patch soon.
In the meantime, you can go ahead with using 24.0.2 which has amd/arm images published.
Sorry for the confusion.
Thanks
-Megha
BenchmarkTestCache-8 1673727 751.5 ns/op (Main without risteretto cache)
BenchmarkTestCache-8 2333268 501.3 ns/op (Main with risteretto branch)
BenchmarkTestCache-8 3668409 319.3 ns/op (The PR's improvements)
Running 24.0.4, with the query cache set at 10% (of 256gb ram), we see about 30% improvement. This was with live data ingest stopped. When it’s on, the improvement is less significant. We’re looking forward to the fruits of these PRs to get even fasters!
Thanks @rahst12 for the enthusiasm. The old risteretto cache, couldn’t handle when there were updates. At every update the cache is invalidated. The new cache that we are implementing updates the cache instead. So hopefully when your live data ingest is on, the new cache would still perform good.
In the new PR, even the ristretto cache is almost as fast as the new cache, signifying in general performance improvements with cache. The new cache is jepsen tested and till now we haven’t seen any consistency issues by using it. Hopefully we get these PRs soon, we are still in the process of fine tuning.
@rahst12 That’s great to hear. The beat would be out soon
If you want to try out right now, You could try out the branch harshil-goel/shared-map. It’s the latest that we have.
I saw that PR 9180 was closed in favor of PR 9237, both being improvements on the cache. Does the new PR hit the same benchmark improvements as the old one (another 30% improvement)? Also, did the new PR’s cache solution end up not invalidating the cache after upserts?
So the change that brought the performance benefit that I mentioned before, has already been in merged. It’s the one you posted before this.
Both the PRs haves similar performance characteristics. We were able to fix the already existing cache using some new apis, and some clever ways to use the cache. The main point for the new cache was that the ristretto cache implementation couldn’t keep consistency.
On our 21 million dataset, we ran 63 queries 1 billion times. We took the average of all the query times. For all the queries combined, we saw an improvement of around 23%. But for the larger queries (average around 3ns), we saw an improvement of 200% (now around 1ns average)
In this PR, we have introduced a new parameter in the cache settings, keep-updates. If this is set as true, the cache will not be invalidated after upserts. This is not set by default for now, while we test it further. It has some performance consequences during heavy mutations.
It has passed most reviews, and hopefully we would do a final review tomorrow and publish it.