Very poor dateTime index/filtering performance when large numbers of nodes have the same date

This really a domain problem, but my dataset was exported from a MySQL database which currently has large numbers of data nodes that have a date time of 2999-12-31T15:00:00Z. This represents data nodes that have no expiry for the foreseeable future.

However, I have noticed that executing something like @filter((le(min_sale_start_datetime, “2020-08-13T17:00:00Z”) AND ge(max_sale_end_datetime, “2020-08-13T17:00:00Z”), is dreadfully slow. Somewhere on the range of 500ms.

I realize after reading Datetime Indexes in Dgraph - Dgraph Blog, that this is probably a result of having too many uids in the same date time index bucket. However, this is a fairly common thing to have in an SQL database, so I feel like it would make sense to support this better?

Would the recommended work around in this case, either to change the data into a boolean, or find some way to distribute the dates better?

2 Likes

Have the same problem
I would be very glad if someone suggests some solution