Long string breaks dgraph

Is there a limit to string length?

I found Dgraph crashing/hanging up with strings of length larger than 64971. Error message: Error occurred while aborting transaction: rpc error: code = Canceled desc = context canceled.

Even after a restart Dgraph is not usable anymore.

Is this a known issue?

Dgraph version v22.0.1

Are you using .NET? looks like .NET can’t deal with which can be 65535 bytes per string literal. In go the limit would be your memory. Which is about 2^64/2 characters.

Nope not using .NET.

Our setup used to work just fine with the docker image v21.12.0. Strings with a length over 200.000 characters were no problem at all. But now we want to switch to v22.0.1 and string length seems to be an issue.

After some time there were some additional logs:
I1208 14:47:48.248839 31 draft.go:1592] Found 3 old transactions. Acting to abort them.
I1208 14:47:48.249546 31 draft.go:1553] TryAbort 3 txns with start ts. Error:
I1208 14:47:48.249568 31 draft.go:1569] TryAbort: No aborts found. Quitting.
I1208 14:47:48.249572 31 draft.go:1595] Done abortOldTransactions for 3 txns. Error:

This just keeps looping.

Can you test with 21.03?

Try to start from scratch.

Share stats about your env and Docker.

Its the same issue there.

The container stays around 50% load even when ideling after the incident/failed insert.

Im running it locally, 32gb ram, 8 cores @3GHz

Depending on the indexing you used. It will try to index the entire string.

Is that docker? How much resources is Docker enabled to consume?
What disk are you using?

Hum, this can be like 200KB…

Maybe it has some improvement in version v21.12.0. The current version had to go back to v21.03 due to several unforeseen bugs. And we’ll work through each PR to see which one is worth merging back. So you need to wait or maybe try to find out which PR made the improvement you need.

This may be linked to improved indexing. Maybe in types, but I don’t remember anybody working with that. You’ll have to wait. Really hard to tell what it is.

1 Like

Jup that was a docker container. There is no resource constraint configured. I dont know the exact disc specs but it should be plenty fast.

But i did some testing, indexing really seems to be the problem here. I testet with a string of length 200.000 just containing the letter a. Index regexp is working. Indexes exact and term are not working. I use a combination of all three as our usecase is extensive and exact searching.

As indexing seems to be the problem, is there anything on your roadmap regarding indexing performance?

Edit:
I did some testing just with the regexp index, strings of length 5.000.000 seem to be no problem (which should be enugh i guess). Thus i will switch to regexp only for now and run searches with predefined regex patterns. But it would be nice to use the other indexes as well :slight_smile:

Cheers!

what was the indexing you were using? We can create some e2e tests for this. Like simulating what you did. But also there are known limitations with indexing for some cases. The docs says I think. You can’t use any index for any case. Another example is that regex only work with 3+ characters due perf. But if you say exact is a problem. I gonna ask the team to go deep into this.

1 Like

So initially the field was indexed with term, exact and regexp. After some testing i found out that a single term index or a single exact index are also troublesome (and the combination of both of course).

The following indexes lead to hang ups when inserting too long strings:

  • field: String! @search([term, exact, regexp])
  • field: String! @search([term, exact])
  • field: String! @search([term])
  • field: String! @search([exact])

Are there any news on this @MichelDiz?

We believe this is fixed in v23. Can you test in v23-RC1?

I just tested it with said version. This is the resulting error message with a string of length 100000:

mutation addValue failed because Dgraph execution failed because value in the mutation is too large for the index

@Damon What is the limit then for the ‘exact’ index?

UPDATE

Turns out using the hash index instead of the exact index solved the problem in my case as I just needed equals filters to be working.