Is there a limit to the size of the string data of a predicate (scalar, string type)?

I need to store images as base64 encoded data.

Eg.
<_:per> <name "jack" .
<_:per> <profile_image> "iVBORw0KGgoAAAANSUhEUgAAAPAAA..." .

  1. Is there a limit on the size of the string ?
  2. Though the profile_image is not searched for, would it affect performance of dgraph while storing and retrieving large strings ?
1 Like

Nope, None that I’m aware of.

Hello, anyone can reply to this question, I try to upload a file size of 2MB but fail, but less than 1MB will succeed.
I really want to know the limit of String Scale. Thanks in advance.

There has never been stated a specific limit, but there must be one at some point. I would also like to know from the @core-devs or maybe @MichelDiz might know?

The question was related to String type.You are probably confusing with some limitation in Dgraph Cloud. Obviously it isn’t 2MB if there is one. It could have a technical, non-confirmed, limit of 2GB. But Manish once said that there’s no such thing.

Cheers.

I host Dgraph in my windows system, I use Retool with graphql update mutitation to test my case, I upload the file of JPG or PDF(encode to base64 String first), when the size is more than 750K, the upload is sucessful, but the update mutitation don’t change the last string data, the graphql request still return the last string data instead of updated base64 string, the status code of response is “200”, but I think it is Fail.
When the size of JPG/PDF is less than 700K, the upload mutitation will be successful too, but the graphql request will return the corrected updated string data.
The Retool docs show their file input component support up to 40MB. so I think the problem is caused by Dgraph.

The graphql update mutitation doesn’t change the field of file, but response status is 200, and I get no error from the response.

This is graphql request.

In GraphQL all responses are 200. Error handling in GraphQL is handled manually or via some third party lib.

Have you tried to send in DQL? The original question was related to DQL. Check if this limitation is GraphQL Spec compliant.

Thanks for reply.
There is no length limit of String scalar in Graphql spec.

GraphQL

and some one has tested upload a 30MB file via Graphql.

Size limit for GraphQL scalar String - Stack Overflow

Sorry, I unfamiliar to dgl, anyone can help to test it on dgl? I test it with Insomnia as request sender, the same result.

1 Like

In fact, there is a hard-coded maximum value size of 1MiB (in Dgraph v21.12.0, which is using BadgerDB commit 3f320f5df1bf; link to code where the size check happens); the over-sized key-value is ignored and the error is not propagated to the client, but it gets logged (see here). This is not the expected behavior, right?

1 Like

@MichelDiz I’m assuming there is still this 1Mb on string size? Working on debugging a project with extra long input strings to filter with (not storing) and seeing something similar. If we break up the string all works find.

We recently merged a PR that checks the size of the string value in certain cases fix(mutation): validate mutation before applying it by mangalaman93 · Pull Request #8623 · dgraph-io/dgraph · GitHub. This is part of the release v23.0.0. In this case, the error will be propagated back to the client and mutation won’t be applied.

The idea is that badger has limit on the key size as 64 KB and value size as 1 GB. This leads to the following approximate equations:

len(uid) + len(namespace) + len(predicate) < 64 KB
len(index token) + len(namespace) + len(predicate) < 64 KB

otherwise, len(value) < 1 GB

Index tokens for string data type is the hash (usually sha256) for hash index, exact value for exact index. Essentially, if you have an exact index on a string predicate, your values have to be smaller than 64KB (even smaller than that because predicate name also takes some space).

I have an internal proposal for validating sizes before applying a schema or a mutation. Let me put it out for comments. We merged the PR 8623 as part of the same proposal. We also plan to support larger sizes in future and not allow exact indexes for such data types.

Hope this helps.