How to set up pydgraph development environment?

I’m interested in contributing to pydgraph (or creating a modified fork). For this I need to create a development environment and ensure that I can run tests. Where can I find instructions how to do it?

What you actually did

I’ve

  1. cloned the repo https://github.com/dgraph-io/pydgraph
  2. created and activated a python virtual environment (python3.8 -m venv venv). I’m using python3.8 because I’ve noticed that tox.ini file defines py38 environment
  3. within the virtual environment I run python setup.py install (according to instructions in the Development section of README). There are some issues with this step.
  4. after resolving issues on previous step I installed tox and tried to run it (according to instructions in the Running tests section of README). There are also some issues unless I keep modifications in setup.py.

TL;DR: see the last screenshot (after “As a summary…”)

Why that wasn’t great, with examples

setup.py install

setup.py install fails with an error because it imports a VERSION variable from the pydgraph package, which also triggers imports from uninstalled dependencies (google.protobuf)

$ python setup.py install
Traceback (most recent call last):
  File "setup.py", line 23, in <module>
    from pydgraph.meta import VERSION
  File "/home/ulianich/projects/pydgraph/pydgraph/__init__.py", line 15, in <module>
    from pydgraph.proto.api_pb2 import Operation, Payload, Request, Response, Mutation, TxnContext,\
  File "/home/ulianich/projects/pydgraph/pydgraph/proto/api_pb2.py", line 6, in <module>
    from google.protobuf import descriptor as _descriptor
ModuleNotFoundError: No module named 'google'

As a workaround I could install dependencies either by running pip install -r requirements.txt or by temporary replacing the problematic line in setup.py:

# from pydgraph.meta import VERSION
VERSION = '21.3.2'

But if I change setup.py back to its original state and run setup.py install, the following error appears:

Traceback (most recent call last):
  File "setup.py", line 23, in <module>
    from pydgraph.meta import VERSION
  File "/home/ulianich/projects/pydgraph/pydgraph/__init__.py", line 15, in <module>
    from pydgraph.proto.api_pb2 import Operation, Payload, Request, Response, Mutation, TxnContext,\
  File "/home/ulianich/projects/pydgraph/pydgraph/proto/api_pb2.py", line 34, in <module>
    _descriptor.EnumValueDescriptor(
  File "/home/ulianich/projects/pydgraph/venv/lib/python3.8/site-packages/protobuf-4.21.12-py3.8-linux-x86_64.egg/google/protobuf/descriptor.py", line 755, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

According to the message installing an older version of protobuf does fix the issue: pip install protobuf~=3.20. Is it supposed that pydgraph uses an older version of protobuf? Or should I set the PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION variable instead?

BTW, I’ve installed a pydgraph from pypi in a pure venv and the problem is reproduced in that case as well: https://imgur.com/a/cCDGfdv (as a new user I cannot embed too may links or media)

tox (running tests)

tox fails to run with “ModuleNotFoundError” as well, so I keep VERSION hard-coded in setup.py and comment out the from pydgraph.meta ... import line. There still is the same issue with “too new protobuf” in tox, so I’m replacing and pinning protobuf version in tox.ini:

    # protobuf >= 3.6.1
    protobuf == 3.20.3

After these changes tox manages to run tests (tough my system hanged up at some point on the first try; I’m so happy that draft of this post was saved). Running tests screenshot: https://imgur.com/a/qBMa2FV (cannot embed too many links or images)

As a summary, I could run tests only after applying these modifications:

So the question is: what is the “intended” way of working with (contributing to) the project? Is it possible to run tox without manually modifying setup.py?

Any external references to support your case

https://stackoverflow.com/a/2073599/7710928. Looks like setup.py does a bad thing by directly importing the pydgraph package (?)

Tests eventually failed, but that’s likely because I’ve ignored the README section:

This script assumes Dgraph and dgo (Go client) are already built on the local machine and that their code is in $GOPATH/src. It also requires that docker and docker-compose are installed in your machine.

Can I find somewhere more detailed instructions on how to build/install dgraph and dgo for local development? E.g. is it okay to install dgraph using my system package manager (as I do with docker) or I must run the scripts/install_dgraph.sh script? How to install the dgo and ensure that their code is in $GOPATH/src?

Also, are there any helper files to run tests partially during local development (to skip time-consuming tests)? E.g. a docker-compose file that I can use to set up a testing environment and then call pytest/coverage directly instead of scripts/local-test.sh.

UPD: I’ve noticed the tests/docker-compose.yml. Is it intended/applicable for local testing process as I described above?

I’ve installed dgraph-bin from AUR (AUR (en) - dgraph-bin) and tests have completed successfully. I did not run the “install_dgraph.sh” script, nor explicitly installed dgo. Does it mean I’m gonna be okay with my current environment?

Before anything. We are not mantainers of this package. Be careful.

You don’t need Go or Dgo if you are a Py dev. Pydgraph is a Python client. No need to install anything related to Go. Dgo is a go client.

If you are developing in Py you should use Pydgraph as your client and have a Dgraph Cluster set up. Nothing more.

In the repo itself. It should be straightforward.

Humm, I see. Well, that is something else. You may need Dgraph repo and go in that case. Cuz you only able to build protobuf from the main repo. But in general I think you would need such set up if you are changing deep linked files. Not sure if it really necessary. In general in other clients you just need to copy the protubuf generated files and set up the env without any issue or dependency. I never did it for Py.

Thanks for a quick reply!

sure, I’m just asking if I get it correctly that this package may be what I need. I don’t see explicit instructions in the repo related to dgraph installation, so assuming that I could (should) install it on my host system as usual software (unlike python dependencies that usually should be installed in isolated virtual environment).

Then README.md is misleading, because it says:

This script assumes Dgraph and dgo (Go client) are already built on the local machine and that their code is in $GOPATH/src . It also requires that docker and docker-compose are installed in your machine.

Is it a mistake? As a python developer I don’t even understand what $GOPATH exactly is (must be something similar to $PYTHONPATH :smile:)

I only found this GitHub - dgraph-io/pydgraph: Official Dgraph Python client. In the orignal post I describe how I follow these instructions step-by-step, but there are issues with following them “straightforwardly” (commands just don’t work without file modifications). BTW, I used the master branch, shouldn’t I?

Right now I’m not trying to do anything except clone the repo and run tests without any modifications to the code, so I don’t think I need any advanced setup. The problem with setup.py is that file imports VERSION variable from a file that also imports a dependency library (google.protobuf). But I’m actually running setup.py in order to install those dependencies. This means I cannot install dependencies because they are required to run a command that installs them (vicious cycle).

So it feels like I’m doing something wrong, despite following instructions “as they are”.

odd, this isn’t documented. Search · dgo · GitHub
I think they use Dgo just for Proto Sync. But they didn’t documented how to…

I think is similar to PYTHONPATH. I know how the Gopath works, but I don’t know about Python one lol! - Basically $GOPATH is where all go context goes. In short the workspace env.

Yes. Master there.

Last time I tried it was lol long time ago btw

I was about of going back to that repository to create some tests for CI for each new PR. Let’s see (next week) if I fall for those same pitfalls you fell.

1 Like