[bug] doesn't handle forking well

diggy · July 25, 2019, 7:22am

Moved from GitHub pydgraph/75

Posted by d4l3k:

I’ve been trying to use pydgraph to load data to train a PyTorch model but the pydgraph client really doesn’t like multiprocessing/forking. If you make a call via pydgraph before forking it will throw errors if there are any concurrent queries made in the sub processes.

import torch
from torch.utils.data import Dataset, DataLoader
import pydgraph

def dgraph_client():
    stub = pydgraph.DgraphClientStub('localhost:9080')
    return pydgraph.DgraphClient(stub)


class GraphDataset(Dataset):
    def __init__(self):
        super().__init__()

        self.docs = list(range(100))

    def __len__(self) -> int:
        return len(self.docs)

    def __getitem__(self, i):
        resp = dgraph_client().txn(read_only=True).query(
            """{
                user(func: has(username), first: 1) {
                   uid
                   username
                }
            }""",
        )
        print(resp)
        return torch.tensor([i])


train_dataset = GraphDataset()
train_dataset[0] # removing this line fixes this code

train_loader = DataLoader(train_dataset, batch_size=8, num_workers=8)
# running multiple dgraph requests in parallel causes the crash
print(next(iter(train_loader)))

Output

json: "{\"user\":[{\"uid\":\"0x2\",\"username\":\"blah\"}]}"
txn {
  start_ts: 1177930
}
latency {
  parsing_ns: 17013
  processing_ns: 308037700
  encoding_ns: 920495
}

Exception ignored in: <function _Rendezvous.__del__ at 0x7f182b90fb70>
Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/grpc/_channel.py", line 436, in __del__
    with self._state.condition:
AttributeError: '_Rendezvous' object has no attribute '_state'
Traceback (most recent call last):
  File "repro.py", line 38, in <module>
    print(next(iter(train_loader)))
  File "/usr/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 582, in __next__
    return self._process_next_batch(batch)
  File "/usr/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
TypeError: __init__() missing 3 required positional arguments: 'call', 'response_deserializer', and 'deadline'

Seems to be caused by some issue with grpc. Googling the error didn’t pull anything up from the grpc project so it might be specific to how pydgraph uses it.

It’d be really nice if torch’s DataLoader was well supported since it’ll make graph learning much easier to do. I’m very excited to use dgraph for ML

Versions:

Name: pydgraph
Version: 1.2.0
Summary: Official Dgraph client implementation for Python
Home-page: https://github.com/dgraph-io/pydgraph
Author: Dgraph Labs
Author-email: contact@dgraph.io
License: Apache License, Version 2.0
Location: /usr/lib/python3.7/site-packages
Requires: grpcio, protobuf
Required-by:
---
Name: grpcio
Version: 1.22.0
Summary: HTTP/2-based RPC framework
Home-page: https://grpc.io
Author: The gRPC Authors
Author-email: grpc-io@googlegroups.com
License: Apache License 2.0
Location: /usr/lib/python3.7/site-packages
Requires: six
Required-by: tensorflow, tensorflow-serving-api, tensorflow-serving-api-gpu, tensorboard, pydgraph
---
Name: protobuf
Version: 3.7.0
Summary: Protocol Buffers
Home-page: https://developers.google.com/protocol-buffers/
Author: None
Author-email: None
License: 3-Clause BSD License
Location: /usr/lib/python3.7/site-packages
Requires: six, setuptools
Required-by: tensorflow, tensorflow-serving-api, tensorflow-serving-api-gpu, tensorboardX, tensorboard, pydgra
ph, googleapis-common-protos, google-api-core
---
Name: torch
Version: 1.1.0
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: UNKNOWN
Author: UNKNOWN
Author-email: UNKNOWN
License: UNKNOWN
Location: /usr/lib/python3.7/site-packages
Requires: numpy
Required-by: torchvision

diggy · July 29, 2019, 5:52pm

martinmr commented :

Yes, this issue is known and is in fact related to gRPC (python grpc server with multiprocessing fails · Issue #16001 · grpc/grpc · GitHub). As far as I know, there’s no plans for the gRPC team to fix this so unfortunately there’s not much we can do about it. The real issue is the lack of decent multithreading in python (the existing solutions seem to mimic threads by spawning a new process).

diggy · July 30, 2019, 5:58am

d4l3k commented :

Might want to document the behavior of grpc somewhere. grpc/fork_support.md at master · grpc/grpc · GitHub

I ended up working around this by creating a multiprocessing.Pool on program launch and then running all dgraph client requests in the subprocess using apply. That way the parent doesn’t make any RPC calls and run into weird grpc forking behavior.

Topic		Replies	Views
Is this client thread safe? Dgraph Clients untagged , pydgraph	2	500	July 11, 2020
Roadmap? Dgraph Clients untagged , pydgraph	4	544	July 11, 2020
Streaming API for DgraphClient.txn().query() Dgraph Clients untagged , pydgraph	0	510	May 26, 2020
Add asynchronous request execution possibility Dgraph Clients kind:enhancement , status:accepted , pydgraph	3	583	July 11, 2020
Add aync examples Dgraph Clients untagged , pydgraph	0	541	June 1, 2020

[bug] doesn't handle forking well

Related topics