Dgraph Client Connection - Best Practices


This has been brought up before but I just wanted some further clarification to make sure I’m optimizing my Dgraph processes.

When dealing with client connection, is it best practice to have many client connections used to perform a small amount of operations or have less client connections to perform a large amount of operations? Meaning, for example in my app I have several Scrapy Spider crawls running concurrently with the intention of each scraped item (JSON obj) being processed to be imported to Dgraph mid-scrape with an upsert query using PyDgraph. Expanding on that, which scenario would be considered best practice, in other words getting the best performance:

  1. Client stub and client created before starting crawls, each spider using that one client to perform the upserts on each item, close client stub when all crawls finish.
  2. Client stub created before starting crawls, before each upsert create a client from the client stub, perform upsert, close client stub when all crawls finish.
  3. Start crawls, before each upsert create a client stub and client, perform upsert, close client stub.

I think to start I’m just not fundamentally understanding the difference between a client stub and a client, then in turn unsure of the best way to use then close them. My experience with grpc is very minimal.