This has been brought up before but I just wanted some further clarification to make sure I’m optimizing my Dgraph processes.
When dealing with client connection, is it best practice to have many client connections used to perform a small amount of operations or have less client connections to perform a large amount of operations? Meaning, for example in my app I have several Scrapy Spider crawls running concurrently with the intention of each scraped item (JSON obj) being processed to be imported to Dgraph mid-scrape with an upsert query using PyDgraph. Expanding on that, which scenario would be considered best practice, in other words getting the best performance:
- Client stub and client created before starting crawls, each spider using that one client to perform the upserts on each item, close client stub when all crawls finish.
- Client stub created before starting crawls, before each upsert create a client from the client stub, perform upsert, close client stub when all crawls finish.
- Start crawls, before each upsert create a client stub and client, perform upsert, close client stub.
I think to start I’m just not fundamentally understanding the difference between a client stub and a client, then in turn unsure of the best way to use then close them. My experience with grpc is very minimal.