Hi @Dgraph_Admin , So much work went into the feature prioritisation here, with so much anticipation from the community (at least, I can speak for myself). We’re approaching the year’s end, so how are we looking per the planning? Which of these features/capabilities should we still be expecting?
I’m prompted to ask since I noticed significantly less activities on GitHub compounded with the move to Hypermode.
Hi @iyinoluwaayoola, jumping in here as the (relatively) new product and engineering leader at Hypermode. I’ve spent my career building and consuming open-source projects in developer infrastructure, and the strength of the Dgraph community is a key reason we chose to build upon it with Hypermode. To be clear, our commitment to the Dgraph project remains steadfast while our commercial positioning evolves.
I’ve seen open-source projects work at their best when commercial developers are first clear on priorities that align with their objectives. It is alignment within these areas that naturally drives high output, with specific features shaped by customer and user feedback. You’ll notice that earlier this year priorities shifted to embrace vector capabilities within Dgragh. We believe this is an exciting development for the project and one we intend to open-source soon as we reach feature stability.
Objectively, we have room to grow in how we engage with you and the rest of the community as our priorities evolve. I’m committed to building this transparency as we come into 2024. As a preview, you’ll see our priorities expressed in thematic areas of investment rather than specific features.
Additionally, you’ll see a stronger commitment to the experience of committers outside of Hypermode. Embracing a community of committers (in addition to users) to drive ongoing feature development is critical to the long-term sustainability of Dgraph.
More soon, though I hope this provides some clarity on my current thinking. Feedback always appreciated to help this new guy understand what he may be missing
So what do you suggest for current clients that are not interested in the shift to vector but are more concerned with the formerly promised feature set, bug fixes, optimizations, and security concerns that may now never see the light of day?
We’ll share more detail soon, but I can assure everyone that our interests extend far beyond vector functionality! Security and performance will always remain paramount for any data store. We are carefully considering the commitments we can make to the community and continue to engage with our commercial customers on their priorities.
@rft , Thanks, I agree with your current thinking, really appreciate you sharing. For the new guy, pardon me, it feels like entering a new chapter with the old one still open. No doubts, vector capabilities in Dgraph is exciting. However, even as the product and business evolves, I would love to see that the critical pain points of Dgraph customers (can speak for myself) and its community are addressed and within a reasonable timeframe. I hope the priorities align.
This completely sounds like a new product and dump the old with exception to not losing high paying commercial customers. Everyone else is out.
I need you guys to understand something very important.
Dgraph is special because it can be used as a primary database, not just a secondary database.
If you focus only on Vectors and their applications, you will lose what is left of what makes Dgraph great, and its diminishing following, including the silent ones.
Dgraph is a graph database as a primary database with 0 server configuration that works out of the box. I have seen 3 teams that don’t understand what Dgraph is, because they never promoted the right things.
If you’re going to kill Dgraph’s uniqueness, be honest about it now.
Definitely see your point of view @iyinoluwaayoola. I think about a roadmap as a living, breathing thing. It’s a useful guide, but it’s important to be mindful of how the world around you is evolving. Reprioritizing doesn’t mean saying no to what was previously a yes, more of a not yet.
@jdgamble555, I couldn’t agree more on the simplicity of the developer experience with Dgraph. It’s why we heard and believed that vector search capabilities were a must-have for teams that didn’t want to spin up another backend to bring AI-powered decisions into their applications. It’s an important feature as application architectures evolve, but certainly not the only place we’re investing.
Fully appreciate that we owe more clarity on the specifics here. I’ll share more soon.
@rft - Something else that I also thought about that makes Dgraph unique: Shared Instances. I lot of graph databases can’t get off the ground because the costs are so high. Shared Instances have an extremely cheap point of entry for developers, and allow an entire instance to be shared cost wise on the business level. This is entirely unique to Dgraph for a Graph Database from my understanding.
Without this, you would be locking yourself up from thousands of potential users IMO.
Side Note: The free instance is also useless as you can’t even build a todo app before it tells you to upgrade (I know, I have literally tried).
I’m very hopeful that if Hypermode is relying on DGraph to be part of their core solution, similar to the way customers of DGraph have relied on it, then Hypermode will begin ‘eating their own dogfood’ with Dgraph. I find when that happens, resiliency of a product goes up, simple ‘gotchas’ that were overlooked for years are addressed, and operation of the product is made easier.
For example, the team I work with runs DGraph on bare metal servers. When those servers take an unexpected power hit, it’s like rolling the dice if DGraph is going to be able to come back online or not without importing a nightly backup of the full database. Twice we’ve lost data because we didn’t have exports/imports working to recover. We have nightly backups now, but they don’t run when Badger Log Compaction is occurring - there’s no easy way to check for this and guarantee a backup is created without additional custom tooling.
Another example is planning to scale the database. If you exceed 1.1TB of data, you have to set the maxLevels in Badger to 8. We took an outage to figure this out. Why isn’t this handled automatically without the database breaking?
Ratel is our view into the operations of the database, but I can’t view all the tablets within a Group in ratel. Is there another way to view all the tablets, sure, but it’s not as easy for an end-user as using ratel.
While these are our experiences, I look forward to Hypermode identifying their own list of improvements to the core operation and day-to-day running of DGraph and improving it for the community.
I recommend using TrueNAS Scale for its ZFS integration, which is the best for this type of file system. My advice is to create an iSCSI disk (or multiple iSCSI disks, one for each Dgraph instance) to leverage ZFS features like snapshots, backups, and many more. However, I’m unsure if this approach fits your scale. But as long as backup and recovery options aren’t open-source, ZFS stands as the best choice. It’s particularly effective in scenarios like power loss where you only need to roll back to a previous snapshot to recover your instances.
To not forgot to mention that iSCSI can be used on any operating system, including connections through Docker or Kubernetes. It’s a bit more complex, but I use it regularly and have recommended it to some clients (not for Dgraph specifically, but for similar purposes like backups and snapshots).
Additionally, with ZFS, you have the flexibility to expand and migrate pools, among numerous other functions. I strongly recommend studying it for a deeper understanding of that file system.