Fraud analysis best practices & pitfalls

Hi, general query - I’m on a team considering moving from a graph edge-in-SQL db to a real graphdb, and I think a lot of us here are in the fraud space. Are there shared best practices when designing a graph db to combat fraud? (ie interested in things like what others have posted - how many users on a shared device, which users are connected to devices connected to X bad user, etc). We can do a lot of those things on our sql-based db, but hoping to do them more efficiently and more powerfully using dgraph…

Thanks for any thoughts, right now we are just at the start of design process so overall best practices/pitfalls super helpful.

Hey Gvelez,

I don’t think other people would share that. Perhaps for security reasons.

It would be nice to have things like that to make available to the community. I’ve thought about creating something myself, but I don’t have any experience in the field of fraud detection.

It would be good coming from someone in the market, who lives the problems day by day to share experiences. There are models out there on the internet about this. That if applied the same logic in Dgraph works, but needs “transport” to Dgraph way. Any GraphDB can perform any kind of activity that all other GraphDBs can do. Unless it has a completely different paradigm.

In general, it is a matter of identifying patterns, having a solid data model, and making adjustments. And GraphDBs are the best way to see patterns.

Today Dgraph has GraphQL, so using the Subscription model I believe that some tasks related to fraud detection could use this. To have a “live” response.

Cheers.

There are folks using Dgraph for fraud detection and identity systems.

The main idea is that you can identify the same users as connected to (or not) related nodes in the graph. You can identify these users or devices by some ID and can start to understand the relationships via queries in Dgraph. Like you wrote, you can use queries to figure out which users connected to certain devices and furthermore what other things did those users/devices access.

Grapl uses Dgraph for their platform for intrusion detection. While not necessarily the same as fraud detection, I imagine there are similarities that may be interesting for you.

2 Likes

Hey, author of Grapl here. Fraud and abuse use cases were some of the inspiration for Grapl, which is more focused on detecting malware.

In an event based detection system you’re often looking at single events in isolation. To build signature this leads to focusing on very specific properties of the events. This is problematic for a number of reasons - it lacks context (higher FP rate) and tends to focus on more attacker controlled behavior (easier to bypass).

Graphs give a more behavioral view of your data. You can see how your events connect together, the relationships between the entities that the events refer to, etc. This gives much more context to your detection logic, decreasing false positives, and allows you to avoid hyper specific properties, giving more resilience to your logic.

I’m not sure if you have any investigation use cases, but this is another area where graphs truly shine. You can trivially pull in context by just expanding the graph outwards, which is an incredible capability.

I’m not an expert in fraud and abuse by any means, and I’m not sure if you’re referring to financial fraud or some other type, so I’ll avoid making specific detection recommendations, but there are quite a few papers on applying graph analytics to the space.

6 Likes

Thanks - this is helpful! We have some experience with a sort of large homegrown solution that has been effective for us, so just wanting to compare to what people are doing out there in more standard ways. Will def take a look at grapl!

2 Likes