Given a set of nodes, how to optimally deduplicate the set of nodes connected to them with a given edge?

Hi, I am fairly new to dgraph and just wanted to confirm if I am going about this the most optimal way.

My current problem is that I am trying to deduplicate seasonal fans across multiple teams to determine how many unique fans a team has out of all of the fans of a set of teams. So far I have not seen ideal response times and I am not sure if it is because of how I am modeling the data or something else.

The first data model I used is below (the A and B correspond to different data sources):

name: string @index(term) .

fan_s2020_A: [uid] @reverse .
fan_s2020_B: [uid] @reverse .
fan_s2019_A: [uid] @reverse .
fan_s2019_B: [uid] @reverse .

type Person {
    fan_s2020_A
    fan_s2020_B
    fan_s2019_A
    fan_s2019_B
}

type Team {
    name
}

Example query deduplicating fans across two teams:

{
    # Fans of first team for 2020 season
    var(func: eq(name, "team-0")) {
        ~fan_s2020_A {
            fan_0_A as uid
        }
        ~fan_s2020_B {
            fan_0_B as uid
        }
    }

    var(func: uid(fan_0_A, fan_0_B)) {
        fans_0 as uid
    }
    
    var(func: eq(name, "team-1")) {
        ~fan_s2020_A {
            fan_1_A as uid
        }
        ~fan_s2020_B {
            fan_1_B as uid
        }
    }

    # Fans of second team
    var(func: uid(fan_1_A, fan_1_B)) {
        fans_1 as uid
    }

    # Unique fan counts of each team
    unique_0_fan(func: uid(fans_0)) @filter(NOT uid(fans_1)) {
        count(uid)
    }
    unique_1_fan(func: uid(fans_1)) @filter(NOT uid(fans_0)) {
        count(uid)
    }

    # Total fan count
    union(func: uid(fans_1, fans_0)) {
        count(uid)
    }
}

As I compare more and more teams I would just create more var blocks for the other teams and add those variables to the NOT filter (i.e. NOT uid(fans_0, fans_1, ...)) in the unique count queries.

I have also tried modelling this data similarly to what is described in this comment in another thread where instead of origin as the facet I had data_provider. The schema/example query for that is below:

name: string @index(term) .

fan_s2020: [uid] @reverse .
fan_s2019: [uid] @reverse .

relates_to: [uid] @reverse .

type Person {
    fan_s2020
    fan_s2019
}

type Queue {
    relates_to
}

type Team {
    name
}
{
    # Fans of first team
    var(func: eq(name, "team-0")) {
        ~relates_to {
            ~fan_s2020 {
                fans_0 as uid
            }
        }
    }
    
    # Fans of second team
    var(func: eq(name, "team-1")) {
        ~relates_to {
            ~fan_s2020 {
                fans_1 as uid
            }
        }
    }
    
    # Unique fan counts of each team
    unique_0_fan(func: uid(fans_0)) @filter(NOT uid(fans_1)) {
        count(uid)
    }
    unique_1_fan(func: uid(fans_1)) @filter(NOT uid(fans_0)) {
        count(uid)
    }

    # Total fan count
    union(func: uid(fans_1, fans_0)) {
        count(uid)
    }
}

I found that the first data model seemed to perform slightly better than the second when looking at a small number of teams, but as the team count got larger, they both seemed to perform the same.

Am I going about this the right way or should one of these data models outperform the other when the team count is large? Or are my queries/data models not optimized for this? Feedback will be greatly appreciated.