Many to Many relationship in Firebase
Asked Answered
J

3

46

I have a Firebase database. I have Companies and Contractors. A Contractor can work for more than one Company and a Company can have multiple Contractors. This is a straightforward many to many relationship. I want to be able to answer the questions about Companies and Contractors:

  1. Given a Company, who are the current Contractors.
  2. Given a Contractor what Companies are they working for.

What are the alternatives for structuring the data within firebase?

J answered 7/1, 2017 at 22:1 Comment(0)
E
39

The self-answer is indeed one way of modeling this. It's probably the most direct equivalent of how you'd model this in a relational database:

  • contractors
  • companies
  • companyAndContractorsAssignment (the many-to-many connector table)

An alternative would be to use 4 top-level nodes:

  • contractors
  • companies
  • companyContractors
  • contractorCompanies

The last two nodes would look like:

companyContractors
    companyKey1
        contractorKey1: true
        contractorKey3: true
    companyKey2
        contractorKey2: true
contractorCompanies
    contractorKey1
        companyKey1: true
    contractorKey2
        companyKey2: true
    contractorKey3
        companyKey1: true

This bidirectional structure allows you to both look up "contractors for a company" and "companies for a contractor", without either of these needing to be a query. This is bound to be faster, especially as you add contractors and companies.

Whether this is necessary for your app, depends on the use-cases you need, the data sizes you expect and much more.

Recommended reading NoSQL data modeling and viewing Firebase for SQL developers. This question was also featured in an episode of the #AskFirebase youtube series.

Update (2017016)

Somebody posted a follow-up question that links here about retrieving the actual items from the "contractors" and "companies" nodes. You will need to retrieve those one at a time, since Firebase doesn't have an equivalent to SELECT * FROM table WHERE id IN (1,2,3). But this operation is not as slow as you may think, because the requests are pipelined over a single connection. Read more about that here: Speed up fetching posts for my social network app by using query instead of observing a single event repeatedly.

Electroencephalograph answered 8/1, 2017 at 2:39 Comment(3)
Hey frank, what if I store the contractors keys into companies and the companies keys into the contractors? creating new tables is for good for performance or just for view?Nicodemus
Is there any way to validate these values using the .validation or .write rules....i.e., a contractor should not be able to contain a companyKey that doesn't exist.Maebashi
@AdirZoari: did you find solution to your problem??Infrastructure
J
9

After further research, I'm going to try and answer my own question. I have reviewed a number of other posts and one solution to the many-to-many problem is to store a list of ContractorKeys within a Company object and store a list of CompanyKeys within each contractor object. This is illustrated with an example below.

companies : {
  companyKey1 : {
    name : company1
    ...
    contractors : {
      contractorKey1 : true,   
      contractorKey3 : true
    }
  }
  companyKey2 : {
    name : company2
    ...
    contractors : {
      contractorKey2 : true,  
    } 
  }
}
contrators : {
  contractorKey1 : {
     name : bill
     ...
     companies : {
        companyKey1 : true
     }
   }
  contractorKey2 : {
     name : steve
     ...
     companies : {
        companyKey1 : true
     }

   }
  contractorKey3 : {
     name : jim
     ...
     companies : {
        companyKey2 : true
     }
   }
}

This organization "works" in the sense that the aforementioned questions can be answered. But a downside of this solution is that there are two lists to maintain when the Contractor/Company assignments change. It would be better if there was a way to represent this information in a single list.

I think I have come up with a better solution. The solution is to create a third list, in addition to companies and contractors called companyAndContractorAssignment. The elements of this list will represent a relationship between a single contractor and company. Its contents will be a pair of fields, the contractorKey and the companyKey. We can then eliminate the contractors list within company and the companies list within contractor. This alternative structure is represented below. Notice there is no contractor list within a company object and no companies list with a contractor object.

companies : {
  companyKey1 : {
    name : company1
    ...
  }
  companyKey2 : {
    name : company2
    ...
  }
}
contrators : {
  contractorKey1 : {
     name : bill
     ...
  }
  contractorKey2 : {
     name : steve
     ...
  }
  contractorKey3 : {
     name : jim
     ...
  }
}
companyAndContractorsAssignment : {
  key1 : {
    contractorKey1 : true,
    companyKey1: true,
  }
  key2 : {
    contractorKey3 : true,
    companyKey1: true,
  }
  key3 : {
    contractorKey2 : true,
    companyKey2: true,
  }

This alternative structure allows one to answer the questions using an orderByChild/equalTo query on companyAndContractorsAssignment to find either all companies for a contractor or all contractors for a company. And there is now only a single list to maintain. I think this is the best solution for my requirements.

J answered 7/1, 2017 at 22:15 Comment(3)
I think if you go with this you could have performance issues on indexing. Firebase will ask you to perform indexOn on your orderByChild, but since those are dynamic keys it could turn into a nightmare to maintain.Kayceekaye
Have you seen any performance issues? I like the assignment model because there is a single list to maintain. I'm afraid that it will get hard to maintain companyContractors and contractorCompanies.Tralee
The 4 node solution is superior I think. Not just for performance, but I'm sure at scale it performs better. My application doesn't have very many contractors. I have plans to migrate to the 4 node solution...J
E
0

In the solution code above why store contractorKey1/companyKey1 ": true" ? Why store the Boolean against the key? Surely you just have:-

contractorKey : "<docRef/ID for contractor>"

companyKey : "<docRef/ID for company>"

Also... What is the latest consensus best way to do model many to many relationships in a document db? Is the 4 node solution or the 'link table" (3 node) solution?

Enkindle answered 27/11, 2023 at 11:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.