MongoDB and "joins" [duplicate]
Asked Answered
M

7

170

I'm sure MongoDB doesn't officially support "joins". What does this mean?

Does this mean "We cannot connect two collections(tables) together."?

I think if we put the value for _id in collection A to the other_id in collection B, can we simply connect two collections?

If my understanding is correct, MongoDB can connect two tables together, say, when we run a query. This is done by "Reference" written in http://www.mongodb.org/display/DOCS/Schema+Design.

Then what does "joins" really mean?

I'd love to know the answer because this is essential to learn MongoDB schema design. http://www.mongodb.org/display/DOCS/Schema+Design

Mythological answered 1/11, 2010 at 7:19 Comment(3)
finally they add join to MongoDB 3.2 mongodb.com/blog/post/thoughts-on-new-feature-lookupBurris
Correct, $lookup was introduced in MongoDB 3.2. Details can be found at docs.mongodb.org/master/reference/operator/aggregation/lookup/…Rodin
Joins Example : fourthbottle.com/2016/07/joins-in-mongodb.htmlTriforium
F
106

It's no join since the relationship will only be evaluated when needed. A join (in a SQL database) on the other hand will resolve relationships and return them as if they were a single table (you "join two tables into one").

You can read more about DBRef here: http://docs.mongodb.org/manual/applications/database-references/

There are two possible solutions for resolving references. One is to do it manually, as you have almost described. Just save a document's _id in another document's other_id, then write your own function to resolve the relationship. The other solution is to use DBRefs as described on the manual page above, which will make MongoDB resolve the relationship client-side on demand. Which solution you choose does not matter so much because both methods will resolve the relationship client-side (note that a SQL database resolves joins on the server-side).

Firewarden answered 1/11, 2010 at 7:25 Comment(6)
Well this is a correct answer bt it raises the question of why MongoDB does not support this server side? Is it just to discourage it's use and encourage denormalization? Denormalizing is sometimes just too inefficient on resources. Why force the client/server turn-arounds when this is the case?Brelje
@groovydotcom to understand this you have to understand the motivation of both days and nosql. Nosql is optimized for massive amounts of reads and write operations. MASSIVE. client machines and application servers are faster than they were years ago. The theory is to offload the expensive join operations to the application a servers and client machines to allow the db servers to simply crunch the simple read and wrote operations as fast as possible.Lali
What this means though, worse case scenario, is that the client needs both complete Collections, which could be massive right? For example if you have a Collection of all the user-profiles and you wanted to only store user _id's against content... ?Imminent
What do you mean by "client side" and "server side"?Stunt
@scaryguy, "Server side" would be in the MongoDB application while "client side" is your application code. Your application connects as a client to the MongoDB server. Do not confuse this with machines; both processes might run on the same machine. Client and server are just describing two different roles in inter-process communication.Libreville
I got it. Thanks for explanation @EmilVikström :)Stunt
P
32

The database does not do joins -- or automatic "linking" between documents. However you can do it yourself client side. If you need to do 2, that is ok, but if you had to do 2000, the number of client/server turnarounds would make the operation slow.

In MongoDB a common pattern is embedding. In relational when normalizing things get broken into parts. Often in mongo these pieces end up being a single document, so no join is needed anyway. But when one is needed, one does it client-side.

Consider the classic ORDER, ORDER-LINEITEM example. One order and 8 line items are 9 rows in relational; in MongoDB we would typically just model this as a single BSON document which is an order with an array of embedded line items. So in that case, the join issue does not arise. However the order would have a CUSTOMER which probably is a separate collection - the client could read the cust_id from the order document, and then go fetch it as needed separately.

There are some videos and slides for schema design talks on the mongodb.org web site I belive.

Pastoralist answered 1/11, 2010 at 21:47 Comment(2)
But take this scenario, Suppose order has 8 line items, line items contains sku code, How to retrive orders which has specific sku . lets say example "I want all orders which has sku code : 'ABC001'".Canada
@VijaySali you search for it...db.collection.find({ "orders.lineitems.SKU" : "ABC001"}); You'll get back all the orders with that SKU, assuming the embedded line item has a SKU field. It all depends on the schemaLali
W
12

one kind of join a query in mongoDB, is ask at one collection for id that match , put ids in a list (idlist) , and do find using on other (or same) collection with $in : idlist

u = db.friends.find({"friends": something }).toArray()
idlist= []
u.forEach(function(myDoc) { idlist.push(myDoc.id ); } )
db.family.find({"id": {$in : idlist} } )
Watercress answered 19/10, 2014 at 2:58 Comment(1)
it's better that : u = db.friends.find({"friends": something } ,{_id:0,id:1}).toArray() db.family.find({"id": {$in : u} } )Obsess
M
6

The first example you link to shows how MongoDB references behave much like lazy loading not like a join. There isn't a query there that's happening on both collections, rather you query one and then you lookup items from another collection by reference.

Midstream answered 1/11, 2010 at 7:27 Comment(0)
S
6

The fact that mongoDB is not relational have led some people to consider it useless. I think that you should know what you are doing before designing a DB. If you choose to use noSQL DB such as MongoDB, you better implement a schema. This will make your collections - more or less - resemble tables in SQL databases. Also, avoid denormalization (embedding), unless necessary for efficiency reasons.

If you want to design your own noSQL database, I suggest to have a look on Firebase documentation. If you understand how they organize the data for their service, you can easily design a similar pattern for yours.

As others pointed out, you will have to do the joins client-side, except with Meteor (a Javascript framework), you can do your joins server-side with this package (I don't know of other framework which enables you to do so). However, I suggest you read this article before deciding to go with this choice.

Edit 28.04.17: Recently Firebase published this excellent series on designing noSql Databases. They also highlighted in one of the episodes the reasons to avoid joins and how to get around such scenarios by denormalizing your database.

Sickening answered 16/6, 2015 at 20:20 Comment(0)
K
1

If you use mongoose, you can just use(assuming you're using subdocuments and population):

Profile.findById profileId
  .select 'friends'
  .exec (err, profile) ->
    if err or not profile
      handleError err, profile, res
    else
      Status.find { profile: { $in: profile.friends } }, (err, statuses) ->
        if err
          handleErr err, statuses, res
        else
          res.json createJSON statuses

It retrieves Statuses which belong to one of Profile (profileId) friends. Friends is array of references to other Profiles. Profile schema with friends defined:

schema = new mongoose.Schema
  # ...

  friends: [
    type: mongoose.Schema.Types.ObjectId
    ref: 'Profile'
    unique: true
    index: true
  ]
Kerstin answered 2/1, 2015 at 20:42 Comment(3)
This question was not tagged in "CoffeeScript". If you'd like to give an example, IMHO, the example should have been in JavaScript. CoffeScript is not JavaScript.Stunt
Wasn't also tagged JavaScript.Kerstin
JavaScript is MongoDB's official shell query language...Stunt
M
0

I came across lot of posts searching for the same - "Mongodb Joins" and alternatives or equivalents. So my answer would help many other who are like me. This is the answer I would be looking for.

I am using Mongoose with Express framework. There is a functionality called Population in place of joins.

As mentioned in Mongoose docs.

There are no joins in MongoDB but sometimes we still want references to documents in other collections. This is where population comes in.

This StackOverflow answer shows a simple example on how to use it.

Misrepresent answered 11/5, 2016 at 12:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.