Facebook "like" data structure
Asked Answered
A

4

28

I've been wondering how facebook manages the database design for all the different things that you can "like". If there is only one thing to like, this is simple, just a foreign key to what you like and a foreign key to who you are.

But there must be hundreds of different tables that you can "like" on facebook. How do they store the likes?

Augustineaugustinian answered 20/9, 2011 at 15:9 Comment(0)
H
26

If you want to represent this sort of structure in a relational database, then you need to use a hierarchy normally referred to as table inheritance. In table inheritance, you have a single table that defines a parent type, then child tables whose primary keys are also foreign keys back to the parent.

Using the Facebook example, you might have something like this:

User
------------
UserId (PK)

Item
-------------
ItemId (PK)
ItemType (discriminator column)
OwnerId (FK to User)

Status
------------
ItemId (PK, FK to Item)
StatusText 

RelationshipUpdate
------------------
ItemId (PK, FK to Item)
RelationshipStatus
RelationTo (FK to User)

Like
------------
OwnerId (FK to User)
ItemId (FK to Item)
Compound PK of OwnerId, ItemId

In the interest completeness, it's worth noting that Facebook doesn't use an RDBMS for this sort of thing. They have opted for a NoSQL solution for this sort of storage. However, this is one way of storing such loosely-coupled information within an RDBMS.

Hopfinger answered 20/9, 2011 at 17:36 Comment(4)
That could be a solution, I think the problem is that "everyhing" must be an "Item" because what happens if you have a table that is not an Item and someday you want a like for that too?. I think that sometimes the simpler is the better, why not make the opposite inheritance? like is the parent and you have a like_for_status table with a FK to status, and like_for_photo, etc. you can extend it easily to any table, and your queries are faster too.Loaves
@Yuck: Yes, TPT (rather than Table-Per-Hierarchy), though TPT and TPH are, as far as I know, part of the Entity Framework lexicon rather than being more generically SQL.Hopfinger
@Loaves Yes, that's certainly a consideration. I am a little dubious on the "queries are faster too" claim, though.Hopfinger
@Adam Robinson, about performance, yes you are right, is not a big difference and not a reason for choosing one method or another. Even more, depending the case the query in your solution can be faster, because you don't need a JOIN if you know the ItemId.Loaves
D
2

Facebook does not have traditional foreign keys and such, as they don't use relational databases for most of their data storage. Simply, they don't cut it for that.

However they use several NoSQL type data stores. The "Like" is most likely attributed based on a service, probably setup in an SOA style manner throughout their infrastructure. This way the "Like" can basically be attributed to anything they want it to be associated with. All this, with vast scalability and no tightly coupled relational issues to deal with. Something that Facebook, can't really afford to deal with at the volume they operate.

They could also be using an AOP (Aspect Oriented Programming) style processing mechanism to "attach" a "Like" to anything that may need one at page rendering time, but I get the notion that it is asynchronous processing via JavaScript against an SOA style web service or other delivery mechanism.

Either way, I'd love to hear how they have this setup from an architecture perspective myself. Considering their volume, even the simple "Like" button becomes a significant implementation of technology.

Driest answered 20/9, 2011 at 15:29 Comment(3)
-1. "They don't cut it for that" is a matter of opinion and much speculation. The only portion of this answer that actually addresses the question (how such things could be stored) is your second paragraph.Hopfinger
+1 @adam, simple technological fact, no opinion involved. RDBMS are designed for a different usage model.Dulcedulcea
Like @StephanEggermont states Adam they are for a different model, a different purpose, Facebook needed more. I'm not speculating and the general database community, and scientific community, agrees. That's the reason the other solutions exist. #justsayin As for your assertion above, the keys are not aligned in that way. It's a way that works for an RDBMS, but the RDBMS couldn't supply or handle the data Facebook deals with. Facebook didn't try and drop RDBMSs just because they wanted to write something else.Driest
C
0

You can have a table with Id, ForeignId and Type. Type can be anything like Photo, Status, Event, etc… ForeignId would be the id of the record in the table Type. This makes possible for both comments and likes. You only need one table for all likes, one for all comments and the one I described.

Example:

Items
Id  | Foreign Id  | Type
----+-------------+--------
  1 |         322 | Photo
  4 |         346 | Status

Likes
Id  | User Id     | Item Id
----+-------------+--------
  1 |         111 | 1

Here, user with Id 111 likes the photo with Id 322.


Note: I assume you are using an RDBMS, but see Adron's answer. Facebook does not use an RDBMS for most of their data.

Choline answered 20/9, 2011 at 15:23 Comment(4)
But then you can't use constraints in the "Foreign Id"Loaves
@Loaves Can you elaborate? There are certainly restrictions as to what can and cannot be enforced in a table inheritance pattern using RI constraints only, but it's not clear what you're talking about.Hopfinger
@Adam Robinson The "Foreign_Id" column in the "Items" table is not a real FK, because you can't point it to any table, because it points to many tables at really (depending on the "Type" column), so you can't put a FK (and hence a constraint) there. That could make your data inconsistent.Loaves
@Loaves I see that now...I misread his answer, as it doesn't actually represent a table inheritance pattern. Table inheritance would mean that your Photo or Status table would have its primary key also be a foreign key back to Item.Hopfinger
B
-4

I'm pretty sure Facebook does not store "like" information as how some other suggested it using RDBMS. With millions of users and possibly thousands of like, we're looking at thousands of rows to join here which would impact performance.

The best approach here is to append all "likes" in a single row. For example, a table with user_like_id column of text datatype. Then all id's who liked the post is appended. In this case, you only query one row and you got everything. This will be a lot faster than joining tables and getting counts.

EDIT: I haven't been here on this site lately and I just discovered this answer has been downvoted. Well, here's an example post with like count and their avatars. This is my design where I just implemented what I'm talking about.

The two components here are 1.) XREF table and 2.) JSON object.

The likes are still stored on a XREF table. But at the same time, data is appended on JSON object and stored on a text column on the post table.

Why did I store the likes info on a text column as JSON? So that there's no need to do db lookup/joins for the likes. If someone unlike the post, the JSON object is just updated.

Now I don't know why this answer is downvoted by some users here. This answer provides quick data retrieval. This is close to NoSQL approach which is how FB access data. In this case, there's no need for extra joins/lookup to get likes info.

And here's the table that holds the likes. It's just a simple XREF mapping between user and item table.

enter image description here

Brown answered 5/7, 2013 at 20:40 Comment(3)
then how to you know 'how many people liked this'? querying all rows in user table?Chenab
forgot to comma penalty alsoSmog
@Chenab I edited my answer. Basically, you still have a transaction table that stores the likes. But in addition to that, there will be a text column that will hold a JSON object on that post table. That will hold your like information for each post. This is to avoid db lookup.Brown

© 2022 - 2024 — McMap. All rights reserved.