I am working on the design of a database that will be used to store data that originates from a number of different sources. The instances I am storing are assigned unique IDs by the original sources. Each instance I store should contain information about the source it came from, along with the ID it was associated by this source.
As an example, consider the following table that illustrates the problem:
----------------------------------------------------------------
| source_id | id_on_source | data |
----------------------------------------------------------------
| 1 | 17600 | ... |
| 1 | 17601 | ... |
| 2 | 1 | ... |
| 3 | 1 | ... |
----------------------------------------------------------------
Note that while the id_on_source
is unique for each source, it is possible for the same id_on_source
to be found for different sources.
I have a decent understanding of relational databases, but am far from an expert or even an experienced user. The problem I face with this design is what I should use as primary key. The data seems to dictate the use of a composite primary key of (source_id, id_on_source)
. After a little googling I found some heated debates on the pros and cons of composite primary keys however, leaving me a little confused.
The table will have one-to-many relationship with other tables, and will thus be referred to in the foreign keys of other tables.
I am not tied to a specific RDBMS
and I am not sure if it matters for the sake of the argument, but let's say that I prefer to work with SQLite
and MySQL
.
What are the pros and cons of using a composite foreign key in this case? Which would you prefer?