Is using char as a primary/foreign key a no no?
Asked Answered
M

4

9

Consider that there is a bunch of tables which link to "countries" or "currencies" tables.

For making data easier to read I'd like make CHAR field with country code (eg US, GB, AU) and currency code (USD, AUD) a primary keys in each of those 2 tables and all other tables will use this CHAR as a foregin key.

Database is mysql with innodb engine.

Is it going to cause performance issues? Is it something i should avoid?

Miletus answered 14/9, 2009 at 0:26 Comment(0)
L
22

Performance isn't really the main issue, at least not for me. The issue is more about surrogate vs natural keys.

Country codes aren't static. They can and do change. Countries change names (eg Ethiopia to Eritrea). They come into being (eg the breakup of Yugoslavia or the Soviet Union) and they cease to exist (eg West and East Germany). When this happens the ISO standard code changes.

More in Name Changes Since 1990: Countries, Cities, and More

Surrogate keys tend to be better because when these events happen the keys don't change, only columns in the reference table do.

For that reason I'd be more inclined to create country and currency tables with an int primary key instead.

That being said, varchar key fields will use more space and have certain performance disadvantages that probably won't be an issue unless you're performing a huge number of queries.

For completeness, you may want to refer to Database Development Mistakes Made by AppDevelopers.

Lipman answered 14/9, 2009 at 0:39 Comment(7)
Ethiopia changed it's name?!?Blemish
Damn the phone, made me get up in the middle of typing this exact answer. Well said!Forspent
@SeanJA: according to that link, yes. It might've been a temporary change.Lipman
Even more importantly, things that by definition ought to be unique, aren't (Social Security numbers, Passport numbers and so on)Ventriculus
well, if country code is renamed then you simply change it in one table and all tables are updated via trigger (or worst you have to do it yourself if triggers not supported) but this is minor issueMiletus
@alexeit: If you use surrogate keys, you don't have to fire triggers that will update potentially millions of rows on a change.Forspent
Abyssinia became Ethiopia; Eritrea broke off from Ethiopia and is now a separate country.Katinka
D
1

James Skidmore's link is important to read.

If you're limiting yourself to country and currency codes (2 and 3 characters, respectively), you may very well be able to get away with declaring the columns char(2) and char(3).

I would guess that would not be a no-no. If you're using an 8-bit character encoding, you're looking at columns the size of smallint or mediumint, respectively.

Deejay answered 14/9, 2009 at 0:38 Comment(0)
N
0

My answer is that there isn't a clear-cut answer. Just pick an approach within your project and be consistent. Both have their pluses and minuses.

@cletus makes a good point about using generated keys, but when you run into a situation where the data is relatively static, like country codes, introducing a generated key for them seems overly complex. Despite real world politics, having country codes appear and disappear isn't really going to be much of an issue for most business problems (but if your data actively concerns all 190-210 countries, follow that advice).

Using surrogate keys universally is a good and popular strategy. But remember, it comes in response to modeling databases using natural keys for everything. Ack! Open up a 15 year old database book. Using natural keys everywhere definitely gets you into difficult situations, as initial understanding of the problem domains prove wrong. You do want to have consistency in your modelling practices, but using different techniques for clearly different situations is OK.

I suspect that performance for most modern databases on var(2) foreign keys will be the same (or better) than int fields. Databases have for years supported textual foreign keys.

Given that we have no other information about the project, if you preference is to use the country codes as foreign keys, and you have the option to do so, I'd say it's OK. It'll be easier to work with the data. It is a little against current practices, but-- in this case-- it's not going to back you into some corner.

Nik answered 14/9, 2009 at 1:29 Comment(3)
-1 This is actually quite wrong. As pointed to in forums.mysql.com/read.php?153,243809,243818#msg-243818 by James, there are things MySQL (the database in question) doesn't do with varchars that it does with int keys.Lipman
I was just guessing that since databases did this for years, it would be optimized. Wouldn't be the first time that assumption proved wrong! But that post is a slightly different question ("is using VARCHAR(45) a good choice for a primary key?") This problem is CHAR(2) on a 200 row table (the number of countries). Unfortunately that post doesn't discuss FK index performance in general, and whether CHAR(2) is going to be more efficient than VARCHAR(2), and I couldn't dig it up. Thanks for the link.Nik
i agree, it discusses how bad is to have 45 byte char since it is 4-5 time bigger then normal int, but with char(2) or char(3) there wont be much difference size wise.Miletus
F
0

I would say the opposite. I know this is an old question, but I came across it when deciding whether to use even more aggressive natural keys.

In your case, a natural key is superior to a surrogate key. It certainly performs better: CHAR(2) is like SMALLINT, 2 bytes; however, you avoid many unnecessary joins just to look up this content. Overall, the performance of your app will be better / at the very least no worse.

Even when there is a performance penalty, a natural key (e.g. passport number) is to me preferred. The idea that primary keys must not change over time is not one that I subscribe to, but you should avoid using natural keys that change too often. You can always use cascading updates in the rare case (like passport number) that the PK is updated.

It's also a good practice in my opinion to use synthetic PK.

Fen answered 21/2, 2024 at 3:26 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.