Best database field type for a URL
Asked Answered
A

11

426

I need to store a url in a MySQL table. What's the best practice for defining a field that will hold a URL with an undetermined length?

Arrive answered 20/10, 2008 at 19:29 Comment(2)
It depends on what you need, indexing, unicity ?Malenamalet
Just go with the TEXT type and skip reading all these answers below. In the end, that's what most of them suggest. :) Of course, if You need indexing or uniqueness, go for VARCHAR, since TEXT cannot be indexed that easily.Lewanna
E
393
  1. Lowest common denominator max URL length among popular web browsers: 2,083 (Internet Explorer)
  1. http://dev.mysql.com/doc/refman/5.0/en/char.html
    Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to 65,535 in 5.0.3 and later versions. The effective maximum length of a VARCHAR in MySQL 5.0.3 and later is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used.
  1. So ...
    < MySQL 5.0.3 use TEXT
    or
    >= MySQL 5.0.3 use VARCHAR(2083)
Exorcism answered 20/10, 2008 at 19:55 Comment(12)
Good answer, but personaly I would limit the length. Depending on the project you might want to limit the accepted urls. Who uses url longet than 200?Patrilocal
They'd better come up with a uri datatype that "understands" the structure of uri so that indexing and search is done efficiently, like oracle did...wait, mysql is now oracle's... download.oracle.com/docs/cd/B10464_05/web.904/b12099/…Devereux
This answer is a little misleading. Note that "Lowest common denominator" here is meaningless, you want to use the highest number a browser or server will accept (which is not consistent and subject to change). As your link says: "...the specification of the HTTP protocol does not specify any maximum length...", so don't bother with that VARCHAR(2083), just use TEXT.Dialysis
Example, also from your link: "After 65,536 characters, the location bar no longer displays the URL in Windows Firefox 1.5.x. However, longer URLs will work. I stopped testing after 100,000 characters."Dialysis
Regarding #3: Note that if you are using MySql and wanted to set a default value, you should use VARCHAR (#3467372)Heterochromatin
@Patrilocal I'm currently storing Magnetic Links, some of those links. 483 Characters for one of my linksIdeational
I'm using MySQL 5.6.31, and I'd like to make the url field as a unique key, then the server says max key length is 767 bytes So I decided to set its length to 767 Besides I need also to set the character to ascii. The final sql look like this: CREATE TABLE pages (id INT AUTO_INCREMENT NOT NULL, url VARCHAR(767) NOT NULL, UNIQUE INDEX UNIQ_2074E575F47645AE (url), PRIMARY KEY(id)) DEFAULT CHARACTER SET ascii COLLATE ascii_general_ci ENGINE = InnoDBFascista
I'm not sure about default encoding but I think setting character set to utf-8 is better.Roee
FYI - Asp.Net MVC [Display(Name = "My URL")] [Column(TypeName = "VARCHAR")] [StringLength(2083)] public string MyUrl { get; set; }Brosy
The boutell.com resource fell off the net. Here's a reference to it in a scanned O'Reilly book: books.google.ca/…Exorcism
@WesleyMurch I know this is old thread but anyway. If you're using SQL Server 2005 or later, use varchar(MAX). The text datatype is deprecated and should not be used for new development work.Socialist
What collation to take? If you use utf8mb3_generali_ci VARCHAR cannot take more than ~1000 characters.Epochal
A
44

VARCHAR(512) (or similar) should be sufficient. However, since you don't really know the maximum length of the URLs in question, I might just go direct to TEXT. The danger with this is of course loss of efficiency due to CLOBs being far slower than a simple string datatype like VARCHAR.

Arriviste answered 20/10, 2008 at 19:31 Comment(1)
what about the collation ?Ganny
H
20

This really depends on your use case (see below), but storing as TEXT has performance issues, and a huge VARCHAR sounds like overkill for most cases.

My approach: use a generous, but not unreasonably large VARCHAR length, such as VARCHAR(500) or so, and encourage the users who need a larger URL to use a URL shortener such as safe.mn.

The Twitter approach: For a really nice UX, provide an automatic URL shortener for overly-long URL's and store the "display version" of the link as a snippet of the URL with ellipses at the end. (Example: https://mcmap.net/q/80844/-best-database-field-type-for-a-url/1235702 would be displayed as https://mcmap.net/q/82075/-how-do-i-compare-two-arrays-of-datarow-objects-in-powershell... and would link to a shortened URL http://ex.ampl/e1234)

Notes and Caveats

  • Obviously, the Twitter approach is nicer, but for my app's needs, recommending a URL shortener was sufficient.
  • URL shorteners have their drawbacks, such as security concerns. In my case, it's not a huge risk because the URL's are not public and not heavily used; however, this obviously won't work for everyone. safe.mn appears to block a lot of spam and phishing URL's, but I would still recommend caution.
  • Be sure to note that you shouldn't force your users to use a URL shortener. For most cases (at least for my app's needs), 500 characters is overly sufficient for what most users will be using it for. Only use/recommend a URL shortener for overly-long links.
Hawkweed answered 8/9, 2014 at 19:33 Comment(3)
If you are providing a built-in url shortener, won't you still need to be storing the full-length url in a database somewhere for it to work? :-)Molliemollify
Of course; but I doubt most people would write their own shortener. Since writing this, I've learned that there are many URL shortening APIs out there (71 are listed here: programmableweb.com/news/…), so you could automate the process without even writing your own. It still depends on user knowledge and consent, of course.Hawkweed
I'm unsure of what safe.mn was in 2014, but this answer is no longer relevant as the site has gone offline and lost it's domain.Adventurous
I
18

varchar(max) for SQLServer2005

varchar(65535) for MySQL 5.0.3 and later

This will allocate storage as need and shouldn't affect performance.

Impassioned answered 20/10, 2008 at 19:31 Comment(3)
In your snippet, is max a magic ANSI SQL specifier to grow the VARCHAR size as necessary, or is it just a meta-variable for the sake of example?Arriviste
In MySQL you most likely can't have a varchar that large unless it is the only column in the table.Garfish
@Daniel Spiewak: "The basic difference between TEXT and VARCHAR(MAX) is that a TEXT type will always store the data in a blob whereas the VARCHAR(MAX) type will attempt to store the data directly in the row unless it exceeds the 8k limitation and at that point it stores it in a blob." #835288 But the question was about MySQL, so this isn't really relevant here.Unarmed
U
14

You should use a VARCHAR with an ASCII character encoding. URLs are percent encoded and international domain names use punycode so ASCII is enough to store them. This will use much less space than UTF8.

VARCHAR(512) CHARACTER SET 'ascii' COLLATE 'ascii_general_ci' NOT NULL
Unintelligible answered 18/12, 2015 at 10:0 Comment(1)
doesn't UTF-8 use more space when it only has to ?Ganny
C
12

You'll want to choose between a TEXT or VARCHAR column based on how often the URL will be used and whether you actually need the length to be unbound.

Use VARCHAR with maxlength >= 2,083 as micahwittman suggested if:

  1. You'll use a lot of URLs per query (unlike TEXT columns, VARCHARs are stored inline with the row)
  2. You're pretty sure that a URL will never exceed the row-limit of 65,535 bytes.

Use TEXT if :

  1. The URL really might break the 65,535 byte row limit
  2. Your queries won't select or update a bunch of URLs at once (or very often). This is because TEXT columns just hold a pointer inline, and the random accesses involved in retrieving the referenced data can be painful.
Chad answered 22/5, 2013 at 22:45 Comment(0)
G
5

Most browsers will let you put very large amounts of data in a URL and thus lots of things end up creating very large URLs so if you are talking about anything more than the domain part of a URL you will need to use a TEXT column since the VARCHAR/CHAR are limited.

Garfish answered 20/10, 2008 at 19:33 Comment(0)
W
5

Here are some SQL data types according to AWS. enter image description here

Whetstone answered 10/11, 2022 at 14:27 Comment(1)
Could you please share the link to this article?Winfordwinfred
B
3

I don't know about other browsers, but IE7 has a 2083 character limit for HTTP GET operations. Unless any other browsers have lower limits, I don't see why you'd need any more characters than 2083.

Barriebarrientos answered 20/10, 2008 at 19:33 Comment(0)
Z
2

You better use varchar(max) which (in terms of size) means varchar (65535). This will even store your bigger web addresses and will save your space as well.

The max specifier expands the storage capabilities of the varchar, nvarchar, and varbinary data types. varchar(max), nvarchar(max), and varbinary(max) are collectively called large-value data types. You can use the large-value data types to store up to 2^31-1 bytes of data.

See this article on TechNet about using Using Large-Value Data Types

Zasuwa answered 19/12, 2012 at 10:32 Comment(1)
varchar (max) is SQLServer syntax, not suitable for MySQL (as in the original question). Furthermore it does not means varchar (65535) since 65535 is the maximum number of ASCII chars in a row in mysql, so it is dependent also on the other fields and on the character set.Latour
T
1

Most web servers have a URL length limit (which is why there is an error code for "URI too long"), meaning there is a practical upper size. Find the default length limit for the most popular web servers, and use the largest of them as the field's maximum size; it should be more than enough.

Trustee answered 20/10, 2008 at 19:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.