Valid characters for URI schemes?
Asked Answered
A

3

34

I was thinking about Registering an Application to a URL Protocol and I'd like to know, what characters are allowed in a scheme?

Some examples:

  • h323 (has numbers)
    • h323:[<user>@]<host>[:<port>][;<parameters>]
  • z39.50r (has a . as well)
    • z39.50r://<host>[:<port>]/<database>?<docid>[;esn=<elementset>][;rs=<recordsyntax>]
  • paparazzi:http (has a :)
    • paparazzi:http:[//<host>[:[<port>][<transport>]]/

So, what characters can I fancy using?
Can we have...

  • @:TwitterUser
  • #:HashTag
  • $:CapitalStock
  • ?:ID-10T

...etc., as desired, or characters in the scheme are restricted by standard?

Abyssal answered 4/9, 2010 at 9:51 Comment(0)
M
42

According to RFC 2396, Appendix A:

  scheme        = alpha *( alpha | digit | "+" | "-" | "." )

Meaning:

The scheme should start with a letter (upper or lower case), and can contains letters (still upper and lower case), number, "+", "-" and ".".


Note: in the case of

paparazzi:http:[//<host>[:[<port>][<transport>]]/

the scheme is only the "paparazzi" part.

Maharani answered 4/9, 2010 at 10:4 Comment(6)
I see. But there are RFCs that use numbers... Why?Abyssal
Numbers are allowed in the URI scheme, but not as first character. 'a234' is valid, while '4bcd' isn't.Maharani
Do you think the fact that it will be used only as an URL protocol on Windows has any impact on the usability of other characters?Abyssal
+1; What Vivien said re: the "paparazzi:" scheme. The http://... is passed on to the WebKit stuff. (NB: I'm the author of the app, and am also crazy and have a BNF-style document on the URL format on the site.)Incult
paparazzi is akin to mailto: it has no hierarchy hence no //Skippie
Chiming in 11 years later on Camilo's question: Windows does not enforce the starts-with-alpha limitation, but because all popular browsers do, you should follow it anyway.Guiscard
L
12

The scheme according to RFC 3986 is defined as:

scheme      = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )

So the scheme must begin with an alphabetic character (AZ, az) and may be followed by any number of alphanumeric characters, +, -, or ..

Ludwigshafen answered 4/9, 2010 at 10:3 Comment(1)
Do you think using it as a Windows-only URL protocol has any impact on the characters used? If that changes anything I'd do some tests...Abyssal
L
6

Quoth RFC 2396:

Scheme names consist of a sequence of characters beginning with a lower case letter and followed by any combination of lower case letters, digits, plus ("+"), period ("."), or hyphen ("-").

Lightly answered 4/9, 2010 at 10:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.