Can any path segments of a URI have a query component?
Asked Answered
D

4

9

According to the Section 3.3, Path Component of RFC2396 - Uniform Resource Identifiers,

The path may consist of a sequence of path segments separated by a single slash "/" character. Within a path segment, the characters "/", ";", "=", and "?" are reserved. Each path segment may include a sequence of parameters, indicated by the semicolon ";" character. The parameters are not significant to the parsing of relative references.

However, I have never seen a URL with a query parameters in any segment other than the final one. So, I am not sure if I am reading this correctly.

Is http://www.url.com/segment1?seg1param1=val1/page.html?pageparam1=val2 a valid URL?

Dragonhead answered 22/6, 2011 at 18:11 Comment(0)
P
16

What the RFC is referring to is something like this:

http://www.example.com/foo/bar;param=value/baz.html

That could be interpreted as the path /foo/bar/baz.html with the parameter param=value to the bar segment. No question marks are used.

Note that RFC 2396 has been obsoleted by RFC 3986, which omits specification of segment-specific parameters in favor of a general note that implementations can (and do) do different things to embed segment-specific parameters:

Aside from dot-segments in hierarchical paths, a path segment is considered opaque by the generic syntax. URI producing applications often use the reserved characters allowed in a segment to delimit scheme-specific or dereference-handler-specific subcomponents. For example, the semicolon (";") and equals ("=") reserved characters are often used to delimit parameters and parameter values applicable to that segment. The comma (",") reserved character is often used for similar purposes. For example, one URI producer might use a segment such as "name;v=1.1" to indicate a reference to version 1.1 of "name", whereas another might use a segment such as "name,1.1" to indicate the same. Parameter types may be defined by scheme-specific semantics, but in most cases the syntax of a parameter is specific to the implementation of the URI's dereferencing algorithm.

Pleasance answered 1/7, 2011 at 13:26 Comment(3)
I did not know that the spec had been obsoleted. I'm going to review the updated one now.Dragonhead
Appendix D2 "Modifications" was also especially helpful... Thanks so much, I never would have figured this out just by reading the obsoleted specDragonhead
if question marks are used, it then becomes a query?Symbology
B
1

When you look at the grammar which is just below, it is written:

  path          = [ abs_path | opaque_part ]

  path_segments = segment *( "/" segment )
  segment       = *pchar *( ";" param )
  param         = *pchar

  pchar         = unreserved | escaped |
                  ":" | "@" | "&" | "=" | "+" | "$" | ","

A segment is composed of pchar and param, param being itself a pchar. When we continue to read, there is absolutely no "?" character in the pchar character components. So the parameters cannot have any "?", and there cannot be a "?" in segments.

So I agree with the answer of Edward Thomson, who says that "?" only delimit the query segment, and cannot be used inside a path.

Bonds answered 30/6, 2011 at 15:41 Comment(0)
H
0

According to my reading of RFC 2396, no. The ? is a reserved character and serves only to delimit the query segment. The ? is not allowed in either the path or the query segment.

In your example, the first ? marks the beginning of the query segment. The second ? is inside the query segment, and is disallowed.

Hansom answered 30/6, 2011 at 15:13 Comment(0)
M
0

I believe you could do a get with that and most web servers would process it but I don't believe you would get the results you are expecting. That is the pageparam1=val2 would not evaluate.

If you want parameters like that you could always use the # symbol (as a lot of javascript based GUIs do now).

Morality answered 30/6, 2011 at 15:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.