You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<aid="github-forkme" href="https://github.com/medialize/URI.js"><imgsrc="http://s3.amazonaws.com/github/ribbons/forkme_right_darkblue_121621.png" alt="Fork me on GitHub" /></a>
Uniform Resource Identifiers (URI) can be one of two things, a Uniform Resource Locator (URL) or a Uniform Resource Name (URN).
49
49
You likely deal with URLs most of the time. See RFC 3986 for a proper definition of the terms <ahref="http://tools.ietf.org/html/rfc3986#section-1.1.3">URI, URL and URN</a>
50
50
</p>
51
-
51
+
52
52
<p>
53
53
URNs <em>name</em> a resource.
54
54
They are (supposed to) designate a globally unique, permanent identifier for that resource.
@@ -84,7 +84,7 @@ <h2>URLs and URNs in URI.js</h2>
84
84
The most surprising result of this is that <code>mailto:</code> URLs will be considered by URI.js to be URNs rather than URLs.
85
85
That said, the functional differences will not adversely impact the handling of those URLs.
86
86
</p>
87
-
87
+
88
88
<h2id="components">Components of an URI</h2>
89
89
<p><ahref="http://tools.ietf.org/html/rfc3986#section-3">RFC 3986 Section 3</a> visualizes the structure of <abbrtitle="Uniform Resource Indicator">URI</abbr>s as follows:</p>
90
90
<preclass="ascii-art">
@@ -120,70 +120,70 @@ <h3 id="components-url">Components of an <abbr title="Uniform Resource Locator">
In Javascript the <em>query</em> is often referred to as the <em>search</em>.
125
+
In Javascript the <em>query</em> is often referred to as the <em>search</em>.
126
126
URI.js provides both accessors with the subtle difference of <ahref="docs.html#accessors-search">.search()</a> beginning with the <code>?</code>-character
127
127
and <ahref="docs.html#accessors-search">.query()</a> not.
128
128
</p>
129
129
<p>
130
-
In Javascript the <em>fragment</em> is often referred to as the <em>hash</em>.
130
+
In Javascript the <em>fragment</em> is often referred to as the <em>hash</em>.
131
131
URI.js provides both accessors with the subtle difference of <ahref="docs.html#accessors-hash">.hash()</a> beginning with the <code>#</code>-character
132
132
and <ahref="docs.html#accessors-hash">.fragment()</a> not.
133
133
</p>
134
-
134
+
135
135
<h3id="components-urn">Components of an <abbrtitle="Uniform Resource Name">URN</abbr> in URI.js</h3>
</span><ahref="docs.html#accessors-protocol">scheme</a><ahref="docs.html#accessors-pathname">path</a> & <ahref="docs.html#accessors-segment">segment</a><ahref="docs.html#accessors-search">query</a><ahref="docs.html#accessors-hash">fragment</a>
142
142
</pre>
143
-
143
+
144
144
<p>While <ahref="http://tools.ietf.org/html/rfc2141">RFC 2141</a> does not define URNs having a query or fragment component, URI.js enables these accessors for convenience.</p>
145
-
145
+
146
146
<h2id="problems">URLs - Man Made Problems</h2>
147
-
147
+
148
148
<p>URLs (URIs, whatever) aren't easy. There are a couple of issues that make this simple text representation of a resource a real pain</p>
149
149
<ul>
150
150
<li>Look simple but have tricky encoding issues</li>
151
151
<li>Domains aren't part of the specification</li>
152
152
<li>Query String Format isn't part of the specification</li>
<p>Because URLs look very simple, most people haven't read the formal specification. As a result, most people get URLs wrong on many different levels. The one thing most everybody screws up is proper encoding/escaping.</p>
158
158
<p><code>http://username:pass:word@example.org/</code> is such a case. Often times homebrew URL handling misses escaping the less frequently used parts such as the userinfo.</p>
159
159
<p><code>http://example.org/@foo</code> that "@" doesn't have to be escaped according to RFC3986. Homebrew URL handlers often just treat everything between "://" and "@" as the userinfo.</p>
160
160
<p><code>some/path/:foo</code> is a valid relative path (as URIs don't have to contain scheme and authority). Since homebrew URL handlers usually just look for the first occurence of ":" to delimit the scheme, they'll screw this up as well.</p>
161
161
<p><code>+</code> is the proper escape-sequence for a space-character within the query string component, while every other component prefers <code>%20</code>. This is due to the fact that the actual format used within the query string component is not defined in RFC 3986, but in the HTML spec.</p>
162
162
<p>There is encoding and strict encoding - and Javascript won't get them right: <ahref="https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/encodeURIComponent#Description">encodeURIComponent()</a></p>
163
-
163
+
164
164
<h3id="problems-tld">Top Level Domains</h3>
165
165
<p>The hostname component can be one of may things. An IPv4 or IPv6 address, an IDN or Punycode domain name, or a regular domain name. While the format (and meaning) of IPv4 and IPv6 addresses is defined in RFC 3986, the meaning of domain names is not.</p>
166
-
<p>DNS is the base of translating domain names to IP addresses. DNS itself only specifies syntax, not semantics. The missing semantics is what's driving us crazy here.</p>
166
+
<p>DNS is the base of translating domain names to IP addresses. DNS itself only specifies syntax, not semantics. The missing semantics is what's driving us crazy here.</p>
167
167
<p>ICANN provides a <ahref="http://www.iana.org/domains/root/db/">list of registered Top Level Domains</a> (TLD). There are country code TLDs (ccTLDs, assigned to each country, like ".uk" for United Kindom) and generic TLDs (gTLDs, like ".xxx" for you know what). Also note that a TLD may be non-ASCII <code>.香港</code> (IDN version of HK, Hong Kong).</p>
168
168
<p>IDN TLDs such as <code>.香港</code> and the fact that any possible new TLD could pop up next month has lead to a lot of URL/Domain verification tools to fail.</p>
169
169
170
-
<h3id="problems-sld">Second Level Domains</h3>
170
+
<h3id="problems-sld">Second Level Domains</h3>
171
171
<p>To make Things worse, people thought it to be a good idea to introduce Second Level Domains (SLD, ".co.uk" - the commercial namespace of United Kingdom). These SLDs are not up to ICANN to define, they're handled individually by each NIC (Network Information Center, the orgianisation responsible for a specific TLD).</p>
172
172
<p>Since there is no central oversight, things got really messy in this particular space. Germany doesn't do SDLs, Australia does. Australia has different SLDs than the United Kingdom (".csiro.au" but no ".csiro.uk"). The individual NICs are not required to publish their arbitrarily chosen SLDs in a defined syntax anywhere.</p>
173
173
<p>You can scour each NIC's website to find some hints at their SLDs. You can look them up on Wikipedia and hope they're right. Or you can use <ahref="http://publicsuffix.org/">PublicSuffix</a>.</p>
174
174
<p>Speaking of PublicSuffix, it's time mentioning that browser vendors actually keep a list of known Second Level Domains. They need to know those for security issues. Remember cookies? They can be read and set on a domain level. What do you think would happen if "co.uk" was treated as the domain? <code>amazon.co.uk</code> would be able to read the cookies of <code>google.co.uk</code>. PublicSuffix also contains custom SLDs, such as <code>.dyndns.org</code>. While this makes perfect sense for browser security, it's not what we need for basic URL handling.</p>
<p>PHP (<ahref="http://php.net/manual/en/function.parse-str.php">parse_str()</a>) will automatically parse the query string and populate the superglobal <code>$_GET</code> for you. <code>?foo=1&foo=2</code> becomes <code>$_GET = array('foo' => 2);</code>, while <code>?foo[]=1&foo[]=2</code> becomes <code>$_GET = array('foo' => array("1", "2"));</code>.</p>
179
179
<p>Ruby's <code>CGI.parse()</code> turns <code>?a=1&a=2</code> into <code>{"a" : ["1", "2"]}</code>, while Ruby on Rails chose the PHP-way.</p>
180
180
<p>Python's <ahref="http://docs.python.org/2/library/urlparse.html#urlparse.parse_qs">parse_qs()</a> doesn't care for <code>[]</code> either.
181
181
<p>Most other languages don't follow the <code>[]</code>-style array-notation and deal with this mess differently.</p>
182
182
<p>TL;DR: You need to know the target-environment, to know how complex query string data has to be encoded</p>
183
-
183
+
184
184
<h3id="problems-fragment">The Fragment</h3>
185
185
<p>Given the URL <code>http://example.org/index.html#foobar</code>, browsers only request <code>http://example.org/index.html</code>, the fragment <code>#foobar</code> is a client-side thing.</p>
<aid="github-forkme" href="https://github.com/medialize/URI.js"><imgsrc="http://s3.amazonaws.com/github/ribbons/forkme_right_darkblue_121621.png" alt="Fork me on GitHub" /></a>
17
17
18
18
<divid="container">
@@ -57,12 +57,12 @@ <h2>Custom Built URI.js</h2>
57
57
<pclass="download"> your custom built <code>URI.js</code> or copy the following code:</p>
This "build tool" does nothing but downloading the selected files, concatenating them and pushing them through <ahref="http://closure-compiler.appspot.com/home">Closure Compiler</a>.
65
-
Since Closure Compiler is running on a different domain, this trick will only work on modern browsers.
65
+
Since Closure Compiler is running on a different domain, this trick will only work on modern browsers.
66
66
I'm sorry for the ~2% of you IE users. You'll have to do this <ahref="https://github.com/medialize/URI.js#minify">manually</a>.
<aid="github-forkme" href="https://github.com/medialize/URI.js"><imgsrc="http://s3.amazonaws.com/github/ribbons/forkme_right_darkblue_121621.png" alt="Fork me on GitHub" /></a>
<p>URI.js is a javascript library for working with URLs. It offers a "jQuery-style" API (<ahref="http://en.wikipedia.org/wiki/Fluent_interface">Fluent Interface</a>, Method Chaining) to read and write all regular components and a number of convenience methods like
45
+
46
+
<p>URI.js is a javascript library for working with URLs. It offers a "jQuery-style" API (<ahref="http://en.wikipedia.org/wiki/Fluent_interface">Fluent Interface</a>, Method Chaining) to read and write all regular components and a number of convenience methods like
47
47
.<ahref="docs.html#accessors-directory">directory</a>() and .<ahref="docs.html#accessors-authority">authority</a>().</p>
48
48
<p>URI.js offers simple, yet powerful ways of working with query string, has a number of URI-normalization functions and converts relative/absolute paths.</p>
49
49
<p>While URI.js provides a <ahref="jquery-uri-plugin.html">jQuery plugin</a>. URI.js itself does not rely on jQuery. You <em>don't need jQuery</em> to use URI.js</p>
@@ -52,13 +52,13 @@ <h2>Examples</h2>
52
52
<p>How do you like manipulating URLs the "jQuery-style"?</p>
0 commit comments