URL Encoding
Some characters cannot be part of a URL (for example, the space) and some other characters have a special meaning in a URL: for example, the character #
can be used to further specify a subsection (or fragment) of a document; the character =
is used to separate a name from a value. A query string may need to be converted to satisfy these constraints. This can be done using a schema known as URL encoding.
In particular, encoding the query string uses the following rules:
- Letters (A-Z and a-z), numbers (0-9) and the characters '.','-','~' and '_' are left as-is
- SPACE is encoded as '+' or %20
- All other characters are encoded as %FF hex representation with any non-ASCII characters first encoded as UTF-8 (or other specified encoding)
The octet corresponding to the tilde ("~") character is often encoded as "%7E" by older URI processing implementations; the "%7E" can be replaced by"~" without changing its interpretation.
The encoding of SPACE as '+' and the selection of "as-is" characters distinguishes this encoding from RFC 1738.
Read more about this topic: Query String