root/lang/perl/URI-Escape-XS/trunk/README @ 28488

Revision 12880, 5.5 kB (checked in by dankogai, 5 years ago)

Version 0.02

Line 
1NAME
2    URI::Escape::XS - Drop-In replacement for URI::Escape
3
4VERSION
5    $Id: README,v 0.2 2008/05/30 23:53:13 dankogai Exp $
6
7SYNOPSIS
8        # use it instead of URI::Escape
9        use URI::Escape::XS qw/uri_escape uri_unescape/;
10        $safe = uri_escape("10% is enough\n");
11        $verysafe = uri_escape("foo", "\0-377);
12        $str  = uri_unescape($safe);
13
14        # or use encodeURIComponent and decodeURIComponent
15        use URI::Escape::XS;
16        $safe = encodeURIComponent("10% is enough\n");
17        $str  = decodeURIComponent("10%25%20is%20enough%0A");
18
19EXPORT
20  by default
21    "encodeURIComponent" and "decodeURIComponent"
22
23  on demand
24    "uri_escape" and "uri_unescape"
25
26FUNCTIONS
27  encodeURIComponent
28    Does what JavaScript's encodeURIComponent does.
29
30      $uri = encodeURIComponent("http://www.example.com/");
31      # http%3A%2F%2Fwww.example.com%2F
32
33    Note you cannot customize characters to escape. If you need to do so,
34    use "uri_escape".
35
36  decodeURIComponent
37    Does what JavaScript's decodeURIComponent does.
38
39      $str = decodeURIComponent("http%3A%2F%2Fwww.example.com%2F");
40      # http://www.example.com/
41
42    It decode not only %HH sequences but also %uHHHH sequences, with
43    surrogate pairs correctly decoded.
44
45      $str = decodeURIComponent("%uD869%uDEB2%u5F3E%u0061");
46      # \x{2A6B2}\x{5F3E}a
47
48    This function UNCONDITIONALLY returns the decoded string with utf8 flag
49    off. To get utf8-decoded string, use Encode and
50
51      decode_utf8(decodeURIComponent($uri));
52
53    This is the correct behavior because you can't tell if the decoded
54    string actually contains UTF-8 decoded string, like ISO-8859-1 and
55    Shift_JIS.
56
57  uri_escape
58    Does exactly the same as URI::Escape::uri_escape() except when
59    utf8-flagged string is fed.
60
61    URI::Escape::uri_escape() croak and urge you to "uri_escape_utf8()" but
62    it is pointless because URI itself has no such things as utf8 flag. The
63    function in this module ALWAYS TREATS the string as byte sequence. That
64    way you can safely use this function without worring about utf8 flags.
65
66    Note this function is NOT EXPORTED by default. That way you can use
67    URI::Escape and URI::Escape::XS simultaneously.
68
69  uri_unescape
70    Does exactly the same as URI::Escape::uri_escape() except when %uHHHH is
71    fed.
72
73    URI::Escape::uri_unescape() simply ignores %uHHHH sequences while the
74    function in this module does decode it into the corresponding UTF-8 byte
75    sequence.
76
77    Like uri_escape, this funciton is NOT EXPORTED by default.
78
79  Note on the %uHHHH sequence
80    With this module the resulting strings never have the utf8 flag on. So
81    if you want to decode it to perl utf8, You have to explicitly decode via
82    Encode. Remember. URIs have always been a byte sequence, not UTF-8
83    characters.
84
85    If %uHHHH sequence became standard, you could've safely told if a given
86    URI is in Unicode. But more fortunately than unfortunately, the RFC
87    proposal was rejected so you can't tell which encoding is used just by
88    looking at the URI.
89
90    <http://en.wikipedia.org/wiki/Percent-encoding#Non-standard_implementati
91    ons>
92
93    I said fortunately because %uHHHH can be nasty for non-BMP characters.
94    Since each %uHHHH can hold one 16-bit value, you need a *surrogate pair*
95    to represent it if it is U+10000 and above.
96
97    In spite of that, there are a significant number of URIs with %uHHHH
98    escapes. Therefore this module supports decoding only.
99
100SPEED
101    Since this module uses XS, it is really fast except for
102    uri_escape("noop").
103
104    Regexp which is used in URI::Escape is really fast for non-matching but
105    slows down significantly when it has to replace string.
106
107  BENCHMARK
108    On Macbook Pro 2GHz, Perl 5.8.8.
109
110     http://www.google.co.jp/search?q=%E5%B0%8F%E9%A3%BC%E5%BC%BE
111     ============================================================
112     Unescape it
113     -----------
114     U::E      58526/s       --     -88%
115     U::E::XS 486968/s     732%       --
116     --------------
117     Escape it back
118     --------------
119     U::E      30046/s       --     -78%
120     U::E::XS 136992/s     356%       --
121
122     www.example.com
123     ===============
124     Unescape it
125     -----------
126                   Rate     U::E U::E::XS
127      U::E     821972/s       --      -4%
128      U::E::XS 854732/s       4%       --
129     --------------
130     Escape it back
131     -------------
132     U::E::XS 522969/s       --      -7%
133     U::E     565112/s       8%       --
134
135AUTHOR
136    Dan Kogai, "<dankogai at dan.co.jp>"
137
138BUGS
139    Please report any bugs or feature requests to "bug-uri-escape-xs at
140    rt.cpan.org", or through the web interface at
141    <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=URI-Escape-XS>. I will
142    be notified, and then you'll automatically be notified of progress on
143    your bug as I make changes.
144
145SUPPORT
146    You can find documentation for this module with the perldoc command.
147
148        perldoc URI::Escape::XS
149
150    You can also look for information at:
151
152    * AnnoCPAN: Annotated CPAN documentation
153        <http://annocpan.org/dist/URI-Escape-XS>
154
155    * CPAN Ratings
156        <http://cpanratings.perl.org/d/URI-Escape-XS>
157
158    * RT: CPAN's request tracker
159        <http://rt.cpan.org/NoAuth/Bugs.html?Dist=URI-Escape-XS>
160
161    * Search CPAN
162        <http://search.cpan.org/dist/URI-Escape-XS>
163
164ACKNOWLEDGEMENTS
165    Gisle Aas for URI::Escape
166
167    Koichi Taniguchi for URI::Escape::JavaScript
168
169COPYRIGHT & LICENSE
170    Copyright 2007 Dan Kogai, all rights reserved.
171
172    This program is free software; you can redistribute it and/or modify it
173    under the same terms as Perl itself.
174
Note: See TracBrowser for help on using the browser.