Add support for query encoding to WTFURL
authorbenjamin@webkit.org <benjamin@webkit.org@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Mon, 24 Sep 2012 21:32:45 +0000 (21:32 +0000)
committerbenjamin@webkit.org <benjamin@webkit.org@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Mon, 24 Sep 2012 21:32:45 +0000 (21:32 +0000)
https://bugs.webkit.org/show_bug.cgi?id=97422

Reviewed by Adam Barth.

Source/WebCore:

Add the Charset conversion on WebCore side.

* platform/KURLWTFURL.cpp:
(WebCore::KURL::KURL):
(CharsetConverter):
(WebCore::CharsetConverter::CharsetConverter):
* platform/mac/KURLMac.mm:
(WebCore::KURL::KURL):

Source/WTF:

Expose character conversion through the new abstract class URLQueryCharsetConverter.
URLQueryCharsetConverter is implemented by WebCore to expose the TextEncoding classes.

Unfortunatelly that forces us to bring over URLBuffer in the public API. We may be able
to mitigate that later when moving WTFURL to more templates.

The change fixes 2 of the URL layout tests.

* WTF.xcodeproj/project.pbxproj:
* wtf/url/api/ParsedURL.cpp:
(WTF::ParsedURL::ParsedURL):
* wtf/url/api/ParsedURL.h:
(ParsedURL):
ParsedURL was using the same constructor for ParsedURLString, and URL without a base.
That was a mistake on my part, I did not intend that, fixed it now :)

* wtf/url/api/URLBuffer.h: Renamed from Source/WTF/wtf/url/src/URLBuffer.h.
(URLBuffer):
(WTF::URLBuffer::URLBuffer):
(WTF::URLBuffer::~URLBuffer):
(WTF::URLBuffer::at):
(WTF::URLBuffer::set):
(WTF::URLBuffer::capacity):
(WTF::URLBuffer::length):
(WTF::URLBuffer::data):
(WTF::URLBuffer::setLength):
(WTF::URLBuffer::append):
(WTF::URLBuffer::grow):
* wtf/url/api/URLQueryCharsetConverter.h: Added.
(URLQueryCharsetConverter):
(WTF::URLQueryCharsetConverter::URLQueryCharsetConverter):
(WTF::URLQueryCharsetConverter::~URLQueryCharsetConverter):
* wtf/url/src/URLCanon.h:
(URLCanonicalizer):
* wtf/url/src/URLCanonFilesystemurl.cpp:
(WTF::URLCanonicalizer::canonicalizeFileSystemURL):
(WTF::URLCanonicalizer::ReplaceFileSystemURL):
* wtf/url/src/URLCanonFileurl.cpp:
(WTF::URLCanonicalizer::CanonicalizeFileURL):
(WTF::URLCanonicalizer::ReplaceFileURL):
* wtf/url/src/URLCanonInternal.h:
(URLCanonicalizer):
* wtf/url/src/URLCanonQuery.cpp:
(WTF::URLCanonicalizer::CanonicalizeQuery):
(WTF::URLCanonicalizer::ConvertUTF16ToQueryEncoding):
* wtf/url/src/URLCanonRelative.cpp:
(WTF::URLCanonicalizer::resolveRelativeURL):
* wtf/url/src/URLCanonStdURL.cpp:
(WTF::URLCanonicalizer::CanonicalizeStandardURL):
(WTF::URLCanonicalizer::ReplaceStandardURL):
* wtf/url/src/URLUtil.cpp:
(URLUtilities):
(WTF::URLUtilities::canonicalize):
(WTF::URLUtilities::resolveRelative):
(WTF::URLUtilities::ReplaceComponents):
* wtf/url/src/URLUtil.h:
(URLUtilities):

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@129413 268f45cc-cd09-0410-ab3c-d52691b4dbfc

18 files changed:
Source/WTF/ChangeLog
Source/WTF/WTF.xcodeproj/project.pbxproj
Source/WTF/wtf/url/api/ParsedURL.cpp
Source/WTF/wtf/url/api/ParsedURL.h
Source/WTF/wtf/url/api/URLBuffer.h [moved from Source/WTF/wtf/url/src/URLBuffer.h with 100% similarity]
Source/WTF/wtf/url/api/URLQueryCharsetConverter.h [new file with mode: 0644]
Source/WTF/wtf/url/src/URLCanon.h
Source/WTF/wtf/url/src/URLCanonFilesystemurl.cpp
Source/WTF/wtf/url/src/URLCanonFileurl.cpp
Source/WTF/wtf/url/src/URLCanonInternal.h
Source/WTF/wtf/url/src/URLCanonQuery.cpp
Source/WTF/wtf/url/src/URLCanonRelative.cpp
Source/WTF/wtf/url/src/URLCanonStdURL.cpp
Source/WTF/wtf/url/src/URLUtil.cpp
Source/WTF/wtf/url/src/URLUtil.h
Source/WebCore/ChangeLog
Source/WebCore/platform/KURLWTFURL.cpp
Source/WebCore/platform/mac/KURLMac.mm

index fdafbff..1295266 100644 (file)
@@ -1,5 +1,70 @@
 2012-09-24  Benjamin Poulain  <benjamin@webkit.org>
 
+        Add support for query encoding to WTFURL
+        https://bugs.webkit.org/show_bug.cgi?id=97422
+
+        Reviewed by Adam Barth.
+
+        Expose character conversion through the new abstract class URLQueryCharsetConverter.
+        URLQueryCharsetConverter is implemented by WebCore to expose the TextEncoding classes.
+
+        Unfortunatelly that forces us to bring over URLBuffer in the public API. We may be able
+        to mitigate that later when moving WTFURL to more templates.
+
+        The change fixes 2 of the URL layout tests.
+
+        * WTF.xcodeproj/project.pbxproj:
+        * wtf/url/api/ParsedURL.cpp:
+        (WTF::ParsedURL::ParsedURL):
+        * wtf/url/api/ParsedURL.h:
+        (ParsedURL):
+        ParsedURL was using the same constructor for ParsedURLString, and URL without a base.
+        That was a mistake on my part, I did not intend that, fixed it now :)
+
+        * wtf/url/api/URLBuffer.h: Renamed from Source/WTF/wtf/url/src/URLBuffer.h.
+        (URLBuffer):
+        (WTF::URLBuffer::URLBuffer):
+        (WTF::URLBuffer::~URLBuffer):
+        (WTF::URLBuffer::at):
+        (WTF::URLBuffer::set):
+        (WTF::URLBuffer::capacity):
+        (WTF::URLBuffer::length):
+        (WTF::URLBuffer::data):
+        (WTF::URLBuffer::setLength):
+        (WTF::URLBuffer::append):
+        (WTF::URLBuffer::grow):
+        * wtf/url/api/URLQueryCharsetConverter.h: Added.
+        (URLQueryCharsetConverter):
+        (WTF::URLQueryCharsetConverter::URLQueryCharsetConverter):
+        (WTF::URLQueryCharsetConverter::~URLQueryCharsetConverter):
+        * wtf/url/src/URLCanon.h:
+        (URLCanonicalizer):
+        * wtf/url/src/URLCanonFilesystemurl.cpp:
+        (WTF::URLCanonicalizer::canonicalizeFileSystemURL):
+        (WTF::URLCanonicalizer::ReplaceFileSystemURL):
+        * wtf/url/src/URLCanonFileurl.cpp:
+        (WTF::URLCanonicalizer::CanonicalizeFileURL):
+        (WTF::URLCanonicalizer::ReplaceFileURL):
+        * wtf/url/src/URLCanonInternal.h:
+        (URLCanonicalizer):
+        * wtf/url/src/URLCanonQuery.cpp:
+        (WTF::URLCanonicalizer::CanonicalizeQuery):
+        (WTF::URLCanonicalizer::ConvertUTF16ToQueryEncoding):
+        * wtf/url/src/URLCanonRelative.cpp:
+        (WTF::URLCanonicalizer::resolveRelativeURL):
+        * wtf/url/src/URLCanonStdURL.cpp:
+        (WTF::URLCanonicalizer::CanonicalizeStandardURL):
+        (WTF::URLCanonicalizer::ReplaceStandardURL):
+        * wtf/url/src/URLUtil.cpp:
+        (URLUtilities):
+        (WTF::URLUtilities::canonicalize):
+        (WTF::URLUtilities::resolveRelative):
+        (WTF::URLUtilities::ReplaceComponents):
+        * wtf/url/src/URLUtil.h:
+        (URLUtilities):
+
+2012-09-24  Benjamin Poulain  <benjamin@webkit.org>
+
         Integrate most of GoogleURL in WTFURL
         https://bugs.webkit.org/show_bug.cgi?id=97405
 
index 82f721c..2d27b74 100644 (file)
@@ -12,6 +12,7 @@
                143F61201565F0F900DB514A /* RAMSize.h in Headers */ = {isa = PBXBuildFile; fileRef = 143F611E1565F0F900DB514A /* RAMSize.h */; settings = {ATTRIBUTES = (Private, ); }; };
                14F3B0F715E45E4600210069 /* SaturatedArithmetic.h in Headers */ = {isa = PBXBuildFile; fileRef = 14F3B0F615E45E4600210069 /* SaturatedArithmetic.h */; settings = {ATTRIBUTES = (Private, ); }; };
                26147B0A15DDCCDC00DDB907 /* IntegerToStringConversion.h in Headers */ = {isa = PBXBuildFile; fileRef = 26147B0815DDCCDC00DDB907 /* IntegerToStringConversion.h */; };
+               2661122E160FEAD40013F5C3 /* URLQueryCharsetConverter.h in Headers */ = {isa = PBXBuildFile; fileRef = 2661122D160FEAD40013F5C3 /* URLQueryCharsetConverter.h */; };
                26E6C1EE1609037300CA6AF4 /* URLCanonEtc.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 26E6C1CE1609037300CA6AF4 /* URLCanonEtc.cpp */; };
                26E6C1EF1609037300CA6AF4 /* URLCanonFilesystemurl.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 26E6C1CF1609037300CA6AF4 /* URLCanonFilesystemurl.cpp */; };
                26E6C1F01609037300CA6AF4 /* URLCanonFileurl.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 26E6C1D01609037300CA6AF4 /* URLCanonFileurl.cpp */; };
                143F611E1565F0F900DB514A /* RAMSize.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = RAMSize.h; sourceTree = "<group>"; };
                14F3B0F615E45E4600210069 /* SaturatedArithmetic.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = SaturatedArithmetic.h; sourceTree = "<group>"; };
                26147B0815DDCCDC00DDB907 /* IntegerToStringConversion.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = IntegerToStringConversion.h; sourceTree = "<group>"; };
+               2661122D160FEAD40013F5C3 /* URLQueryCharsetConverter.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = URLQueryCharsetConverter.h; sourceTree = "<group>"; };
                26E6C1CE1609037300CA6AF4 /* URLCanonEtc.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = URLCanonEtc.cpp; sourceTree = "<group>"; };
                26E6C1CF1609037300CA6AF4 /* URLCanonFilesystemurl.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = URLCanonFilesystemurl.cpp; sourceTree = "<group>"; };
                26E6C1D01609037300CA6AF4 /* URLCanonFileurl.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = URLCanonFileurl.cpp; sourceTree = "<group>"; };
                        children = (
                                A8A47360151A825B004123FF /* ParsedURL.cpp */,
                                A8A47361151A825B004123FF /* ParsedURL.h */,
+                               A8A47365151A825B004123FF /* URLBuffer.h */,
+                               2661122D160FEAD40013F5C3 /* URLQueryCharsetConverter.h */,
                                4330F38E15745B0500AAFA8F /* URLString.cpp */,
                                A8A47362151A825B004123FF /* URLString.h */,
                        );
                        isa = PBXGroup;
                        children = (
                                A8A47364151A825B004123FF /* RawURLBuffer.h */,
-                               A8A47365151A825B004123FF /* URLBuffer.h */,
                                26E6C1E11609037300CA6AF4 /* URLCanon.h */,
                                26E6C1CE1609037300CA6AF4 /* URLCanonEtc.cpp */,
                                26E6C1CF1609037300CA6AF4 /* URLCanonFilesystemurl.cpp */,
                                26E6C2031609037300CA6AF4 /* URLFile.h in Headers */,
                                26E6C2051609037300CA6AF4 /* URLParseInternal.h in Headers */,
                                26E6C2081609037300CA6AF4 /* URLParse.h in Headers */,
+                               2661122E160FEAD40013F5C3 /* URLQueryCharsetConverter.h in Headers */,
                                26E6C20A1609037300CA6AF4 /* URLUtilInternal.h in Headers */,
                                26E6C20D1609037300CA6AF4 /* URLUtil.h in Headers */,
                        );
index 5046319..c700dd5 100644 (file)
@@ -38,7 +38,7 @@
 
 namespace WTF {
 
-ParsedURL::ParsedURL(const String& urlString)
+ParsedURL::ParsedURL(const String& urlString, ParsedURLStringTag)
 {
     unsigned urlStringLength = urlString.length();
     if (!urlStringLength)
@@ -67,7 +67,35 @@ ParsedURL::ParsedURL(const String& urlString)
         m_spec = URLString(String(outputBuffer.data(), outputBuffer.length()));
 }
 
-ParsedURL::ParsedURL(const ParsedURL& base, const String& relative)
+ParsedURL::ParsedURL(const String& urlString, URLQueryCharsetConverter* queryCharsetConverter)
+{
+    unsigned urlStringLength = urlString.length();
+    if (!urlStringLength)
+        return;
+
+    RawURLBuffer<char> outputBuffer;
+    String base;
+    const CString& baseStr = base.utf8();
+    bool isValid = false;
+    URLSegments baseSegments;
+
+    // FIXME: we should take shortcuts here! We do not have to resolve the relative part.
+    if (urlString.is8Bit())
+        isValid = URLUtilities::resolveRelative(baseStr.data(), baseSegments,
+                                                reinterpret_cast<const char*>(urlString.characters8()), urlStringLength,
+                                                queryCharsetConverter,
+                                                outputBuffer, &m_segments);
+    else
+        isValid = URLUtilities::resolveRelative(baseStr.data(), baseSegments,
+                                                urlString.characters16(), urlStringLength,
+                                                queryCharsetConverter,
+                                                outputBuffer, &m_segments);
+
+    if (isValid)
+        m_spec = URLString(String(outputBuffer.data(), outputBuffer.length()));
+}
+
+ParsedURL::ParsedURL(const ParsedURL& base, const String& relative, URLQueryCharsetConverter* queryCharsetConverter)
 {
     if (!base.isValid())
         return;
@@ -85,12 +113,12 @@ ParsedURL::ParsedURL(const ParsedURL& base, const String& relative)
     if (relative.is8Bit())
         isValid = URLUtilities::resolveRelative(baseStr.data(), base.m_segments,
                                                 reinterpret_cast<const char*>(relative.characters8()), relativeLength,
-                                                /* charsetConverter */ 0,
+                                                queryCharsetConverter,
                                                 outputBuffer, &m_segments);
     else
         isValid = URLUtilities::resolveRelative(baseStr.data(), base.m_segments,
                                                 relative.characters16(), relativeLength,
-                                                /* charsetConverter */ 0,
+                                                queryCharsetConverter,
                                                 outputBuffer, &m_segments);
 
     if (isValid)
index 014e293..79a72dd 100644 (file)
 namespace WTF {
 
 class URLComponent;
+class URLQueryCharsetConverter;
 
 // ParsedURL represents a valid URL decomposed by components.
 class ParsedURL {
 public:
+    enum ParsedURLStringTag { ParsedURLString };
+
     ParsedURL() { };
-    WTF_EXPORT_PRIVATE explicit ParsedURL(const String&);
-    WTF_EXPORT_PRIVATE explicit ParsedURL(const ParsedURL& base, const String& relative);
+    WTF_EXPORT_PRIVATE explicit ParsedURL(const String&, ParsedURLStringTag);
+
+    WTF_EXPORT_PRIVATE explicit ParsedURL(const String&, URLQueryCharsetConverter*);
+    WTF_EXPORT_PRIVATE explicit ParsedURL(const ParsedURL& base, const String& relative, URLQueryCharsetConverter*);
 
     WTF_EXPORT_PRIVATE ParsedURL isolatedCopy() const;
 
diff --git a/Source/WTF/wtf/url/api/URLQueryCharsetConverter.h b/Source/WTF/wtf/url/api/URLQueryCharsetConverter.h
new file mode 100644 (file)
index 0000000..5771a42
--- /dev/null
@@ -0,0 +1,54 @@
+/*
+ * Copyright 2007 Google Inc. All rights reserved.
+ * Copyright 2012 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following disclaimer
+ * in the documentation and/or other materials provided with the
+ * distribution.
+ *     * Neither the name of Google Inc. nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef URLQueryCharsetConverter_h
+#define URLQueryCharsetConverter_h
+
+#include <wtf/unicode/Unicode.h>
+
+#if USE(WTFURL)
+
+namespace WTF {
+
+template<typename CharacterType> class URLBuffer;
+
+class URLQueryCharsetConverter {
+public:
+    URLQueryCharsetConverter() { }
+    virtual ~URLQueryCharsetConverter() { }
+    virtual void convertFromUTF16(const UChar* input, unsigned inputLength, URLBuffer<char>& output) = 0;
+};
+
+} // namespace WTF
+
+#endif // USE(WTFURL)
+
+#endif // URLQueryCharsetConverter_h
index b66649c..d1e2b1c 100644 (file)
 
 namespace WTF {
 
-namespace URLCanonicalizer {
-
-// Character set converter ----------------------------------------------------
-//
-// Converts query strings into a custom encoding. The embedder can supply an
-// implementation of this class to interface with their own character set
-// conversion libraries.
-//
-// Embedders will want to see the unit test for the ICU version.
+class URLQueryCharsetConverter;
 
-class CharsetConverter {
-public:
-    CharsetConverter() { }
-    virtual ~CharsetConverter() { }
-
-    // Converts the given input string from UTF-16 to whatever output format the
-    // converter supports. This is used only for the query encoding conversion,
-    // which does not fail. Instead, the converter should insert "invalid
-    // character" characters in the output for invalid sequences, and do the
-    // best it can.
-    //
-    // If the input contains a character not representable in the output
-    // character set, the converter should append the HTML entity sequence in
-    // decimal, (such as "&#20320;") with escaping of the ampersand, number
-    // sign, and semicolon (in the previous example it would be
-    // "%26%2320320%3B"). This rule is based on what IE does in this situation.
-    virtual void ConvertFromUTF16(const UChar* input, int inputLength, URLBuffer<char>& output) = 0;
-};
+namespace URLCanonicalizer {
 
 // Whitespace -----------------------------------------------------------------
 
@@ -264,8 +239,8 @@ bool FileCanonicalizePath(const UChar* spec, const URLComponent& path, URLBuffer
 // if necessary, for ASCII input, no conversions are necessary.
 //
 // The converter can be null. In this case, the output encoding will be UTF-8.
-void CanonicalizeQuery(const char* spec, const URLComponent& query, CharsetConverter*, URLBuffer<char>&, URLComponent* outputQuery);
-void CanonicalizeQuery(const UChar* spec, const URLComponent& query, CharsetConverter*, URLBuffer<char>&, URLComponent* outputQuery);
+void CanonicalizeQuery(const char* spec, const URLComponent& query, URLQueryCharsetConverter*, URLBuffer<char>&, URLComponent* outputQuery);
+void CanonicalizeQuery(const UChar* spec, const URLComponent& query, URLQueryCharsetConverter*, URLBuffer<char>&, URLComponent* outputQuery);
 
 // Ref: Prepends the # if needed. The output will be UTF-8 (this is the only
 // canonicalizer that does not produce ASCII output). The output is
@@ -287,21 +262,21 @@ void canonicalizeFragment(const UChar* spec, const URLComponent& path, URLBuffer
 // The 8-bit versions require UTF-8 encoding.
 
 // Use for standard URLs with authorities and paths.
-bool CanonicalizeStandardURL(const char* spec, int specLength, const URLSegments& parsed, CharsetConverter* queryConverter,
+bool CanonicalizeStandardURL(const char* spec, int specLength, const URLSegments& parsed, URLQueryCharsetConverter* queryConverter,
                              URLBuffer<char>&, URLSegments* outputParsed);
-bool CanonicalizeStandardURL(const UChar* spec, int specLength, const URLSegments& parsed, CharsetConverter* queryConverter,
+bool CanonicalizeStandardURL(const UChar* spec, int specLength, const URLSegments& parsed, URLQueryCharsetConverter* queryConverter,
                              URLBuffer<char>&, URLSegments* outputParsed);
 
 // Use for file URLs.
-bool CanonicalizeFileURL(const char* spec, int specLength, const URLSegments& parsed, CharsetConverter* queryConverter,
+bool CanonicalizeFileURL(const char* spec, int specLength, const URLSegments& parsed, URLQueryCharsetConverter* queryConverter,
                          URLBuffer<char>&, URLSegments* outputParsed);
-bool CanonicalizeFileURL(const UChar* spec, int specLength, const URLSegments& parsed, CharsetConverter* queryConverter,
+bool CanonicalizeFileURL(const UChar* spec, int specLength, const URLSegments& parsed, URLQueryCharsetConverter* queryConverter,
                          URLBuffer<char>&, URLSegments* outputParsed);
 
 // Use for filesystem URLs.
-bool canonicalizeFileSystemURL(const char* spec, const URLSegments& parsed, CharsetConverter* queryConverter,
+bool canonicalizeFileSystemURL(const char* spec, const URLSegments& parsed, URLQueryCharsetConverter* queryConverter,
                                URLBuffer<char>&, URLSegments& outputParsed);
-bool canonicalizeFileSystemURL(const UChar* spec, const URLSegments& parsed, CharsetConverter* queryConverter,
+bool canonicalizeFileSystemURL(const UChar* spec, const URLSegments& parsed, URLQueryCharsetConverter* queryConverter,
                                URLBuffer<char>&, URLSegments& outputParsed);
 
 // Use for path URLs such as javascript. This does not modify the path in any
@@ -522,13 +497,13 @@ private:
 bool ReplaceStandardURL(const char* base,
                         const URLSegments& baseParsed,
                         const Replacements<char>&,
-                        CharsetConverter* queryConverter,
+                        URLQueryCharsetConverter* queryConverter,
                         URLBuffer<char>&,
                         URLSegments* outputParsed);
 bool ReplaceStandardURL(const char* base,
                         const URLSegments& baseParsed,
                         const Replacements<UChar>&,
-                        CharsetConverter* queryConverter,
+                        URLQueryCharsetConverter* queryConverter,
                         URLBuffer<char>&,
                         URLSegments* outputParsed);
 
@@ -537,13 +512,13 @@ bool ReplaceStandardURL(const char* base,
 bool ReplaceFileSystemURL(const char* base,
                           const URLSegments& baseParsed,
                           const Replacements<char>&,
-                          CharsetConverter* queryConverter,
+                          URLQueryCharsetConverter* queryConverter,
                           URLBuffer<char>&,
                           URLSegments* outputParsed);
 bool ReplaceFileSystemURL(const char* base,
                           const URLSegments& baseParsed,
                           const Replacements<UChar>&,
-                          CharsetConverter* queryConverter,
+                          URLQueryCharsetConverter* queryConverter,
                           URLBuffer<char>&,
                           URLSegments* outputParsed);
 
@@ -552,13 +527,13 @@ bool ReplaceFileSystemURL(const char* base,
 bool ReplaceFileURL(const char* base,
                     const URLSegments& baseParsed,
                     const Replacements<char>&,
-                    CharsetConverter* queryConverter,
+                    URLQueryCharsetConverter* queryConverter,
                     URLBuffer<char>&,
                     URLSegments* outputParsed);
 bool ReplaceFileURL(const char* base,
                     const URLSegments& baseParsed,
                     const Replacements<UChar>&,
-                    CharsetConverter* queryConverter,
+                    URLQueryCharsetConverter* queryConverter,
                     URLBuffer<char>&,
                     URLSegments* outputParsed);
 
@@ -626,11 +601,11 @@ bool isRelativeURL(const char* base, const URLSegments& baseParsed,
 // was intended by the web page author or caller.
 bool resolveRelativeURL(const char* baseURL, const URLSegments& baseParsed, bool baseIsFile,
                         const char* relativeURL, const URLComponent& relativeComponent,
-                        CharsetConverter* queryConverter,
+                        URLQueryCharsetConverter* queryConverter,
                         URLBuffer<char>&, URLSegments* outputParsed);
 bool resolveRelativeURL(const char* baseURL, const URLSegments& baseParsed, bool baseIsFile,
                         const UChar* relativeURL, const URLComponent& relativeComponent,
-                        CharsetConverter* queryConverter,
+                        URLQueryCharsetConverter* queryConverter,
                         URLBuffer<char>&, URLSegments* outputParsed);
 
 } // namespace URLCanonicalizer
index e7018f7..361bbbe 100644 (file)
@@ -55,7 +55,7 @@ template<typename CharacterType, typename UCHAR>
 bool doCanonicalizeFileSystemURL(const CharacterType* spec,
                                  const URLComponentSource<CharacterType>& source,
                                  const URLSegments& parsed,
-                                 CharsetConverter* charsetConverter,
+                                 URLQueryCharsetConverter* charsetConverter,
                                  URLBuffer<char>& output,
                                  URLSegments& outputParsed)
 {
@@ -114,7 +114,7 @@ bool doCanonicalizeFileSystemURL(const CharacterType* spec,
 
 bool canonicalizeFileSystemURL(const char* spec,
                                const URLSegments& parsed,
-                               CharsetConverter* charsetConverter,
+                               URLQueryCharsetConverter* charsetConverter,
                                URLBuffer<char>& output,
                                URLSegments& outputParsed)
 {
@@ -123,7 +123,7 @@ bool canonicalizeFileSystemURL(const char* spec,
 
 bool canonicalizeFileSystemURL(const UChar* spec,
                                const URLSegments& parsed,
-                               CharsetConverter* charsetConverter,
+                               URLQueryCharsetConverter* charsetConverter,
                                URLBuffer<char>& output,
                                URLSegments& outputParsed)
 {
@@ -133,7 +133,7 @@ bool canonicalizeFileSystemURL(const UChar* spec,
 bool ReplaceFileSystemURL(const char* base,
                           const URLSegments& baseParsed,
                           const Replacements<char>& replacements,
-                          CharsetConverter* charsetConverter,
+                          URLQueryCharsetConverter* charsetConverter,
                           URLBuffer<char>& output,
                           URLSegments* outputParsed)
 {
@@ -146,7 +146,7 @@ bool ReplaceFileSystemURL(const char* base,
 bool ReplaceFileSystemURL(const char* base,
                           const URLSegments& baseParsed,
                           const Replacements<UChar>& replacements,
-                          CharsetConverter* charsetConverter,
+                          URLQueryCharsetConverter* charsetConverter,
                           URLBuffer<char>& output,
                           URLSegments* outputParsed)
 {
index c4f8412..73a9f6d 100644 (file)
@@ -122,7 +122,7 @@ bool doFileCanonicalizePath(const CharacterType* spec,
 template<typename CharacterType, typename UCHAR>
 bool doCanonicalizeFileURL(const URLComponentSource<CharacterType>& source,
                            const URLSegments& parsed,
-                           CharsetConverter* queryConverter,
+                           URLQueryCharsetConverter* queryConverter,
                            URLBuffer<char>& output,
                            URLSegments& outputParsed)
 {
@@ -157,7 +157,7 @@ bool doCanonicalizeFileURL(const URLComponentSource<CharacterType>& source,
 bool CanonicalizeFileURL(const char* spec,
                          int /* specLength */,
                          const URLSegments& parsed,
-                         CharsetConverter* queryConverter,
+                         URLQueryCharsetConverter* queryConverter,
                          URLBuffer<char>& output,
                          URLSegments* outputParsed)
 {
@@ -167,7 +167,7 @@ bool CanonicalizeFileURL(const char* spec,
 bool CanonicalizeFileURL(const UChar* spec,
                          int /* specLength */,
                          const URLSegments& parsed,
-                         CharsetConverter* queryConverter,
+                         URLQueryCharsetConverter* queryConverter,
                          URLBuffer<char>& output,
                          URLSegments* outputParsed)
 {
@@ -193,7 +193,7 @@ bool FileCanonicalizePath(const UChar* spec,
 bool ReplaceFileURL(const char* base,
                     const URLSegments& baseParsed,
                     const Replacements<char>& replacements,
-                    CharsetConverter* queryConverter,
+                    URLQueryCharsetConverter* queryConverter,
                     URLBuffer<char>& output,
                     URLSegments* outputParsed)
 {
@@ -206,7 +206,7 @@ bool ReplaceFileURL(const char* base,
 bool ReplaceFileURL(const char* base,
                     const URLSegments& baseParsed,
                     const Replacements<UChar>& replacements,
-                    CharsetConverter* queryConverter,
+                    URLQueryCharsetConverter* queryConverter,
                     URLBuffer<char>& output,
                     URLSegments* outputParsed)
 {
index 4c10c3e..9b29706 100644 (file)
@@ -311,7 +311,7 @@ bool ConvertUTF8ToUTF16(const char* input, int inputLength, URLBuffer<UChar>& ou
 
 // Converts from UTF-16 to 8-bit using the character set converter. If the
 // converter is null, this will use UTF-8.
-void ConvertUTF16ToQueryEncoding(const UChar* input, const URLComponent& query, CharsetConverter*, URLBuffer<char>& output);
+void ConvertUTF16ToQueryEncoding(const UChar* input, const URLComponent& query, URLQueryCharsetConverter*, URLBuffer<char>& output);
 
 // Applies the replacements to the given component source. The component source
 // should be pre-initialized to the "old" base. That is, all pointers will
index 42673fb..0b829ba 100644 (file)
@@ -35,6 +35,7 @@
 #include "RawURLBuffer.h"
 #include "URLCanonInternal.h"
 #include "URLCharacterTypes.h"
+#include "URLQueryCharsetConverter.h"
 #include <wtf/text/ASCIIFastPath.h>
 
 // Query canonicalization in IE
@@ -107,25 +108,25 @@ void appendRaw8BitQueryString(const CharacterType* source, int length, URLBuffer
 
 // Runs the converter on the given UTF-8 input. Since the converter expects
 // UTF-16, we have to convert first. The converter must be non-null.
-void runConverter(const char* spec, const URLComponent& query, CharsetConverter* converter, URLBuffer<char>& output)
+void runConverter(const char* spec, const URLComponent& query, URLQueryCharsetConverter* converter, URLBuffer<char>& output)
 {
     // This function will replace any misencoded values with the invalid
     // character. This is what we want so we don't have to check for error.
     RawURLBuffer<UChar> utf16;
     ConvertUTF8ToUTF16(&spec[query.begin()], query.length(), utf16);
-    converter->ConvertFromUTF16(utf16.data(), utf16.length(), output);
+    converter->convertFromUTF16(utf16.data(), utf16.length(), output);
 }
 
 // Runs the converter with the given UTF-16 input. We don't have to do
 // anything, but this overriddden function allows us to use the same code
 // for both UTF-8 and UTF-16 input.
-void runConverter(const UChar* spec, const URLComponent& query, CharsetConverter* converter, URLBuffer<char>& output)
+void runConverter(const UChar* spec, const URLComponent& query, URLQueryCharsetConverter* converter, URLBuffer<char>& output)
 {
-    converter->ConvertFromUTF16(&spec[query.begin()], query.length(), output);
+    converter->convertFromUTF16(&spec[query.begin()], query.length(), output);
 }
 
 template<typename CharacterType>
-void doConvertToQueryEncoding(const CharacterType* spec, const URLComponent& query, CharsetConverter* converter, URLBuffer<char>& output)
+void doConvertToQueryEncoding(const CharacterType* spec, const URLComponent& query, URLQueryCharsetConverter* converter, URLBuffer<char>& output)
 {
     if (isAllASCII(spec, query)) {
         // Easy: the input can just appended with no character set conversions.
@@ -146,7 +147,7 @@ void doConvertToQueryEncoding(const CharacterType* spec, const URLComponent& que
 }
 
 template<typename CharacterType>
-void doCanonicalizeQuery(const CharacterType* spec, const URLComponent& query, CharsetConverter* converter,
+void doCanonicalizeQuery(const CharacterType* spec, const URLComponent& query, URLQueryCharsetConverter* converter,
                          URLBuffer<char>& output, URLComponent& outputQueryComponent)
 {
     if (query.length() < 0) {
@@ -164,19 +165,19 @@ void doCanonicalizeQuery(const CharacterType* spec, const URLComponent& query, C
 
 } // namespace
 
-void CanonicalizeQuery(const char* spec, const URLComponent& query, CharsetConverter* converter,
+void CanonicalizeQuery(const char* spec, const URLComponent& query, URLQueryCharsetConverter* converter,
                        URLBuffer<char>& output, URLComponent* outputQueryComponent)
 {
     doCanonicalizeQuery(spec, query, converter, output, *outputQueryComponent);
 }
 
-void CanonicalizeQuery(const UChar* spec, const URLComponent& query, CharsetConverter* converter,
+void CanonicalizeQuery(const UChar* spec, const URLComponent& query, URLQueryCharsetConverter* converter,
                        URLBuffer<char>& output, URLComponent* outputQueryComponent)
 {
     doCanonicalizeQuery(spec, query, converter, output, *outputQueryComponent);
 }
 
-void ConvertUTF16ToQueryEncoding(const UChar* input, const URLComponent& query, CharsetConverter* converter, URLBuffer<char>& output)
+void ConvertUTF16ToQueryEncoding(const UChar* input, const URLComponent& query, URLQueryCharsetConverter* converter, URLBuffer<char>& output)
 {
     doConvertToQueryEncoding<UChar>(input, query, converter, output);
 }
index 3564ea3..7ef0500 100644 (file)
@@ -283,7 +283,7 @@ bool doResolveRelativePath(const char* baseURL,
                            bool /* baseIsFile */,
                            const CHAR* relativeURL,
                            const URLComponent& relativeComponent,
-                           CharsetConverter* queryConverter,
+                           URLQueryCharsetConverter* queryConverter,
                            URLBuffer<char>& output,
                            URLSegments* outputParsed)
 {
@@ -390,7 +390,7 @@ bool doResolveRelativeHost(const char* baseURL,
                            const URLSegments& baseParsed,
                            const CHAR* relativeURL,
                            const URLComponent& relativeComponent,
-                           CharsetConverter* queryConverter,
+                           URLQueryCharsetConverter* queryConverter,
                            URLBuffer<char>& output,
                            URLSegments* outputParsed)
 {
@@ -421,7 +421,7 @@ bool doResolveRelativeHost(const char* baseURL,
 template<typename CharacterType>
 bool doResolveAbsoluteFile(const CharacterType* relativeURL,
                            const URLComponent& relativeComponent,
-                           CharsetConverter* queryConverter,
+                           URLQueryCharsetConverter* queryConverter,
                            URLBuffer<char>& output,
                            URLSegments& outputParsed)
 {
@@ -444,7 +444,7 @@ bool doResolveRelativeURL(const char* baseURL,
                           bool baseIsFile,
                           const CHAR* relativeURL,
                           const URLComponent& relativeComponent,
-                          CharsetConverter* queryConverter,
+                          URLQueryCharsetConverter* queryConverter,
                           URLBuffer<char>& output,
                           URLSegments* outputParsed)
 {
@@ -544,7 +544,7 @@ bool resolveRelativeURL(const char* baseURL,
                         bool baseIsFile,
                         const char* relativeURL,
                         const URLComponent& relativeComponent,
-                        CharsetConverter* queryConverter,
+                        URLQueryCharsetConverter* queryConverter,
                         URLBuffer<char>& output,
                         URLSegments* outputParsed)
 {
@@ -557,7 +557,7 @@ bool resolveRelativeURL(const char* baseURL,
                         bool baseIsFile,
                         const UChar* relativeURL,
                         const URLComponent& relativeComponent,
-                        CharsetConverter* queryConverter,
+                        URLQueryCharsetConverter* queryConverter,
                         URLBuffer<char>& output,
                         URLSegments* outputParsed)
 {
index 12d9e9e..5e2dd1c 100644 (file)
@@ -49,7 +49,7 @@ namespace {
 template<typename CHAR, typename UCHAR>
 bool doCanonicalizeStandardURL(const URLComponentSource<CHAR>& source,
                                const URLSegments& parsed,
-                               CharsetConverter* queryConverter,
+                               URLQueryCharsetConverter* queryConverter,
                                URLBuffer<char>& output,
                                URLSegments& outputParsed)
 {
@@ -146,7 +146,7 @@ int defaultPortForScheme(const char* scheme, int schemeLength)
 bool CanonicalizeStandardURL(const char* spec,
                              int /* specLength */,
                              const URLSegments& parsed,
-                             CharsetConverter* queryConverter,
+                             URLQueryCharsetConverter* queryConverter,
                              URLBuffer<char>& output,
                              URLSegments* outputParsed)
 {
@@ -157,7 +157,7 @@ bool CanonicalizeStandardURL(const char* spec,
 bool CanonicalizeStandardURL(const UChar* spec,
                              int /* specLength */,
                              const URLSegments& parsed,
-                             CharsetConverter* queryConverter,
+                             URLQueryCharsetConverter* queryConverter,
                              URLBuffer<char>& output,
                              URLSegments* outputParsed)
 {
@@ -177,7 +177,7 @@ bool CanonicalizeStandardURL(const UChar* spec,
 bool ReplaceStandardURL(const char* base,
                         const URLSegments& baseParsed,
                         const Replacements<char>& replacements,
-                        CharsetConverter* queryConverter,
+                        URLQueryCharsetConverter* queryConverter,
                         URLBuffer<char>& output,
                         URLSegments* outputParsed)
 {
@@ -192,7 +192,7 @@ bool ReplaceStandardURL(const char* base,
 bool ReplaceStandardURL(const char* base,
                         const URLSegments& baseParsed,
                         const Replacements<UChar>& replacements,
-                        CharsetConverter* queryConverter,
+                        URLQueryCharsetConverter* queryConverter,
                         URLBuffer<char>& output,
                         URLSegments* outputParsed)
 {
index 9dcfabc..d053513 100644 (file)
@@ -121,7 +121,7 @@ bool doFindAndCompareScheme(const CharacterType* str, int strLength, const char*
 
 template<typename CharacterType>
 bool doCanonicalize(const CharacterType* inSpec, int inSpecLength,
-                    URLCanonicalizer::CharsetConverter* charsetConverter,
+                    URLQueryCharsetConverter* charsetConverter,
                     URLBuffer<char>& output, URLSegments& ouputParsed)
 {
     // Remove any whitespace from the middle of the relative URL, possibly
@@ -194,7 +194,7 @@ bool doCanonicalize(const CharacterType* inSpec, int inSpecLength,
 template<typename CharacterType>
 bool doResolveRelative(const char* baseSpec, const URLSegments& baseParsed,
                        const CharacterType* inRelative, int inRelativeLength,
-                       URLCanonicalizer::CharsetConverter* charsetConverter,
+                       URLQueryCharsetConverter* charsetConverter,
                        URLBuffer<char>& output, URLSegments* ouputParsed)
 {
     // Remove any whitespace from the middle of the relative URL, possibly
@@ -232,7 +232,7 @@ bool doReplaceComponents(const char* spec,
                          int specLength,
                          const URLSegments& parsed,
                          const URLCanonicalizer::Replacements<CharacterType>& replacements,
-                         URLCanonicalizer::CharsetConverter* charsetConverter,
+                         URLQueryCharsetConverter* charsetConverter,
                          URLBuffer<char>& output,
                          URLSegments& outputParsed)
 {
@@ -340,14 +340,14 @@ bool FindAndCompareScheme(const UChar* str, int strLength, const char* compare,
 }
 
 bool canonicalize(const char* spec, int specLength,
-                  URLCanonicalizer::CharsetConverter* charsetConverter,
+                  URLQueryCharsetConverter* charsetConverter,
                   URLBuffer<char>& output, URLSegments& ouputParsed)
 {
     return doCanonicalize(spec, specLength, charsetConverter, output, ouputParsed);
 }
 
 bool canonicalize(const UChar* spec, int specLength,
-                  URLCanonicalizer::CharsetConverter* charsetConverter,
+                  URLQueryCharsetConverter* charsetConverter,
                   URLBuffer<char>& output, URLSegments& ouputParsed)
 {
     return doCanonicalize(spec, specLength, charsetConverter, output, ouputParsed);
@@ -355,7 +355,7 @@ bool canonicalize(const UChar* spec, int specLength,
 
 bool resolveRelative(const char* baseSpec, const URLSegments& baseParsed,
                      const char* relative, int relativeLength,
-                     URLCanonicalizer::CharsetConverter* charsetConverter,
+                     URLQueryCharsetConverter* charsetConverter,
                      URLBuffer<char>& output, URLSegments* ouputParsed)
 {
     return doResolveRelative(baseSpec, baseParsed,
@@ -365,7 +365,7 @@ bool resolveRelative(const char* baseSpec, const URLSegments& baseParsed,
 
 bool resolveRelative(const char* baseSpec, const URLSegments& baseParsed,
                      const UChar* relative, int relativeLength,
-                     URLCanonicalizer::CharsetConverter* charsetConverter,
+                     URLQueryCharsetConverter* charsetConverter,
                      URLBuffer<char>& output, URLSegments* ouputParsed)
 {
     return doResolveRelative(baseSpec, baseParsed,
@@ -377,7 +377,7 @@ bool ReplaceComponents(const char* spec,
                        int specLength,
                        const URLSegments& parsed,
                        const URLCanonicalizer::Replacements<char>& replacements,
-                       URLCanonicalizer::CharsetConverter* charsetConverter,
+                       URLQueryCharsetConverter* charsetConverter,
                        URLBuffer<char>& output,
                        URLSegments* outputParsed)
 {
@@ -389,7 +389,7 @@ bool ReplaceComponents(const char* spec,
                        int specLength,
                        const URLSegments& parsed,
                        const URLCanonicalizer::Replacements<UChar>& replacements,
-                       URLCanonicalizer::CharsetConverter* charsetConverter,
+                       URLQueryCharsetConverter* charsetConverter,
                        URLBuffer<char>& output,
                        URLSegments* outputParsed)
 {
index 0f9ca93..a739241 100644 (file)
@@ -41,6 +41,8 @@
 
 namespace WTF {
 
+class URLQueryCharsetConverter;
+
 namespace URLUtilities {
 
 // Locates the scheme in the given string and places it into |foundScheme|,
@@ -69,9 +71,9 @@ bool isStandard(const UChar* spec, const URLComponent& scheme);
 // Returns true if a valid URL was produced, false if not. On failure, the
 // output and parsed structures will still be filled and will be consistent,
 // but they will not represent a loadable URL.
-bool canonicalize(const char* spec, int specLength, URLCanonicalizer::CharsetConverter*,
+bool canonicalize(const char* spec, int specLength, URLQueryCharsetConverter*,
                   URLBuffer<char>&, URLSegments& ouputParsed);
-bool canonicalize(const UChar* spec, int specLength, URLCanonicalizer::CharsetConverter*,
+bool canonicalize(const UChar* spec, int specLength, URLQueryCharsetConverter*,
                   URLBuffer<char>&, URLSegments& ouputParsed);
 
 // Resolves a potentially relative URL relative to the given parsed base URL.
@@ -86,11 +88,11 @@ bool canonicalize(const UChar* spec, int specLength, URLCanonicalizer::CharsetCo
 // a valid URL.
 bool resolveRelative(const char* baseSpec, const URLSegments& baseParsed,
                      const char* relative, int relativeLength,
-                     URLCanonicalizer::CharsetConverter*,
+                     URLQueryCharsetConverter*,
                      URLBuffer<char>&, URLSegments* ouputParsed);
 bool resolveRelative(const char* baseSpec, const URLSegments& baseParsed,
                      const UChar* relative, int relativeLength,
-                     URLCanonicalizer::CharsetConverter*,
+                     URLQueryCharsetConverter*,
                      URLBuffer<char>&, URLSegments* ouputParsed);
 
 // Replaces components in the given VALID input url. The new canonical URL info
@@ -99,11 +101,11 @@ bool resolveRelative(const char* baseSpec, const URLSegments& baseParsed,
 // Returns true if the resulting URL is valid.
 bool ReplaceComponents(const char* spec, int specLength, const URLSegments& parsed,
                        const URLCanonicalizer::Replacements<char>&,
-                       URLCanonicalizer::CharsetConverter*,
+                       URLQueryCharsetConverter*,
                        URLBuffer<char>&, URLSegments* outputParsed);
 bool ReplaceComponents(const char* spec, int specLength, const URLSegments& parsed,
                        const URLCanonicalizer::Replacements<UChar>&,
-                       URLCanonicalizer::CharsetConverter*,
+                       URLQueryCharsetConverter*,
                        URLBuffer<char>&, URLSegments* outputParsed);
 
 // String helper functions ----------------------------------------------------
index 1619a63..8f7ab64 100644 (file)
@@ -1,5 +1,21 @@
 2012-09-24  Benjamin Poulain  <benjamin@webkit.org>
 
+        Add support for query encoding to WTFURL
+        https://bugs.webkit.org/show_bug.cgi?id=97422
+
+        Reviewed by Adam Barth.
+
+        Add the Charset conversion on WebCore side.
+
+        * platform/KURLWTFURL.cpp:
+        (WebCore::KURL::KURL):
+        (CharsetConverter):
+        (WebCore::CharsetConverter::CharsetConverter):
+        * platform/mac/KURLMac.mm:
+        (WebCore::KURL::KURL):
+
+2012-09-24  Benjamin Poulain  <benjamin@webkit.org>
+
         Integrate most of GoogleURL in WTFURL
         https://bugs.webkit.org/show_bug.cgi?id=97405
 
index 7fb4d4a..9a3484d 100644 (file)
 #include "config.h"
 #include "KURL.h"
 
+#include <TextEncoding.h>
 #include <wtf/DataLog.h>
+#include <wtf/text/CString.h>
+#include <wtf/url/api/URLBuffer.h>
+#include <wtf/url/api/URLQueryCharsetConverter.h>
 
 #if USE(WTFURL)
 
@@ -51,34 +55,52 @@ static inline void detach(RefPtr<KURLWTFURLImpl>& urlImpl)
 KURL::KURL(ParsedURLStringTag, const String& urlString)
     : m_urlImpl(adoptRef(new KURLWTFURLImpl()))
 {
-    m_urlImpl->m_parsedURL = ParsedURL(urlString);
+    m_urlImpl->m_parsedURL = ParsedURL(urlString, ParsedURL::ParsedURLString);
 
     // FIXME: Frame::init() actually create empty URL, investigate why not just null URL.
     // ASSERT(m_urlImpl->m_parsedURL.isValid());
 }
 
+class CharsetConverter : public URLQueryCharsetConverter {
+public:
+    CharsetConverter(const TextEncoding& encoding)
+        : m_encoding(encoding)
+    {
+    }
+
+    virtual void convertFromUTF16(const UChar* input, unsigned inputLength, URLBuffer<char>& output) OVERRIDE
+    {
+        CString encoded = m_encoding.encode(input, inputLength, URLEncodedEntitiesForUnencodables);
+        output.append(encoded.data(), static_cast<int>(encoded.length()));
+    }
+
+private:
+    const TextEncoding& m_encoding;
+};
+
 KURL::KURL(const KURL& baseURL, const String& relative)
     : m_urlImpl(adoptRef(new KURLWTFURLImpl()))
 {
     // FIXME: the case with a null baseURL is common. We should have a separate constructor in KURL.
     // FIXME: the case of an empty Base is useless, we should get rid of empty URLs.
+    CharsetConverter charsetConverter(UTF8Encoding());
     if (baseURL.isEmpty())
-        m_urlImpl->m_parsedURL = ParsedURL(relative);
+        m_urlImpl->m_parsedURL = ParsedURL(relative, &charsetConverter);
     else
-        m_urlImpl->m_parsedURL = ParsedURL(baseURL.m_urlImpl->m_parsedURL, relative);
+        m_urlImpl->m_parsedURL = ParsedURL(baseURL.m_urlImpl->m_parsedURL, relative, &charsetConverter);
 
     if (!m_urlImpl->m_parsedURL.isValid())
         m_urlImpl->m_invalidUrlString = relative;
 }
 
-KURL::KURL(const KURL& baseURL, const String& relative, const TextEncoding&)
+KURL::KURL(const KURL& baseURL, const String& relative, const TextEncoding& encoding)
     : m_urlImpl(adoptRef(new KURLWTFURLImpl()))
 {
-    // FIXME: handle the encoding.
+    CharsetConverter charsetConverter(encoding.encodingForFormSubmission());
     if (baseURL.isEmpty())
-        m_urlImpl->m_parsedURL = ParsedURL(relative);
+        m_urlImpl->m_parsedURL = ParsedURL(relative, &charsetConverter);
     else
-        m_urlImpl->m_parsedURL = ParsedURL(baseURL.m_urlImpl->m_parsedURL, relative);
+        m_urlImpl->m_parsedURL = ParsedURL(baseURL.m_urlImpl->m_parsedURL, relative, &charsetConverter);
 
     if (!m_urlImpl->m_parsedURL.isValid())
         m_urlImpl->m_invalidUrlString = relative;
index 9efbee0..d5c25e1 100644 (file)
@@ -53,7 +53,7 @@ KURL::KURL(NSURL *url)
 #else
     m_urlImpl = adoptRef(new KURLWTFURLImpl());
     String urlString(bytes, bytesLength);
-    m_urlImpl->m_parsedURL = ParsedURL(urlString);
+    m_urlImpl->m_parsedURL = ParsedURL(urlString, 0);
     if (!m_urlImpl->m_parsedURL.isValid())
         m_urlImpl->m_invalidUrlString = urlString;
 #endif // USE(WTFURL)