<rdar://problem/5954398> REGRESSION: 1.1% PLT regression from 33577 and 33578 (encoding fixes)
WebCore:
Changed single argument KURL constructors back to always expect an already encoded string,
eliminating extra conversions.
This is a rather unstable situation, as it is often unclear whether a given string is safe
to convert to KURL without resolving. I think that going forward, the solution is to try to
keep encoded URLs as KURL instances, and not as strings.
* platform/KURL.h: Updated comments.
* platform/KURL.cpp:
(WebCore::KURL::KURL): In debug builds, verify that the passed string is ASCII-only. The
intention is to verify that it is already parsed and encoded by KURL or equivalent code, but
since encoding is scheme-dependent, such a verification would be quite complicated.
Don't encode the string as UTF-8, as it supposed to be ASCII-only.
Removed a hack that made strings beginning with "/" turn into "file:" URLs. I didn't find
any reason for it to exist, but I saw several cases where this code path was taken
inadvertently (see examples in LayoutTests/ChangeLog).
(WebCore::KURL::setProtocol): Using a user-provided string without validation or encoding
is clearly wrong here (e.g., the "protocol" can be set to a full URL, effectively replacing
the old one), and an already encoded string is expected by parse().
In debug builds, non-ASCII input will make an assertion in parse() fail. Added a FIXME.
(WebCore::KURL::setHost): Ditto.
(WebCore::KURL::setPort): Ditto.
(WebCore::KURL::setHostAndPort): Ditto.
(WebCore::KURL::setUser): Ditto.
(WebCore::KURL::setPass): Ditto.
(WebCore::KURL::setRef): Ditto.
(WebCore::KURL::setQuery): Ditto.
(WebCore::KURL::setPath): Ditto. Note that problems described in bug 23500 mask some of
the problems in release builds currently, as the incorrectly parsed URL is ignored.
(WebCore::KURL::parse): Verify that the passed string is already encoded.
* html/HTMLLinkElement.cpp:
(WebCore::HTMLLinkElement::parseMappedAttribute):
(WebCore::HTMLLinkElement::process):
* html/HTMLLinkElement.h:
Changed to avoid using invalid URLs (this was causing problems on DNS prefetch tests, see
LayoutTests/ChangeLog).
* loader/FrameLoader.cpp:
(WebCore::FrameLoader::init): Create an empty KURL without indirection for a small speedup.
(WebCore::FrameLoader::requestFrame): Resolve and encode javascript URLs properly, now that
String to KURL conversion requires the string to be already encoded.
* page/DOMWindow.cpp: (WebCore::DOMWindow::postMessage): Resolve and encode the origin.
HTML5 seems a little unclear on how this should work (it talks about "either parsing it as
a URL, or resolving it", and then somehow compares unaltered targetOrigin string to a
security origin object), so I just made the code as close to what we already had as possible.
LayoutTests:
* http/tests/misc/dns-prefetch-control-expected.txt:
* http/tests/misc/dns-prefetch-control.html:
Google documentation for DNS Prefetch makes use of net-path relative URLs (//server-name),
explaining that scheme is not necessary. This is of course true, but this test uses data:
subframes, and data: is a non-hierachical scheme, so resolving such URLs fails, resulting
in a KURL object that is not valid. WebKit used to ignore this, and tried to create a URL
from this string again, now with a single argument KURL constructor, which resulted in a
valid file: URL, which was successfully used! Both issues have been corrected in WebCore,
so I had to change the test to no longer use relative net-path URLs.
* http/tests/security/postMessage/invalid-origin-throws-exception-expected.txt:
* http/tests/security/postMessage/invalid-origin-throws-exception.html:
URLs that start with "/" are no longer converted to "file:" ones, so the results now
match Firefox.
* http/tests/uri/resolve-encoding-relative-expected.txt: Added.
* http/tests/uri/resolve-encoding-relative.html: Added.
Added a test to cover some cases of relative URL resolving that were not covered before.
Expected results are taken from Firefox 3, and WebKit doesn't match in how fragments are
encoded (we use document encoding, while Firefox uses UTF-8). Since fragments are not
sent in HTTP requests, this is not too dangerous, but the Firefox behavior looks more
consistent.
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@40221
268f45cc-cd09-0410-ab3c-
d52691b4dbfc
+2009-01-24 Alexey Proskuryakov <ap@webkit.org>
+
+ Reviewed by Darin Adler.
+
+ <rdar://problem/5954398> REGRESSION: 1.1% PLT regression from 33577 and 33578 (encoding fixes)
+
+ * http/tests/misc/dns-prefetch-control-expected.txt:
+ * http/tests/misc/dns-prefetch-control.html:
+ Google documentation for DNS Prefetch makes use of net-path relative URLs (//server-name),
+ explaining that scheme is not necessary. This is of course true, but this test uses data:
+ subframes, and data: is a non-hierachical scheme, so resolving such URLs fails, resulting
+ in a KURL object that is not valid. WebKit used to ignore this, and tried to create a URL
+ from this string again, now with a single argument KURL constructor, which resulted in a
+ valid file: URL, which was successfully used! Both issues have been corrected in WebCore,
+ so I had to change the test to no longer use relative net-path URLs.
+
+ * http/tests/security/postMessage/invalid-origin-throws-exception-expected.txt:
+ * http/tests/security/postMessage/invalid-origin-throws-exception.html:
+ URLs that start with "/" are no longer converted to "file:" ones, so the results now
+ match Firefox.
+
+ * http/tests/uri/resolve-encoding-relative-expected.txt: Added.
+ * http/tests/uri/resolve-encoding-relative.html: Added.
+ Added a test to cover some cases of relative URL resolving that were not covered before.
+ Expected results are taken from Firefox 3, and WebKit doesn't match in how fragments are
+ encoded (we use document encoding, while Firefox uses UTF-8). Since fragments are not
+ sent in HTTP requests, this is not too dangerous, but the Firefox behavior looks more
+ consistent.
+
2009-01-24 Eric Carlson <eric.carlson@apple.com>
<video> controls visibility test needs to dump render tree as text because
The following frames contain links that are expected to trigger a DNS prefetch.
-
+
The following frames contain links that are not expected to cause a DNS prefetch.
var contents = "";
if (dnsPrefetchControl)
contents += "<meta http-equiv='x-dns-prefetch-control' content='" + dnsPrefetchControl + "'>";
- contents += "<link rel='dns-prefetch' href='//" + host + "'>" + host;
+ contents += "<link rel='dns-prefetch' href='http://" + host + "'>" + host;
emitFrameWithContents(contents);
}
</script>
<script>emitFrameForScheme("http:")</script>
<script>emitFrameForScheme("https:")</script>
<script>emitFrameForScheme("ftp:")</script>
- <script>emitFrameForScheme("")</script>
<iframe src="resources/dns-prefetch-control.php"></iframe>
<iframe src="resources/dns-prefetch-control.php?value=on"></iframe>
<iframe src="https://127.0.0.1:8443/misc/resources/dns-prefetch-control.php?value=on"></iframe>
CONSOLE MESSAGE: line 0: Unable to post message to http://. Recipient has origin http://localhost:8000.
-CONSOLE MESSAGE: line 0: Unable to post message to localhost:8000.
-
-CONSOLE MESSAGE: line 0: Unable to post message to localhost:8000.
-
window.location.href = http://127.0.0.1:8000/security/postMessage/invalid-origin-throws-exception.html
waiting...
Encountered exception Error: SYNTAX_ERR: DOM Exception 12 while posting message to ''.
Encountered exception Error: SYNTAX_ERR: DOM Exception 12 while posting message to 'asdf'.
+Encountered exception Error: SYNTAX_ERR: DOM Exception 12 while posting message to '/tmp/foo'.
+Encountered exception Error: SYNTAX_ERR: DOM Exception 12 while posting message to '//localhost'.
Posted message to 'asdf:' without any exceptions.
Posted message to 'http:' without any exceptions.
-Posted message to '/tmp/foo' without any exceptions.
-Posted message to '//localhost' without any exceptions.
Received message: data="Received message: data="done" origin="http://127.0.0.1:8000"" origin="http://localhost:8000"
// Non-URLs should fail with a syntax error.
tryPostMessage("");
tryPostMessage("asdf");
+ tryPostMessage("/tmp/foo");
+ tryPostMessage("//localhost");
// URLs without an origin should fail without generating any errors.
tryPostMessage("asdf:");
tryPostMessage("http:");
-
- // WebKit converts URLs that start with / to file:// URLs. They don't match the target frame, so they fail silently.
- tryPostMessage("/tmp/foo");
- tryPostMessage("//localhost");
-
win.postMessage('done', '*');
}
--- /dev/null
+Test how non-ASCII characters are encoded in relative URLs.
+
+1 2 3 4 5
+1. PASS
+2. FAIL: http://127.0.0.1:8000/uri/resolve-encoding-relative.html#%F4%F0%E0%E3%EC%E5%ED%F2
+3. PASS
+4. PASS
+5. FAIL: http://127.0.0.1:8000/%D0%BF%D1%83%D1%82%D1%8C?%E7%E0%EF%F0%EE%F1#%F4%F0%E0%E3%EC%E5%ED%F2
+
--- /dev/null
+<meta charset=windows-1251>
+<body>
+<p>Test how non-ASCII characters are encoded in relative URLs.</p>
+<a href="?çàïðîñ">1</a>
+<a href="#ôðàãìåíò">2</a>
+<a href="ïóòü">3</a>
+<a href="/ïóòü">4</a>
+<a href="/ïóòü?çàïðîñ#ôðàãìåíò">5</a>
+<pre id=result></pre>
+<script>
+if (window.layoutTestController)
+ layoutTestController.dumpAsText();
+
+function log(message)
+{
+ document.getElementById("result").innerHTML += message + "\n";
+}
+
+// Everything except for query (search) part is encoded as UTF-8 in Firefox.
+log("1. " + (document.getElementsByTagName("a")[0].href.match(/%E7%E0%EF%F0%EE%F1$/) ? "PASS" : "FAIL: " + document.getElementsByTagName("a")[0].href));
+log("2. " + (document.getElementsByTagName("a")[1].href.match(/#%D1%84%D1%80%D0%B0%D0%B3%D0%BC%D0%B5%D0%BD%D1%82$/) ? "PASS" : "FAIL: " + document.getElementsByTagName("a")[1].href));
+log("3. " + (document.getElementsByTagName("a")[2].href.match(/%D0%BF%D1%83%D1%82%D1%8C$/) ? "PASS" : "FAIL: " + document.getElementsByTagName("a")[2].href));
+log("4. " + (document.getElementsByTagName("a")[3].href.match(/%D0%BF%D1%83%D1%82%D1%8C$/) ? "PASS" : "FAIL: " + document.getElementsByTagName("a")[3].href));
+log("5. " + (document.getElementsByTagName("a")[4].href.match(/%D0%BF%D1%83%D1%82%D1%8C\?%E7%E0%EF%F0%EE%F1#%D1%84%D1%80%D0%B0%D0%B3%D0%BC%D0%B5%D0%BD%D1%82$/) ? "PASS" : "FAIL: " + document.getElementsByTagName("a")[4].href));
+</script>
+</body>
+2009-01-24 Alexey Proskuryakov <ap@webkit.org>
+
+ Reviewed by Darin Adler.
+
+ <rdar://problem/5954398> REGRESSION: 1.1% PLT regression from 33577 and 33578 (encoding fixes)
+
+ Changed single argument KURL constructors back to always expect an already encoded string,
+ eliminating extra conversions.
+
+ This is a rather unstable situation, as it is often unclear whether a given string is safe
+ to convert to KURL without resolving. I think that going forward, the solution is to try to
+ keep encoded URLs as KURL instances, and not as strings.
+
+ * platform/KURL.h: Updated comments.
+
+ * platform/KURL.cpp:
+ (WebCore::KURL::KURL): In debug builds, verify that the passed string is ASCII-only. The
+ intention is to verify that it is already parsed and encoded by KURL or equivalent code, but
+ since encoding is scheme-dependent, such a verification would be quite complicated.
+ Don't encode the string as UTF-8, as it supposed to be ASCII-only.
+ Removed a hack that made strings beginning with "/" turn into "file:" URLs. I didn't find
+ any reason for it to exist, but I saw several cases where this code path was taken
+ inadvertently (see examples in LayoutTests/ChangeLog).
+ (WebCore::KURL::setProtocol): Using a user-provided string without validation or encoding
+ is clearly wrong here (e.g., the "protocol" can be set to a full URL, effectively replacing
+ the old one), and an already encoded string is expected by parse().
+ In debug builds, non-ASCII input will make an assertion in parse() fail. Added a FIXME.
+ (WebCore::KURL::setHost): Ditto.
+ (WebCore::KURL::setPort): Ditto.
+ (WebCore::KURL::setHostAndPort): Ditto.
+ (WebCore::KURL::setUser): Ditto.
+ (WebCore::KURL::setPass): Ditto.
+ (WebCore::KURL::setRef): Ditto.
+ (WebCore::KURL::setQuery): Ditto.
+ (WebCore::KURL::setPath): Ditto. Note that problems described in bug 23500 mask some of
+ the problems in release builds currently, as the incorrectly parsed URL is ignored.
+ (WebCore::KURL::parse): Verify that the passed string is already encoded.
+
+ * html/HTMLLinkElement.cpp:
+ (WebCore::HTMLLinkElement::parseMappedAttribute):
+ (WebCore::HTMLLinkElement::process):
+ * html/HTMLLinkElement.h:
+ Changed to avoid using invalid URLs (this was causing problems on DNS prefetch tests, see
+ LayoutTests/ChangeLog).
+
+ * loader/FrameLoader.cpp:
+ (WebCore::FrameLoader::init): Create an empty KURL without indirection for a small speedup.
+ (WebCore::FrameLoader::requestFrame): Resolve and encode javascript URLs properly, now that
+ String to KURL conversion requires the string to be already encoded.
+
+ * page/DOMWindow.cpp: (WebCore::DOMWindow::postMessage): Resolve and encode the origin.
+ HTML5 seems a little unclear on how this should work (it talks about "either parsing it as
+ a URL, or resolving it", and then somehow compares unaltered targetOrigin string to a
+ security origin object), so I just made the code as close to what we already had as possible.
+
2009-01-24 Darin Adler <darin@apple.com>
Try to fix Qt build.
tokenizeRelAttribute(attr->value(), m_isStyleSheet, m_alternate, m_isIcon, m_isDNSPrefetch);
process();
} else if (attr->name() == hrefAttr) {
- m_url = document()->completeURL(parseURL(attr->value())).string();
+ m_url = document()->completeURL(parseURL(attr->value()));
process();
} else if (attr->name() == typeAttr) {
m_type = attr->value();
// IE extension: location of small icon for locationbar / bookmarks
// We'll record this URL per document, even if we later only use it in top level frames
- if (m_isIcon && !m_url.isEmpty())
- document()->setIconURL(m_url, type);
+ if (m_isIcon && m_url.isValid() && !m_url.isEmpty())
+ document()->setIconURL(m_url.string(), type);
- if (m_isDNSPrefetch && !m_url.isEmpty())
- prefetchDNS(KURL(m_url).host());
+ if (m_isDNSPrefetch && m_url.isValid() && !m_url.isEmpty())
+ prefetchDNS(m_url.host());
// Stylesheet
// This was buggy and would incorrectly match <link rel="alternate">, which has a different specified meaning. -dwh
- if (m_disabledState != 2 && m_isStyleSheet && document()->frame()) {
+ if (m_disabledState != 2 && m_isStyleSheet && document()->frame() && m_url.isValid()) {
// no need to load style sheets which aren't for the screen output
// ### there may be in some situations e.g. for an editor or script to manipulate
// also, don't load style sheets for standalone documents
m_cachedSheet->removeClient(this);
}
m_loading = true;
- m_cachedSheet = document()->docLoader()->requestCSSStyleSheet(m_url, chset);
+ m_cachedSheet = document()->docLoader()->requestCSSStyleSheet(m_url.string(), chset);
if (m_cachedSheet)
m_cachedSheet->addClient(this);
else if (!isAlternate()) { // request may have been denied if stylesheet is local and document is remote.
protected:
CachedResourceHandle<CachedCSSStyleSheet> m_cachedSheet;
RefPtr<CSSStyleSheet> m_sheet;
- String m_url;
+ KURL m_url;
String m_type;
String m_media;
int m_disabledState; // 0=unset(default), 1=enabled via script, 2=disabled
// this somewhat odd set of steps is needed to give the frame an initial empty document
m_isDisplayingInitialEmptyDocument = false;
m_creatingInitialEmptyDocument = true;
- setPolicyDocumentLoader(m_client->createDocumentLoader(ResourceRequest(String("")), SubstituteData()).get());
+ setPolicyDocumentLoader(m_client->createDocumentLoader(ResourceRequest(KURL("")), SubstituteData()).get());
setProvisionalDocumentLoader(m_policyDocumentLoader.get());
setState(FrameStateProvisional);
m_provisionalDocumentLoader->setResponse(ResourceResponse(KURL(), "text/html", 0, String(), String()));
KURL scriptURL;
KURL url;
if (protocolIs(urlString, "javascript")) {
- scriptURL = KURL(urlString);
+ scriptURL = completeURL(urlString); // completeURL() encodes the URL.
url = blankURL();
} else
url = completeURL(urlString);
// to generate the SYNTAX_ERR exception correctly.
RefPtr<SecurityOrigin> target;
if (targetOrigin != "*") {
- target = SecurityOrigin::create(KURL(targetOrigin));
+ target = SecurityOrigin::create(KURL(KURL(), targetOrigin, UTF8Encoding()));
if (target->isEmpty()) {
ec = SYNTAX_ERR;
return;
return -1;
}
+#ifndef NDEBUG
+static void checkEncodedString(const String& url)
+{
+ for (unsigned i = 0; i < url.length(); ++i)
+ ASSERT(!(url[i] & ~0x7F));
+
+ // FIXME: The first character should be checked with isSchemeFirstChar(), but some layout tests currently trigger this assertion.
+ ASSERT(!url.length() || url[0] != '/');
+}
+#else
+static inline void checkEncodedString(const String&)
+{
+}
+#endif
+
inline bool KURL::protocolIs(const String& string, const char* protocol)
{
return WebCore::protocolIs(string, protocol);
KURL::KURL(const char* url)
{
- if (!url || url[0] != '/') {
- parse(url, 0);
- return;
- }
-
- size_t urlLength = strlen(url) + 1;
- CharBuffer buffer(urlLength + 5); // 5 for "file:".
- buffer[0] = 'f';
- buffer[1] = 'i';
- buffer[2] = 'l';
- buffer[3] = 'e';
- buffer[4] = ':';
- memcpy(&buffer[5], url, urlLength);
- parse(buffer.data(), 0);
+ parse(url, 0);
}
KURL::KURL(const String& url)
{
- if (url[0] != '/') {
- parse(url.utf8().data(), &url);
- return;
- }
+ checkEncodedString(url);
- CharBuffer buffer(url.length() + 6); // 5 for "file:", 1 for terminator.
- buffer[0] = 'f';
- buffer[1] = 'i';
- buffer[2] = 'l';
- buffer[3] = 'e';
- buffer[4] = ':';
- copyASCII(url.characters(), url.length(), &buffer[5]);
- buffer[url.length() + 5] = '\0'; // Need null terminator.
-
- parse(buffer.data(), 0);
+ parse(url);
}
KURL::KURL(const KURL& base, const String& relative)
void KURL::setProtocol(const String& s)
{
+ // FIXME: Non-ASCII characters must be encoded and escaped to match parse() expectations,
+ // and to avoid changing more than just the protocol.
+
if (!m_isValid) {
parse(s + ":" + m_string);
return;
if (!m_isValid)
return;
+ // FIXME: Non-ASCII characters must be encoded and escaped to match parse() expectations,
+ // and to avoid changing more than just the host.
+
bool slashSlashNeeded = m_userStart == m_schemeEnd + 1;
parse(m_string.left(hostStart()) + (slashSlashNeeded ? "//" : "") + s + m_string.substring(m_hostEnd));
if (!m_isValid)
return;
+ // FIXME: Non-ASCII characters must be encoded and escaped to match parse() expectations,
+ // and to avoid changing more than just the port.
+
bool colonNeeded = m_portEnd == m_hostEnd;
int portStart = (colonNeeded ? m_hostEnd : m_hostEnd + 1);
if (!m_isValid)
return;
+ // FIXME: Non-ASCII characters must be encoded and escaped to match parse() expectations,
+ // and to avoid changing more than just host and port.
+
bool slashSlashNeeded = m_userStart == m_schemeEnd + 1;
parse(m_string.left(hostStart()) + (slashSlashNeeded ? "//" : "") + hostAndPort + m_string.substring(m_portEnd));
if (!m_isValid)
return;
+ // FIXME: Non-ASCII characters must be encoded and escaped to match parse() expectations,
+ // and to avoid changing more than just the user login.
String u;
int end = m_userEnd;
if (!user.isEmpty()) {
if (!m_isValid)
return;
+ // FIXME: Non-ASCII characters must be encoded and escaped to match parse() expectations,
+ // and to avoid changing more than just the user password.
String p;
int end = m_passwordEnd;
if (!password.isEmpty()) {
{
if (!m_isValid)
return;
+
+ // FIXME: Non-ASCII characters must be encoded and escaped to match parse() expectations.
parse(m_string.left(m_queryEnd) + (s.isNull() ? "" : "#" + s));
}
if (!m_isValid)
return;
+ // FIXME: '#' and non-ASCII characters must be encoded and escaped.
+ // Usually, the query is encoded using document encoding, not UTF-8, but we don't have
+ // access to the document in this function.
if ((query.isEmpty() || query[0] != '?') && !query.isNull())
parse(m_string.left(m_pathEnd) + "?" + query + m_string.substring(m_queryEnd));
else
if (!m_isValid)
return;
+ // FIXME: encodeWithURLEscapeSequences does not correctly escape '#' and '?', so fragment and query parts
+ // may be inadvertently affected.
parse(m_string.left(m_portEnd) + encodeWithURLEscapeSequences(s) + m_string.substring(m_pathEnd));
}
void KURL::parse(const String& string)
{
- // FIXME: What should this do for non-ASCII URLs?
- // Currently it throws away the high bytes of the characters in the string in that case, matching createCFURL().
+ checkEncodedString(string);
+
CharBuffer buffer(string.length() + 1);
copyASCII(string.characters(), string.length(), buffer.data());
buffer[string.length()] = '\0';
KURL() { invalidate(); }
// The argument is an absolute URL string. The string is assumed to be
- // already encoded.
- // FIXME: This constructor has a special case for strings that start with
- // "/", prepending "file://" to such strings; it would be good to get
- // rid of that special case.
+ // an already encoded (ASCII-only) valid absolute URL.
explicit KURL(const char*);
-
- // The argument is an absolute URL string. The string is assumed to be
- // already encoded.
- // FIXME: This constructor has a special case for strings that start with
- // "/", prepending "file://" to such strings; it would be good to get
- // rid of that special case.
explicit KURL(const String&);
// Resolves the relative URL with the given base URL. If provided, the