<rdar://problem/13666412> Clean up some edge cases of URL parsing.
authorap@apple.com <ap@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Sat, 11 May 2013 05:51:04 +0000 (05:51 +0000)
committerap@apple.com <ap@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Sat, 11 May 2013 05:51:04 +0000 (05:51 +0000)
commitc3dee101adbf2b74b67b1261679a6923011bd465
tree1ce82cead399612f1c3a03194fd4b793750766c4
parentf631486cb197d27a251b8eae0427490dcd36d516
    <rdar://problem/13666412> Clean up some edge cases of URL parsing.
        https://bugs.webkit.org/show_bug.cgi?id=104919

        Reviewed by Darin Adler.

WebCore:
        * page/SecurityOrigin.cpp:
        (WebCore::schemeRequiresHost):
        (WebCore::shouldTreatAsUniqueOrigin):
        Updated function name and comments (host is not the same as authority). We still
        need this check - KURL can still produce http URLs with an empty host (even as this
        patch reduces the number of such cases). So can Gecko and current draft of URL
        Standard.
        It would be good to have a guarantee that such useless URLs can not come out of
        URL parser, as relying on downstream code re-parsing the URL correctly would be fragile.

        * platform/KURL.cpp:
        (WebCore::hostPortIsEmptyButCredentialsArePresent): Updated an argument name for
        correctness.
        (WebCore::KURL::parse):
        1. Reverted behavior changes from <http://trac.webkit.org/changeset/82181> - I could
        find no reason to allow "@" in hostnames, and having a URL like this re-parsed by
        a different parser would likely produce different results. It's better to just treat
        these edge case URLs as invalid.
        2. When hostname component is a lone colon, preserve it in parsed URL string,
        as otherwise path would get pushed in its place when re-parsing.
        3. When authority component is a lone colon, don't forget to "//" after scheme, too.
        4. Added some assertions about contents of authority component, to catch potential
        mis-parsing earlier.

LayoutTests:
        * fast/dom/HTMLAnchorElement/script-tests/set-href-attribute-pathname.js:
        * fast/dom/HTMLAnchorElement/set-href-attribute-pathname-expected.txt:
        Updated expectations of one sub-test. We previously tried to keep the test passing
        as is (see bug 57291), but I couldn't find any reason to prefer the old behavior.

        * fast/url/host-expected.txt:
        * fast/url/host.html:
        Updated one subtest to new results, which match at least Gecko (original of the
        test actually claims that all browsers including Safari already do what we'll do now).

        * fast/url/segments-userinfo-vs-host-expected.txt: Added.
        * fast/url/segments-userinfo-vs-host.html: Added.
        Added a number of tests, with detailed explanations of the differences with Firefox,
        and with rationales.

        * http/tests/uri/username-with-no-hostname-expected.txt: Removed.
        * http/tests/uri/username-with-no-hostname.html-disabled: Removed.
        * platform/win/http/tests/uri/username-with-no-hostname-expected.txt: Removed.
        This test has been disabled for a long time, and being an end-to-end test for
        invalid URL handling, it would be difficult to make work again. We have multiple
        parsing tests for URLs like this.

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@149925 268f45cc-cd09-0410-ab3c-d52691b4dbfc
13 files changed:
LayoutTests/ChangeLog
LayoutTests/fast/dom/HTMLAnchorElement/script-tests/set-href-attribute-pathname.js
LayoutTests/fast/dom/HTMLAnchorElement/set-href-attribute-pathname-expected.txt
LayoutTests/fast/url/host-expected.txt
LayoutTests/fast/url/host.html
LayoutTests/fast/url/segments-userinfo-vs-host-expected.txt [new file with mode: 0644]
LayoutTests/fast/url/segments-userinfo-vs-host.html [new file with mode: 0644]
LayoutTests/http/tests/uri/username-with-no-hostname-expected.txt [deleted file]
LayoutTests/http/tests/uri/username-with-no-hostname.html-disabled [deleted file]
LayoutTests/platform/win/http/tests/uri/username-with-no-hostname-expected.txt [deleted file]
Source/WebCore/ChangeLog
Source/WebCore/page/SecurityOrigin.cpp
Source/WebCore/platform/KURL.cpp