Elements whose contents start with an astral Unicode symbol disappear when CSS `...
authormmaxfield@apple.com <mmaxfield@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Wed, 13 Aug 2014 03:53:47 +0000 (03:53 +0000)
committermmaxfield@apple.com <mmaxfield@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Wed, 13 Aug 2014 03:53:47 +0000 (03:53 +0000)
https://bugs.webkit.org/show_bug.cgi?id=135756

Reviewed by Darin Adler.

Source/WebCore:

The previous code assumed that all "characters" are exactly 1 16-bit code unit wide. Instead, use numCharactersInGraphemeClusters().

This patch also modifies the signature of numCharactersInGraphemeClusters() to take a StringView instead
of a string, which will avoid a copy.

Test: css1/pseudo/firstletter-surrogate.html

* platform/text/TextBreakIterator.cpp:
(WebCore::numCharactersInGraphemeClusters): Update numCharactersInGraphemeClusters() to take a StringView.
* platform/text/TextBreakIterator.h: Ditto.
* rendering/RenderBlock.cpp:
(WebCore::RenderBlock::createFirstLetterRenderer): Use numCharactersInGraphemeClusters() to determine the length
of the first letter, rather than assuming it has length of 1 code unit
(WebCore::RenderBlock::updateFirstLetter): Add a FIXME comment.

Source/WTF:

Add a method to StringView which passes through contains() to find().

* wtf/text/StringView.h:
(WTF::StringView::contains):

LayoutTests:

Make sure the pseudoclass matches manually wrapping a <span> around the character.

* css1/pseudo/firstletter-surrogate-expected.html: Added.
* css1/pseudo/firstletter-surrogate.html: Added.

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@172513 268f45cc-cd09-0410-ab3c-d52691b4dbfc

LayoutTests/ChangeLog
LayoutTests/css1/pseudo/firstletter-surrogate-expected.html [new file with mode: 0644]
LayoutTests/css1/pseudo/firstletter-surrogate.html [new file with mode: 0644]
Source/WTF/ChangeLog
Source/WTF/wtf/text/StringView.h
Source/WebCore/ChangeLog
Source/WebCore/platform/text/TextBreakIterator.cpp
Source/WebCore/platform/text/TextBreakIterator.h
Source/WebCore/rendering/RenderBlock.cpp

index 42e3cdb..4e400b9 100644 (file)
@@ -1,3 +1,15 @@
+2014-08-11  Myles C. Maxfield  <mmaxfield@apple.com>
+
+        Elements whose contents start with an astral Unicode symbol disappear when CSS `::first-letter` is applied to them
+        https://bugs.webkit.org/show_bug.cgi?id=135756
+
+        Reviewed by Darin Adler.
+
+        Make sure the pseudoclass matches manually wrapping a <span> around the character.
+
+        * css1/pseudo/firstletter-surrogate-expected.html: Added.
+        * css1/pseudo/firstletter-surrogate.html: Added.
+
 2014-08-12  Commit Queue  <commit-queue@webkit.org>
 
         Unreviewed, rolling out r172494.
diff --git a/LayoutTests/css1/pseudo/firstletter-surrogate-expected.html b/LayoutTests/css1/pseudo/firstletter-surrogate-expected.html
new file mode 100644 (file)
index 0000000..22a9154
--- /dev/null
@@ -0,0 +1,3 @@
+This test makes sure that the ::first-letter pseudoclass works with first
+letters that require surrogates when written in UTF-16.
+<p><span style="background: red;">&#x1D306;</span>Lorem ipsum (1)</p>
diff --git a/LayoutTests/css1/pseudo/firstletter-surrogate.html b/LayoutTests/css1/pseudo/firstletter-surrogate.html
new file mode 100644 (file)
index 0000000..8b34233
--- /dev/null
@@ -0,0 +1,8 @@
+<style>
+p::first-letter {
+    background: red;
+}
+</style>
+This test makes sure that the ::first-letter pseudoclass works with first
+letters that require surrogates when written in UTF-16.
+<p>&#x1D306;Lorem ipsum (1)</p>
index 79a7192..2f85372 100644 (file)
@@ -1,3 +1,15 @@
+2014-08-12  Myles C. Maxfield  <mmaxfield@apple.com>
+
+        Elements whose contents start with an astral Unicode symbol disappear when CSS `::first-letter` is applied to them
+        https://bugs.webkit.org/show_bug.cgi?id=135756
+
+        Reviewed by Darin Adler.
+
+        Add a method to StringView which passes through contains() to find().
+
+        * wtf/text/StringView.h:
+        (WTF::StringView::contains):
+
 2014-08-12  Pratik Solanki  <psolanki@apple.com>
 
         Enable didReceiveDataArray callback on Mac
index 34f9822..9c98d08 100644 (file)
@@ -179,6 +179,8 @@ public:
         return WTF::find(characters16(), length(), character, start);
     }
 
+    bool contains(UChar c) const { return find(c) != notFound; }
+
 #if USE(CF)
     // This function converts null strings to empty strings.
     WTF_EXPORT_STRING_API RetainPtr<CFStringRef> createCFStringWithoutCopying() const;
index 371b784..79f2235 100644 (file)
@@ -1,3 +1,25 @@
+2014-08-11  Myles C. Maxfield  <mmaxfield@apple.com>
+
+        Elements whose contents start with an astral Unicode symbol disappear when CSS `::first-letter` is applied to them
+        https://bugs.webkit.org/show_bug.cgi?id=135756
+
+        Reviewed by Darin Adler.
+
+        The previous code assumed that all "characters" are exactly 1 16-bit code unit wide. Instead, use numCharactersInGraphemeClusters().
+
+        This patch also modifies the signature of numCharactersInGraphemeClusters() to take a StringView instead
+        of a string, which will avoid a copy.
+
+        Test: css1/pseudo/firstletter-surrogate.html
+
+        * platform/text/TextBreakIterator.cpp:
+        (WebCore::numCharactersInGraphemeClusters): Update numCharactersInGraphemeClusters() to take a StringView.
+        * platform/text/TextBreakIterator.h: Ditto.
+        * rendering/RenderBlock.cpp:
+        (WebCore::RenderBlock::createFirstLetterRenderer): Use numCharactersInGraphemeClusters() to determine the length
+        of the first letter, rather than assuming it has length of 1 code unit
+        (WebCore::RenderBlock::updateFirstLetter): Add a FIXME comment.
+
 2014-08-12  Jer Noble  <jer.noble@apple.com>
 
         [MSE][Mac] Seeking to the very beginning of a buffered range stalls video playback
index 4786083..37af0d8 100644 (file)
@@ -377,7 +377,7 @@ unsigned numGraphemeClusters(const String& s)
     return num;
 }
 
-unsigned numCharactersInGraphemeClusters(const String& s, unsigned numGraphemeClusters)
+unsigned numCharactersInGraphemeClusters(const StringView& s, unsigned numGraphemeClusters)
 {
     unsigned stringLength = s.length();
 
index 306f6e7..1179372 100644 (file)
@@ -183,7 +183,7 @@ private:
 unsigned numGraphemeClusters(const String&);
 // Returns the number of characters which will be less than or equal to
 // the specified grapheme cluster length.
-unsigned numCharactersInGraphemeClusters(const String&, unsigned);
+unsigned numCharactersInGraphemeClusters(const StringView&, unsigned);
 
 }
 
index 738def8..26a478d 100644 (file)
@@ -63,6 +63,7 @@
 #include "SVGTextRunRenderingContext.h"
 #include "Settings.h"
 #include "ShadowRoot.h"
+#include "TextBreakIterator.h"
 #include "TransformState.h"
 #include <wtf/NeverDestroyed.h>
 #include <wtf/StackStats.h>
@@ -3585,8 +3586,8 @@ void RenderBlock::createFirstLetterRenderer(RenderObject* firstLetterBlock, Rend
         while (length < oldText.length() && shouldSkipForFirstLetter(oldText[length]))
             length++;
 
-        // Account for first letter.
-        length++;
+        // Account for first grapheme cluster.
+        length += numCharactersInGraphemeClusters(StringView(oldText).substring(length), 1);
         
         // Keep looking for whitespace and allowed punctuation, but avoid
         // accumulating just whitespace into the :first-letter.
@@ -3686,6 +3687,8 @@ void RenderBlock::updateFirstLetter()
 {
     RenderObject* firstLetterObj;
     RenderElement* firstLetterContainer;
+    // FIXME: The first letter might be composed of a variety of code units, and therefore might
+    // be contained within multiple RenderElements.
     getFirstLetter(firstLetterObj, firstLetterContainer);
 
     if (!firstLetterObj)