Deduplicate shortish Text node strings during tree construction.
authorakling@apple.com <akling@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Mon, 25 Nov 2013 21:53:32 +0000 (21:53 +0000)
committerakling@apple.com <akling@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Mon, 25 Nov 2013 21:53:32 +0000 (21:53 +0000)
commite6b5b673bfc787b82d6ac0bd44b6287d6e252ba7
tree02582263444c330e5c131e23fc9d92067f98bcd1
parent76ac472b2dfa824451dbb36bffe60e5d4f1d80e0
Deduplicate shortish Text node strings during tree construction.
<https://webkit.org/b/124855>

Let HTMLConstructionSite keep a hash set of already seen strings over
its lifetime. Use this to deduplicate the strings inside Text nodes
for any string up to 64 characters of length.

This optimization already sort-of existed for whitespace-only Texts,
but those are laundered in the AtomicString table which we definitely
don't want to pollute with every single Text. It might be a good idea
to stop using the AtomicString table for all-whitespace Text too.

3.82 MB progression on HTML5-8266 locally.

Reviewed by Anders Carlsson.

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@159764 268f45cc-cd09-0410-ab3c-d52691b4dbfc
Source/WebCore/ChangeLog
Source/WebCore/html/parser/HTMLConstructionSite.cpp
Source/WebCore/html/parser/HTMLConstructionSite.h