Add support for RegExp "dotAll" flag
authormsaboff@apple.com <msaboff@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Thu, 24 Aug 2017 21:14:43 +0000 (21:14 +0000)
committermsaboff@apple.com <msaboff@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Thu, 24 Aug 2017 21:14:43 +0000 (21:14 +0000)
commit5c50eb3c5725f67ba416d6fbbcf130d3e58ff9a5
treeaade5d8be58f1a5b715b4ca8e58ce380cd101ba8
parentbb0d52a375c1a548dc90cd20602bcf8f6761fddb
Add support for RegExp "dotAll" flag
https://bugs.webkit.org/show_bug.cgi?id=175924

Reviewed by Keith Miller.

JSTests:

Updated tests for new dotAll ('s' flag) changes.

* es6/Proxy_internal_get_calls_RegExp.prototype.flags.js:
* stress/static-getter-in-names.js:

Source/JavaScriptCore:

The dotAll RegExp flag, 's', changes . to match any character including line terminators.
Added a the "dotAll" identifier as well as RegExp.prototype.dotAll getter.
Added a new any character CharacterClass that is used to match . terms in a dotAll flags
RegExp.  In the YARR pattern and parsing code, changed the NewlineClassID, which was only
used for '.' processing, to DotClassID.  The selection of which builtin character class
that DotClassID resolves to when generating the pattern is conditional on the dotAll flag.
This NewlineClassID to DotClassID refactoring includes the atomBuiltInCharacterClass() in
the WebCore content extensions code in the PatternParser class.

As an optimization, the Yarr JIT actually doesn't perform match checks against the builtin
any character CharacterClass, it merely reads the character.  There is another optimization
in our DotStart enclosure processing where a non-capturing regular expression in the form
of .*<expression.*, with options beginning ^ and/or trailing $, match the contained
expression and then look for the extents of the surrounding .*'s.  When used with the
dotAll flag, that processing alwys results with the beinning of the string and the end
of the string.  Therefore we short circuit the finding the beginning and end of the line
or string with dotAll patterns.

* bytecode/BytecodeDumper.cpp:
(JSC::regexpToSourceString):
* runtime/CommonIdentifiers.h:
* runtime/RegExp.cpp:
(JSC::regExpFlags):
(JSC::RegExpFunctionalTestCollector::outputOneTest):
* runtime/RegExp.h:
* runtime/RegExpKey.h:
* runtime/RegExpPrototype.cpp:
(JSC::RegExpPrototype::finishCreation):
(JSC::flagsString):
(JSC::regExpProtoGetterDotAll):
* yarr/YarrInterpreter.cpp:
(JSC::Yarr::Interpreter::matchDotStarEnclosure):
* yarr/YarrInterpreter.h:
(JSC::Yarr::BytecodePattern::dotAll const):
* yarr/YarrJIT.cpp:
(JSC::Yarr::YarrGenerator::optimizeAlternative):
(JSC::Yarr::YarrGenerator::generateCharacterClassOnce):
(JSC::Yarr::YarrGenerator::generateCharacterClassFixed):
(JSC::Yarr::YarrGenerator::generateCharacterClassGreedy):
(JSC::Yarr::YarrGenerator::backtrackCharacterClassNonGreedy):
(JSC::Yarr::YarrGenerator::generateDotStarEnclosure):
* yarr/YarrParser.h:
(JSC::Yarr::Parser::parseTokens):
* yarr/YarrPattern.cpp:
(JSC::Yarr::YarrPatternConstructor::atomBuiltInCharacterClass):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassBuiltIn):
(JSC::Yarr::YarrPatternConstructor::optimizeDotStarWrappedExpressions):
(JSC::Yarr::YarrPattern::YarrPattern):
(JSC::Yarr::PatternTerm::dump):
(JSC::Yarr::anycharCreate):
* yarr/YarrPattern.h:
(JSC::Yarr::YarrPattern::reset):
(JSC::Yarr::YarrPattern::anyCharacterClass):
(JSC::Yarr::YarrPattern::dotAll const):

Source/WebCore:

Changed due to refactoring NewlineClassID to DotClassID.

No new tests. No change in behavior.

* contentextensions/URLFilterParser.cpp:
(WebCore::ContentExtensions::PatternParser::atomBuiltInCharacterClass):

LayoutTests:

* js/regexp-dotall-expected.txt: Added.
* js/regexp-dotall.html: Added.
* js/script-tests/Object-getOwnPropertyNames.js:
* js/script-tests/regexp-dotall.js: Added.
New tests.

* js/Object-getOwnPropertyNames-expected.txt:
Updated tests for new dotAll ('s' flag) changes.

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@221160 268f45cc-cd09-0410-ab3c-d52691b4dbfc
24 files changed:
JSTests/ChangeLog
JSTests/es6/Proxy_internal_get_calls_RegExp.prototype.flags.js
JSTests/stress/static-getter-in-names.js
LayoutTests/ChangeLog
LayoutTests/js/Object-getOwnPropertyNames-expected.txt
LayoutTests/js/regexp-dotall-expected.txt [new file with mode: 0644]
LayoutTests/js/regexp-dotall.html [new file with mode: 0644]
LayoutTests/js/script-tests/Object-getOwnPropertyNames.js
LayoutTests/js/script-tests/regexp-dotall.js [new file with mode: 0644]
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/bytecode/BytecodeDumper.cpp
Source/JavaScriptCore/runtime/CommonIdentifiers.h
Source/JavaScriptCore/runtime/RegExp.cpp
Source/JavaScriptCore/runtime/RegExp.h
Source/JavaScriptCore/runtime/RegExpKey.h
Source/JavaScriptCore/runtime/RegExpPrototype.cpp
Source/JavaScriptCore/yarr/YarrInterpreter.cpp
Source/JavaScriptCore/yarr/YarrInterpreter.h
Source/JavaScriptCore/yarr/YarrJIT.cpp
Source/JavaScriptCore/yarr/YarrParser.h
Source/JavaScriptCore/yarr/YarrPattern.cpp
Source/JavaScriptCore/yarr/YarrPattern.h
Source/WebCore/ChangeLog
Source/WebCore/contentextensions/URLFilterParser.cpp