FTL B3 should be able to flag the tag constants as being super important so that...
authorfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Tue, 8 Dec 2015 02:46:22 +0000 (02:46 +0000)
committerfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Tue, 8 Dec 2015 02:46:22 +0000 (02:46 +0000)
https://bugs.webkit.org/show_bug.cgi?id=151955

Reviewed by Geoffrey Garen.

Source/JavaScriptCore:

Taught B3 about the concept of "fast constants". A client of B3 can now tell B3 which
constants are super important. B3 will not spill the constant in that case and will ensure
that the constant is materialized only once: statically once, and dynamically once per
procedure execution. The hoistFastConstants() algorithm in B3MoveConstants.cpp achieves this
by first picking the lowest common dominator of all uses of each fast constant, and then
picking the materialization point by finding the lowest dominator of that dominator that is
tied for lowest block frequency. In practice, the second step ensures that this is the lowest
point in the program that is not in a loop (i.e. executes no more than once dynamically per
procedure invocation).

Taught Air about the concept of "fast tmps". B3 tells Air that a tmp is fast if it is used to
hold the materialization of a fast constant. IRC will use the lowest possible spill score for
fast tmps. In practice, this ensures that fast constants are never spilled.

Added a small snippet of code to FTL::LowerDFGToLLVM that makes both of the tag constants
into fast constants.

My hope is that this very brute-force heuristic is good enough that we don't have to think
about constants for a while. Based on my experience with how LLVM's constant hoisting works
out, the heuristic in this patch is going to be tough to beat. LLVM's constant hoisting does
good things when it hoists the tags, and usually causes nothing but problems when it hoists
anything else. This is because there is no way a low-level compiler to really understand how
a constant materialization impacts some operation's contribution to the overall execution
time of a procedure. But, in the FTL we know that constant materializations for type checks
are a bummer because we are super comfortable placing type checks on the hottest of paths. So
those are the last paths where extra instructions should be added by the compiler. On the
other hand, all other large constant uses are on relatively cold paths, or paths that are
already expensive for other reasons. For example, global variable accesses have to
materialize a pointer to the global. But that's not really a big deal, since a load from a
global involves first the load itself and then type checks on the result - so probably the
constant materialization is just not interesting. A store to a global often involves a store
barrier, so the constant materialization is really not interesting. This patch codifies this
heuristic in a pact between Air, B3, and the FTL: FTL demands that B3 pin the two tags in
registers, and B3 relays the demand to Air.

* JavaScriptCore.xcodeproj/project.pbxproj:
* b3/B3CFG.h: Added.
(JSC::B3::CFG::CFG):
(JSC::B3::CFG::root):
(JSC::B3::CFG::newMap):
(JSC::B3::CFG::successors):
(JSC::B3::CFG::predecessors):
(JSC::B3::CFG::index):
(JSC::B3::CFG::node):
(JSC::B3::CFG::numNodes):
(JSC::B3::CFG::dump):
* b3/B3Dominators.h: Added.
(JSC::B3::Dominators::Dominators):
* b3/B3IndexMap.h:
(JSC::B3::IndexMap::resize):
(JSC::B3::IndexMap::size):
(JSC::B3::IndexMap::operator[]):
* b3/B3LowerMacros.cpp:
* b3/B3LowerToAir.cpp:
(JSC::B3::Air::LowerToAir::tmp):
* b3/B3MoveConstants.cpp:
* b3/B3Opcode.h:
(JSC::B3::constPtrOpcode):
(JSC::B3::isConstant):
* b3/B3Procedure.cpp:
(JSC::B3::Procedure::Procedure):
(JSC::B3::Procedure::resetReachability):
(JSC::B3::Procedure::invalidateCFG):
(JSC::B3::Procedure::dump):
(JSC::B3::Procedure::deleteValue):
(JSC::B3::Procedure::dominators):
(JSC::B3::Procedure::addFastConstant):
(JSC::B3::Procedure::isFastConstant):
(JSC::B3::Procedure::addDataSection):
* b3/B3Procedure.h:
(JSC::B3::Procedure::size):
(JSC::B3::Procedure::cfg):
(JSC::B3::Procedure::setLastPhaseName):
* b3/B3ReduceStrength.cpp:
* b3/B3ValueInlines.h:
(JSC::B3::Value::isConstant):
(JSC::B3::Value::isInteger):
* b3/B3ValueKey.h:
(JSC::B3::ValueKey::canMaterialize):
(JSC::B3::ValueKey::isConstant):
* b3/air/AirCode.cpp:
(JSC::B3::Air::Code::findNextBlock):
(JSC::B3::Air::Code::addFastTmp):
* b3/air/AirCode.h:
(JSC::B3::Air::Code::specials):
(JSC::B3::Air::Code::isFastTmp):
(JSC::B3::Air::Code::setLastPhaseName):
* b3/air/AirIteratedRegisterCoalescing.cpp:
* dfg/DFGDominators.h:
* dfg/DFGSSACalculator.cpp:
* ftl/FTLLowerDFGToLLVM.cpp:
(JSC::FTL::DFG::LowerDFGToLLVM::lower):

Source/WTF:

Remove some remaining DFG-specific snippets from Dominators. This used to be a non-template
DFG class, and some time ago I hoisted it into WTF and made it generic. But since the only
user of the class was the DFG, this class still had a handful of DFG-specific snippets that
didn't compile when I started using it from B3.

Also renamed immediateDominatorOf() to idom(). This is the sort of abbreviation that we
wouldn't ordinarily want to have in WebKit. But WebKit does allow for abbreviations that are
"more canonical". The term "idom" is definitely more canonical than "immediateDominatorOf".

* wtf/Dominators.h:
(WTF::Dominators::dominates):
(WTF::Dominators::idom):
(WTF::Dominators::forAllStrictDominatorsOf):
(WTF::Dominators::NaiveDominators::dominates):
(WTF::Dominators::NaiveDominators::dump):
(WTF::Dominators::ValidationContext::handleErrors):

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@193682 268f45cc-cd09-0410-ab3c-d52691b4dbfc

23 files changed:
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj
Source/JavaScriptCore/b3/B3CFG.h [new file with mode: 0644]
Source/JavaScriptCore/b3/B3Dominators.h [new file with mode: 0644]
Source/JavaScriptCore/b3/B3IndexMap.h
Source/JavaScriptCore/b3/B3LowerMacros.cpp
Source/JavaScriptCore/b3/B3LowerToAir.cpp
Source/JavaScriptCore/b3/B3MoveConstants.cpp
Source/JavaScriptCore/b3/B3Opcode.h
Source/JavaScriptCore/b3/B3Procedure.cpp
Source/JavaScriptCore/b3/B3Procedure.h
Source/JavaScriptCore/b3/B3ReduceStrength.cpp
Source/JavaScriptCore/b3/B3ValueInlines.h
Source/JavaScriptCore/b3/B3ValueKey.h
Source/JavaScriptCore/b3/air/AirCode.cpp
Source/JavaScriptCore/b3/air/AirCode.h
Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp
Source/JavaScriptCore/dfg/DFGCFG.h
Source/JavaScriptCore/dfg/DFGDominators.h
Source/JavaScriptCore/dfg/DFGSSACalculator.cpp
Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp
Source/WTF/ChangeLog
Source/WTF/wtf/Dominators.h

index b7218c4..3769aa4 100644 (file)
@@ -1,3 +1,103 @@
+2015-12-07  Filip Pizlo  <fpizlo@apple.com>
+
+        FTL B3 should be able to flag the tag constants as being super important so that B3 can hoist them and Air can force them into registers
+        https://bugs.webkit.org/show_bug.cgi?id=151955
+
+        Reviewed by Geoffrey Garen.
+
+        Taught B3 about the concept of "fast constants". A client of B3 can now tell B3 which
+        constants are super important. B3 will not spill the constant in that case and will ensure
+        that the constant is materialized only once: statically once, and dynamically once per
+        procedure execution. The hoistFastConstants() algorithm in B3MoveConstants.cpp achieves this
+        by first picking the lowest common dominator of all uses of each fast constant, and then
+        picking the materialization point by finding the lowest dominator of that dominator that is
+        tied for lowest block frequency. In practice, the second step ensures that this is the lowest
+        point in the program that is not in a loop (i.e. executes no more than once dynamically per
+        procedure invocation).
+
+        Taught Air about the concept of "fast tmps". B3 tells Air that a tmp is fast if it is used to
+        hold the materialization of a fast constant. IRC will use the lowest possible spill score for
+        fast tmps. In practice, this ensures that fast constants are never spilled.
+
+        Added a small snippet of code to FTL::LowerDFGToLLVM that makes both of the tag constants
+        into fast constants.
+
+        My hope is that this very brute-force heuristic is good enough that we don't have to think
+        about constants for a while. Based on my experience with how LLVM's constant hoisting works
+        out, the heuristic in this patch is going to be tough to beat. LLVM's constant hoisting does
+        good things when it hoists the tags, and usually causes nothing but problems when it hoists
+        anything else. This is because there is no way a low-level compiler to really understand how
+        a constant materialization impacts some operation's contribution to the overall execution
+        time of a procedure. But, in the FTL we know that constant materializations for type checks
+        are a bummer because we are super comfortable placing type checks on the hottest of paths. So
+        those are the last paths where extra instructions should be added by the compiler. On the
+        other hand, all other large constant uses are on relatively cold paths, or paths that are
+        already expensive for other reasons. For example, global variable accesses have to
+        materialize a pointer to the global. But that's not really a big deal, since a load from a
+        global involves first the load itself and then type checks on the result - so probably the
+        constant materialization is just not interesting. A store to a global often involves a store
+        barrier, so the constant materialization is really not interesting. This patch codifies this
+        heuristic in a pact between Air, B3, and the FTL: FTL demands that B3 pin the two tags in
+        registers, and B3 relays the demand to Air.
+
+        * JavaScriptCore.xcodeproj/project.pbxproj:
+        * b3/B3CFG.h: Added.
+        (JSC::B3::CFG::CFG):
+        (JSC::B3::CFG::root):
+        (JSC::B3::CFG::newMap):
+        (JSC::B3::CFG::successors):
+        (JSC::B3::CFG::predecessors):
+        (JSC::B3::CFG::index):
+        (JSC::B3::CFG::node):
+        (JSC::B3::CFG::numNodes):
+        (JSC::B3::CFG::dump):
+        * b3/B3Dominators.h: Added.
+        (JSC::B3::Dominators::Dominators):
+        * b3/B3IndexMap.h:
+        (JSC::B3::IndexMap::resize):
+        (JSC::B3::IndexMap::size):
+        (JSC::B3::IndexMap::operator[]):
+        * b3/B3LowerMacros.cpp:
+        * b3/B3LowerToAir.cpp:
+        (JSC::B3::Air::LowerToAir::tmp):
+        * b3/B3MoveConstants.cpp:
+        * b3/B3Opcode.h:
+        (JSC::B3::constPtrOpcode):
+        (JSC::B3::isConstant):
+        * b3/B3Procedure.cpp:
+        (JSC::B3::Procedure::Procedure):
+        (JSC::B3::Procedure::resetReachability):
+        (JSC::B3::Procedure::invalidateCFG):
+        (JSC::B3::Procedure::dump):
+        (JSC::B3::Procedure::deleteValue):
+        (JSC::B3::Procedure::dominators):
+        (JSC::B3::Procedure::addFastConstant):
+        (JSC::B3::Procedure::isFastConstant):
+        (JSC::B3::Procedure::addDataSection):
+        * b3/B3Procedure.h:
+        (JSC::B3::Procedure::size):
+        (JSC::B3::Procedure::cfg):
+        (JSC::B3::Procedure::setLastPhaseName):
+        * b3/B3ReduceStrength.cpp:
+        * b3/B3ValueInlines.h:
+        (JSC::B3::Value::isConstant):
+        (JSC::B3::Value::isInteger):
+        * b3/B3ValueKey.h:
+        (JSC::B3::ValueKey::canMaterialize):
+        (JSC::B3::ValueKey::isConstant):
+        * b3/air/AirCode.cpp:
+        (JSC::B3::Air::Code::findNextBlock):
+        (JSC::B3::Air::Code::addFastTmp):
+        * b3/air/AirCode.h:
+        (JSC::B3::Air::Code::specials):
+        (JSC::B3::Air::Code::isFastTmp):
+        (JSC::B3::Air::Code::setLastPhaseName):
+        * b3/air/AirIteratedRegisterCoalescing.cpp:
+        * dfg/DFGDominators.h:
+        * dfg/DFGSSACalculator.cpp:
+        * ftl/FTLLowerDFGToLLVM.cpp:
+        (JSC::FTL::DFG::LowerDFGToLLVM::lower):
+
 2015-12-07  Andy VanWagoner  <thetalecrafter@gmail.com>
 
         [INTL] Implement String.prototype.toLocaleUpperCase in ECMA-402
index a7ee808..f353bb0 100644 (file)
                0F338E1E1BF286EA0013C88F /* B3LowerMacros.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F338E1A1BF286EA0013C88F /* B3LowerMacros.h */; };
                0F33FCF71C136E2500323F67 /* B3StackmapGenerationParams.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F33FCF51C136E2500323F67 /* B3StackmapGenerationParams.cpp */; };
                0F33FCF81C136E2500323F67 /* B3StackmapGenerationParams.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F33FCF61C136E2500323F67 /* B3StackmapGenerationParams.h */; };
+               0F33FCFB1C1625BE00323F67 /* B3CFG.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F33FCF91C1625BE00323F67 /* B3CFG.h */; };
+               0F33FCFC1C1625BE00323F67 /* B3Dominators.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F33FCFA1C1625BE00323F67 /* B3Dominators.h */; };
                0F34B14916D42010001CDA5A /* DFGUseKind.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F34B14716D4200E001CDA5A /* DFGUseKind.cpp */; };
                0F34B14A16D42013001CDA5A /* DFGUseKind.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F34B14816D4200E001CDA5A /* DFGUseKind.h */; };
                0F37308C1C0BD29100052BFA /* B3PhiChildren.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F37308A1C0BD29100052BFA /* B3PhiChildren.cpp */; };
                0F338E1A1BF286EA0013C88F /* B3LowerMacros.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3LowerMacros.h; path = b3/B3LowerMacros.h; sourceTree = "<group>"; };
                0F33FCF51C136E2500323F67 /* B3StackmapGenerationParams.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = B3StackmapGenerationParams.cpp; path = b3/B3StackmapGenerationParams.cpp; sourceTree = "<group>"; };
                0F33FCF61C136E2500323F67 /* B3StackmapGenerationParams.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3StackmapGenerationParams.h; path = b3/B3StackmapGenerationParams.h; sourceTree = "<group>"; };
+               0F33FCF91C1625BE00323F67 /* B3CFG.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3CFG.h; path = b3/B3CFG.h; sourceTree = "<group>"; };
+               0F33FCFA1C1625BE00323F67 /* B3Dominators.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3Dominators.h; path = b3/B3Dominators.h; sourceTree = "<group>"; };
                0F34B14716D4200E001CDA5A /* DFGUseKind.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = DFGUseKind.cpp; path = dfg/DFGUseKind.cpp; sourceTree = "<group>"; };
                0F34B14816D4200E001CDA5A /* DFGUseKind.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = DFGUseKind.h; path = dfg/DFGUseKind.h; sourceTree = "<group>"; };
                0F37308A1C0BD29100052BFA /* B3PhiChildren.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = B3PhiChildren.cpp; path = b3/B3PhiChildren.cpp; sourceTree = "<group>"; };
                                0FEC84BA1BDACDAC0080FF74 /* B3BlockWorklist.h */,
                                0F338DF71BE96AA80013C88F /* B3CCallValue.cpp */,
                                0F338DF81BE96AA80013C88F /* B3CCallValue.h */,
+                               0F33FCF91C1625BE00323F67 /* B3CFG.h */,
                                0FEC84BB1BDACDAC0080FF74 /* B3CheckSpecial.cpp */,
                                0FEC84BC1BDACDAC0080FF74 /* B3CheckSpecial.h */,
                                0FEC84BD1BDACDAC0080FF74 /* B3CheckValue.cpp */,
                                0FEC84CA1BDACDAC0080FF74 /* B3ControlValue.h */,
                                0F338E011BF0276C0013C88F /* B3DataSection.cpp */,
                                0F338E021BF0276C0013C88F /* B3DataSection.h */,
+                               0F33FCFA1C1625BE00323F67 /* B3Dominators.h */,
                                0FEC85C41BE16F5A0080FF74 /* B3Effects.cpp */,
                                0FEC85BE1BE167A00080FF74 /* B3Effects.h */,
                                0FEC84CB1BDACDAC0080FF74 /* B3FrequencyClass.cpp */,
                                9928FF3C18AC4AEC00B8CF12 /* JSReplayInputs.h in Headers */,
                                BC18C4260E16F5CD00B34460 /* JSRetainPtr.h in Headers */,
                                14874AE615EBDE4A002E3587 /* JSScope.h in Headers */,
+                               0F33FCFB1C1625BE00323F67 /* B3CFG.h in Headers */,
                                A7C0C4AC168103020017011D /* JSScriptRefPrivate.h in Headers */,
                                FE1220271BE7F58C0039E6F2 /* JITAddGenerator.h in Headers */,
                                0F919D11157F332C004A4E7D /* JSSegmentedVariableObject.h in Headers */,
                                705B41AE1A6E501E00716757 /* SymbolConstructor.h in Headers */,
                                996B73271BDA08EF00331B84 /* SymbolConstructor.lut.h in Headers */,
                                705B41B01A6E501E00716757 /* SymbolObject.h in Headers */,
+                               0F33FCFC1C1625BE00323F67 /* B3Dominators.h in Headers */,
                                705B41B21A6E501E00716757 /* SymbolPrototype.h in Headers */,
                                996B73281BDA08EF00331B84 /* SymbolPrototype.lut.h in Headers */,
                                BC18C46B0E16F5CD00B34460 /* SymbolTable.h in Headers */,
diff --git a/Source/JavaScriptCore/b3/B3CFG.h b/Source/JavaScriptCore/b3/B3CFG.h
new file mode 100644 (file)
index 0000000..dd9dc09
--- /dev/null
@@ -0,0 +1,80 @@
+/*
+ * Copyright (C) 2015 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+#ifndef B3CFG_h
+#define B3CFG_h
+
+#if ENABLE(B3_JIT)
+
+#include "B3BasicBlock.h"
+#include "B3IndexMap.h"
+#include "B3IndexSet.h"
+#include "B3Procedure.h"
+
+namespace JSC { namespace B3 {
+
+class CFG {
+    WTF_MAKE_NONCOPYABLE(CFG);
+    WTF_MAKE_FAST_ALLOCATED;
+public:
+    typedef BasicBlock* Node;
+    typedef IndexSet<BasicBlock> Set;
+    template<typename T> using Map = IndexMap<BasicBlock, T>;
+    typedef Vector<BasicBlock*, 4> List;
+
+    CFG(Procedure& proc)
+        : m_proc(proc)
+    {
+    }
+
+    Node root() { return m_proc[0]; }
+
+    template<typename T>
+    Map<T> newMap() { return IndexMap<BasicBlock, T>(m_proc.size()); }
+
+    SuccessorCollection<BasicBlock, BasicBlock::SuccessorList> successors(Node node) { return node->successorBlocks(); }
+    BasicBlock::PredecessorList& predecessors(Node node) { return node->predecessors(); }
+
+    unsigned index(Node node) const { return node->index(); }
+    Node node(unsigned index) const { return m_proc[index]; }
+    unsigned numNodes() const { return m_proc.size(); }
+
+    PointerDump<BasicBlock> dump(Node node) const { return pointerDump(node); }
+
+    void dump(PrintStream& out) const
+    {
+        m_proc.dump(out);
+    }
+
+private:
+    Procedure& m_proc;
+};
+
+} } // namespace JSC::B3
+
+#endif // ENABLE(B3_JIT)
+
+#endif // B3CFG_h
+
diff --git a/Source/JavaScriptCore/b3/B3Dominators.h b/Source/JavaScriptCore/b3/B3Dominators.h
new file mode 100644 (file)
index 0000000..c93a26e
--- /dev/null
@@ -0,0 +1,55 @@
+/*
+ * Copyright (C) 2015 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+#ifndef B3Dominators_h
+#define B3Dominators_h
+
+#if ENABLE(B3_JIT)
+
+#include "B3CFG.h"
+#include "B3Common.h"
+#include "B3Procedure.h"
+#include <wtf/Dominators.h>
+#include <wtf/FastMalloc.h>
+#include <wtf/Noncopyable.h>
+
+namespace JSC { namespace B3 {
+
+class Dominators : public WTF::Dominators<CFG> {
+    WTF_MAKE_NONCOPYABLE(Dominators);
+    WTF_MAKE_FAST_ALLOCATED;
+public:
+    Dominators(Procedure& proc)
+        : WTF::Dominators<CFG>(proc.cfg(), shouldValidateIR())
+    {
+    }
+};
+
+} } // namespace JSC::B3
+
+#endif // ENABLE(B3_JIT)
+
+#endif // B3Dominators_h
+
index 22fc858..01b49f8 100644 (file)
@@ -49,6 +49,18 @@ public:
         m_vector.fill(Value(), size);
     }
 
+    size_t size() const { return m_vector.size(); }
+
+    Value& operator[](size_t index)
+    {
+        return m_vector[index];
+    }
+
+    const Value& operator[](size_t index) const
+    {
+        return m_vector[index];
+    }
+    
     Value& operator[](Key* key)
     {
         return m_vector[key->index()];
index c907bba..b3dc61b 100644 (file)
@@ -58,8 +58,10 @@ public:
             processCurrentBlock();
         }
         m_changed |= m_blockInsertionSet.execute();
-        if (m_changed)
+        if (m_changed) {
             m_proc.resetReachability();
+            m_proc.invalidateCFG();
+        }
         return m_changed;
     }
     
index 0fabe49..895fbd3 100644 (file)
@@ -302,8 +302,11 @@ private:
             while (shouldCopyPropagate(value))
                 value = value->child(0);
             Tmp& realTmp = m_valueToTmp[value];
-            if (!realTmp)
+            if (!realTmp) {
                 realTmp = m_code.newTmp(Arg::typeForB3Type(value->type()));
+                if (m_procedure.isFastConstant(value->key()))
+                    m_code.addFastTmp(realTmp);
+            }
             tmp = realTmp;
         }
         return tmp;
index 133853c..81f70e7 100644 (file)
 #if ENABLE(B3_JIT)
 
 #include "B3BasicBlockInlines.h"
+#include "B3Dominators.h"
 #include "B3InsertionSetInlines.h"
 #include "B3MemoryValue.h"
 #include "B3PhaseScope.h"
 #include "B3ProcedureInlines.h"
 #include "B3ValueInlines.h"
 #include "B3ValueKeyInlines.h"
+#include <wtf/HashMap.h>
+#include <wtf/Vector.h>
 
 namespace JSC { namespace B3 {
 
@@ -50,27 +53,137 @@ public:
 
     void run()
     {
-        // Eventually this phase will do smart things. For now, it uses a super simple heuristic: it
-        // places constants in the block that uses them, and makes sure that each block has only one
-        // materialization for each constant. Note that this mostly only matters for large constants, since
-        // small constants get fused into the instructions that use them. But it might matter for small
-        // constants if they are used in instructions that don't do immediates, like conditional moves.
-
-        // FIXME: Implement a better story for constants. At a minimum this should allow the B3
-        // client to specify important constants that always get hoisted. Also, the table used to
-        // hold double constants should have a pointer to it that is hoisted. If we wanted to be more
-        // aggressive, we could make constant materialization be a feature of Air: we could label
-        // some Tmps as being unmaterialized constants and have a late Air phase - post register
-        // allocation - that creates materializations of those constant Tmps by scavenging leftover
-        // registers.
+        // This phase uses super simple heuristics to ensure that constants are in profitable places
+        // and also lowers constant materialization code. Constants marked "fast" by the client are
+        // hoisted to the lowest common dominator. A table is created for constants that need to be
+        // loaded to be materialized, and all of their Values are turned into loads from that table.
+        // Non-fast constants get materialized in the block that uses them. Constants that are
+        // materialized by loading get special treatment when they get used in some kind of Any in a
+        // StackmapValue. In that case, the Constants are sunk to the point of use, since that allows
+        // the instruction selector to sink the constants into an Arg::imm64.
 
+        // FIXME: Implement a better story for constants. For example, the table used to hold double
+        // constants should have a pointer to it that is hoisted. If we wanted to be more aggressive,
+        // we could make constant materialization be a feature of Air: we could label some Tmps as
+        // being unmaterialized constants and have a late Air phase - post register allocation - that
+        // creates materializations of those constant Tmps by scavenging leftover registers.
+
+        hoistFastConstants();
+        sinkSlowConstants();
+    }
+
+private:
+    void hoistFastConstants()
+    {
+        Dominators& dominators = m_proc.dominators();
+        HashMap<ValueKey, Value*> valueForConstant;
+        IndexMap<BasicBlock, Vector<Value*>> materializations(m_proc.size());
+
+        // We determine where things get materialized based on where they are used.
+        for (BasicBlock* block : m_proc) {
+            for (Value* value : *block) {
+                for (Value*& child : value->children()) {
+                    ValueKey key = child->key();
+                    if (!m_proc.isFastConstant(key))
+                        continue;
+
+                    auto result = valueForConstant.add(key, child);
+                    if (result.isNewEntry) {
+                        // Assume that this block is where we want to materialize the value.
+                        child->owner = block;
+                        continue;
+                    }
+
+                    // Make 'value' use the canonical constant rather than the one it was using.
+                    child = result.iterator->value;
+
+                    // Determine the least common dominator. That's the lowest place in the CFG where
+                    // we could materialize the constant while still having only one materialization
+                    // in the resulting code.
+                    while (!dominators.dominates(child->owner, block))
+                        child->owner = dominators.idom(child->owner);
+                }
+            }
+        }
+
+        // Make sure that each basic block knows what to materialize. This also refines the
+        // materialization block based on execution frequency. It finds the minimum block frequency
+        // of all of its dominators, and selects the closest block amongst those that are tied for
+        // lowest frequency.
+        for (auto& entry : valueForConstant) {
+            Value* value = entry.value;
+            for (BasicBlock* block = value->owner; block; block = dominators.idom(block)) {
+                if (block->frequency() < value->owner->frequency())
+                    value->owner = block;
+            }
+            materializations[entry.value->owner].append(entry.value);
+        }
+
+        // Get rid of Value's that are fast constants but aren't canonical. Also remove the canonical
+        // ones from the CFG, since we're going to reinsert them elsewhere.
+        for (BasicBlock* block : m_proc) {
+            for (Value*& value : *block) {
+                ValueKey key = value->key();
+                if (!m_proc.isFastConstant(key))
+                    continue;
+
+                if (valueForConstant.get(key) == value)
+                    value = m_proc.add<Value>(Nop, value->origin());
+                else
+                    value->replaceWithNop();
+            }
+        }
+
+        // Now make sure that we move constants to where they are supposed to go. Again, we do this
+        // based on uses.
+        for (BasicBlock* block : m_proc) {
+            for (unsigned valueIndex = 0; valueIndex < block->size(); ++valueIndex) {
+                Value* value = block->at(valueIndex);
+                for (Value* child : value->children()) {
+                    ValueKey key = child->key();
+                    if (!m_proc.isFastConstant(key))
+                        continue;
+
+                    // If we encounter a fast constant, then it must be canonical, since we already
+                    // got rid of the non-canonical ones.
+                    ASSERT(valueForConstant.get(key) == child);
+
+                    if (child->owner != block) {
+                        // This constant isn't our problem. It's going to be materialized in another
+                        // block.
+                        continue;
+                    }
+                    
+                    // We're supposed to materialize this constant in this block, and we haven't
+                    // done it yet.
+                    m_insertionSet.insertValue(valueIndex, child);
+                    child->owner = nullptr;
+                }
+            }
+
+            // We may have some constants that need to be materialized right at the end of this
+            // block.
+            for (Value* value : materializations[block]) {
+                if (!value->owner) {
+                    // It's already materialized in this block.
+                    continue;
+                }
+
+                m_insertionSet.insertValue(block->size() - 1, value);
+            }
+            m_insertionSet.execute(block);
+        }
+    }
+    
+    void sinkSlowConstants()
+    {
         // First we need to figure out which constants go into the data section. These are non-zero
         // double constants.
         for (Value* value : m_proc.values()) {
-            if (!needsMotion(value))
+            ValueKey key = value->key();
+            if (!needsMotion(key))
                 continue;
             m_toRemove.append(value);
-            ValueKey key = value->key();
             if (goesInTable(key))
                 m_constTable.add(key, m_constTable.size());
         }
@@ -87,10 +200,10 @@ public:
                 StackmapValue* stackmap = value->as<StackmapValue>();
                 for (unsigned childIndex = 0; childIndex < value->numChildren(); ++childIndex) {
                     Value*& child = value->child(childIndex);
-                    if (!needsMotion(child))
+                    ValueKey key = child->key();
+                    if (!needsMotion(key))
                         continue;
 
-                    ValueKey key = child->key();
                     if (stackmap
                         && goesInTable(key)
                         && stackmap->constrainedChild(childIndex).rep().isAny()) {
@@ -115,8 +228,7 @@ public:
         for (Value* toRemove : m_toRemove)
             toRemove->replaceWithNop();
     }
-
-private:
+    
     Value* materialize(unsigned valueIndex, const ValueKey& key, const Origin& origin)
     {
         if (Value* result = m_constants.get(key))
@@ -147,9 +259,9 @@ private:
         return key.opcode() == ConstDouble && key != doubleZero();
     }
 
-    bool needsMotion(const Value* value)
+    bool needsMotion(const ValueKey& key)
     {
-        return value->isConstant();
+        return key.isConstant() && !m_proc.isFastConstant(key);
     }
 
     static ValueKey doubleZero()
index f0f3ab8..fe37db3 100644 (file)
@@ -231,6 +231,18 @@ inline Opcode constPtrOpcode()
     return Const32;
 }
 
+inline bool isConstant(Opcode opcode)
+{
+    switch (opcode) {
+    case Const32:
+    case Const64:
+    case ConstDouble:
+        return true;
+    default:
+        return false;
+    }
+}
+
 } } // namespace JSC::B3
 
 namespace WTF {
index 0f8ebb4..9c7ad5f 100644 (file)
 #include "B3BasicBlockInlines.h"
 #include "B3BasicBlockUtils.h"
 #include "B3BlockWorklist.h"
+#include "B3CFG.h"
 #include "B3DataSection.h"
+#include "B3Dominators.h"
 #include "B3OpaqueByproducts.h"
 #include "B3ValueInlines.h"
 
 namespace JSC { namespace B3 {
 
 Procedure::Procedure()
-    : m_lastPhaseName("initial")
+    : m_cfg(new CFG(*this))
+    , m_lastPhaseName("initial")
     , m_byproducts(std::make_unique<OpaqueByproducts>())
     , m_code(new Air::Code(*this))
 {
@@ -113,6 +116,11 @@ void Procedure::resetReachability()
         });
 }
 
+void Procedure::invalidateCFG()
+{
+    m_dominators = nullptr;
+}
+
 void Procedure::dump(PrintStream& out) const
 {
     for (BasicBlock* block : *this)
@@ -138,6 +146,26 @@ void Procedure::deleteValue(Value* value)
     m_values[value->index()] = nullptr;
 }
 
+Dominators& Procedure::dominators()
+{
+    if (!m_dominators)
+        m_dominators = std::make_unique<Dominators>(*this);
+    return *m_dominators;
+}
+
+void Procedure::addFastConstant(const ValueKey& constant)
+{
+    RELEASE_ASSERT(constant.isConstant());
+    m_fastConstants.add(constant);
+}
+
+bool Procedure::isFastConstant(const ValueKey& constant)
+{
+    if (!constant)
+        return false;
+    return m_fastConstants.contains(constant);
+}
+
 void* Procedure::addDataSection(size_t size)
 {
     if (!size)
index 44f075a..4f3ff84 100644 (file)
 #include "B3OpaqueByproducts.h"
 #include "B3Origin.h"
 #include "B3Type.h"
+#include "B3ValueKey.h"
 #include "PureNaN.h"
 #include "RegisterAtOffsetList.h"
 #include <wtf/Bag.h>
 #include <wtf/FastMalloc.h>
+#include <wtf/HashSet.h>
 #include <wtf/Noncopyable.h>
 #include <wtf/PrintStream.h>
 #include <wtf/TriState.h>
@@ -44,6 +46,8 @@ namespace JSC { namespace B3 {
 
 class BasicBlock;
 class BlockInsertionSet;
+class CFG;
+class Dominators;
 class Value;
 
 namespace Air { class Code; }
@@ -70,6 +74,10 @@ public:
     void resetValueOwners();
     void resetReachability();
 
+    // This destroys CFG analyses. If we ask for them again, we will recompute them. Usually you
+    // should call this anytime you call resetReachability().
+    void invalidateCFG();
+
     JS_EXPORT_PRIVATE void dump(PrintStream&) const;
 
     unsigned size() const { return m_blocks.size(); }
@@ -200,6 +208,13 @@ public:
 
     void deleteValue(Value*);
 
+    CFG& cfg() const { return *m_cfg; }
+
+    Dominators& dominators();
+
+    void addFastConstant(const ValueKey&);
+    bool isFastConstant(const ValueKey&);
+
     // The name has to be a string literal, since we don't do any memory management for the string.
     void setLastPhaseName(const char* name)
     {
@@ -240,6 +255,9 @@ private:
     Vector<std::unique_ptr<BasicBlock>> m_blocks;
     Vector<std::unique_ptr<Value>> m_values;
     Vector<size_t> m_valueIndexFreeList;
+    std::unique_ptr<CFG> m_cfg;
+    std::unique_ptr<Dominators> m_dominators;
+    HashSet<ValueKey> m_fastConstants;
     const char* m_lastPhaseName;
     std::unique_ptr<OpaqueByproducts> m_byproducts;
     std::unique_ptr<Air::Code> m_code;
index 8c2764f..2fc410f 100644 (file)
@@ -117,6 +117,7 @@ public:
 
             if (m_changedCFG) {
                 m_proc.resetReachability();
+                m_proc.invalidateCFG();
                 m_changed = true;
             }
 
index 34414f2..b49e28c 100644 (file)
@@ -54,14 +54,7 @@ const T* Value::as() const
 
 inline bool Value::isConstant() const
 {
-    switch (opcode()) {
-    case Const32:
-    case Const64:
-    case ConstDouble:
-        return true;
-    default:
-        return false;
-    }
+    return B3::isConstant(opcode());
 }
 
 inline bool Value::isInteger() const
index 0aa1606..a3a4f01 100644 (file)
@@ -120,6 +120,11 @@ public:
         }
     }
 
+    bool isConstant() const
+    {
+        return B3::isConstant(opcode());
+    }
+
     // Attempts to materialize the Value for this ValueKey. May return nullptr if the value cannot
     // be materialized. This happens for CheckAdd and friends. You can use canMaterialize() to check
     // if your key is materializable.
index 3dc3930..3dc1ebd 100644 (file)
@@ -134,6 +134,11 @@ BasicBlock* Code::findNextBlock(BasicBlock* block) const
     return nullptr;
 }
 
+void Code::addFastTmp(Tmp tmp)
+{
+    m_fastTmps.add(tmp);
+}
+
 } } } // namespace JSC::B3::Air
 
 #endif // ENABLE(B3_JIT)
index f626125..4a90181 100644 (file)
@@ -31,6 +31,7 @@
 #include "AirBasicBlock.h"
 #include "AirSpecial.h"
 #include "AirStackSlot.h"
+#include "AirTmp.h"
 #include "RegisterAtOffsetList.h"
 #include "StackAlignment.h"
 
@@ -289,6 +290,9 @@ public:
     };
 
     SpecialsCollection specials() const { return SpecialsCollection(*this); }
+
+    void addFastTmp(Tmp);
+    bool isFastTmp(Tmp tmp) const { return m_fastTmps.contains(tmp); }
     
     // The name has to be a string literal, since we don't do any memory management for the string.
     void setLastPhaseName(const char* name)
@@ -307,6 +311,7 @@ private:
     Vector<std::unique_ptr<StackSlot>> m_stackSlots;
     Vector<std::unique_ptr<BasicBlock>> m_blocks;
     Vector<std::unique_ptr<Special>> m_specials;
+    HashSet<Tmp> m_fastTmps;
     CCallSpecial* m_cCallSpecial { nullptr };
     unsigned m_numGPTmps { 0 };
     unsigned m_numFPTmps { 0 };
index e51a563..502c89d 100644 (file)
@@ -563,6 +563,11 @@ private:
         // Higher score means more desirable to spill. Lower scores maximize the likelihood that a tmp
         // gets a register.
         auto score = [&] (Tmp tmp) -> double {
+            // Air exposes the concept of "fast tmps", and we interpret that to mean that the tmp
+            // should always be in a register.
+            if (m_code.isFastTmp(tmp))
+                return 0;
+            
             // All else being equal, the score should be directly related to the degree.
             double degree = static_cast<double>(m_degrees[AbsoluteTmpMapper<type>::absoluteIndex(tmp)]);
 
index 83706c4..1805b29 100644 (file)
@@ -36,6 +36,8 @@
 namespace JSC { namespace DFG {
 
 class CFG {
+    WTF_MAKE_NONCOPYABLE(CFG);
+    WTF_MAKE_FAST_ALLOCATED;
 public:
     typedef BasicBlock* Node;
     typedef BlockSet Set;
index dd842cd..475588d 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2011, 2014 Apple Inc. All rights reserved.
+ * Copyright (C) 2011, 2014, 2015 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
index cff246f..899fa15 100644 (file)
@@ -95,12 +95,12 @@ SSACalculator::Def* SSACalculator::newDef(Variable* variable, BasicBlock* block,
 
 SSACalculator::Def* SSACalculator::nonLocalReachingDef(BasicBlock* block, Variable* variable)
 {
-    return reachingDefAtTail(m_graph.m_dominators->immediateDominatorOf(block), variable);
+    return reachingDefAtTail(m_graph.m_dominators->idom(block), variable);
 }
 
 SSACalculator::Def* SSACalculator::reachingDefAtTail(BasicBlock* block, Variable* variable)
 {
-    for (; block; block = m_graph.m_dominators->immediateDominatorOf(block)) {
+    for (; block; block = m_graph.m_dominators->idom(block)) {
         if (Def* def = m_data[block].m_defs.get(variable))
             return def;
     }
index 1a438b0..fdfc743 100644 (file)
@@ -303,6 +303,13 @@ public:
 #endif
         m_tagTypeNumber = m_out.constInt64(TagTypeNumber);
         m_tagMask = m_out.constInt64(TagMask);
+
+#if FTL_USES_B3
+        // Make sure that B3 knows that we really care about the mask registers. This forces the
+        // constants to be materialized in registers.
+        m_proc.addFastConstant(m_tagTypeNumber->key());
+        m_proc.addFastConstant(m_tagMask->key());
+#endif // FTL_USES_B3
         
         m_out.storePtr(m_out.constIntPtr(codeBlock()), addressFor(JSStack::CodeBlock));
         
index baad0f2..fdade64 100644 (file)
@@ -1,3 +1,27 @@
+2015-12-07  Filip Pizlo  <fpizlo@apple.com>
+
+        FTL B3 should be able to flag the tag constants as being super important so that B3 can hoist them and Air can force them into registers
+        https://bugs.webkit.org/show_bug.cgi?id=151955
+
+        Reviewed by Geoffrey Garen.
+
+        Remove some remaining DFG-specific snippets from Dominators. This used to be a non-template
+        DFG class, and some time ago I hoisted it into WTF and made it generic. But since the only
+        user of the class was the DFG, this class still had a handful of DFG-specific snippets that
+        didn't compile when I started using it from B3.
+
+        Also renamed immediateDominatorOf() to idom(). This is the sort of abbreviation that we
+        wouldn't ordinarily want to have in WebKit. But WebKit does allow for abbreviations that are
+        "more canonical". The term "idom" is definitely more canonical than "immediateDominatorOf".
+
+        * wtf/Dominators.h:
+        (WTF::Dominators::dominates):
+        (WTF::Dominators::idom):
+        (WTF::Dominators::forAllStrictDominatorsOf):
+        (WTF::Dominators::NaiveDominators::dominates):
+        (WTF::Dominators::NaiveDominators::dump):
+        (WTF::Dominators::ValidationContext::handleErrors):
+
 2015-12-03  Anders Carlsson  <andersca@apple.com>
 
         Remove Objective-C GC support
index 82db530..e613d40 100644 (file)
@@ -26,6 +26,7 @@
 #ifndef WTFDominators_h
 #define WTFDominators_h
 
+#include <wtf/FastBitVector.h>
 #include <wtf/GraphNodeWorklist.h>
 
 namespace WTF {
@@ -120,8 +121,9 @@ public:
     {
         return from == to || strictlyDominates(from, to);
     }
-    
-    typename Graph::Node immediateDominatorOf(typename Graph::Node block) const
+
+    // Returns the immediate dominator of this block. Returns null for the root block.
+    typename Graph::Node idom(typename Graph::Node block) const
     {
         return m_data[block].idomParent;
     }
@@ -555,7 +557,7 @@ private:
     
         bool dominates(typename Graph::Node from, typename Graph::Node to) const
         {
-            return dominates(from->index, to->index);
+            return dominates(m_graph.index(from), m_graph.index(to));
         }
     
         void dump(PrintStream& out) const
@@ -593,7 +595,7 @@ private:
             return m_results[idx].setAndCheck(m_scratch);
         }
     
-        Graph m_graph;
+        Graph& m_graph;
         Vector<FastBitVector> m_results; // For each block, the bitvector of blocks that dominate it.
         FastBitVector m_scratch; // A temporary bitvector with bit for each block. We recycle this to save new/deletes.
     };
@@ -636,12 +638,12 @@ private:
                     continue;
                 dataLog("    Block ", graph.dump(graph.node(blockIndex)), ": successors = [");
                 CommaPrinter comma;
-                for (unsigned i = 0; i < block->numSuccessors(); ++i)
-                    dataLog(comma, graph.dump(block->successor(i)));
+                for (auto successor : graph.successors(block))
+                    dataLog(comma, graph.dump(successor));
                 dataLog("], predecessors = [");
                 comma = CommaPrinter();
-                for (unsigned i = 0; i < block->predecessors.size(); ++i)
-                    dataLog(comma, graph.dump(block->predecessors[i]));
+                for (auto predecessor : graph.predecessors(block))
+                    dataLog(comma, graph.dump(predecessor));
                 dataLog("]\n");
             }
             dataLog("\n");