B3 should be be clever about choosing which child to reuse for result in two-operand...
authorfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Mon, 30 Nov 2015 21:05:25 +0000 (21:05 +0000)
committerfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Mon, 30 Nov 2015 21:05:25 +0000 (21:05 +0000)
https://bugs.webkit.org/show_bug.cgi?id=151321

Reviewed by Geoffrey Garen.

When lowering a commutative operation to a two-operand instruction, you have a choice of which
child value to move into the result tmp. For example we might have:

    @x = Add(@y, @z)

Assuming no three-operand add is available, we could either lower it to this:

    Move %y, %x
    Add %z, %x

or to this:

    Move %z, %x
    Add %y, %x

Which is better depends on the likelihood of coalescing with %x. If it's more likely that %y will
coalesce with %x, then we want to use the first form. Otherwise, we should use the second form.

This implements two heuristics for selecting the right form, and makes those heuristics reusable
within the B3->Air lowering by abstracting it as preferRightForResult(). For non-commutative
operations we must use the first form, so the first form is the default. The heuristics are:

- If the right child has only one user, then use the second form instead. This is profitable because
  that means that @z dies at the Add, so using the second form means that the Move will be coalesced
  away.

- If one of the children is a Phi that this operation (the Add in this case) flows into via some
  Upsilon - possibly transitively through other Phis - then use the form that cases a Move on that
  child. This overrides everything else, and is meant to optimize variables that accumulate in a
  loop.

This required adding a reusable PhiChildren analysis, so I wrote one. It has an API that is mostly
based on iterators, and a higher-level API for looking at transitive children that is based on
functors.

I was originally implementing this for completeness, but when looking at how it interacted with
imaging-gaussian-blur, I realized the need for some heuristic for the loop-accumulator case. This
helps a lot on that benchmark. This widens the overall lead that B3 has on imaging-gaussian-blur, but
steady-state runs that exclude compile latency still show a slight deficit. That will most likely get
fixed by https://bugs.webkit.org/show_bug.cgi?id=151174.

No new tests because the commutativity appears to be covered by existing tests, and anyway, there are
no correctness implications to commuting a commutative operation.

* CMakeLists.txt:
* JavaScriptCore.xcodeproj/project.pbxproj:
* b3/B3LowerToAir.cpp:
(JSC::B3::Air::LowerToAir::LowerToAir):
(JSC::B3::Air::LowerToAir::canBeInternal):
(JSC::B3::Air::LowerToAir::appendUnOp):
(JSC::B3::Air::LowerToAir::preferRightForResult):
(JSC::B3::Air::LowerToAir::appendBinOp):
(JSC::B3::Air::LowerToAir::lower):
* b3/B3PhiChildren.cpp: Added.
(JSC::B3::PhiChildren::PhiChildren):
(JSC::B3::PhiChildren::~PhiChildren):
* b3/B3PhiChildren.h: Added.
(JSC::B3::PhiChildren::ValueCollection::ValueCollection):
(JSC::B3::PhiChildren::ValueCollection::size):
(JSC::B3::PhiChildren::ValueCollection::at):
(JSC::B3::PhiChildren::ValueCollection::operator[]):
(JSC::B3::PhiChildren::ValueCollection::contains):
(JSC::B3::PhiChildren::ValueCollection::iterator::iterator):
(JSC::B3::PhiChildren::ValueCollection::iterator::operator*):
(JSC::B3::PhiChildren::ValueCollection::iterator::operator++):
(JSC::B3::PhiChildren::ValueCollection::iterator::operator==):
(JSC::B3::PhiChildren::ValueCollection::iterator::operator!=):
(JSC::B3::PhiChildren::ValueCollection::begin):
(JSC::B3::PhiChildren::ValueCollection::end):
(JSC::B3::PhiChildren::UpsilonCollection::UpsilonCollection):
(JSC::B3::PhiChildren::UpsilonCollection::size):
(JSC::B3::PhiChildren::UpsilonCollection::at):
(JSC::B3::PhiChildren::UpsilonCollection::operator[]):
(JSC::B3::PhiChildren::UpsilonCollection::contains):
(JSC::B3::PhiChildren::UpsilonCollection::begin):
(JSC::B3::PhiChildren::UpsilonCollection::end):
(JSC::B3::PhiChildren::UpsilonCollection::values):
(JSC::B3::PhiChildren::UpsilonCollection::forAllTransitiveIncomingValues):
(JSC::B3::PhiChildren::UpsilonCollection::transitivelyUses):
(JSC::B3::PhiChildren::at):
(JSC::B3::PhiChildren::operator[]):
* b3/B3Procedure.cpp:
(JSC::B3::Procedure::Procedure):
* b3/B3Procedure.h:
* b3/B3UseCounts.cpp:
(JSC::B3::UseCounts::UseCounts):
* b3/B3UseCounts.h:
(JSC::B3::UseCounts::numUses):
(JSC::B3::UseCounts::numUsingInstructions):
(JSC::B3::UseCounts::operator[]): Deleted.

git-svn-id: http://svn.webkit.org/repository/webkit/trunk@192816 268f45cc-cd09-0410-ab3c-d52691b4dbfc

Source/JavaScriptCore/CMakeLists.txt
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj
Source/JavaScriptCore/b3/B3LowerToAir.cpp
Source/JavaScriptCore/b3/B3PhiChildren.cpp [new file with mode: 0644]
Source/JavaScriptCore/b3/B3PhiChildren.h [new file with mode: 0644]
Source/JavaScriptCore/b3/B3UseCounts.cpp
Source/JavaScriptCore/b3/B3UseCounts.h

index 2383e9f..74c6fcc 100644 (file)
@@ -121,6 +121,7 @@ set(JavaScriptCore_SOURCES
     b3/B3PatchpointSpecial.cpp
     b3/B3PatchpointValue.cpp
     b3/B3PhaseScope.cpp
+    b3/B3PhiChildren.cpp
     b3/B3Procedure.cpp
     b3/B3ReduceStrength.cpp
     b3/B3StackmapSpecial.cpp
index 2f9fba2..92bcb45 100644 (file)
@@ -1,5 +1,103 @@
 2015-11-30  Filip Pizlo  <fpizlo@apple.com>
 
+        B3 should be be clever about choosing which child to reuse for result in two-operand commutative operations
+        https://bugs.webkit.org/show_bug.cgi?id=151321
+
+        Reviewed by Geoffrey Garen.
+
+        When lowering a commutative operation to a two-operand instruction, you have a choice of which
+        child value to move into the result tmp. For example we might have:
+
+            @x = Add(@y, @z)
+
+        Assuming no three-operand add is available, we could either lower it to this:
+
+            Move %y, %x
+            Add %z, %x
+
+        or to this:
+
+            Move %z, %x
+            Add %y, %x
+
+        Which is better depends on the likelihood of coalescing with %x. If it's more likely that %y will
+        coalesce with %x, then we want to use the first form. Otherwise, we should use the second form.
+
+        This implements two heuristics for selecting the right form, and makes those heuristics reusable
+        within the B3->Air lowering by abstracting it as preferRightForResult(). For non-commutative
+        operations we must use the first form, so the first form is the default. The heuristics are:
+
+        - If the right child has only one user, then use the second form instead. This is profitable because
+          that means that @z dies at the Add, so using the second form means that the Move will be coalesced
+          away.
+
+        - If one of the children is a Phi that this operation (the Add in this case) flows into via some
+          Upsilon - possibly transitively through other Phis - then use the form that cases a Move on that
+          child. This overrides everything else, and is meant to optimize variables that accumulate in a
+          loop.
+
+        This required adding a reusable PhiChildren analysis, so I wrote one. It has an API that is mostly
+        based on iterators, and a higher-level API for looking at transitive children that is based on
+        functors.
+
+        I was originally implementing this for completeness, but when looking at how it interacted with
+        imaging-gaussian-blur, I realized the need for some heuristic for the loop-accumulator case. This
+        helps a lot on that benchmark. This widens the overall lead that B3 has on imaging-gaussian-blur, but
+        steady-state runs that exclude compile latency still show a slight deficit. That will most likely get
+        fixed by https://bugs.webkit.org/show_bug.cgi?id=151174.
+
+        No new tests because the commutativity appears to be covered by existing tests, and anyway, there are
+        no correctness implications to commuting a commutative operation.
+
+        * CMakeLists.txt:
+        * JavaScriptCore.xcodeproj/project.pbxproj:
+        * b3/B3LowerToAir.cpp:
+        (JSC::B3::Air::LowerToAir::LowerToAir):
+        (JSC::B3::Air::LowerToAir::canBeInternal):
+        (JSC::B3::Air::LowerToAir::appendUnOp):
+        (JSC::B3::Air::LowerToAir::preferRightForResult):
+        (JSC::B3::Air::LowerToAir::appendBinOp):
+        (JSC::B3::Air::LowerToAir::lower):
+        * b3/B3PhiChildren.cpp: Added.
+        (JSC::B3::PhiChildren::PhiChildren):
+        (JSC::B3::PhiChildren::~PhiChildren):
+        * b3/B3PhiChildren.h: Added.
+        (JSC::B3::PhiChildren::ValueCollection::ValueCollection):
+        (JSC::B3::PhiChildren::ValueCollection::size):
+        (JSC::B3::PhiChildren::ValueCollection::at):
+        (JSC::B3::PhiChildren::ValueCollection::operator[]):
+        (JSC::B3::PhiChildren::ValueCollection::contains):
+        (JSC::B3::PhiChildren::ValueCollection::iterator::iterator):
+        (JSC::B3::PhiChildren::ValueCollection::iterator::operator*):
+        (JSC::B3::PhiChildren::ValueCollection::iterator::operator++):
+        (JSC::B3::PhiChildren::ValueCollection::iterator::operator==):
+        (JSC::B3::PhiChildren::ValueCollection::iterator::operator!=):
+        (JSC::B3::PhiChildren::ValueCollection::begin):
+        (JSC::B3::PhiChildren::ValueCollection::end):
+        (JSC::B3::PhiChildren::UpsilonCollection::UpsilonCollection):
+        (JSC::B3::PhiChildren::UpsilonCollection::size):
+        (JSC::B3::PhiChildren::UpsilonCollection::at):
+        (JSC::B3::PhiChildren::UpsilonCollection::operator[]):
+        (JSC::B3::PhiChildren::UpsilonCollection::contains):
+        (JSC::B3::PhiChildren::UpsilonCollection::begin):
+        (JSC::B3::PhiChildren::UpsilonCollection::end):
+        (JSC::B3::PhiChildren::UpsilonCollection::values):
+        (JSC::B3::PhiChildren::UpsilonCollection::forAllTransitiveIncomingValues):
+        (JSC::B3::PhiChildren::UpsilonCollection::transitivelyUses):
+        (JSC::B3::PhiChildren::at):
+        (JSC::B3::PhiChildren::operator[]):
+        * b3/B3Procedure.cpp:
+        (JSC::B3::Procedure::Procedure):
+        * b3/B3Procedure.h:
+        * b3/B3UseCounts.cpp:
+        (JSC::B3::UseCounts::UseCounts):
+        * b3/B3UseCounts.h:
+        (JSC::B3::UseCounts::numUses):
+        (JSC::B3::UseCounts::numUsingInstructions):
+        (JSC::B3::UseCounts::operator[]): Deleted.
+
+2015-11-30  Filip Pizlo  <fpizlo@apple.com>
+
         REGRESSION(r192812): This change seems to have broken the iOS builds (Requested by ryanhaddad on #webkit).
         https://bugs.webkit.org/show_bug.cgi?id=151669
 
index 7aba3af..9356846 100644 (file)
                0F338E1E1BF286EA0013C88F /* B3LowerMacros.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F338E1A1BF286EA0013C88F /* B3LowerMacros.h */; };
                0F34B14916D42010001CDA5A /* DFGUseKind.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F34B14716D4200E001CDA5A /* DFGUseKind.cpp */; };
                0F34B14A16D42013001CDA5A /* DFGUseKind.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F34B14816D4200E001CDA5A /* DFGUseKind.h */; };
+               0F37308C1C0BD29100052BFA /* B3PhiChildren.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F37308A1C0BD29100052BFA /* B3PhiChildren.cpp */; };
+               0F37308D1C0BD29100052BFA /* B3PhiChildren.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F37308B1C0BD29100052BFA /* B3PhiChildren.h */; };
                0F37308F1C0CD68500052BFA /* DisallowMacroScratchRegisterUsage.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F37308E1C0CD68500052BFA /* DisallowMacroScratchRegisterUsage.h */; };
                0F3730911C0CD70C00052BFA /* AllowMacroScratchRegisterUsage.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F3730901C0CD70C00052BFA /* AllowMacroScratchRegisterUsage.h */; };
                0F38B01117CF078000B144D3 /* LLIntEntrypoint.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F38B00F17CF077F00B144D3 /* LLIntEntrypoint.cpp */; };
                0F338E1A1BF286EA0013C88F /* B3LowerMacros.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3LowerMacros.h; path = b3/B3LowerMacros.h; sourceTree = "<group>"; };
                0F34B14716D4200E001CDA5A /* DFGUseKind.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = DFGUseKind.cpp; path = dfg/DFGUseKind.cpp; sourceTree = "<group>"; };
                0F34B14816D4200E001CDA5A /* DFGUseKind.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = DFGUseKind.h; path = dfg/DFGUseKind.h; sourceTree = "<group>"; };
+               0F37308A1C0BD29100052BFA /* B3PhiChildren.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = B3PhiChildren.cpp; path = b3/B3PhiChildren.cpp; sourceTree = "<group>"; };
+               0F37308B1C0BD29100052BFA /* B3PhiChildren.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3PhiChildren.h; path = b3/B3PhiChildren.h; sourceTree = "<group>"; };
                0F37308E1C0CD68500052BFA /* DisallowMacroScratchRegisterUsage.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = DisallowMacroScratchRegisterUsage.h; sourceTree = "<group>"; };
                0F3730901C0CD70C00052BFA /* AllowMacroScratchRegisterUsage.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = AllowMacroScratchRegisterUsage.h; sourceTree = "<group>"; };
                0F38B00F17CF077F00B144D3 /* LLIntEntrypoint.cpp */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.cpp.cpp; name = LLIntEntrypoint.cpp; path = llint/LLIntEntrypoint.cpp; sourceTree = "<group>"; };
                                0FEC84DE1BDACDAC0080FF74 /* B3PatchpointValue.h */,
                                0FEC84DF1BDACDAC0080FF74 /* B3PhaseScope.cpp */,
                                0FEC84E01BDACDAC0080FF74 /* B3PhaseScope.h */,
+                               0F37308A1C0BD29100052BFA /* B3PhiChildren.cpp */,
+                               0F37308B1C0BD29100052BFA /* B3PhiChildren.h */,
                                0FEC84E11BDACDAC0080FF74 /* B3Procedure.cpp */,
                                0FEC84E21BDACDAC0080FF74 /* B3Procedure.h */,
                                0FEC84E31BDACDAC0080FF74 /* B3ProcedureInlines.h */,
                                0F3B3A281544C997003ED0FF /* DFGCFGSimplificationPhase.h in Headers */,
                                0F9D36951AE9CC33000D4DFB /* DFGCleanUpPhase.h in Headers */,
                                A77A424017A0BBFD00A8DB81 /* DFGClobberize.h in Headers */,
+                               0F37308D1C0BD29100052BFA /* B3PhiChildren.h in Headers */,
                                A77A424217A0BBFD00A8DB81 /* DFGClobberSet.h in Headers */,
                                0F3C1F1B1B868E7900ABB08B /* DFGClobbersExitState.h in Headers */,
                                0F04396E1B03DC0B009598B7 /* DFGCombinedLiveness.h in Headers */,
                                14D2F3DA139F4BE200491031 /* MarkedSpace.cpp in Sources */,
                                142D6F1113539A4100B02E86 /* MarkStack.cpp in Sources */,
                                4340A4841A9051AF00D73CCA /* MathCommon.cpp in Sources */,
+                               0F37308C1C0BD29100052BFA /* B3PhiChildren.cpp in Sources */,
                                14469DDF107EC7E700650446 /* MathObject.cpp in Sources */,
                                90213E3D123A40C200D422F3 /* MemoryStatistics.cpp in Sources */,
                                0FB5467D14F5CFD6002C2989 /* MethodOfGettingAValueProfile.cpp in Sources */,
index f06f309..d8959e4 100644 (file)
@@ -44,6 +44,7 @@
 #include "B3PatchpointSpecial.h"
 #include "B3PatchpointValue.h"
 #include "B3PhaseScope.h"
+#include "B3PhiChildren.h"
 #include "B3Procedure.h"
 #include "B3StackSlotValue.h"
 #include "B3UpsilonValue.h"
@@ -65,6 +66,7 @@ public:
         : m_valueToTmp(procedure.values().size())
         , m_blockToBlock(procedure.size())
         , m_useCounts(procedure)
+        , m_phiChildren(procedure)
         , m_procedure(procedure)
         , m_code(procedure.code())
     {
@@ -317,8 +319,11 @@ private:
         if (m_valueToTmp[value])
             return false;
         
-        // We require internals to have only one use - us.
-        if (m_useCounts[value] != 1)
+        // We require internals to have only one use - us. It's not clear if this should be numUses() or
+        // numUsingInstructions(). Ideally, it would be numUsingInstructions(), except that it's not clear
+        // if we'd actually do the right thing when matching over such a DAG pattern. For now, it simply
+        // doesn't matter because we don't implement patterns that would trigger this.
+        if (m_useCounts.numUses(value) != 1)
             return false;
 
         return true;
@@ -525,6 +530,36 @@ private:
         append(opcode, result);
     }
 
+    // Call this method when doing two-operand lowering of a commutative operation. You have a choice of
+    // which incoming Value is moved into the result. This will select which one is likely to be most
+    // profitable to use as the result. Doing the right thing can have big performance consequences in tight
+    // kernels.
+    bool preferRightForResult(Value* left, Value* right)
+    {
+        // The default is to move left into result, because that's required for non-commutative instructions.
+        // The value that we want to move into result position is the one that dies here. So, if we're
+        // compiling a commutative operation and we know that actually right is the one that dies right here,
+        // then we can flip things around to help coalescing, which then kills the move instruction.
+        //
+        // But it's more complicated:
+        // - Used-once is a bad estimate of whether the variable dies here.
+        // - A child might be a candidate for coalescing with this value.
+        //
+        // Currently, we have machinery in place to recognize super obvious forms of the latter issue.
+
+        bool result = m_useCounts.numUsingInstructions(right) == 1;
+        
+        // We recognize when a child is a Phi that has this value as one of its children. We're very
+        // conservative about this; for example we don't even consider transitive Phi children.
+        bool leftIsPhiWithThis = m_phiChildren[left].transitivelyUses(m_value);
+        bool rightIsPhiWithThis = m_phiChildren[right].transitivelyUses(m_value);
+        
+        if (leftIsPhiWithThis != rightIsPhiWithThis)
+            result = rightIsPhiWithThis;
+
+        return result;
+    }
+
     template<
         Air::Opcode opcode32, Air::Opcode opcode64, Air::Opcode opcodeDouble,
         Commutativity commutativity = NotCommutative>
@@ -598,9 +633,11 @@ private:
             return;
         }
 
-        // FIXME: If we're going to use a two-operand instruction, and the operand is commutative, we
-        // should coalesce the result with the operand that is killed.
-        // https://bugs.webkit.org/show_bug.cgi?id=151321
+        if (commutativity == Commutative && preferRightForResult(left, right)) {
+            append(relaxedMoveForType(m_value->type()), tmp(right), result);
+            append(opcode, tmp(left), result);
+            return;
+        }
         
         append(relaxedMoveForType(m_value->type()), tmp(left), result);
         append(opcode, tmp(right), result);
@@ -1658,7 +1695,8 @@ private:
         }
 
         case CheckAdd:
-        case CheckSub: {
+        case CheckSub:
+        case CheckMul: {
             CheckValue* checkValue = m_value->as<CheckValue>();
 
             Value* left = checkValue->child(0);
@@ -1684,19 +1722,21 @@ private:
                 return;
             }
 
-            // FIXME: Use commutativity of CheckAdd to increase the likelihood of coalescing.
-            // https://bugs.webkit.org/show_bug.cgi?id=151321
-
-            append(Move, tmp(left), result);
-            
             Air::Opcode opcode = Air::Oops;
+            Commutativity commutativity = NotCommutative;
+            Arg::Role stackmapRole = Arg::Use;
             switch (m_value->opcode()) {
             case CheckAdd:
                 opcode = opcodeForType(BranchAdd32, BranchAdd64, Air::Oops, m_value->type());
+                commutativity = Commutative;
                 break;
             case CheckSub:
                 opcode = opcodeForType(BranchSub32, BranchSub64, Air::Oops, m_value->type());
                 break;
+            case CheckMul:
+                opcode = opcodeForType(BranchMul32, BranchMul64, Air::Oops, checkValue->type());
+                stackmapRole = Arg::LateUse;
+                break;
             default:
                 RELEASE_ASSERT_NOT_REACHED();
                 break;
@@ -1707,11 +1747,20 @@ private:
             // this rule here.
             // https://bugs.webkit.org/show_bug.cgi?id=151228
 
-            Arg source;
-            if (imm(right) && isValidForm(opcode, Arg::ResCond, Arg::Imm, Arg::Tmp))
-                source = imm(right);
-            else
-                source = tmp(right);
+            Vector<Arg, 2> sources;
+            if (imm(right) && isValidForm(opcode, Arg::ResCond, Arg::Tmp, Arg::Imm, Arg::Tmp)) {
+                sources.append(tmp(left));
+                sources.append(imm(right));
+            } else if (imm(right) && isValidForm(opcode, Arg::ResCond, Arg::Imm, Arg::Tmp)) {
+                sources.append(imm(right));
+                append(Move, tmp(left), result);
+            } else if (commutativity == Commutative && preferRightForResult(left, right)) {
+                sources.append(tmp(left));
+                append(Move, tmp(right), result);
+            } else {
+                sources.append(tmp(right));
+                append(Move, tmp(left), result);
+            }
 
             // There is a really hilarious case that arises when we do BranchAdd32(%x, %x). We won't emit
             // such code, but the coalescing in our register allocator also does copy propagation, so
@@ -1736,13 +1785,13 @@ private:
             // optimizations remove other Move's or identity-like operations. That's why we don't use
             // LateUse here to take care of add-to-self.
             
-            CheckSpecial* special = ensureCheckSpecial(opcode, 3);
+            CheckSpecial* special = ensureCheckSpecial(opcode, 2 + sources.size(), stackmapRole);
             
             Inst inst(Patch, checkValue, Arg::special(special));
 
             inst.args.append(Arg::resCond(MacroAssembler::Overflow));
 
-            inst.args.append(source);
+            inst.args.appendVector(sources);
             inst.args.append(result);
 
             fillStackmap(inst, checkValue, 2);
@@ -1751,34 +1800,6 @@ private:
             return;
         }
 
-        case CheckMul: {
-            CheckValue* checkValue = m_value->as<CheckValue>();
-
-            Value* left = checkValue->child(0);
-            Value* right = checkValue->child(1);
-
-            Tmp result = tmp(m_value);
-
-            Air::Opcode opcode =
-                opcodeForType(BranchMul32, BranchMul64, Air::Oops, checkValue->type());
-            CheckSpecial* special = ensureCheckSpecial(opcode, 3, Arg::LateUse);
-
-            // FIXME: Handle immediates.
-            // https://bugs.webkit.org/show_bug.cgi?id=151230
-
-            append(Move, tmp(left), result);
-
-            Inst inst(Patch, checkValue, Arg::special(special));
-            inst.args.append(Arg::resCond(MacroAssembler::Overflow));
-            inst.args.append(tmp(right));
-            inst.args.append(result);
-
-            fillStackmap(inst, checkValue, 2);
-            
-            m_insts.last().append(WTF::move(inst));
-            return;
-        }
-
         case Check: {
             Inst branch = createBranch(m_value->child(0));
 
@@ -1858,6 +1879,7 @@ private:
     HashMap<StackSlotValue*, Air::StackSlot*> m_stackToStack;
 
     UseCounts m_useCounts;
+    PhiChildren m_phiChildren;
 
     Vector<Vector<Inst, 4>> m_insts;
     Vector<Inst> m_prologue;
diff --git a/Source/JavaScriptCore/b3/B3PhiChildren.cpp b/Source/JavaScriptCore/b3/B3PhiChildren.cpp
new file mode 100644 (file)
index 0000000..7e75596
--- /dev/null
@@ -0,0 +1,51 @@
+/*
+ * Copyright (C) 2015 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+#include "config.h"
+#include "B3PhiChildren.h"
+
+#if ENABLE(B3_JIT)
+
+#include "B3ValueInlines.h"
+
+namespace JSC { namespace B3 {
+
+PhiChildren::PhiChildren(Procedure& proc)
+    : m_upsilons(proc.values().size())
+{
+    for (Value* value : proc.values()) {
+        if (UpsilonValue* upsilon = value->as<UpsilonValue>())
+            m_upsilons[upsilon->phi()].append(upsilon);
+    }
+}
+
+PhiChildren::~PhiChildren()
+{
+}
+
+} } // namespace JSC::B3
+
+#endif // ENABLE(B3_JIT)
+
diff --git a/Source/JavaScriptCore/b3/B3PhiChildren.h b/Source/JavaScriptCore/b3/B3PhiChildren.h
new file mode 100644 (file)
index 0000000..7789b44
--- /dev/null
@@ -0,0 +1,178 @@
+/*
+ * Copyright (C) 2015 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+#ifndef B3PhiChildren_h
+#define B3PhiChildren_h
+
+#if ENABLE(B3_JIT)
+
+#include "B3IndexMap.h"
+#include "B3Procedure.h"
+#include "B3UpsilonValue.h"
+#include <wtf/GraphNodeWorklist.h>
+
+namespace JSC { namespace B3 {
+
+class PhiChildren {
+public:
+    PhiChildren(Procedure&);
+    ~PhiChildren();
+
+    class ValueCollection {
+    public:
+        ValueCollection(Vector<UpsilonValue*>* values = nullptr)
+            : m_values(values)
+        {
+        }
+
+        unsigned size() const { return m_values->size(); }
+        Value* at(unsigned index) const { return m_values->at(index)->child(0); }
+        Value* operator[](unsigned index) const { return at(index); }
+
+        bool contains(Value* value) const
+        {
+            for (unsigned i = size(); i--;) {
+                if (at(i) == value)
+                    return true;
+            }
+            return false;
+        }
+
+        class iterator {
+        public:
+            iterator(Vector<UpsilonValue*>* values = nullptr, unsigned index = 0)
+                : m_values(values)
+                , m_index(index)
+            {
+            }
+
+            Value* operator*() const
+            {
+                return m_values->at(m_index)->child(0);
+            }
+
+            iterator& operator++()
+            {
+                m_index++;
+                return *this;
+            }
+
+            bool operator==(const iterator& other) const
+            {
+                ASSERT(m_values == other.m_values);
+                return m_index == other.m_index;
+            }
+
+            bool operator!=(const iterator& other) const
+            {
+                return !(*this == other);
+            }
+
+        private:
+            Vector<UpsilonValue*>* m_values;
+            unsigned m_index;
+        };
+
+        iterator begin() const { return iterator(m_values); }
+        iterator end() const { return iterator(m_values, m_values->size()); }
+
+    private:
+        Vector<UpsilonValue*>* m_values;
+    };
+    
+    class UpsilonCollection {
+    public:
+        UpsilonCollection()
+        {
+        }
+        
+        UpsilonCollection(PhiChildren* phiChildren, Value* value, Vector<UpsilonValue*>* values)
+            : m_phiChildren(phiChildren)
+            , m_value(value)
+            , m_values(values)
+        {
+        }
+
+        unsigned size() const { return m_values->size(); }
+        Value* at(unsigned index) const { return m_values->at(index); }
+        Value* operator[](unsigned index) const { return at(index); }
+
+        bool contains(Value* value) const { return m_values->contains(value); }
+
+        typedef Vector<UpsilonValue*>::const_iterator iterator;
+        Vector<UpsilonValue*>::const_iterator begin() const { return m_values->begin(); }
+        Vector<UpsilonValue*>::const_iterator end() const { return m_values->end(); }
+
+        ValueCollection values() { return ValueCollection(m_values); }
+        
+        template<typename Functor>
+        void forAllTransitiveIncomingValues(const Functor& functor)
+        {
+            if (m_value->opcode() != Phi) {
+                functor(m_value);
+                return;
+            }
+            
+            GraphNodeWorklist<Value*> worklist;
+            worklist.push(m_value);
+            while (Value* phi = worklist.pop()) {
+                for (Value* child : m_phiChildren->at(phi).values()) {
+                    if (child->opcode() == Phi)
+                        worklist.push(child);
+                    else
+                        functor(child);
+                }
+            }
+        }
+
+        bool transitivelyUses(Value* candidate)
+        {
+            bool result = false;
+            forAllTransitiveIncomingValues(
+                [&] (Value* child) {
+                    result |= child == candidate;
+                });
+            return result;
+        }
+
+    private:
+        PhiChildren* m_phiChildren { nullptr };
+        Value* m_value { nullptr };
+        Vector<UpsilonValue*>* m_values { nullptr };
+    };
+
+    UpsilonCollection at(Value* value) { return UpsilonCollection(this, value, &m_upsilons[value]); }
+    UpsilonCollection operator[](Value* value) { return at(value); }
+
+private:
+    IndexMap<Value, Vector<UpsilonValue*>> m_upsilons;
+};
+
+} } // namespace JSC::B3
+
+#endif // ENABLE(B3_JIT)
+
+#endif // B3PhiChildren_h
+
index b96cbd6..5fe18d4 100644 (file)
@@ -35,11 +35,22 @@ namespace JSC { namespace B3 {
 UseCounts::UseCounts(Procedure& procedure)
     : m_counts(procedure.values().size())
 {
-    for (Value* value : procedure.values())
-        ASSERT_UNUSED(value, !m_counts[value]);
+    Vector<Value*, 64> children;
     for (Value* value : procedure.values()) {
-        for (Value* child : value->children())
-            m_counts[child]++;
+        children.resize(0);
+        for (Value* child : value->children()) {
+            m_counts[child].numUses++;
+            children.append(child);
+        }
+        std::sort(children.begin(), children.end());
+        Value* last = nullptr;
+        for (Value* child : children) {
+            if (child == last)
+                continue;
+
+            m_counts[child].numUsingInstructions++;
+            last = child;
+        }
     }
 }
 
index d2d17d8..a59d58a 100644 (file)
@@ -40,12 +40,16 @@ public:
     UseCounts(Procedure&);
     ~UseCounts();
 
-    unsigned operator[](Value* value) const
-    {
-        return m_counts[value];
-    }
+    unsigned numUses(Value* value) const { return m_counts[value].numUses; }
+    unsigned numUsingInstructions(Value* value) const { return m_counts[value].numUsingInstructions; }
+    
 private:
-    IndexMap<Value, unsigned> m_counts;
+    struct Counts {
+        unsigned numUses { 0 };
+        unsigned numUsingInstructions { 0 };
+    };
+    
+    IndexMap<Value, Counts> m_counts;
 };
 
 } } // namespace JSC::B3