FTL B3 should be able to run crypto-sha1 in eager mode
authorfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Thu, 24 Dec 2015 00:26:04 +0000 (00:26 +0000)
committerfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Thu, 24 Dec 2015 00:26:04 +0000 (00:26 +0000)
https://bugs.webkit.org/show_bug.cgi?id=152539

Reviewed by Saam Barati.

This patch contains one real bug fix and some other fixes that are primarily there for sanity
because I don't believe they are symptomatic.

The real fix is the instruction selector's handling of Phi. It was assuming that the correct
lowering of Phi is to do nothing and the correct lowering of Upsilon is to store into the tmp
that the Phi uses. But this fails for code patterns like:

    @a = Phi()
    Upsilon(@x, ^a)
    use(@a) // this should see the value that @a had at the point that "@a = Phi()" executed.

This arises when we have a lot of Upsilons in a row and they are trying to perform a
shuffling. Prior to this change, "use(@a)" would see the new value of @a, i.e. @x. That's
wrong. So, this changes the lowering to make each Phi have a special shadow Tmp, and Upsilon
stores to it while Phi loads from it. Most of these assignments get copy-propagated by IRC,
so it doesn't really hurt us. I couldn't find any benchmarks that slowed down because of
this. In fact, I believe that the only time that this would lead to extra interference or
extra assignments is when it's actually needed to be correct.

This also contains other fixes, which are probably not for real bugs, but they make me feel
all warm and fuzzy:

- spillEverything() works again.  Previously, it didn't have all of IRC's smarts for handling
  a spill of a ZDef.  I fixed this by creating a helper phase that finds all subwidth ZDefs
  to spill slots and amends them with zero-fills of the top bits.

- IRC no longer requires precise TmpWidth analysis.  Previously, if TmpWidth gave pessimistic
  results, the subwidth ZDef bug would return.  That probably means that it was never fixed
  to begin with, since it's totally cool for just a single def or use of a tmp to cause it
  to become pessimistic. But there may still have been some subwidth ZDefs.  The way that I
  fixed this bug is to have IRC also run the ZDef fixup code that spillEverything() uses.
  This is abstracted behind the beautifully named Air::fixSpillSlotZDef().

- B3::validate() does dominance checks!  So, if you shoot yourself in the foot by using
  something before defining it, validate() will tell you.

- Air::TmpWidth is now easy to "turn off" - i.e. to make it go fully conservative. It's not
  an Option; you have to hack code. But that's better than nothing, and it's consistent with
  what we do for other super-internal compiler options that we use rarely.

- You can now run spillEverything() without hacking code.  Just use
  Options::airSpillSeverything().

* JavaScriptCore.xcodeproj/project.pbxproj:
* b3/B3LowerToAir.cpp:
(JSC::B3::Air::LowerToAir::LowerToAir):
(JSC::B3::Air::LowerToAir::run):
(JSC::B3::Air::LowerToAir::lower):
* b3/B3Validate.cpp:
* b3/air/AirCode.h:
(JSC::B3::Air::Code::specials):
(JSC::B3::Air::Code::forAllTmps):
(JSC::B3::Air::Code::isFastTmp):
* b3/air/AirFixSpillSlotZDef.h: Added.
(JSC::B3::Air::fixSpillSlotZDef):
* b3/air/AirGenerate.cpp:
(JSC::B3::Air::prepareForGeneration):
* b3/air/AirIteratedRegisterCoalescing.cpp:
* b3/air/AirSpillEverything.cpp:
(JSC::B3::Air::spillEverything):
* b3/air/AirTmpWidth.cpp:
(JSC::B3::Air::TmpWidth::recompute):
* jit/JITOperations.cpp:
* runtime/Options.h:

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@194402 268f45cc-cd09-0410-ab3c-d52691b4dbfc

Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj
Source/JavaScriptCore/b3/B3LowerToAir.cpp
Source/JavaScriptCore/b3/B3Validate.cpp
Source/JavaScriptCore/b3/air/AirCode.h
Source/JavaScriptCore/b3/air/AirFixSpillSlotZDef.h [new file with mode: 0644]
Source/JavaScriptCore/b3/air/AirGenerate.cpp
Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp
Source/JavaScriptCore/b3/air/AirSpillEverything.cpp
Source/JavaScriptCore/b3/air/AirTmpWidth.cpp
Source/JavaScriptCore/runtime/Options.h

index 48a02dc..51cf36d 100644 (file)
@@ -1,5 +1,77 @@
 2015-12-23  Filip Pizlo  <fpizlo@apple.com>
 
+        FTL B3 should be able to run crypto-sha1 in eager mode
+        https://bugs.webkit.org/show_bug.cgi?id=152539
+
+        Reviewed by Saam Barati.
+
+        This patch contains one real bug fix and some other fixes that are primarily there for sanity
+        because I don't believe they are symptomatic.
+
+        The real fix is the instruction selector's handling of Phi. It was assuming that the correct
+        lowering of Phi is to do nothing and the correct lowering of Upsilon is to store into the tmp
+        that the Phi uses. But this fails for code patterns like:
+
+            @a = Phi()
+            Upsilon(@x, ^a)
+            use(@a) // this should see the value that @a had at the point that "@a = Phi()" executed.
+
+        This arises when we have a lot of Upsilons in a row and they are trying to perform a
+        shuffling. Prior to this change, "use(@a)" would see the new value of @a, i.e. @x. That's
+        wrong. So, this changes the lowering to make each Phi have a special shadow Tmp, and Upsilon
+        stores to it while Phi loads from it. Most of these assignments get copy-propagated by IRC,
+        so it doesn't really hurt us. I couldn't find any benchmarks that slowed down because of
+        this. In fact, I believe that the only time that this would lead to extra interference or
+        extra assignments is when it's actually needed to be correct.
+
+        This also contains other fixes, which are probably not for real bugs, but they make me feel
+        all warm and fuzzy:
+
+        - spillEverything() works again.  Previously, it didn't have all of IRC's smarts for handling
+          a spill of a ZDef.  I fixed this by creating a helper phase that finds all subwidth ZDefs
+          to spill slots and amends them with zero-fills of the top bits.
+
+        - IRC no longer requires precise TmpWidth analysis.  Previously, if TmpWidth gave pessimistic
+          results, the subwidth ZDef bug would return.  That probably means that it was never fixed
+          to begin with, since it's totally cool for just a single def or use of a tmp to cause it
+          to become pessimistic. But there may still have been some subwidth ZDefs.  The way that I
+          fixed this bug is to have IRC also run the ZDef fixup code that spillEverything() uses.
+          This is abstracted behind the beautifully named Air::fixSpillSlotZDef().
+
+        - B3::validate() does dominance checks!  So, if you shoot yourself in the foot by using
+          something before defining it, validate() will tell you.
+
+        - Air::TmpWidth is now easy to "turn off" - i.e. to make it go fully conservative. It's not
+          an Option; you have to hack code. But that's better than nothing, and it's consistent with
+          what we do for other super-internal compiler options that we use rarely.
+
+        - You can now run spillEverything() without hacking code.  Just use
+          Options::airSpillSeverything().
+
+        * JavaScriptCore.xcodeproj/project.pbxproj:
+        * b3/B3LowerToAir.cpp:
+        (JSC::B3::Air::LowerToAir::LowerToAir):
+        (JSC::B3::Air::LowerToAir::run):
+        (JSC::B3::Air::LowerToAir::lower):
+        * b3/B3Validate.cpp:
+        * b3/air/AirCode.h:
+        (JSC::B3::Air::Code::specials):
+        (JSC::B3::Air::Code::forAllTmps):
+        (JSC::B3::Air::Code::isFastTmp):
+        * b3/air/AirFixSpillSlotZDef.h: Added.
+        (JSC::B3::Air::fixSpillSlotZDef):
+        * b3/air/AirGenerate.cpp:
+        (JSC::B3::Air::prepareForGeneration):
+        * b3/air/AirIteratedRegisterCoalescing.cpp:
+        * b3/air/AirSpillEverything.cpp:
+        (JSC::B3::Air::spillEverything):
+        * b3/air/AirTmpWidth.cpp:
+        (JSC::B3::Air::TmpWidth::recompute):
+        * jit/JITOperations.cpp:
+        * runtime/Options.h:
+
+2015-12-23  Filip Pizlo  <fpizlo@apple.com>
+
         Need a story for platform-specific Args
         https://bugs.webkit.org/show_bug.cgi?id=152529
 
index ae6b9f6..6439cc1 100644 (file)
                0F493AFA16D0CAD30084508B /* SourceProvider.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F493AF816D0CAD10084508B /* SourceProvider.cpp */; };
                0F4B94DC17B9F07500DD03A4 /* TypedArrayInlines.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F4B94DB17B9F07500DD03A4 /* TypedArrayInlines.h */; settings = {ATTRIBUTES = (Private, ); }; };
                0F4C91661C29F4F2004341A6 /* B3OriginDump.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F4C91651C29F4F2004341A6 /* B3OriginDump.h */; };
+               0F4C91681C2B3D68004341A6 /* AirFixSpillSlotZDef.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F4C91671C2B3D68004341A6 /* AirFixSpillSlotZDef.h */; };
                0F4F29DF18B6AD1C0057BC15 /* DFGStaticExecutionCountEstimationPhase.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F4F29DD18B6AD1C0057BC15 /* DFGStaticExecutionCountEstimationPhase.cpp */; };
                0F4F29E018B6AD1C0057BC15 /* DFGStaticExecutionCountEstimationPhase.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F4F29DE18B6AD1C0057BC15 /* DFGStaticExecutionCountEstimationPhase.h */; };
                0F50AF3C193E8B3900674EE8 /* DFGStructureClobberState.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F50AF3B193E8B3900674EE8 /* DFGStructureClobberState.h */; };
                0F493AF816D0CAD10084508B /* SourceProvider.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = SourceProvider.cpp; sourceTree = "<group>"; };
                0F4B94DB17B9F07500DD03A4 /* TypedArrayInlines.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = TypedArrayInlines.h; sourceTree = "<group>"; };
                0F4C91651C29F4F2004341A6 /* B3OriginDump.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3OriginDump.h; path = b3/B3OriginDump.h; sourceTree = "<group>"; };
+               0F4C91671C2B3D68004341A6 /* AirFixSpillSlotZDef.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirFixSpillSlotZDef.h; path = b3/air/AirFixSpillSlotZDef.h; sourceTree = "<group>"; };
                0F4F29DD18B6AD1C0057BC15 /* DFGStaticExecutionCountEstimationPhase.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = DFGStaticExecutionCountEstimationPhase.cpp; path = dfg/DFGStaticExecutionCountEstimationPhase.cpp; sourceTree = "<group>"; };
                0F4F29DE18B6AD1C0057BC15 /* DFGStaticExecutionCountEstimationPhase.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = DFGStaticExecutionCountEstimationPhase.h; path = dfg/DFGStaticExecutionCountEstimationPhase.h; sourceTree = "<group>"; };
                0F50AF3B193E8B3900674EE8 /* DFGStructureClobberState.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = DFGStructureClobberState.h; path = dfg/DFGStructureClobberState.h; sourceTree = "<group>"; };
                                0F4570371BE44C910062A629 /* AirEliminateDeadCode.h */,
                                262D85B41C0D650F006ACB61 /* AirFixPartialRegisterStalls.cpp */,
                                262D85B51C0D650F006ACB61 /* AirFixPartialRegisterStalls.h */,
+                               0F4C91671C2B3D68004341A6 /* AirFixSpillSlotZDef.h */,
                                0FEC85521BDACDC70080FF74 /* AirFrequentedBlock.h */,
                                0FEC85531BDACDC70080FF74 /* AirGenerate.cpp */,
                                0FEC85541BDACDC70080FF74 /* AirGenerate.h */,
                                DC00039319D8BE6F00023EB0 /* DFGPreciseLocalClobberize.h in Headers */,
                                0FBE0F7516C1DB0B0082C5E8 /* DFGPredictionInjectionPhase.h in Headers */,
                                0FFFC95E14EF90B700C72532 /* DFGPredictionPropagationPhase.h in Headers */,
+                               0F4C91681C2B3D68004341A6 /* AirFixSpillSlotZDef.h in Headers */,
                                0F3E01AB19D353A500F61B7F /* DFGPrePostNumbering.h in Headers */,
                                0F2B9CED19D0BA7D00B1D1B5 /* DFGPromotedHeapLocation.h in Headers */,
                                0FFC92161B94FB3E0071DD66 /* DFGPropertyTypeKey.h in Headers */,
index 2c74631..a8f7129 100644 (file)
@@ -64,6 +64,7 @@ class LowerToAir {
 public:
     LowerToAir(Procedure& procedure)
         : m_valueToTmp(procedure.values().size())
+        , m_phiToTmp(procedure.values().size())
         , m_blockToBlock(procedure.size())
         , m_useCounts(procedure)
         , m_phiChildren(procedure)
@@ -76,9 +77,21 @@ public:
     {
         for (B3::BasicBlock* block : m_procedure)
             m_blockToBlock[block] = m_code.addBlock(block->frequency());
+        
         for (Value* value : m_procedure.values()) {
-            if (StackSlotValue* stackSlotValue = value->as<StackSlotValue>())
+            switch (value->opcode()) {
+            case Phi: {
+                m_phiToTmp[value] = m_code.newTmp(Arg::typeForB3Type(value->type()));
+                break;
+            }
+            case B3::StackSlot: {
+                StackSlotValue* stackSlotValue = value->as<StackSlotValue>();
                 m_stackToStack.add(stackSlotValue, m_code.addStackSlot(stackSlotValue));
+                break;
+            }
+            default:
+                break;
+            }
         }
 
         m_procedure.resetValueOwners(); // Used by crossesInterference().
@@ -2126,12 +2139,17 @@ private:
             Value* value = m_value->child(0);
             append(
                 relaxedMoveForType(value->type()), immOrTmp(value),
-                tmp(m_value->as<UpsilonValue>()->phi()));
+                m_phiToTmp[m_value->as<UpsilonValue>()->phi()]);
             return;
         }
 
         case Phi: {
-            // Our semantics are determined by Upsilons, so we have nothing to do here.
+            // Snapshot the value of the Phi. It may change under us because you could do:
+            // a = Phi()
+            // Upsilon(@x, ^a)
+            // @a => this should get the value of the Phi before the Upsilon, i.e. not @x.
+
+            append(relaxedMoveForType(m_value->type()), m_phiToTmp[m_value], tmp(m_value));
             return;
         }
 
@@ -2209,6 +2227,7 @@ private:
 
     IndexSet<Value> m_locked; // These are values that will have no Tmp in Air.
     IndexMap<Value, Tmp> m_valueToTmp; // These are values that must have a Tmp in Air. We say that a Value* with a non-null Tmp is "pinned".
+    IndexMap<Value, Tmp> m_phiToTmp; // Each Phi gets its own Tmp.
     IndexMap<B3::BasicBlock, Air::BasicBlock*> m_blockToBlock;
     HashMap<StackSlotValue*, Air::StackSlot*> m_stackToStack;
 
index 77bfb5f..6a2c2e6 100644 (file)
@@ -30,6 +30,7 @@
 
 #include "B3ArgumentRegValue.h"
 #include "B3BasicBlockInlines.h"
+#include "B3Dominators.h"
 #include "B3MemoryValue.h"
 #include "B3Procedure.h"
 #include "B3StackSlotValue.h"
@@ -62,11 +63,17 @@ public:
         HashSet<BasicBlock*> blocks;
         HashSet<Value*> valueInProc;
         HashMap<Value*, unsigned> valueInBlock;
+        HashMap<Value*, BasicBlock*> valueOwner;
+        HashMap<Value*, unsigned> valueIndex;
 
         for (BasicBlock* block : m_procedure) {
             blocks.add(block);
-            for (Value* value : *block)
+            for (unsigned i = 0; i < block->size(); ++i) {
+                Value* value = block->at(i);
                 valueInBlock.add(value, 0).iterator->value++;
+                valueOwner.add(value, block);
+                valueIndex.add(value, i);
+            }
         }
 
         for (Value* value : m_procedure.values())
@@ -79,10 +86,17 @@ public:
             VALIDATE(entry.value == 1, ("At ", *entry.key));
         }
 
+        // Compute dominators ourselves to avoid perturbing Procedure.
+        Dominators dominators(m_procedure);
+
         for (Value* value : valueInProc) {
             for (Value* child : value->children()) {
                 VALIDATE(child, ("At ", *value));
                 VALIDATE(valueInProc.contains(child), ("At ", *value, "->", pointerDump(child)));
+                if (valueOwner.get(child) == valueOwner.get(value))
+                    VALIDATE(valueIndex.get(value) > valueIndex.get(child), ("At ", *value, "->", pointerDump(child)));
+                else
+                    VALIDATE(dominators.dominates(valueOwner.get(child), valueOwner.get(value)), ("at ", *value, "->", pointerDump(child)));
             }
         }
 
index 4a90181..e651b9a 100644 (file)
@@ -57,6 +57,9 @@ public:
 
     BasicBlock* addBlock(double frequency = 1);
 
+    // Note that you can rely on stack slots always getting indices that are larger than the index
+    // of any prior stack slot. In fact, all stack slots you create in the future will have an index
+    // that is >= stackSlots().size().
     StackSlot* addStackSlot(unsigned byteSize, StackSlotKind, StackSlotValue* = nullptr);
     StackSlot* addStackSlot(StackSlotValue*);
 
@@ -291,6 +294,15 @@ public:
 
     SpecialsCollection specials() const { return SpecialsCollection(*this); }
 
+    template<typename Callback>
+    void forAllTmps(const Callback& callback) const
+    {
+        for (unsigned i = m_numGPTmps; i--;)
+            callback(Tmp::gpTmpForIndex(i));
+        for (unsigned i = m_numFPTmps; i--;)
+            callback(Tmp::fpTmpForIndex(i));
+    }
+
     void addFastTmp(Tmp);
     bool isFastTmp(Tmp tmp) const { return m_fastTmps.contains(tmp); }
     
diff --git a/Source/JavaScriptCore/b3/air/AirFixSpillSlotZDef.h b/Source/JavaScriptCore/b3/air/AirFixSpillSlotZDef.h
new file mode 100644 (file)
index 0000000..2012d89
--- /dev/null
@@ -0,0 +1,78 @@
+/*
+ * Copyright (C) 2015 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+#ifndef AirFixSpillSlotZDef_h
+#define AirFixSpillSlotZDef_h
+
+#include "AirCode.h"
+#include "AirInsertionSet.h"
+#include "AirInstInlines.h"
+
+namespace JSC { namespace B3 { namespace Air {
+
+template<typename IsSpillSlot>
+void fixSpillSlotZDef(Code& code, const IsSpillSlot& isSpillSlot)
+{
+    // We could have introduced ZDef's to StackSlots that are wider than the def. In that case, we
+    // need to emit code to zero-fill the top bits of the StackSlot.
+    InsertionSet insertionSet(code);
+    for (BasicBlock* block : code) {
+        for (unsigned instIndex = 0; instIndex < block->size(); ++instIndex) {
+            Inst& inst = block->at(instIndex);
+
+            inst.forEachArg(
+                [&] (Arg& arg, Arg::Role role, Arg::Type, Arg::Width width) {
+                    if (!Arg::isZDef(role))
+                        return;
+                    if (!arg.isStack())
+                        return;
+                    if (!isSpillSlot(arg.stackSlot()))
+                        return;
+                    if (arg.stackSlot()->byteSize() == Arg::bytes(width))
+                        return;
+
+                    // Currently we only handle this simple case because it's the only one that
+                    // arises: ZDef's are only 32-bit right now. So, when we hit these
+                    // assertions it means that we need to implement those other kinds of zero
+                    // fills.
+                    RELEASE_ASSERT(arg.stackSlot()->byteSize() == 8);
+                    RELEASE_ASSERT(width == Arg::Width32);
+
+                    // We rely on the fact that there must be some way to move zero to a memory
+                    // location without first burning a register. On ARM, we would do this using
+                    // zr.
+                    RELEASE_ASSERT(isValidForm(Move32, Arg::Imm, Arg::Stack));
+                    insertionSet.insert(
+                        instIndex + 1, Move32, inst.origin, Arg::imm(0), arg.withOffset(4));
+                });
+        }
+        insertionSet.execute(block);
+    }
+}
+
+} } } // namespace JSC::B3::Air
+
+#endif // AirFixSpillSlotZDef_h
+
index b7f00a9..c6be7da 100644 (file)
@@ -74,7 +74,7 @@ void prepareForGeneration(Code& code)
     // After this phase, every Tmp has a reg.
     //
     // For debugging, you can use spillEverything() to put everything to the stack between each Inst.
-    if (false)
+    if (Options::airSpillsEverything())
         spillEverything(code);
     else
         iteratedRegisterCoalescing(code);
index 846b432..bb937ba 100644 (file)
@@ -29,6 +29,7 @@
 #if ENABLE(B3_JIT)
 
 #include "AirCode.h"
+#include "AirFixSpillSlotZDef.h"
 #include "AirInsertionSet.h"
 #include "AirInstInlines.h"
 #include "AirLiveness.h"
@@ -1164,6 +1165,7 @@ private:
     void addSpillAndFill(const ColoringAllocator<type>& allocator, HashSet<unsigned>& unspillableTmps)
     {
         HashMap<Tmp, StackSlot*> stackSlots;
+        unsigned newStackSlotThreshold = m_code.stackSlots().size();
         for (Tmp tmp : allocator.spilledTmps()) {
             // All the spilled values become unspillable.
             unspillableTmps.add(AbsoluteTmpMapper<type>::absoluteIndex(tmp));
@@ -1260,6 +1262,12 @@ private:
                 });
             }
         }
+
+        fixSpillSlotZDef(
+            m_code,
+            [&] (StackSlot* stackSlot) -> bool {
+                return stackSlot->index() >= newStackSlotThreshold;
+            });
     }
 
     Code& m_code;
index 75115ec..8b4012e 100644 (file)
@@ -29,6 +29,7 @@
 #if ENABLE(B3_JIT)
 
 #include "AirCode.h"
+#include "AirFixSpillSlotZDef.h"
 #include "AirInsertionSet.h"
 #include "AirInstInlines.h"
 #include "AirLiveness.h"
@@ -86,6 +87,7 @@ void spillEverything(Code& code)
 
     // Allocate a stack slot for each tmp.
     Vector<StackSlot*> allStackSlots[Arg::numTypes];
+    unsigned newStackSlotThreshold = code.stackSlots().size();
     for (unsigned typeIndex = 0; typeIndex < Arg::numTypes; ++typeIndex) {
         Vector<StackSlot*>& stackSlots = allStackSlots[typeIndex];
         Arg::Type type = static_cast<Arg::Type>(typeIndex);
@@ -109,7 +111,7 @@ void spillEverything(Code& code)
                     if (arg.isReg())
                         continue;
 
-                    if (inst.admitsStack(i)) { 
+                    if (inst.admitsStack(i)) {
                         StackSlot* stackSlot = allStackSlots[arg.type()][arg.tmpIndex()];
                         arg = Arg::stack(stackSlot);
                         continue;
@@ -184,6 +186,12 @@ void spillEverything(Code& code)
         }
         insertionSet.execute(block);
     }
+
+    fixSpillSlotZDef(
+        code,
+        [&] (StackSlot* stackSlot) -> bool {
+            return stackSlot->index() >= newStackSlotThreshold;
+        });
 }
 
 } } } // namespace JSC::B3::Air
index c5a6634..501c8e3 100644 (file)
@@ -48,17 +48,31 @@ TmpWidth::~TmpWidth()
 
 void TmpWidth::recompute(Code& code)
 {
+    // Set this to true to cause this analysis to always return pessimistic results.
+    const bool beCareful = false;
+    
     m_width.clear();
     
+    auto assumeTheWorst = [&] (Tmp tmp) {
+        Widths& widths = m_width.add(tmp, Widths()).iterator->value;
+        Arg::Type type = Arg(tmp).type();
+        widths.use = Arg::conservativeWidth(type);
+        widths.def = Arg::conservativeWidth(type);
+    };
+    
     // Assume the worst for registers.
     RegisterSet::allRegisters().forEach(
         [&] (Reg reg) {
-            Widths& widths = m_width.add(Tmp(reg), Widths()).iterator->value;
-            Arg::Type type = Arg(Tmp(reg)).type();
-            widths.use = Arg::conservativeWidth(type);
-            widths.def = Arg::conservativeWidth(type);
+            assumeTheWorst(Tmp(reg));
         });
-    
+
+    if (beCareful) {
+        code.forAllTmps(assumeTheWorst);
+        
+        // We fall through because the fixpoint that follows can only make things even more
+        // conservative. This mode isn't meant to be fast, just safe.
+    }
+
     // Now really analyze everything but Move's over Tmp's, but set aside those Move's so we can find
     // them quickly during the fixpoint below. Note that we can make this analysis stronger by
     // recognizing more kinds of Move's or anything that has Move-like behavior, though it's probably not
index ae2d3f0..c0d9089 100644 (file)
@@ -340,6 +340,7 @@ typedef const char* optionString;
     \
     v(bool, logB3PhaseTimes, false, nullptr) \
     v(double, rareBlockPenalty, 0.001, nullptr) \
+    v(bool, airSpillsEverything, false, nullptr) \
     \
     v(bool, useDollarVM, false, "installs the $vm debugging tool in global objects") \
     v(optionString, functionOverrides, nullptr, "file with debugging overrides for function bodies") \