[JSC] Pick how to OSR Enter to FTL at runtime instead of compile time
authorbenjamin@webkit.org <benjamin@webkit.org@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Wed, 9 Mar 2016 17:51:38 +0000 (17:51 +0000)
committerbenjamin@webkit.org <benjamin@webkit.org@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Wed, 9 Mar 2016 17:51:38 +0000 (17:51 +0000)
https://bugs.webkit.org/show_bug.cgi?id=155217

Reviewed by Filip Pizlo.

This patch addresses 2 types of problems with tiering up to FTL
with OSR Entry in a loop:
-When there are nested loops, it is generally valuable to enter
 an outer loop rather than an inner loop.
-When tiering up at a point that cannot OSR Enter, we are at
 the mercy of the outer loop frequency to compile the right
 entry point.

The first case is significant in the test "gaussian-blur".
That test has 4 nested loops. When we have an OSR Entry,
the analysis phases have to be pesimistic where we enter:
we do not really know what constraint can be proven from
the DFG code that was running.

In "gaussian-blur", integer-range analysis removes pretty
much all overflow checks in the inner loops of where we entered.
The more outside we enter, the better code we generate.

Since we spend the most iterations in the inner loop, we naturally
tend to OSR Enter into the 2 most inner loops, making the most
pessimistic assumptions.

To avoid such problems, I changed how we decide where to OSR Enter.
Previously, the last CheckTierUpAndOSREnter to cross the threshold
was where we take the entry point for FTL.

What happens now is that the entry point is not decied when
compiling the CheckTierUp variants. Instead, all the information
we need is gathered during compilation and keept on the JITCode
to be used at runtime.

When we try to tier up and decide to OSR Enter, we use the information
we have to pick a good outer loop for OSR Entry.

Now the problem is outer loop do not CheckTierUpAndOSREnter often,
wasting several miliseconds before entering the newly compiled FTL code.

To solve that, every CheckTierUpAndOSREnter has its own trigger that
bypass the counter. When the FTL Code is compiled, the trigger is set
and we enter through the right CheckTierUpAndOSREnter immediately.

---

This new mechanism also solves a problem of ai-astar.
When we try to tier up in ai-astar, we had nothing to compile until
the outer loop is reached.

To make sure we reached the CheckTierUpAndOSREnter in a reasonable time,
we had CheckTierUpWithNestedTriggerAndOSREnter with a special trigger.

With the new mechanism, we can do much better:
-When we keep hitting CheckTierUpInLoop, we now have all the information
 we need to already start compiling the outer loop.
 Instead of waiting for the outer loop to be reached a few times, we compile
 it as soon as the inner loop is hammering CheckTierUpInLoop.
-With the new triggers, the very next time we hit the outer loop, we OSR Enter.

This allow us to compile what we need sooner and enter sooner.

* dfg/DFGAbstractInterpreterInlines.h:
(JSC::DFG::AbstractInterpreter<AbstractStateType>::executeEffects): Deleted.
* dfg/DFGClobberize.h:
(JSC::DFG::clobberize): Deleted.
* dfg/DFGDoesGC.cpp:
(JSC::DFG::doesGC): Deleted.
* dfg/DFGFixupPhase.cpp:
(JSC::DFG::FixupPhase::fixupNode): Deleted.
* dfg/DFGJITCode.h:
* dfg/DFGJITCompiler.cpp:
(JSC::DFG::JITCompiler::JITCompiler):
(JSC::DFG::JITCompiler::compileEntryExecutionFlag):
* dfg/DFGNodeType.h:
* dfg/DFGOperations.cpp:
* dfg/DFGOperations.h:
* dfg/DFGPlan.h:
(JSC::DFG::Plan::canTierUpAndOSREnter):
* dfg/DFGPredictionPropagationPhase.cpp:
(JSC::DFG::PredictionPropagationPhase::propagate): Deleted.
* dfg/DFGSafeToExecute.h:
(JSC::DFG::safeToExecute): Deleted.
* dfg/DFGSpeculativeJIT32_64.cpp:
(JSC::DFG::SpeculativeJIT::compile): Deleted.
* dfg/DFGSpeculativeJIT64.cpp:
(JSC::DFG::SpeculativeJIT::compile):
* dfg/DFGTierUpCheckInjectionPhase.cpp:
(JSC::DFG::TierUpCheckInjectionPhase::run):
(JSC::DFG::TierUpCheckInjectionPhase::buildNaturalLoopToLoopHintMap):
(JSC::DFG::TierUpCheckInjectionPhase::findLoopsContainingLoopHintWithoutOSREnter): Deleted.
* dfg/DFGToFTLForOSREntryDeferredCompilationCallback.cpp:
(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::ToFTLForOSREntryDeferredCompilationCallback):
(JSC::DFG::Ref<ToFTLForOSREntryDeferredCompilationCallback>ToFTLForOSREntryDeferredCompilationCallback::create):
(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::compilationDidBecomeReadyAsynchronously):
(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::compilationDidComplete):
* dfg/DFGToFTLForOSREntryDeferredCompilationCallback.h:

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@197861 268f45cc-cd09-0410-ab3c-d52691b4dbfc

18 files changed:
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h
Source/JavaScriptCore/dfg/DFGClobberize.h
Source/JavaScriptCore/dfg/DFGDoesGC.cpp
Source/JavaScriptCore/dfg/DFGFixupPhase.cpp
Source/JavaScriptCore/dfg/DFGJITCode.h
Source/JavaScriptCore/dfg/DFGJITCompiler.cpp
Source/JavaScriptCore/dfg/DFGNodeType.h
Source/JavaScriptCore/dfg/DFGOperations.cpp
Source/JavaScriptCore/dfg/DFGOperations.h
Source/JavaScriptCore/dfg/DFGPlan.h
Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp
Source/JavaScriptCore/dfg/DFGSafeToExecute.h
Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp
Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp
Source/JavaScriptCore/dfg/DFGTierUpCheckInjectionPhase.cpp
Source/JavaScriptCore/dfg/DFGToFTLForOSREntryDeferredCompilationCallback.cpp
Source/JavaScriptCore/dfg/DFGToFTLForOSREntryDeferredCompilationCallback.h

index 92a611b..76cfc28 100644 (file)
@@ -1,3 +1,105 @@
+2016-03-09  Benjamin Poulain  <benjamin@webkit.org>
+
+        [JSC] Pick how to OSR Enter to FTL at runtime instead of compile time
+        https://bugs.webkit.org/show_bug.cgi?id=155217
+
+        Reviewed by Filip Pizlo.
+
+        This patch addresses 2 types of problems with tiering up to FTL
+        with OSR Entry in a loop:
+        -When there are nested loops, it is generally valuable to enter
+         an outer loop rather than an inner loop.
+        -When tiering up at a point that cannot OSR Enter, we are at
+         the mercy of the outer loop frequency to compile the right
+         entry point.
+
+        The first case is significant in the test "gaussian-blur".
+        That test has 4 nested loops. When we have an OSR Entry,
+        the analysis phases have to be pesimistic where we enter:
+        we do not really know what constraint can be proven from
+        the DFG code that was running.
+
+        In "gaussian-blur", integer-range analysis removes pretty
+        much all overflow checks in the inner loops of where we entered.
+        The more outside we enter, the better code we generate.
+
+        Since we spend the most iterations in the inner loop, we naturally
+        tend to OSR Enter into the 2 most inner loops, making the most
+        pessimistic assumptions.
+
+        To avoid such problems, I changed how we decide where to OSR Enter.
+        Previously, the last CheckTierUpAndOSREnter to cross the threshold
+        was where we take the entry point for FTL.
+
+        What happens now is that the entry point is not decied when
+        compiling the CheckTierUp variants. Instead, all the information
+        we need is gathered during compilation and keept on the JITCode
+        to be used at runtime.
+
+        When we try to tier up and decide to OSR Enter, we use the information
+        we have to pick a good outer loop for OSR Entry.
+
+        Now the problem is outer loop do not CheckTierUpAndOSREnter often,
+        wasting several miliseconds before entering the newly compiled FTL code.
+
+        To solve that, every CheckTierUpAndOSREnter has its own trigger that
+        bypass the counter. When the FTL Code is compiled, the trigger is set
+        and we enter through the right CheckTierUpAndOSREnter immediately.
+
+        ---
+
+        This new mechanism also solves a problem of ai-astar.
+        When we try to tier up in ai-astar, we had nothing to compile until
+        the outer loop is reached.
+
+        To make sure we reached the CheckTierUpAndOSREnter in a reasonable time,
+        we had CheckTierUpWithNestedTriggerAndOSREnter with a special trigger.
+
+        With the new mechanism, we can do much better:
+        -When we keep hitting CheckTierUpInLoop, we now have all the information
+         we need to already start compiling the outer loop.
+         Instead of waiting for the outer loop to be reached a few times, we compile
+         it as soon as the inner loop is hammering CheckTierUpInLoop.
+        -With the new triggers, the very next time we hit the outer loop, we OSR Enter.
+
+        This allow us to compile what we need sooner and enter sooner.
+
+        * dfg/DFGAbstractInterpreterInlines.h:
+        (JSC::DFG::AbstractInterpreter<AbstractStateType>::executeEffects): Deleted.
+        * dfg/DFGClobberize.h:
+        (JSC::DFG::clobberize): Deleted.
+        * dfg/DFGDoesGC.cpp:
+        (JSC::DFG::doesGC): Deleted.
+        * dfg/DFGFixupPhase.cpp:
+        (JSC::DFG::FixupPhase::fixupNode): Deleted.
+        * dfg/DFGJITCode.h:
+        * dfg/DFGJITCompiler.cpp:
+        (JSC::DFG::JITCompiler::JITCompiler):
+        (JSC::DFG::JITCompiler::compileEntryExecutionFlag):
+        * dfg/DFGNodeType.h:
+        * dfg/DFGOperations.cpp:
+        * dfg/DFGOperations.h:
+        * dfg/DFGPlan.h:
+        (JSC::DFG::Plan::canTierUpAndOSREnter):
+        * dfg/DFGPredictionPropagationPhase.cpp:
+        (JSC::DFG::PredictionPropagationPhase::propagate): Deleted.
+        * dfg/DFGSafeToExecute.h:
+        (JSC::DFG::safeToExecute): Deleted.
+        * dfg/DFGSpeculativeJIT32_64.cpp:
+        (JSC::DFG::SpeculativeJIT::compile): Deleted.
+        * dfg/DFGSpeculativeJIT64.cpp:
+        (JSC::DFG::SpeculativeJIT::compile):
+        * dfg/DFGTierUpCheckInjectionPhase.cpp:
+        (JSC::DFG::TierUpCheckInjectionPhase::run):
+        (JSC::DFG::TierUpCheckInjectionPhase::buildNaturalLoopToLoopHintMap):
+        (JSC::DFG::TierUpCheckInjectionPhase::findLoopsContainingLoopHintWithoutOSREnter): Deleted.
+        * dfg/DFGToFTLForOSREntryDeferredCompilationCallback.cpp:
+        (JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::ToFTLForOSREntryDeferredCompilationCallback):
+        (JSC::DFG::Ref<ToFTLForOSREntryDeferredCompilationCallback>ToFTLForOSREntryDeferredCompilationCallback::create):
+        (JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::compilationDidBecomeReadyAsynchronously):
+        (JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::compilationDidComplete):
+        * dfg/DFGToFTLForOSREntryDeferredCompilationCallback.h:
+
 2016-03-08  Filip Pizlo  <fpizlo@apple.com>
 
         DFG should be able to constant-fold strings
index 977245a..3225c57 100644 (file)
@@ -2660,7 +2660,6 @@ bool AbstractInterpreter<AbstractStateType>::executeEffects(unsigned clobberLimi
     }
 
     case CheckTierUpAndOSREnter:
-    case CheckTierUpWithNestedTriggerAndOSREnter:
     case LoopHint:
     case ZombieHint:
     case ExitOK:
index e0d67f0..400cadd 100644 (file)
@@ -343,7 +343,6 @@ void clobberize(Graph& graph, Node* node, const ReadFunctor& read, const WriteFu
     case CheckTierUpInLoop:
     case CheckTierUpAtReturn:
     case CheckTierUpAndOSREnter:
-    case CheckTierUpWithNestedTriggerAndOSREnter:
     case LoopHint:
     case Breakpoint:
     case ProfileWillCall:
index 3846692..e3bd688 100644 (file)
@@ -180,7 +180,6 @@ bool doesGC(Graph& graph, Node* node)
     case CheckTierUpInLoop:
     case CheckTierUpAtReturn:
     case CheckTierUpAndOSREnter:
-    case CheckTierUpWithNestedTriggerAndOSREnter:
     case LoopHint:
     case StoreBarrier:
     case InvalidationPoint:
index f33f877..757d9f1 100644 (file)
@@ -1270,7 +1270,6 @@ private:
         case CheckTierUpInLoop:
         case CheckTierUpAtReturn:
         case CheckTierUpAndOSREnter:
-        case CheckTierUpWithNestedTriggerAndOSREnter:
         case InvalidationPoint:
         case CheckArray:
         case CheckInBounds:
index 05879dd..6bda777 100644 (file)
@@ -138,9 +138,28 @@ public:
     DFG::VariableEventStream variableEventStream;
     DFG::MinifiedGraph minifiedDFG;
 #if ENABLE(FTL_JIT)
-    uint8_t nestedTriggerIsSet { 0 };
     uint8_t neverExecutedEntry { 1 };
+
     UpperTierExecutionCounter tierUpCounter;
+
+    // For osrEntryPoint that are in inner loop, this maps their bytecode to the bytecode
+    // of the outerloop entry points in order (from innermost to outermost).
+    //
+    // The key may not always be a target for OSR Entry but the list in the value is guaranteed
+    // to be usable for OSR Entry.
+    HashMap<unsigned, Vector<unsigned>> tierUpInLoopHierarchy;
+
+    // Map each bytecode of CheckTierUpAndOSREnter to its stream index.
+    HashMap<unsigned, unsigned, WTF::IntHash<unsigned>, WTF::UnsignedWithZeroKeyHashTraits<unsigned>> bytecodeIndexToStreamIndex;
+
+    // Map each bytecode of CheckTierUpAndOSREnter to its trigger forcing OSR Entry.
+    // This can never be modified after it has been initialized since the addresses of the triggers
+    // are used by the JIT.
+    HashMap<unsigned, uint8_t> tierUpEntryTriggers;
+
+    // Set of bytecode that were the target of a TierUp operation.
+    HashSet<unsigned, WTF::IntHash<unsigned>, WTF::UnsignedWithZeroKeyHashTraits<unsigned>> tierUpEntrySeen;
+
     WriteBarrier<CodeBlock> m_osrEntryBlock;
     unsigned osrEntryRetry;
     bool abandonOSREntry;
index 758f5cd..3a67c37 100644 (file)
@@ -56,6 +56,11 @@ JITCompiler::JITCompiler(Graph& dfg)
 {
     if (shouldDumpDisassembly() || m_graph.m_vm.m_perBytecodeProfiler)
         m_disassembler = std::make_unique<Disassembler>(dfg);
+#if ENABLE(FTL_JIT)
+    m_jitCode->tierUpInLoopHierarchy = WTFMove(m_graph.m_plan.tierUpInLoopHierarchy);
+    for (unsigned tierUpBytecode : m_graph.m_plan.tierUpAndOSREnterBytecodes)
+        m_jitCode->tierUpEntryTriggers.add(tierUpBytecode, 0);
+#endif
 }
 
 JITCompiler::~JITCompiler()
@@ -114,7 +119,7 @@ void JITCompiler::compileSetupRegistersForEntry()
 void JITCompiler::compileEntryExecutionFlag()
 {
 #if ENABLE(FTL_JIT)
-    if (m_graph.m_plan.canTierUpAndOSREnter)
+    if (m_graph.m_plan.canTierUpAndOSREnter())
         store8(TrustedImm32(0), &m_jitCode->neverExecutedEntry);
 #endif // ENABLE(FTL_JIT)
 }
index 6a50dbd..cdb9247 100644 (file)
@@ -95,7 +95,6 @@ namespace JSC { namespace DFG {
     /* Tier-up checks from the DFG to the FTL. */\
     macro(CheckTierUpInLoop, NodeMustGenerate) \
     macro(CheckTierUpAndOSREnter, NodeMustGenerate) \
-    macro(CheckTierUpWithNestedTriggerAndOSREnter, NodeMustGenerate) \
     macro(CheckTierUpAtReturn, NodeMustGenerate) \
     \
     /* Get the value of a local variable, without linking into the VariableAccessData */\
index 7c01066..c5741b3 100644 (file)
@@ -1554,7 +1554,7 @@ static void triggerFTLReplacementCompile(VM* vm, CodeBlock* codeBlock, JITCode*
         codeBlock, CompilationDeferred);
 }
 
-static void triggerTierUpNowCommon(ExecState* exec, bool inLoop)
+void JIT_OPERATION triggerTierUpNow(ExecState* exec)
 {
     VM* vm = &exec->vm();
     NativeCallFrameTracer tracer(vm, exec);
@@ -1573,45 +1573,64 @@ static void triggerTierUpNowCommon(ExecState* exec, bool inLoop)
             *codeBlock, ": Entered triggerTierUpNow with executeCounter = ",
             jitCode->tierUpCounter, "\n");
     }
-    if (inLoop)
-        jitCode->nestedTriggerIsSet = 1;
 
     if (shouldTriggerFTLCompile(codeBlock, jitCode))
         triggerFTLReplacementCompile(vm, codeBlock, jitCode);
-}
-
-void JIT_OPERATION triggerTierUpNow(ExecState* exec)
-{
-    triggerTierUpNowCommon(exec, false);
-}
 
-void JIT_OPERATION triggerTierUpNowInLoop(ExecState* exec)
-{
-    triggerTierUpNowCommon(exec, true);
+    if (codeBlock->hasOptimizedReplacement()) {
+        if (jitCode->tierUpEntryTriggers.isEmpty()) {
+            // There is nothing more we can do, the only way this will be entered
+            // is through the function entry point.
+            jitCode->dontOptimizeAnytimeSoon(codeBlock);
+            return;
+        }
+        if (jitCode->osrEntryBlock() && jitCode->tierUpEntryTriggers.size() == 1) {
+            // There is only one outer loop and its trigger must have been set
+            // when the plan completed.
+            // Exiting the inner loop is useless, we can ignore the counter and leave
+            // the trigger do its job.
+            jitCode->dontOptimizeAnytimeSoon(codeBlock);
+            return;
+        }
+    }
 }
 
-char* JIT_OPERATION triggerOSREntryNow(
-    ExecState* exec, int32_t bytecodeIndex, int32_t streamIndex)
+static char* tierUpCommon(ExecState* exec, unsigned originBytecodeIndex, unsigned osrEntryBytecodeIndex)
 {
     VM* vm = &exec->vm();
-    NativeCallFrameTracer tracer(vm, exec);
-    DeferGC deferGC(vm->heap);
     CodeBlock* codeBlock = exec->codeBlock();
-    
-    if (codeBlock->jitType() != JITCode::DFGJIT) {
-        dataLog("Unexpected code block in DFG->FTL tier-up: ", *codeBlock, "\n");
-        RELEASE_ASSERT_NOT_REACHED();
-    }
-    
+
+    // Resolve any pending plan for OSR Enter on this function.
+    Worklist::State worklistState;
+    if (Worklist* worklist = existingGlobalFTLWorklistOrNull()) {
+        worklistState = worklist->completeAllReadyPlansForVM(
+            *vm, CompilationKey(codeBlock->baselineVersion(), FTLForOSREntryMode));
+    } else
+        worklistState = Worklist::NotKnown;
+
     JITCode* jitCode = codeBlock->jitCode()->dfg();
-    jitCode->nestedTriggerIsSet = 0;
-    
-    if (Options::verboseOSR()) {
-        dataLog(
-            *codeBlock, ": Entered triggerOSREntryNow with executeCounter = ",
-            jitCode->tierUpCounter, "\n");
+    if (worklistState == Worklist::Compiling) {
+        jitCode->setOptimizationThresholdBasedOnCompilationResult(
+            codeBlock, CompilationDeferred);
+        return nullptr;
     }
-    
+
+    if (worklistState == Worklist::Compiled) {
+        // This means that compilation failed and we already set the thresholds.
+        if (Options::verboseOSR())
+            dataLog("Code block ", *codeBlock, " was compiled but it doesn't have an optimized replacement.\n");
+        return nullptr;
+    }
+
+    // If we can OSR Enter, do it right away.
+    if (originBytecodeIndex == osrEntryBytecodeIndex) {
+        unsigned streamIndex = jitCode->bytecodeIndexToStreamIndex.get(originBytecodeIndex);
+        if (CodeBlock* entryBlock = jitCode->osrEntryBlock()) {
+            if (void* address = FTL::prepareOSREntry(exec, codeBlock, entryBlock, originBytecodeIndex, streamIndex))
+                return static_cast<char*>(address);
+        }
+    }
+
     // - If we don't have an FTL code block, then try to compile one.
     // - If we do have an FTL code block, then try to enter for a while.
     // - If we couldn't enter for a while, then trigger OSR entry.
@@ -1630,29 +1649,13 @@ char* JIT_OPERATION triggerOSREntryNow(
             return nullptr;
         }
     }
-    
+
     // It's time to try to compile code for OSR entry.
-    Worklist::State worklistState;
-    if (Worklist* worklist = existingGlobalFTLWorklistOrNull()) {
-        worklistState = worklist->completeAllReadyPlansForVM(
-            *vm, CompilationKey(codeBlock->baselineVersion(), FTLForOSREntryMode));
-    } else
-        worklistState = Worklist::NotKnown;
-    
-    if (worklistState == Worklist::Compiling) {
-        jitCode->setOptimizationThresholdBasedOnCompilationResult(
-            codeBlock, CompilationDeferred);
-        return nullptr;
-    }
-    
     if (CodeBlock* entryBlock = jitCode->osrEntryBlock()) {
-        void* address = FTL::prepareOSREntry(
-            exec, codeBlock, entryBlock, bytecodeIndex, streamIndex);
-        if (address)
-            return static_cast<char*>(address);
-
         if (jitCode->osrEntryRetry < Options::ftlOSREntryRetryThreshold()) {
             jitCode->osrEntryRetry++;
+            jitCode->setOptimizationThresholdBasedOnCompilationResult(
+                codeBlock, CompilationDeferred);
             return nullptr;
         }
 
@@ -1660,33 +1663,47 @@ char* JIT_OPERATION triggerOSREntryNow(
         entryCode->countEntryFailure();
         if (entryCode->entryFailureCount() <
             Options::ftlOSREntryFailureCountForReoptimization()) {
-            jitCode->optimizeSoon(codeBlock);
+            jitCode->setOptimizationThresholdBasedOnCompilationResult(
+                codeBlock, CompilationDeferred);
             return nullptr;
         }
-        
+
         // OSR entry failed. Oh no! This implies that we need to retry. We retry
         // without exponential backoff and we only do this for the entry code block.
+        unsigned osrEntryBytecode = entryBlock->jitCode()->ftlForOSREntry()->bytecodeIndex();
         jitCode->clearOSREntryBlock();
         jitCode->osrEntryRetry = 0;
+        jitCode->tierUpEntryTriggers.set(osrEntryBytecode, 0);
+        jitCode->setOptimizationThresholdBasedOnCompilationResult(
+            codeBlock, CompilationDeferred);
         return nullptr;
     }
-    
-    if (worklistState == Worklist::Compiled) {
-        // This means that compilation failed and we already set the thresholds.
-        if (Options::verboseOSR())
-            dataLog("Code block ", *codeBlock, " was compiled but it doesn't have an optimized replacement.\n");
-        return nullptr;
+
+    unsigned streamIndex = jitCode->bytecodeIndexToStreamIndex.get(osrEntryBytecodeIndex);
+    auto tierUpHierarchyEntry = jitCode->tierUpInLoopHierarchy.find(osrEntryBytecodeIndex);
+    if (tierUpHierarchyEntry != jitCode->tierUpInLoopHierarchy.end()) {
+        for (unsigned osrEntryCandidate : tierUpHierarchyEntry->value) {
+            if (jitCode->tierUpEntrySeen.contains(osrEntryCandidate)) {
+                osrEntryBytecodeIndex = osrEntryCandidate;
+                streamIndex = jitCode->bytecodeIndexToStreamIndex.get(osrEntryBytecodeIndex);
+            }
+        }
     }
 
     // We aren't compiling and haven't compiled anything for OSR entry. So, try to compile
     // something.
+    auto triggerIterator = jitCode->tierUpEntryTriggers.find(osrEntryBytecodeIndex);
+    RELEASE_ASSERT(triggerIterator != jitCode->tierUpEntryTriggers.end());
+    uint8_t* triggerAddress = &(triggerIterator->value);
+
     Operands<JSValue> mustHandleValues;
     jitCode->reconstruct(
-        exec, codeBlock, CodeOrigin(bytecodeIndex), streamIndex, mustHandleValues);
+        exec, codeBlock, CodeOrigin(osrEntryBytecodeIndex), streamIndex, mustHandleValues);
     CodeBlock* replacementCodeBlock = codeBlock->newReplacement();
+
     CompilationResult forEntryResult = compile(
-        *vm, replacementCodeBlock, codeBlock, FTLForOSREntryMode, bytecodeIndex,
-        mustHandleValues, ToFTLForOSREntryDeferredCompilationCallback::create());
+        *vm, replacementCodeBlock, codeBlock, FTLForOSREntryMode, osrEntryBytecodeIndex,
+        mustHandleValues, ToFTLForOSREntryDeferredCompilationCallback::create(triggerAddress));
 
     if (jitCode->neverExecutedEntry)
         triggerFTLReplacementCompile(vm, codeBlock, jitCode);
@@ -1701,10 +1718,66 @@ char* JIT_OPERATION triggerOSREntryNow(
     // entry will succeed unless we ran out of stack. It's not clear what we should do.
     // We signal to try again after a while if that happens.
     void* address = FTL::prepareOSREntry(
-        exec, codeBlock, jitCode->osrEntryBlock(), bytecodeIndex, streamIndex);
+        exec, codeBlock, jitCode->osrEntryBlock(), originBytecodeIndex, streamIndex);
     return static_cast<char*>(address);
 }
 
+void JIT_OPERATION triggerTierUpNowInLoop(ExecState* exec, unsigned bytecodeIndex)
+{
+    VM* vm = &exec->vm();
+    NativeCallFrameTracer tracer(vm, exec);
+    DeferGC deferGC(vm->heap);
+    CodeBlock* codeBlock = exec->codeBlock();
+
+    if (codeBlock->jitType() != JITCode::DFGJIT) {
+        dataLog("Unexpected code block in DFG->FTL tier-up: ", *codeBlock, "\n");
+        RELEASE_ASSERT_NOT_REACHED();
+    }
+
+    JITCode* jitCode = codeBlock->jitCode()->dfg();
+
+    if (Options::verboseOSR()) {
+        dataLog(
+            *codeBlock, ": Entered triggerTierUpNowInLoop with executeCounter = ",
+            jitCode->tierUpCounter, "\n");
+    }
+
+    auto tierUpHierarchyEntry = jitCode->tierUpInLoopHierarchy.find(bytecodeIndex);
+    if (tierUpHierarchyEntry != jitCode->tierUpInLoopHierarchy.end()
+        && !tierUpHierarchyEntry->value.isEmpty()) {
+        tierUpCommon(exec, bytecodeIndex, tierUpHierarchyEntry->value.first());
+    } else if (shouldTriggerFTLCompile(codeBlock, jitCode))
+        triggerFTLReplacementCompile(vm, codeBlock, jitCode);
+
+    // Since we cannot OSR Enter here, the default "optimizeSoon()" is not useful.
+    if (codeBlock->hasOptimizedReplacement())
+        jitCode->setOptimizationThresholdBasedOnCompilationResult(codeBlock, CompilationDeferred);
+}
+
+char* JIT_OPERATION triggerOSREntryNow(ExecState* exec, unsigned bytecodeIndex)
+{
+    VM* vm = &exec->vm();
+    NativeCallFrameTracer tracer(vm, exec);
+    DeferGC deferGC(vm->heap);
+    CodeBlock* codeBlock = exec->codeBlock();
+
+    if (codeBlock->jitType() != JITCode::DFGJIT) {
+        dataLog("Unexpected code block in DFG->FTL tier-up: ", *codeBlock, "\n");
+        RELEASE_ASSERT_NOT_REACHED();
+    }
+
+    JITCode* jitCode = codeBlock->jitCode()->dfg();
+    jitCode->tierUpEntrySeen.add(bytecodeIndex);
+
+    if (Options::verboseOSR()) {
+        dataLog(
+            *codeBlock, ": Entered triggerOSREntryNow with executeCounter = ",
+            jitCode->tierUpCounter, "\n");
+    }
+
+    return tierUpCommon(exec, bytecodeIndex, bytecodeIndex);
+}
+
 #endif // ENABLE(FTL_JIT)
 
 } // extern "C"
index c818f7f..f8e9e05 100644 (file)
@@ -169,8 +169,8 @@ double JIT_OPERATION operationRandom(JSGlobalObject*);
 
 #if ENABLE(FTL_JIT)
 void JIT_OPERATION triggerTierUpNow(ExecState*) WTF_INTERNAL;
-void JIT_OPERATION triggerTierUpNowInLoop(ExecState*) WTF_INTERNAL;
-char* JIT_OPERATION triggerOSREntryNow(ExecState*, int32_t bytecodeIndex, int32_t streamIndex) WTF_INTERNAL;
+void JIT_OPERATION triggerTierUpNowInLoop(ExecState*, unsigned bytecodeIndex) WTF_INTERNAL;
+char* JIT_OPERATION triggerOSREntryNow(ExecState*, unsigned bytecodeIndex) WTF_INTERNAL;
 #endif // ENABLE(FTL_JIT)
 
 } // extern "C"
index 091e3cb..a488b1b 100644 (file)
@@ -75,6 +75,8 @@ struct Plan : public ThreadSafeRefCounted<Plan> {
     void checkLivenessAndVisitChildren(SlotVisitor&);
     bool isKnownToBeLiveDuringGC();
     void cancel();
+
+    bool canTierUpAndOSREnter() const { return !tierUpAndOSREnterBytecodes.isEmpty(); }
     
     VM& vm;
 
@@ -99,7 +101,9 @@ struct Plan : public ThreadSafeRefCounted<Plan> {
     DesiredTransitions transitions;
     
     bool willTryToTierUp { false };
-    bool canTierUpAndOSREnter { false };
+
+    HashMap<unsigned, Vector<unsigned>> tierUpInLoopHierarchy;
+    Vector<unsigned> tierUpAndOSREnterBytecodes;
 
     enum Stage { Preparing, Compiling, Compiled, Ready, Cancelled };
     Stage stage;
index 6b0c723..582cc65 100644 (file)
@@ -641,7 +641,6 @@ private:
         case CheckTierUpInLoop:
         case CheckTierUpAtReturn:
         case CheckTierUpAndOSREnter:
-        case CheckTierUpWithNestedTriggerAndOSREnter:
         case InvalidationPoint:
         case CheckInBounds:
         case ValueToInt32:
index 3031c9f..b06a9f1 100644 (file)
@@ -296,7 +296,6 @@ bool safeToExecute(AbstractStateType& state, Graph& graph, Node* node)
     case CheckTierUpInLoop:
     case CheckTierUpAtReturn:
     case CheckTierUpAndOSREnter:
-    case CheckTierUpWithNestedTriggerAndOSREnter:
     case LoopHint:
     case StoreBarrier:
     case InvalidationPoint:
index cafbec9..d24caf1 100644 (file)
@@ -4990,7 +4990,6 @@ void SpeculativeJIT::compile(Node* node)
     case CheckTierUpInLoop:
     case CheckTierUpAtReturn:
     case CheckTierUpAndOSREnter:
-    case CheckTierUpWithNestedTriggerAndOSREnter:
     case Int52Rep:
     case FiatInt52:
     case Int52Constant:
index c3efacd..a7c9d95 100644 (file)
@@ -4979,7 +4979,8 @@ void SpeculativeJIT::compile(Node* node)
             MacroAssembler::AbsoluteAddress(&m_jit.jitCode()->tierUpCounter.m_counter));
         
         silentSpillAllRegisters(InvalidGPRReg);
-        m_jit.setupArgumentsExecState();
+        m_jit.setupArgumentsWithExecState(
+            TrustedImm32(node->origin.semantic.bytecodeIndex));
         appendCall(triggerTierUpNowInLoop);
         silentFillAllRegisters(InvalidGPRReg);
         
@@ -5002,28 +5003,28 @@ void SpeculativeJIT::compile(Node* node)
         break;
     }
         
-    case CheckTierUpAndOSREnter:
-    case CheckTierUpWithNestedTriggerAndOSREnter: {
+    case CheckTierUpAndOSREnter: {
         ASSERT(!node->origin.semantic.inlineCallFrame);
         
         GPRTemporary temp(this);
         GPRReg tempGPR = temp.gpr();
 
-        MacroAssembler::Jump forceOSREntry;
-        if (op == CheckTierUpWithNestedTriggerAndOSREnter)
-            forceOSREntry = m_jit.branchTest8(MacroAssembler::NonZero, MacroAssembler::AbsoluteAddress(&m_jit.jitCode()->nestedTriggerIsSet));
+        unsigned bytecodeIndex = node->origin.semantic.bytecodeIndex;
+        auto triggerIterator = m_jit.jitCode()->tierUpEntryTriggers.find(bytecodeIndex);
+        RELEASE_ASSERT(triggerIterator != m_jit.jitCode()->tierUpEntryTriggers.end());
+        uint8_t* forceEntryTrigger = &(m_jit.jitCode()->tierUpEntryTriggers.find(bytecodeIndex)->value);
+        MacroAssembler::Jump forceOSREntry = m_jit.branchTest8(MacroAssembler::NonZero, MacroAssembler::AbsoluteAddress(forceEntryTrigger));
         
         MacroAssembler::Jump done = m_jit.branchAdd32(
             MacroAssembler::Signed,
             TrustedImm32(Options::ftlTierUpCounterIncrementForLoop()),
             MacroAssembler::AbsoluteAddress(&m_jit.jitCode()->tierUpCounter.m_counter));
 
-        if (forceOSREntry.isSet())
-            forceOSREntry.link(&m_jit);
+        forceOSREntry.link(&m_jit);
         silentSpillAllRegisters(tempGPR);
-        m_jit.setupArgumentsWithExecState(
-            TrustedImm32(node->origin.semantic.bytecodeIndex),
-            TrustedImm32(m_stream->size()));
+        unsigned streamIndex = m_stream->size();
+        m_jit.jitCode()->bytecodeIndexToStreamIndex.add(bytecodeIndex, streamIndex);
+        m_jit.setupArgumentsWithExecState(TrustedImm32(bytecodeIndex));
         appendCallSetResult(triggerOSREntryNow, tempGPR);
         MacroAssembler::Jump dontEnter = m_jit.branchTestPtr(MacroAssembler::Zero, tempGPR);
         m_jit.emitRestoreCalleeSaves();
@@ -5038,7 +5039,6 @@ void SpeculativeJIT::compile(Node* node)
     case CheckTierUpInLoop:
     case CheckTierUpAtReturn:
     case CheckTierUpAndOSREnter:
-    case CheckTierUpWithNestedTriggerAndOSREnter:
         DFG_CRASH(m_jit.graph(), node, "Unexpected tier-up node");
         break;
 #endif // ENABLE(FTL_JIT)
index 9b14a98..7c59d3e 100644 (file)
@@ -65,49 +65,73 @@ public:
         if (!Options::useOSREntryToFTL())
             level = FTL::CanCompile;
 
-        // First we find all the loops that contain a LoopHint for which we cannot OSR enter.
-        // We use that information to decide if we need CheckTierUpAndOSREnter or CheckTierUpWithNestedTriggerAndOSREnter.
         m_graph.ensureNaturalLoops();
         NaturalLoops& naturalLoops = *m_graph.m_naturalLoops;
+        HashMap<const NaturalLoop*, unsigned> naturalLoopToLoopHint = buildNaturalLoopToLoopHintMap(naturalLoops);
 
-        HashSet<const NaturalLoop*> loopsContainingLoopHintWithoutOSREnter = findLoopsContainingLoopHintWithoutOSREnter(naturalLoops, level);
+        HashMap<unsigned, LoopHintDescriptor> tierUpHierarchy;
 
-        bool canTierUpAndOSREnter = false;
-        
         InsertionSet insertionSet(m_graph);
         for (BlockIndex blockIndex = m_graph.numBlocks(); blockIndex--;) {
             BasicBlock* block = m_graph.block(blockIndex);
             if (!block)
                 continue;
-            
+
             for (unsigned nodeIndex = 0; nodeIndex < block->size(); ++nodeIndex) {
                 Node* node = block->at(nodeIndex);
                 if (node->op() != LoopHint)
                     continue;
 
                 NodeOrigin origin = node->origin;
-                if (canOSREnterAtLoopHint(level, block, nodeIndex)) {
-                    canTierUpAndOSREnter = true;
-                    const NaturalLoop* loop = naturalLoops.innerMostLoopOf(block);
-                    if (loop && loopsContainingLoopHintWithoutOSREnter.contains(loop))
-                        insertionSet.insertNode(nodeIndex + 1, SpecNone, CheckTierUpWithNestedTriggerAndOSREnter, origin);
-                    else
-                        insertionSet.insertNode(nodeIndex + 1, SpecNone, CheckTierUpAndOSREnter, origin);
-                } else
-                    insertionSet.insertNode(nodeIndex + 1, SpecNone, CheckTierUpInLoop, origin);
+                bool canOSREnter = canOSREnterAtLoopHint(level, block, nodeIndex);
+
+                NodeType tierUpType = CheckTierUpAndOSREnter;
+                if (!canOSREnter)
+                    tierUpType = CheckTierUpInLoop;
+                insertionSet.insertNode(nodeIndex + 1, SpecNone, tierUpType, origin);
+
+                unsigned bytecodeIndex = origin.semantic.bytecodeIndex;
+                if (canOSREnter)
+                    m_graph.m_plan.tierUpAndOSREnterBytecodes.append(bytecodeIndex);
+
+                if (const NaturalLoop* loop = naturalLoops.innerMostLoopOf(block)) {
+                    LoopHintDescriptor descriptor;
+                    descriptor.canOSREnter = canOSREnter;
+
+                    const NaturalLoop* outerLoop = loop;
+                    while ((outerLoop = naturalLoops.innerMostOuterLoop(*outerLoop))) {
+                        auto it = naturalLoopToLoopHint.find(outerLoop);
+                        if (it != naturalLoopToLoopHint.end())
+                            descriptor.osrEntryCandidates.append(it->value);
+                    }
+                    if (!descriptor.osrEntryCandidates.isEmpty())
+                        tierUpHierarchy.add(bytecodeIndex, WTFMove(descriptor));
+                }
                 break;
             }
-            
+
             NodeAndIndex terminal = block->findTerminal();
             if (terminal.node->isFunctionTerminal()) {
                 insertionSet.insertNode(
                     terminal.index, SpecNone, CheckTierUpAtReturn, terminal.node->origin);
             }
-            
+
             insertionSet.execute(block);
         }
 
-        m_graph.m_plan.canTierUpAndOSREnter = canTierUpAndOSREnter;
+        // Add all the candidates that can be OSR Entered.
+        for (auto entry : tierUpHierarchy) {
+            Vector<unsigned> tierUpCandidates;
+            for (unsigned bytecodeIndex : entry.value.osrEntryCandidates) {
+                auto descriptorIt = tierUpHierarchy.find(bytecodeIndex);
+                if (descriptorIt != tierUpHierarchy.end()
+                    && descriptorIt->value.canOSREnter)
+                    tierUpCandidates.append(bytecodeIndex);
+            }
+
+            if (!tierUpCandidates.isEmpty())
+                m_graph.m_plan.tierUpInLoopHierarchy.add(entry.key, WTFMove(tierUpCandidates));
+        }
         m_graph.m_plan.willTryToTierUp = true;
         return true;
 #else // ENABLE(FTL_JIT)
@@ -118,6 +142,11 @@ public:
 
 private:
 #if ENABLE(FTL_JIT)
+    struct LoopHintDescriptor {
+        Vector<unsigned> osrEntryCandidates;
+        bool canOSREnter;
+    };
+
     bool canOSREnterAtLoopHint(FTL::CapabilityLevel level, const BasicBlock* block, unsigned nodeIndex)
     {
         Node* node = block->at(nodeIndex);
@@ -137,25 +166,24 @@ private:
         return true;
     }
 
-    HashSet<const NaturalLoop*> findLoopsContainingLoopHintWithoutOSREnter(const NaturalLoops& naturalLoops, FTL::CapabilityLevel level)
+    HashMap<const NaturalLoop*, unsigned> buildNaturalLoopToLoopHintMap(const NaturalLoops& naturalLoops)
     {
-        HashSet<const NaturalLoop*> loopsContainingLoopHintWithoutOSREnter;
+        HashMap<const NaturalLoop*, unsigned> naturalLoopsToLoopHint;
+
         for (BasicBlock* block : m_graph.blocksInNaturalOrder()) {
             for (unsigned nodeIndex = 0; nodeIndex < block->size(); ++nodeIndex) {
                 Node* node = block->at(nodeIndex);
                 if (node->op() != LoopHint)
                     continue;
 
-                if (!canOSREnterAtLoopHint(level, block, nodeIndex)) {
-                    const NaturalLoop* loop = naturalLoops.innerMostLoopOf(block);
-                    while (loop) {
-                        loopsContainingLoopHintWithoutOSREnter.add(loop);
-                        loop = naturalLoops.innerMostOuterLoop(*loop);
-                    }
+                if (const NaturalLoop* loop = naturalLoops.innerMostLoopOf(block)) {
+                    unsigned bytecodeIndex = node->origin.semantic.bytecodeIndex;
+                    naturalLoopsToLoopHint.add(loop, bytecodeIndex);
                 }
+                break;
             }
         }
-        return loopsContainingLoopHintWithoutOSREnter;
+        return naturalLoopsToLoopHint;
     }
 #endif
 };
index c9e5c79..fc1996a 100644 (file)
 #include "CodeBlock.h"
 #include "DFGJITCode.h"
 #include "Executable.h"
+#include "FTLForOSREntryJITCode.h"
 #include "JSCInlines.h"
 
 namespace JSC { namespace DFG {
 
-ToFTLForOSREntryDeferredCompilationCallback::ToFTLForOSREntryDeferredCompilationCallback()
+ToFTLForOSREntryDeferredCompilationCallback::ToFTLForOSREntryDeferredCompilationCallback(uint8_t* forcedOSREntryTrigger)
+    : m_forcedOSREntryTrigger(forcedOSREntryTrigger)
 {
 }
 
@@ -43,9 +45,9 @@ ToFTLForOSREntryDeferredCompilationCallback::~ToFTLForOSREntryDeferredCompilatio
 {
 }
 
-Ref<ToFTLForOSREntryDeferredCompilationCallback>ToFTLForOSREntryDeferredCompilationCallback::create()
+Ref<ToFTLForOSREntryDeferredCompilationCallback>ToFTLForOSREntryDeferredCompilationCallback::create(uint8_t* forcedOSREntryTrigger)
 {
-    return adoptRef(*new ToFTLForOSREntryDeferredCompilationCallback());
+    return adoptRef(*new ToFTLForOSREntryDeferredCompilationCallback(forcedOSREntryTrigger));
 }
 
 void ToFTLForOSREntryDeferredCompilationCallback::compilationDidBecomeReadyAsynchronously(
@@ -56,9 +58,8 @@ void ToFTLForOSREntryDeferredCompilationCallback::compilationDidBecomeReadyAsync
             "Optimizing compilation of ", *codeBlock, " (for ", *profiledDFGCodeBlock,
             ") did become ready.\n");
     }
-    
-    profiledDFGCodeBlock->jitCode()->dfg()->forceOptimizationSlowPathConcurrently(
-        profiledDFGCodeBlock);
+
+    *m_forcedOSREntryTrigger = 1;
 }
 
 void ToFTLForOSREntryDeferredCompilationCallback::compilationDidComplete(
@@ -73,12 +74,17 @@ void ToFTLForOSREntryDeferredCompilationCallback::compilationDidComplete(
     JITCode* jitCode = profiledDFGCodeBlock->jitCode()->dfg();
         
     switch (result) {
-    case CompilationSuccessful:
+    case CompilationSuccessful: {
         jitCode->setOSREntryBlock(*codeBlock->vm(), profiledDFGCodeBlock, codeBlock);
+        unsigned osrEntryBytecode = codeBlock->jitCode()->ftlForOSREntry()->bytecodeIndex();
+        jitCode->tierUpEntryTriggers.set(osrEntryBytecode, 1);
         break;
+    }
     case CompilationFailed:
         jitCode->osrEntryRetry = 0;
         jitCode->abandonOSREntry = true;
+        profiledDFGCodeBlock->jitCode()->dfg()->setOptimizationThresholdBasedOnCompilationResult(
+            profiledDFGCodeBlock, result);
         break;
     case CompilationDeferred:
         RELEASE_ASSERT_NOT_REACHED();
index 580f7d2..f56f8ef 100644 (file)
@@ -40,15 +40,18 @@ namespace DFG {
 
 class ToFTLForOSREntryDeferredCompilationCallback : public DeferredCompilationCallback {
 protected:
-    ToFTLForOSREntryDeferredCompilationCallback();
+    ToFTLForOSREntryDeferredCompilationCallback(uint8_t* forcedOSREntryTrigger);
 
 public:
     virtual ~ToFTLForOSREntryDeferredCompilationCallback();
 
-    static Ref<ToFTLForOSREntryDeferredCompilationCallback> create();
+    static Ref<ToFTLForOSREntryDeferredCompilationCallback> create(uint8_t* forcedOSREntryTrigger);
     
     virtual void compilationDidBecomeReadyAsynchronously(CodeBlock*, CodeBlock* profiledDFGCodeBlock);
     virtual void compilationDidComplete(CodeBlock*, CodeBlock* profiledDFGCodeBlock, CompilationResult);
+
+private:
+    uint8_t* m_forcedOSREntryTrigger;
 };
 
 } } // namespace JSC::DFG