REGRESSION(r180595): same-callee profiling no longer works
authorrniwa@webkit.org <rniwa@webkit.org@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Thu, 14 May 2015 04:19:18 +0000 (04:19 +0000)
committerrniwa@webkit.org <rniwa@webkit.org@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Thu, 14 May 2015 04:19:18 +0000 (04:19 +0000)
https://bugs.webkit.org/show_bug.cgi?id=144787

Reviewed by Filip Pizlo.

This patch introduces a DFG optimization to use NewObject node when the callee of op_create_this is
always the same JSFunction. This condition doesn't hold when the byte code creates multiple
JSFunction objects at runtime as in: function y() { return function () {} }; new y(); new y();

To enable this optimization, LLint and baseline JIT now store the last callee we saw in the newly
added fourth operand of op_create_this. We use this JSFunction's structure in DFG after verifying
our speculation that the callee is the same. To avoid recompiling the same code for different callee
objects in the polymorphic case, the special value of seenMultipleCalleeObjects() is set in
LLint and baseline JIT when multiple callees are observed.

Tests: stress/create-this-with-callee-variants.js

* bytecode/BytecodeList.json: Increased the number of operands to 5.
* bytecode/CodeBlock.cpp:
(JSC::CodeBlock::dumpBytecode): Dump the newly added callee cache.
(JSC::CodeBlock::finalizeUnconditionally): Clear the callee cache if the callee is no longer alive.
* bytecompiler/BytecodeGenerator.cpp:
(JSC::BytecodeGenerator::emitCreateThis): Add the instruction to propertyAccessInstructions so that
we can clear the callee cache in CodeBlock::finalizeUnconditionally. Also initialize the newly added
operand.
* dfg/DFGByteCodeParser.cpp:
(JSC::DFG::ByteCodeParser::parseBlock): Implement the optimization. Speculate the actual callee to
match the cache. Use the cached callee's structure if the speculation succeeds. Otherwise, OSR exit.
* jit/JITOpcodes.cpp:
(JSC::JIT::emit_op_create_this): Go to the slow path to update the cache unless it's already marked
as seenMultipleCalleeObjects() to indicate the polymorphic behavior and/or we've OSR exited here.
(JSC::JIT::emitSlow_op_create_this):
* jit/JITOpcodes32_64.cpp:
(JSC::JIT::emit_op_create_this): Ditto.
(JSC::JIT::emitSlow_op_create_this):
* llint/LowLevelInterpreter32_64.asm:
(_llint_op_create_this): Ditto.
* llint/LowLevelInterpreter64.asm:
(_llint_op_create_this): Ditto.
* runtime/CommonSlowPaths.cpp:
(slow_path_create_this): Set the callee cache to the actual callee if it's not set. If the cache has
been set to a JSFunction* different from the actual callee, set it to seenMultipleCalleeObjects().
* runtime/JSCell.h:
(JSC::JSCell::seenMultipleCalleeObjects): Added.
* runtime/WriteBarrier.h:
(JSC::WriteBarrierBase::unvalidatedGet): Removed the compile guard around it.
* tests/stress/create-this-with-callee-variants.js: Added.

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@184328 268f45cc-cd09-0410-ab3c-d52691b4dbfc

13 files changed:
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/bytecode/BytecodeList.json
Source/JavaScriptCore/bytecode/CodeBlock.cpp
Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp
Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp
Source/JavaScriptCore/jit/JITOpcodes.cpp
Source/JavaScriptCore/jit/JITOpcodes32_64.cpp
Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm
Source/JavaScriptCore/llint/LowLevelInterpreter64.asm
Source/JavaScriptCore/runtime/CommonSlowPaths.cpp
Source/JavaScriptCore/runtime/JSCell.h
Source/JavaScriptCore/runtime/WriteBarrier.h
Source/JavaScriptCore/tests/stress/create-this-with-callee-variants.js [new file with mode: 0644]

index f83089c..2992f80 100644 (file)
@@ -1,3 +1,53 @@
+2015-05-13  Ryosuke Niwa  <rniwa@webkit.org>
+
+        REGRESSION(r180595): same-callee profiling no longer works
+        https://bugs.webkit.org/show_bug.cgi?id=144787
+
+        Reviewed by Filip Pizlo.
+
+        This patch introduces a DFG optimization to use NewObject node when the callee of op_create_this is
+        always the same JSFunction. This condition doesn't hold when the byte code creates multiple
+        JSFunction objects at runtime as in: function y() { return function () {} }; new y(); new y();
+
+        To enable this optimization, LLint and baseline JIT now store the last callee we saw in the newly
+        added fourth operand of op_create_this. We use this JSFunction's structure in DFG after verifying
+        our speculation that the callee is the same. To avoid recompiling the same code for different callee
+        objects in the polymorphic case, the special value of seenMultipleCalleeObjects() is set in
+        LLint and baseline JIT when multiple callees are observed.
+
+        Tests: stress/create-this-with-callee-variants.js
+
+        * bytecode/BytecodeList.json: Increased the number of operands to 5.
+        * bytecode/CodeBlock.cpp:
+        (JSC::CodeBlock::dumpBytecode): Dump the newly added callee cache.
+        (JSC::CodeBlock::finalizeUnconditionally): Clear the callee cache if the callee is no longer alive.
+        * bytecompiler/BytecodeGenerator.cpp:
+        (JSC::BytecodeGenerator::emitCreateThis): Add the instruction to propertyAccessInstructions so that
+        we can clear the callee cache in CodeBlock::finalizeUnconditionally. Also initialize the newly added
+        operand.
+        * dfg/DFGByteCodeParser.cpp:
+        (JSC::DFG::ByteCodeParser::parseBlock): Implement the optimization. Speculate the actual callee to
+        match the cache. Use the cached callee's structure if the speculation succeeds. Otherwise, OSR exit.
+        * jit/JITOpcodes.cpp:
+        (JSC::JIT::emit_op_create_this): Go to the slow path to update the cache unless it's already marked
+        as seenMultipleCalleeObjects() to indicate the polymorphic behavior and/or we've OSR exited here.
+        (JSC::JIT::emitSlow_op_create_this):
+        * jit/JITOpcodes32_64.cpp:
+        (JSC::JIT::emit_op_create_this): Ditto.
+        (JSC::JIT::emitSlow_op_create_this):
+        * llint/LowLevelInterpreter32_64.asm:
+        (_llint_op_create_this): Ditto.
+        * llint/LowLevelInterpreter64.asm:
+        (_llint_op_create_this): Ditto.
+        * runtime/CommonSlowPaths.cpp:
+        (slow_path_create_this): Set the callee cache to the actual callee if it's not set. If the cache has
+        been set to a JSFunction* different from the actual callee, set it to seenMultipleCalleeObjects().
+        * runtime/JSCell.h:
+        (JSC::JSCell::seenMultipleCalleeObjects): Added.
+        * runtime/WriteBarrier.h:
+        (JSC::WriteBarrierBase::unvalidatedGet): Removed the compile guard around it.
+        * tests/stress/create-this-with-callee-variants.js: Added.
+
 2015-05-13  Joseph Pecoraro  <pecoraro@apple.com>
 
         Clean up some possible RefPtr to PassRefPtr churn
index be7236b..8b9a81e 100644 (file)
@@ -9,7 +9,7 @@
             { "name" : "op_create_direct_arguments", "length" : 2 },
             { "name" : "op_create_scoped_arguments", "length" : 3 },
             { "name" : "op_create_out_of_band_arguments", "length" : 2 },
-            { "name" : "op_create_this", "length" : 4 },
+            { "name" : "op_create_this", "length" : 5 },
             { "name" : "op_to_this", "length" : 4 },
             { "name" : "op_check_tdz", "length" : 2 },
             { "name" : "op_new_object", "length" : 4 },
index bf6956f..edcbc75 100644 (file)
@@ -795,8 +795,9 @@ void CodeBlock::dumpBytecode(
             int r0 = (++it)->u.operand;
             int r1 = (++it)->u.operand;
             unsigned inferredInlineCapacity = (++it)->u.operand;
+            unsigned cachedFunction = (++it)->u.operand;
             printLocationAndOp(out, exec, location, it, "create_this");
-            out.printf("%s, %s, %u", registerName(r0).data(), registerName(r1).data(), inferredInlineCapacity);
+            out.printf("%s, %s, %u, %u", registerName(r0).data(), registerName(r1).data(), inferredInlineCapacity, cachedFunction);
             break;
         }
         case op_to_this: {
@@ -2569,6 +2570,18 @@ void CodeBlock::finalizeUnconditionally()
                 curInstruction[3].u.toThisStatus = merge(
                     curInstruction[3].u.toThisStatus, ToThisClearedByGC);
                 break;
+            case op_create_this: {
+                auto& cacheWriteBarrier = curInstruction[4].u.jsCell;
+                if (!cacheWriteBarrier || cacheWriteBarrier.unvalidatedGet() == JSCell::seenMultipleCalleeObjects())
+                    break;
+                JSCell* cachedFunction = cacheWriteBarrier.get();
+                if (Heap::isMarked(cachedFunction))
+                    break;
+                if (Options::verboseOSR())
+                    dataLogF("Clearing LLInt create_this with cached callee %p.\n", cachedFunction);
+                cacheWriteBarrier.clear();
+                break;
+            }
             case op_resolve_scope: {
                 // Right now this isn't strictly necessary. Any symbol tables that this will refer to
                 // are for outer functions, and we refer to those functions strongly, and they refer
index 5ff461e..5fe9e0e 100644 (file)
@@ -1675,10 +1675,12 @@ RegisterID* BytecodeGenerator::emitCreateThis(RegisterID* dst)
     size_t begin = instructions().size();
     m_staticPropertyAnalyzer.createThis(m_thisRegister.index(), begin + 3);
 
+    m_codeBlock->addPropertyAccessInstruction(instructions().size());
     emitOpcode(op_create_this); 
     instructions().append(m_thisRegister.index()); 
     instructions().append(m_thisRegister.index()); 
     instructions().append(0);
+    instructions().append(0);
     return dst;
 }
 
index e0c4635..17b36ee 100644 (file)
@@ -2679,8 +2679,25 @@ bool ByteCodeParser::parseBlock(unsigned limit)
         case op_create_this: {
             int calleeOperand = currentInstruction[2].u.operand;
             Node* callee = get(VirtualRegister(calleeOperand));
+
+            JSFunction* function = callee->dynamicCastConstant<JSFunction*>();
+            if (!function) {
+                JSCell* cachedFunction = currentInstruction[4].u.jsCell.unvalidatedGet();
+                if (cachedFunction
+                    && cachedFunction != JSCell::seenMultipleCalleeObjects()
+                    && !m_inlineStackTop->m_exitProfile.hasExitSite(m_currentIndex, BadCell)) {
+                    ASSERT(cachedFunction->inherits(JSFunction::info()));
+
+                    FrozenValue* frozen = m_graph.freeze(cachedFunction);
+                    addToGraph(CheckCell, OpInfo(frozen), callee);
+                    set(VirtualRegister(currentInstruction[1].u.operand), addToGraph(JSConstant, OpInfo(frozen)));
+
+                    function = static_cast<JSFunction*>(cachedFunction);
+                }
+            }
+
             bool alreadyEmitted = false;
-            if (JSFunction* function = callee->dynamicCastConstant<JSFunction*>()) {
+            if (function) {
                 if (FunctionRareData* rareData = function->rareData()) {
                     if (Structure* structure = rareData->allocationStructure()) {
                         m_graph.freeze(rareData);
index 5d4d0b9..730361a 100644 (file)
@@ -705,11 +705,13 @@ void JIT::emit_op_to_this(Instruction* currentInstruction)
 void JIT::emit_op_create_this(Instruction* currentInstruction)
 {
     int callee = currentInstruction[2].u.operand;
+    WriteBarrierBase<JSCell>* cachedFunction = &currentInstruction[4].u.jsCell;
     RegisterID calleeReg = regT0;
-    RegisterID rareDataReg = regT0;
+    RegisterID rareDataReg = regT4;
     RegisterID resultReg = regT0;
     RegisterID allocatorReg = regT1;
     RegisterID structureReg = regT2;
+    RegisterID cachedFunctionReg = regT4;
     RegisterID scratchReg = regT3;
 
     emitGetVirtualRegister(callee, calleeReg);
@@ -719,6 +721,11 @@ void JIT::emit_op_create_this(Instruction* currentInstruction)
     loadPtr(Address(rareDataReg, FunctionRareData::offsetOfAllocationProfile() + ObjectAllocationProfile::offsetOfStructure()), structureReg);
     addSlowCase(branchTestPtr(Zero, allocatorReg));
 
+    loadPtr(cachedFunction, cachedFunctionReg);
+    Jump hasSeenMultipleCallees = branchPtr(Equal, cachedFunctionReg, TrustedImmPtr(JSCell::seenMultipleCalleeObjects()));
+    addSlowCase(branchPtr(NotEqual, calleeReg, cachedFunctionReg));
+    hasSeenMultipleCallees.link(this);
+
     emitAllocateJSObject(allocatorReg, structureReg, resultReg, scratchReg);
     emitPutVirtualRegister(currentInstruction[1].u.operand);
 }
@@ -728,6 +735,7 @@ void JIT::emitSlow_op_create_this(Instruction* currentInstruction, Vector<SlowCa
     linkSlowCase(iter); // doesn't have rare data
     linkSlowCase(iter); // doesn't have an allocation profile
     linkSlowCase(iter); // allocation failed
+    linkSlowCase(iter); // cached function didn't match
 
     JITSlowPathCall slowPathCall(this, currentInstruction, slow_path_create_this);
     slowPathCall.call();
index acd460b..1920225 100644 (file)
@@ -936,11 +936,13 @@ void JIT::emit_op_get_scope(Instruction* currentInstruction)
 void JIT::emit_op_create_this(Instruction* currentInstruction)
 {
     int callee = currentInstruction[2].u.operand;
+    WriteBarrierBase<JSCell>* cachedFunction = &currentInstruction[4].u.jsCell;
     RegisterID calleeReg = regT0;
-    RegisterID rareDataReg = regT0;
+    RegisterID rareDataReg = regT4;
     RegisterID resultReg = regT0;
     RegisterID allocatorReg = regT1;
     RegisterID structureReg = regT2;
+    RegisterID cachedFunctionReg = regT4;
     RegisterID scratchReg = regT3;
 
     emitLoadPayload(callee, calleeReg);
@@ -950,6 +952,11 @@ void JIT::emit_op_create_this(Instruction* currentInstruction)
     loadPtr(Address(rareDataReg, FunctionRareData::offsetOfAllocationProfile() + ObjectAllocationProfile::offsetOfStructure()), structureReg);
     addSlowCase(branchTestPtr(Zero, allocatorReg));
 
+    loadPtr(cachedFunction, cachedFunctionReg);
+    Jump hasSeenMultipleCallees = branchPtr(Equal, cachedFunctionReg, TrustedImmPtr(JSCell::seenMultipleCalleeObjects()));
+    addSlowCase(branchPtr(NotEqual, calleeReg, cachedFunctionReg));
+    hasSeenMultipleCallees.link(this);
+
     emitAllocateJSObject(allocatorReg, structureReg, resultReg, scratchReg);
     emitStoreCell(currentInstruction[1].u.operand, resultReg);
 }
@@ -959,6 +966,7 @@ void JIT::emitSlow_op_create_this(Instruction* currentInstruction, Vector<SlowCa
     linkSlowCase(iter); // doesn't have rare data
     linkSlowCase(iter); // doesn't have an allocation profile
     linkSlowCase(iter); // allocation failed
+    linkSlowCase(iter); // cached function didn't match
 
     JITSlowPathCall slowPathCall(this, currentInstruction, slow_path_create_this);
     slowPathCall.call();
index 7ced21a..5ac24a8 100644 (file)
@@ -745,15 +745,19 @@ _llint_op_create_this:
     loadp FunctionRareData::m_allocationProfile + ObjectAllocationProfile::m_allocator[t4], t1
     loadp FunctionRareData::m_allocationProfile + ObjectAllocationProfile::m_structure[t4], t2
     btpz t1, .opCreateThisSlow
+    loadpFromInstruction(4, t4)
+    bpeq t4, 1, .hasSeenMultipleCallee
+    bpneq t4, t0, .opCreateThisSlow
+.hasSeenMultipleCallee:
     allocateJSObject(t1, t2, t0, t3, .opCreateThisSlow)
     loadi 4[PC], t1
     storei CellTag, TagOffset[cfr, t1, 8]
     storei t0, PayloadOffset[cfr, t1, 8]
-    dispatch(4)
+    dispatch(5)
 
 .opCreateThisSlow:
     callSlowPath(_slow_path_create_this)
-    dispatch(4)
+    dispatch(5)
 
 
 _llint_op_to_this:
index 68a9c62..0f11e67 100644 (file)
@@ -631,14 +631,18 @@ _llint_op_create_this:
     loadp FunctionRareData::m_allocationProfile + ObjectAllocationProfile::m_allocator[t4], t1
     loadp FunctionRareData::m_allocationProfile + ObjectAllocationProfile::m_structure[t4], t2
     btpz t1, .opCreateThisSlow
+    loadpFromInstruction(4, t4)
+    bpeq t4, 1, .hasSeenMultipleCallee
+    bpneq t4, t0, .opCreateThisSlow
+.hasSeenMultipleCallee:
     allocateJSObject(t1, t2, t0, t3, .opCreateThisSlow)
     loadisFromInstruction(1, t1)
     storeq t0, [cfr, t1, 8]
-    dispatch(4)
+    dispatch(5)
 
 .opCreateThisSlow:
     callSlowPath(_slow_path_create_this)
-    dispatch(4)
+    dispatch(5)
 
 
 _llint_op_to_this:
index 009b350..56d46ed 100644 (file)
@@ -235,6 +235,12 @@ SLOW_PATH_DECL(slow_path_create_this)
     ASSERT(constructor->methodTable()->getConstructData(constructor, constructData) == ConstructTypeJS);
 #endif
 
+    auto& cacheWriteBarrier = pc[4].u.jsCell;
+    if (!cacheWriteBarrier)
+        cacheWriteBarrier.set(exec->vm(), exec->codeBlock()->ownerExecutable(), constructor);
+    else if (cacheWriteBarrier.unvalidatedGet() != JSCell::seenMultipleCalleeObjects() && cacheWriteBarrier.get() != constructor)
+        cacheWriteBarrier.setWithoutWriteBarrier(JSCell::seenMultipleCalleeObjects());
+
     size_t inlineCapacity = pc[3].u.operand;
     Structure* structure = constructor->rareData(exec, inlineCapacity)->allocationProfile()->structure();
     RETURN(constructEmptyObject(exec, structure));
index 26df1e3..6d648ad 100644 (file)
@@ -74,6 +74,8 @@ public:
 
     static const bool needsDestruction = false;
 
+    static JSCell* seenMultipleCalleeObjects() { return bitwise_cast<JSCell*>(static_cast<uintptr_t>(1)); }
+
     enum CreatingEarlyCellTag { CreatingEarlyCell };
     JSCell(CreatingEarlyCellTag);
 
index b7a42c9..ac7c55b 100644 (file)
@@ -126,9 +126,7 @@ public:
         this->m_cell = reinterpret_cast<JSCell*>(value);
     }
 
-#if ENABLE(GC_VALIDATION)
     T* unvalidatedGet() const { return reinterpret_cast<T*>(static_cast<void*>(m_cell)); }
-#endif
 
 private:
     JSCell* m_cell;
diff --git a/Source/JavaScriptCore/tests/stress/create-this-with-callee-variants.js b/Source/JavaScriptCore/tests/stress/create-this-with-callee-variants.js
new file mode 100644 (file)
index 0000000..a7368ae
--- /dev/null
@@ -0,0 +1,18 @@
+function createInLoop(x, count) {
+    noInline(x)
+    for (var i = 0; i < 5000; i++) {
+        var obj = new x;
+        if (!(obj instanceof x))
+            throw "Failed to instantiate the right object";
+    }
+}
+
+function y() { return function () {} }
+
+createInLoop(y());
+
+function z() { return function () {} }
+
+createInLoop(z());
+createInLoop(z());
+createInLoop(z());