REGRESSION(r180595): same-callee profiling no longer works
authorrniwa@webkit.org <rniwa@webkit.org@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Mon, 11 May 2015 21:21:17 +0000 (21:21 +0000)
committerrniwa@webkit.org <rniwa@webkit.org@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Mon, 11 May 2015 21:21:17 +0000 (21:21 +0000)
https://bugs.webkit.org/show_bug.cgi?id=144787

Reviewed by Michael Saboff.

This patch introduces a DFG optimization to use NewObject node when the callee of op_create_this is
always the same JSFunction. This condition doesn't hold when the byte code creates multiple
JSFunction objects at runtime as in: function y() { return function () {} }; new y(); new y();

To enable this optimization, LLint and baseline JIT now store the last callee we saw in the newly
added fourth operand of op_create_this. We use this JSFunction's structure in DFG after verifying
our speculation that the callee is the same. To avoid recompiling the same code for different callee
objects in the polymorphic case, the special value of seenMultipleCalleeObjects() is set in
LLint and baseline JIT when multiple callees are observed.

Tests: stress/create-this-with-callee-variants.js

* bytecode/BytecodeList.json: Increased the number of operands to 5.
* bytecode/BytecodeUseDef.h:
(JSC::computeUsesForBytecodeOffset): op_create_this uses 2nd (constructor) and 4th (callee cache)
operands.
* bytecode/CodeBlock.cpp:
(JSC::CodeBlock::dumpBytecode): Dump the newly added callee cache.
(JSC::CodeBlock::finalizeUnconditionally): Clear the callee cache if the callee is no longer alive.
* bytecompiler/BytecodeGenerator.cpp:
(JSC::BytecodeGenerator::emitCreateThis): Add the instruction to propertyAccessInstructions so that
we can clear the callee cache in CodeBlock::finalizeUnconditionally. Also initialize the newly added
operand.
* dfg/DFGByteCodeParser.cpp:
(JSC::DFG::ByteCodeParser::parseBlock): Implement the optimization. Speculate the actual callee to
match the cache. Use the cached callee's structure if the speculation succeeds. Otherwise, OSR exit.
* jit/JITOpcodes.cpp:
(JSC::JIT::emit_op_create_this): Go to the slow path to update the cache unless it's already marked
as seenMultipleCalleeObjects() to indicate the polymorphic behavior.
(JSC::JIT::emitSlow_op_create_this):
* jit/JITOpcodes32_64.cpp:
(JSC::JIT::emit_op_create_this): Ditto.
(JSC::JIT::emitSlow_op_create_this):
* llint/LowLevelInterpreter32_64.asm:
(_llint_op_create_this): Ditto.
* llint/LowLevelInterpreter64.asm:
(_llint_op_create_this): Ditto.
* runtime/CommonSlowPaths.cpp:
(slow_path_create_this): Set the callee cache to the actual callee if it's not set. If the cache has
been set to a JSFunction* different from the actual callee, set it to seenMultipleCalleeObjects().
* runtime/JSCell.h:
(JSC::JSCell::seenMultipleCalleeObjects): Added.
* runtime/WriteBarrier.h:
(JSC::WriteBarrierBase::unvalidatedGet): Removed the compile guard around it.
* tests/stress/create-this-with-callee-variants.js: Added.

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@184123 268f45cc-cd09-0410-ab3c-d52691b4dbfc

14 files changed:
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/bytecode/BytecodeList.json
Source/JavaScriptCore/bytecode/BytecodeUseDef.h
Source/JavaScriptCore/bytecode/CodeBlock.cpp
Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp
Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp
Source/JavaScriptCore/jit/JITOpcodes.cpp
Source/JavaScriptCore/jit/JITOpcodes32_64.cpp
Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm
Source/JavaScriptCore/llint/LowLevelInterpreter64.asm
Source/JavaScriptCore/runtime/CommonSlowPaths.cpp
Source/JavaScriptCore/runtime/JSCell.h
Source/JavaScriptCore/runtime/WriteBarrier.h
Source/JavaScriptCore/tests/stress/create-this-with-callee-variants.js [new file with mode: 0644]

index e59f2ad5dada05c5fdcec6d87afb6e6909cefd9a..23adcbaa9f825a2ca68c67072376df08fe8b136c 100644 (file)
@@ -1,3 +1,56 @@
+2015-05-11  Ryosuke Niwa  <rniwa@webkit.org>
+
+        REGRESSION(r180595): same-callee profiling no longer works
+        https://bugs.webkit.org/show_bug.cgi?id=144787
+
+        Reviewed by Michael Saboff.
+
+        This patch introduces a DFG optimization to use NewObject node when the callee of op_create_this is
+        always the same JSFunction. This condition doesn't hold when the byte code creates multiple
+        JSFunction objects at runtime as in: function y() { return function () {} }; new y(); new y();
+
+        To enable this optimization, LLint and baseline JIT now store the last callee we saw in the newly
+        added fourth operand of op_create_this. We use this JSFunction's structure in DFG after verifying
+        our speculation that the callee is the same. To avoid recompiling the same code for different callee
+        objects in the polymorphic case, the special value of seenMultipleCalleeObjects() is set in
+        LLint and baseline JIT when multiple callees are observed.
+
+        Tests: stress/create-this-with-callee-variants.js
+
+        * bytecode/BytecodeList.json: Increased the number of operands to 5.
+        * bytecode/BytecodeUseDef.h:
+        (JSC::computeUsesForBytecodeOffset): op_create_this uses 2nd (constructor) and 4th (callee cache)
+        operands.
+        * bytecode/CodeBlock.cpp:
+        (JSC::CodeBlock::dumpBytecode): Dump the newly added callee cache.
+        (JSC::CodeBlock::finalizeUnconditionally): Clear the callee cache if the callee is no longer alive.
+        * bytecompiler/BytecodeGenerator.cpp:
+        (JSC::BytecodeGenerator::emitCreateThis): Add the instruction to propertyAccessInstructions so that
+        we can clear the callee cache in CodeBlock::finalizeUnconditionally. Also initialize the newly added
+        operand.
+        * dfg/DFGByteCodeParser.cpp:
+        (JSC::DFG::ByteCodeParser::parseBlock): Implement the optimization. Speculate the actual callee to
+        match the cache. Use the cached callee's structure if the speculation succeeds. Otherwise, OSR exit.
+        * jit/JITOpcodes.cpp:
+        (JSC::JIT::emit_op_create_this): Go to the slow path to update the cache unless it's already marked
+        as seenMultipleCalleeObjects() to indicate the polymorphic behavior.
+        (JSC::JIT::emitSlow_op_create_this):
+        * jit/JITOpcodes32_64.cpp:
+        (JSC::JIT::emit_op_create_this): Ditto.
+        (JSC::JIT::emitSlow_op_create_this):
+        * llint/LowLevelInterpreter32_64.asm:
+        (_llint_op_create_this): Ditto.
+        * llint/LowLevelInterpreter64.asm:
+        (_llint_op_create_this): Ditto.
+        * runtime/CommonSlowPaths.cpp:
+        (slow_path_create_this): Set the callee cache to the actual callee if it's not set. If the cache has
+        been set to a JSFunction* different from the actual callee, set it to seenMultipleCalleeObjects().
+        * runtime/JSCell.h:
+        (JSC::JSCell::seenMultipleCalleeObjects): Added.
+        * runtime/WriteBarrier.h:
+        (JSC::WriteBarrierBase::unvalidatedGet): Removed the compile guard around it.
+        * tests/stress/create-this-with-callee-variants.js: Added.
+
 2015-05-11  Andreas Kling  <akling@apple.com>
 
         PropertyNameArray should use a Vector when there are few entries.
index e3ed1dcda4e03a5f11ef91e56119eb93051fb7f6..7e7849f0e06de1ebb5ea31c8c70a7f95f4f217b1 100644 (file)
@@ -9,7 +9,7 @@
             { "name" : "op_create_direct_arguments", "length" : 2 },
             { "name" : "op_create_scoped_arguments", "length" : 3 },
             { "name" : "op_create_out_of_band_arguments", "length" : 2 },
-            { "name" : "op_create_this", "length" : 4 },
+            { "name" : "op_create_this", "length" : 5 },
             { "name" : "op_to_this", "length" : 4 },
             { "name" : "op_check_tdz", "length" : 2 },
             { "name" : "op_new_object", "length" : 4 },
index 7667e683e454a499986edacd90b64917242037cd..fbaec9fe0757081a5e02985dd6483f2008d41bc8 100644 (file)
@@ -142,7 +142,6 @@ void computeUsesForBytecodeOffset(
     case op_not:
     case op_mov:
     case op_new_array_with_size:
-    case op_create_this:
     case op_del_by_id:
     case op_unsigned:
     case op_new_func:
@@ -183,6 +182,11 @@ void computeUsesForBytecodeOffset(
         functor(codeBlock, instruction, opcodeID, instruction[3].u.operand);
         return;
     }
+    case op_create_this: {
+        functor(codeBlock, instruction, opcodeID, instruction[2].u.operand);
+        functor(codeBlock, instruction, opcodeID, instruction[4].u.operand);
+        return;
+    }
     case op_has_structure_property:
     case op_construct_varargs:
     case op_call_varargs: {
index 90138f97cbd828f972cb22442bf40441d13f4056..918fe208346cd35ce8658e21f8eead063d6bcf2a 100644 (file)
@@ -795,8 +795,9 @@ void CodeBlock::dumpBytecode(
             int r0 = (++it)->u.operand;
             int r1 = (++it)->u.operand;
             unsigned inferredInlineCapacity = (++it)->u.operand;
+            unsigned cachedFunction = (++it)->u.operand;
             printLocationAndOp(out, exec, location, it, "create_this");
-            out.printf("%s, %s, %u", registerName(r0).data(), registerName(r1).data(), inferredInlineCapacity);
+            out.printf("%s, %s, %u, %u", registerName(r0).data(), registerName(r1).data(), inferredInlineCapacity, cachedFunction);
             break;
         }
         case op_to_this: {
@@ -2553,6 +2554,18 @@ void CodeBlock::finalizeUnconditionally()
                 curInstruction[3].u.toThisStatus = merge(
                     curInstruction[3].u.toThisStatus, ToThisClearedByGC);
                 break;
+            case op_create_this: {
+                auto& cacheWriteBarrier = curInstruction[4].u.jsCell;
+                if (!cacheWriteBarrier || cacheWriteBarrier.unvalidatedGet() == JSCell::seenMultipleCalleeObjects())
+                    break;
+                JSCell* cachedFunction = cacheWriteBarrier.get();
+                if (Heap::isMarked(cachedFunction))
+                    break;
+                if (Options::verboseOSR())
+                    dataLogF("Clearing LLInt create_this with cached callee %p.\n", cachedFunction);
+                cacheWriteBarrier.clear();
+                break;
+            }
             case op_resolve_scope: {
                 // Right now this isn't strictly necessary. Any symbol tables that this will refer to
                 // are for outer functions, and we refer to those functions strongly, and they refer
index c3ce875591a329f73ea5db2606bf56698a25f1a6..2829391a609948ab4c806c6351fb1d2ed8dae635 100644 (file)
@@ -1653,10 +1653,12 @@ RegisterID* BytecodeGenerator::emitCreateThis(RegisterID* dst)
     size_t begin = instructions().size();
     m_staticPropertyAnalyzer.createThis(m_thisRegister.index(), begin + 3);
 
+    m_codeBlock->addPropertyAccessInstruction(instructions().size());
     emitOpcode(op_create_this); 
     instructions().append(m_thisRegister.index()); 
     instructions().append(m_thisRegister.index()); 
     instructions().append(0);
+    instructions().append(0);
     return dst;
 }
 
index abc51d46a05197df413c3b36df1696d39bccaa01..d6213372082eb79b72e52c07c81f37800d4b305d 100644 (file)
@@ -2669,8 +2669,24 @@ bool ByteCodeParser::parseBlock(unsigned limit)
         case op_create_this: {
             int calleeOperand = currentInstruction[2].u.operand;
             Node* callee = get(VirtualRegister(calleeOperand));
+
+            JSFunction* function = callee->dynamicCastConstant<JSFunction*>();
+            if (!function) {
+                JSCell* cachedFunction = currentInstruction[4].u.jsCell.unvalidatedGet();
+                RELEASE_ASSERT(cachedFunction); // LLint and BaselineJIT always set it to a JSFunction* or seenMultipleCalleeObjects().
+                if (cachedFunction != JSCell::seenMultipleCalleeObjects()) {
+                    ASSERT(cachedFunction->inherits(JSFunction::info()));
+
+                    FrozenValue* frozen = m_graph.freeze(cachedFunction);
+                    addToGraph(CheckCell, OpInfo(frozen), callee);
+                    set(VirtualRegister(currentInstruction[1].u.operand), addToGraph(JSConstant, OpInfo(frozen)));
+
+                    function = static_cast<JSFunction*>(cachedFunction);
+                }
+            }
+
             bool alreadyEmitted = false;
-            if (JSFunction* function = callee->dynamicCastConstant<JSFunction*>()) {
+            if (function) {
                 if (FunctionRareData* rareData = function->rareData()) {
                     if (Structure* structure = rareData->allocationStructure()) {
                         m_graph.freeze(rareData);
index 5d4d0b9fe8768ac2fe596575ed4aa81220dad6c7..730361abdc955885153c0309ab7c6c19c3d98111 100644 (file)
@@ -705,11 +705,13 @@ void JIT::emit_op_to_this(Instruction* currentInstruction)
 void JIT::emit_op_create_this(Instruction* currentInstruction)
 {
     int callee = currentInstruction[2].u.operand;
+    WriteBarrierBase<JSCell>* cachedFunction = &currentInstruction[4].u.jsCell;
     RegisterID calleeReg = regT0;
-    RegisterID rareDataReg = regT0;
+    RegisterID rareDataReg = regT4;
     RegisterID resultReg = regT0;
     RegisterID allocatorReg = regT1;
     RegisterID structureReg = regT2;
+    RegisterID cachedFunctionReg = regT4;
     RegisterID scratchReg = regT3;
 
     emitGetVirtualRegister(callee, calleeReg);
@@ -719,6 +721,11 @@ void JIT::emit_op_create_this(Instruction* currentInstruction)
     loadPtr(Address(rareDataReg, FunctionRareData::offsetOfAllocationProfile() + ObjectAllocationProfile::offsetOfStructure()), structureReg);
     addSlowCase(branchTestPtr(Zero, allocatorReg));
 
+    loadPtr(cachedFunction, cachedFunctionReg);
+    Jump hasSeenMultipleCallees = branchPtr(Equal, cachedFunctionReg, TrustedImmPtr(JSCell::seenMultipleCalleeObjects()));
+    addSlowCase(branchPtr(NotEqual, calleeReg, cachedFunctionReg));
+    hasSeenMultipleCallees.link(this);
+
     emitAllocateJSObject(allocatorReg, structureReg, resultReg, scratchReg);
     emitPutVirtualRegister(currentInstruction[1].u.operand);
 }
@@ -728,6 +735,7 @@ void JIT::emitSlow_op_create_this(Instruction* currentInstruction, Vector<SlowCa
     linkSlowCase(iter); // doesn't have rare data
     linkSlowCase(iter); // doesn't have an allocation profile
     linkSlowCase(iter); // allocation failed
+    linkSlowCase(iter); // cached function didn't match
 
     JITSlowPathCall slowPathCall(this, currentInstruction, slow_path_create_this);
     slowPathCall.call();
index acd460b7d684c6dbaa806b03cc851b9f15b183b1..192022574302a81c494923277ff3d1e7e768b085 100644 (file)
@@ -936,11 +936,13 @@ void JIT::emit_op_get_scope(Instruction* currentInstruction)
 void JIT::emit_op_create_this(Instruction* currentInstruction)
 {
     int callee = currentInstruction[2].u.operand;
+    WriteBarrierBase<JSCell>* cachedFunction = &currentInstruction[4].u.jsCell;
     RegisterID calleeReg = regT0;
-    RegisterID rareDataReg = regT0;
+    RegisterID rareDataReg = regT4;
     RegisterID resultReg = regT0;
     RegisterID allocatorReg = regT1;
     RegisterID structureReg = regT2;
+    RegisterID cachedFunctionReg = regT4;
     RegisterID scratchReg = regT3;
 
     emitLoadPayload(callee, calleeReg);
@@ -950,6 +952,11 @@ void JIT::emit_op_create_this(Instruction* currentInstruction)
     loadPtr(Address(rareDataReg, FunctionRareData::offsetOfAllocationProfile() + ObjectAllocationProfile::offsetOfStructure()), structureReg);
     addSlowCase(branchTestPtr(Zero, allocatorReg));
 
+    loadPtr(cachedFunction, cachedFunctionReg);
+    Jump hasSeenMultipleCallees = branchPtr(Equal, cachedFunctionReg, TrustedImmPtr(JSCell::seenMultipleCalleeObjects()));
+    addSlowCase(branchPtr(NotEqual, calleeReg, cachedFunctionReg));
+    hasSeenMultipleCallees.link(this);
+
     emitAllocateJSObject(allocatorReg, structureReg, resultReg, scratchReg);
     emitStoreCell(currentInstruction[1].u.operand, resultReg);
 }
@@ -959,6 +966,7 @@ void JIT::emitSlow_op_create_this(Instruction* currentInstruction, Vector<SlowCa
     linkSlowCase(iter); // doesn't have rare data
     linkSlowCase(iter); // doesn't have an allocation profile
     linkSlowCase(iter); // allocation failed
+    linkSlowCase(iter); // cached function didn't match
 
     JITSlowPathCall slowPathCall(this, currentInstruction, slow_path_create_this);
     slowPathCall.call();
index 7ced21adf2b3eae9b6443c08dbb399299a84f362..5ac24a8334fcf10b62bd4feb1f2bef79dfa077f5 100644 (file)
@@ -745,15 +745,19 @@ _llint_op_create_this:
     loadp FunctionRareData::m_allocationProfile + ObjectAllocationProfile::m_allocator[t4], t1
     loadp FunctionRareData::m_allocationProfile + ObjectAllocationProfile::m_structure[t4], t2
     btpz t1, .opCreateThisSlow
+    loadpFromInstruction(4, t4)
+    bpeq t4, 1, .hasSeenMultipleCallee
+    bpneq t4, t0, .opCreateThisSlow
+.hasSeenMultipleCallee:
     allocateJSObject(t1, t2, t0, t3, .opCreateThisSlow)
     loadi 4[PC], t1
     storei CellTag, TagOffset[cfr, t1, 8]
     storei t0, PayloadOffset[cfr, t1, 8]
-    dispatch(4)
+    dispatch(5)
 
 .opCreateThisSlow:
     callSlowPath(_slow_path_create_this)
-    dispatch(4)
+    dispatch(5)
 
 
 _llint_op_to_this:
index 68a9c62fc6050cc25f55e91e26bb4376686e41e7..0f11e67d2821dca84b3528c77267bf281c80319c 100644 (file)
@@ -631,14 +631,18 @@ _llint_op_create_this:
     loadp FunctionRareData::m_allocationProfile + ObjectAllocationProfile::m_allocator[t4], t1
     loadp FunctionRareData::m_allocationProfile + ObjectAllocationProfile::m_structure[t4], t2
     btpz t1, .opCreateThisSlow
+    loadpFromInstruction(4, t4)
+    bpeq t4, 1, .hasSeenMultipleCallee
+    bpneq t4, t0, .opCreateThisSlow
+.hasSeenMultipleCallee:
     allocateJSObject(t1, t2, t0, t3, .opCreateThisSlow)
     loadisFromInstruction(1, t1)
     storeq t0, [cfr, t1, 8]
-    dispatch(4)
+    dispatch(5)
 
 .opCreateThisSlow:
     callSlowPath(_slow_path_create_this)
-    dispatch(4)
+    dispatch(5)
 
 
 _llint_op_to_this:
index 009b3500ba42ab0883722dd7889c628612fe9a6f..56d46ed82b6758f91d3c2b2d04e27a00f1355e5f 100644 (file)
@@ -235,6 +235,12 @@ SLOW_PATH_DECL(slow_path_create_this)
     ASSERT(constructor->methodTable()->getConstructData(constructor, constructData) == ConstructTypeJS);
 #endif
 
+    auto& cacheWriteBarrier = pc[4].u.jsCell;
+    if (!cacheWriteBarrier)
+        cacheWriteBarrier.set(exec->vm(), exec->codeBlock()->ownerExecutable(), constructor);
+    else if (cacheWriteBarrier.unvalidatedGet() != JSCell::seenMultipleCalleeObjects() && cacheWriteBarrier.get() != constructor)
+        cacheWriteBarrier.setWithoutWriteBarrier(JSCell::seenMultipleCalleeObjects());
+
     size_t inlineCapacity = pc[3].u.operand;
     Structure* structure = constructor->rareData(exec, inlineCapacity)->allocationProfile()->structure();
     RETURN(constructEmptyObject(exec, structure));
index 26df1e342f767647211e322ff51ca63991381871..6d648ad262944ed7b4670e23df79c8e63b2985b8 100644 (file)
@@ -74,6 +74,8 @@ public:
 
     static const bool needsDestruction = false;
 
+    static JSCell* seenMultipleCalleeObjects() { return bitwise_cast<JSCell*>(static_cast<uintptr_t>(1)); }
+
     enum CreatingEarlyCellTag { CreatingEarlyCell };
     JSCell(CreatingEarlyCellTag);
 
index b7a42c9d3cb8ef7b3ba4695d88f3cf92f1d4b07c..ac7c55b42133d13184eb9d18f88c0f77cf6013ca 100644 (file)
@@ -126,9 +126,7 @@ public:
         this->m_cell = reinterpret_cast<JSCell*>(value);
     }
 
-#if ENABLE(GC_VALIDATION)
     T* unvalidatedGet() const { return reinterpret_cast<T*>(static_cast<void*>(m_cell)); }
-#endif
 
 private:
     JSCell* m_cell;
diff --git a/Source/JavaScriptCore/tests/stress/create-this-with-callee-variants.js b/Source/JavaScriptCore/tests/stress/create-this-with-callee-variants.js
new file mode 100644 (file)
index 0000000..a7368ae
--- /dev/null
@@ -0,0 +1,18 @@
+function createInLoop(x, count) {
+    noInline(x)
+    for (var i = 0; i < 5000; i++) {
+        var obj = new x;
+        if (!(obj instanceof x))
+            throw "Failed to instantiate the right object";
+    }
+}
+
+function y() { return function () {} }
+
+createInLoop(y());
+
+function z() { return function () {} }
+
+createInLoop(z());
+createInLoop(z());
+createInLoop(z());