[JSC] Generator should not create JSLexicalEnvironment if it is not necessary
authorysuzuki@apple.com <ysuzuki@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Tue, 19 Mar 2019 02:21:10 +0000 (02:21 +0000)
committerysuzuki@apple.com <ysuzuki@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Tue, 19 Mar 2019 02:21:10 +0000 (02:21 +0000)
https://bugs.webkit.org/show_bug.cgi?id=195901

Reviewed by Saam Barati.

It is not rare that generators do not need to have any registers to be suspended and resumed.
Since currently we always emit op_create_lexical_environment for generator code, we sometimes
create empty JSLexicalEnvironment while it is not required. We can see that a lot of empty JSLexicalEnvironment
are allocated in RAMification's Basic test.

This patch removes this unnecessary allocation. We introduce op_create_generator_frame_environment, which is
a marker, similar to op_yield. And generatorification phase decides whether we should actually emit op_create_lexical_environment,
based on the result of the analysis in generatorification. This can remove unnecessary JSLexicalEnvironment allocations.

We run RAMification in 6 times, use average of them.
RAMification's Basic in JIT mode shows 1.4% improvement.
ToT
    Current: 55076864.00, Peak: 55080960.00
Patched
    Current: 54325930.67, Peak: 54329344.00

RAMification's Basic in non-JIT mode shows 5.0% improvement.
ToT
    Current: 12485290.67, Peak: 12485290.67
Patched
    Current: 11894101.33, Peak: 11894101.33

* bytecode/BytecodeGeneratorification.cpp:
(JSC::BytecodeGeneratorification::BytecodeGeneratorification):
(JSC::BytecodeGeneratorification::generatorFrameData const):
(JSC::BytecodeGeneratorification::run):
* bytecode/BytecodeList.rb:
* bytecode/BytecodeUseDef.h:
(JSC::computeUsesForBytecodeOffset):
(JSC::computeDefsForBytecodeOffset):
* bytecompiler/BytecodeGenerator.cpp:
(JSC::BytecodeGenerator::BytecodeGenerator):
* dfg/DFGCapabilities.cpp:
(JSC::DFG::capabilityLevel):
* llint/LowLevelInterpreter.asm:

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@243127 268f45cc-cd09-0410-ab3c-d52691b4dbfc

Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/bytecode/BytecodeGeneratorification.cpp
Source/JavaScriptCore/bytecode/BytecodeList.rb
Source/JavaScriptCore/bytecode/BytecodeUseDef.h
Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp
Source/JavaScriptCore/dfg/DFGCapabilities.cpp
Source/JavaScriptCore/llint/LowLevelInterpreter.asm

index d85c612..ee56fd1 100644 (file)
@@ -1,3 +1,46 @@
+2019-03-18  Yusuke Suzuki  <ysuzuki@apple.com>
+
+        [JSC] Generator should not create JSLexicalEnvironment if it is not necessary
+        https://bugs.webkit.org/show_bug.cgi?id=195901
+
+        Reviewed by Saam Barati.
+
+        It is not rare that generators do not need to have any registers to be suspended and resumed.
+        Since currently we always emit op_create_lexical_environment for generator code, we sometimes
+        create empty JSLexicalEnvironment while it is not required. We can see that a lot of empty JSLexicalEnvironment
+        are allocated in RAMification's Basic test.
+
+        This patch removes this unnecessary allocation. We introduce op_create_generator_frame_environment, which is
+        a marker, similar to op_yield. And generatorification phase decides whether we should actually emit op_create_lexical_environment,
+        based on the result of the analysis in generatorification. This can remove unnecessary JSLexicalEnvironment allocations.
+
+        We run RAMification in 6 times, use average of them.
+        RAMification's Basic in JIT mode shows 1.4% improvement.
+        ToT
+            Current: 55076864.00, Peak: 55080960.00
+        Patched
+            Current: 54325930.67, Peak: 54329344.00
+
+        RAMification's Basic in non-JIT mode shows 5.0% improvement.
+        ToT
+            Current: 12485290.67, Peak: 12485290.67
+        Patched
+            Current: 11894101.33, Peak: 11894101.33
+
+        * bytecode/BytecodeGeneratorification.cpp:
+        (JSC::BytecodeGeneratorification::BytecodeGeneratorification):
+        (JSC::BytecodeGeneratorification::generatorFrameData const):
+        (JSC::BytecodeGeneratorification::run):
+        * bytecode/BytecodeList.rb:
+        * bytecode/BytecodeUseDef.h:
+        (JSC::computeUsesForBytecodeOffset):
+        (JSC::computeDefsForBytecodeOffset):
+        * bytecompiler/BytecodeGenerator.cpp:
+        (JSC::BytecodeGenerator::BytecodeGenerator):
+        * dfg/DFGCapabilities.cpp:
+        (JSC::DFG::capabilityLevel):
+        * llint/LowLevelInterpreter.asm:
+
 2019-03-18  Robin Morisset  <rmorisset@apple.com>
 
         Remove the inline capacity of Operands
index 2be3c3c..eebeb4d 100644 (file)
@@ -55,6 +55,14 @@ class BytecodeGeneratorification {
 public:
     typedef Vector<YieldData> Yields;
 
+    struct GeneratorFrameData {
+        InstructionStream::Offset m_point;
+        VirtualRegister m_dst;
+        VirtualRegister m_scope;
+        VirtualRegister m_symbolTable;
+        VirtualRegister m_initialValue;
+    };
+
     BytecodeGeneratorification(BytecodeGenerator& bytecodeGenerator, UnlinkedCodeBlock* codeBlock, InstructionStreamWriter& instructions, SymbolTable* generatorFrameSymbolTable, int generatorFrameSymbolTableIndex)
         : m_bytecodeGenerator(bytecodeGenerator)
         , m_codeBlock(codeBlock)
@@ -83,6 +91,18 @@ public:
                     break;
                 }
 
+                case op_create_generator_frame_environment: {
+                    auto bytecode = instruction->as<OpCreateGeneratorFrameEnvironment>();
+                    GeneratorFrameData data;
+                    data.m_point = instruction.offset();
+                    data.m_dst = bytecode.m_dst;
+                    data.m_scope = bytecode.m_scope;
+                    data.m_symbolTable = bytecode.m_symbolTable;
+                    data.m_initialValue = bytecode.m_initialValue;
+                    m_generatorFrameData = WTFMove(data);
+                    break;
+                }
+
                 default:
                     break;
                 }
@@ -115,6 +135,11 @@ public:
         return m_instructions.at(m_enterPoint);
     }
 
+    Optional<GeneratorFrameData> generatorFrameData() const
+    {
+        return m_generatorFrameData;
+    }
+
     const InstructionStream& instructions() const
     {
         return m_instructions;
@@ -150,6 +175,7 @@ private:
 
     BytecodeGenerator& m_bytecodeGenerator;
     InstructionStream::Offset m_enterPoint;
+    Optional<GeneratorFrameData> m_generatorFrameData;
     UnlinkedCodeBlock* m_codeBlock;
     InstructionStreamWriter& m_instructions;
     BytecodeGraph m_graph;
@@ -205,7 +231,7 @@ void BytecodeGeneratorification::run()
         for (unsigned i = 0; i < m_yields.size(); ++i)
             jumpTable.add(i + 1, m_yields[i].point);
 
-        rewriter.insertFragmentBefore(nextToEnterPoint, [&](BytecodeRewriter::Fragment& fragment) {
+        rewriter.insertFragmentBefore(nextToEnterPoint, [&] (BytecodeRewriter::Fragment& fragment) {
             fragment.appendInstruction<OpSwitchImm>(switchTableIndex, BoundLabel(nextToEnterPoint.offset()), state);
         });
     }
@@ -215,7 +241,7 @@ void BytecodeGeneratorification::run()
 
         auto instruction = m_instructions.at(data.point);
         // Emit save sequence.
-        rewriter.insertFragmentBefore(instruction, [&](BytecodeRewriter::Fragment& fragment) {
+        rewriter.insertFragmentBefore(instruction, [&] (BytecodeRewriter::Fragment& fragment) {
             data.liveness.forEachSetBit([&](size_t index) {
                 VirtualRegister operand = virtualRegisterForLocal(index);
                 Storage storage = storageForGeneratorLocal(index);
@@ -235,7 +261,7 @@ void BytecodeGeneratorification::run()
         });
 
         // Emit resume sequence.
-        rewriter.insertFragmentAfter(instruction, [&](BytecodeRewriter::Fragment& fragment) {
+        rewriter.insertFragmentAfter(instruction, [&] (BytecodeRewriter::Fragment& fragment) {
             data.liveness.forEachSetBit([&](size_t index) {
                 VirtualRegister operand = virtualRegisterForLocal(index);
                 Storage storage = storageForGeneratorLocal(index);
@@ -255,6 +281,18 @@ void BytecodeGeneratorification::run()
         rewriter.removeBytecode(instruction);
     }
 
+    if (m_generatorFrameData) {
+        auto instruction = m_instructions.at(m_generatorFrameData->m_point);
+        rewriter.insertFragmentAfter(instruction, [&] (BytecodeRewriter::Fragment& fragment) {
+            if (!m_generatorFrameSymbolTable->scopeSize()) {
+                // This will cause us to put jsUndefined() into the generator frame's scope value.
+                fragment.appendInstruction<OpMov>(m_generatorFrameData->m_dst, m_generatorFrameData->m_initialValue);
+            } else
+                fragment.appendInstruction<OpCreateLexicalEnvironment>(m_generatorFrameData->m_dst, m_generatorFrameData->m_scope, m_generatorFrameData->m_symbolTable, m_generatorFrameData->m_initialValue);
+        });
+        rewriter.removeBytecode(instruction);
+    }
+
     rewriter.execute();
 }
 
index 115bd3b..8ca0da9 100644 (file)
@@ -929,6 +929,14 @@ op :create_lexical_environment,
         initialValue: VirtualRegister,
     }
 
+op :create_generator_frame_environment,
+    args: {
+        dst: VirtualRegister,
+        scope: VirtualRegister,
+        symbolTable: VirtualRegister,
+        initialValue: VirtualRegister,
+    }
+
 op :get_parent_scope,
     args: {
         dst: VirtualRegister,
index 6a9ee53..93aa2de 100644 (file)
@@ -156,7 +156,8 @@ void computeUsesForBytecodeOffset(Block* codeBlock, OpcodeID opcodeID, const Ins
     USES(OpNewGeneratorFuncExp, scope)
     USES(OpNewAsyncFuncExp, scope)
     USES(OpToIndexString, index)
-    USES(OpCreateLexicalEnvironment, scope)
+    USES(OpCreateLexicalEnvironment, scope, symbolTable, initialValue)
+    USES(OpCreateGeneratorFrameEnvironment, scope, symbolTable, initialValue)
     USES(OpResolveScope, scope)
     USES(OpResolveScopeForHoistingFuncDeclInEval, scope)
     USES(OpGetFromScope, scope)
@@ -364,6 +365,7 @@ void computeDefsForBytecodeOffset(Block* codeBlock, OpcodeID opcodeID, const Ins
     DEFS(OpGetParentScope, dst)
     DEFS(OpPushWithScope, dst)
     DEFS(OpCreateLexicalEnvironment, dst)
+    DEFS(OpCreateGeneratorFrameEnvironment, dst)
     DEFS(OpResolveScope, dst)
     DEFS(OpResolveScopeForHoistingFuncDeclInEval, dst)
     DEFS(OpStrcat, dst)
index e263555..5bf75f7 100644 (file)
@@ -422,9 +422,6 @@ BytecodeGenerator::BytecodeGenerator(VM& vm, FunctionNode* functionNode, Unlinke
     if (isGeneratorOrAsyncFunctionBodyParseMode(parseMode)) {
         // Generator and AsyncFunction never provides "arguments". "arguments" reference will be resolved in an upper generator function scope.
         needsArguments = false;
-
-        // Generator and AsyncFunction uses the var scope to save and resume its variables. So the lexical scope is always instantiated.
-        shouldCaptureSomeOfTheThings = true;
     }
 
     if (isGeneratorOrAsyncFunctionWrapperParseMode(parseMode) && needsArguments) {
@@ -490,7 +487,7 @@ BytecodeGenerator::BytecodeGenerator(VM& vm, FunctionNode* functionNode, Unlinke
     if (shouldCaptureSomeOfTheThings)
         m_lexicalEnvironmentRegister = addVar();
 
-    if (shouldCaptureSomeOfTheThings || vm.typeProfiler())
+    if (isGeneratorOrAsyncFunctionBodyParseMode(parseMode) || shouldCaptureSomeOfTheThings || vm.typeProfiler())
         symbolTableConstantIndex = addConstantValue(functionSymbolTable)->index();
 
     // We can allocate the "var" environment if we don't have default parameter expressions. If we have
@@ -839,10 +836,17 @@ BytecodeGenerator::BytecodeGenerator(VM& vm, FunctionNode* functionNode, Unlinke
     // Set up the lexical environment scope as the generator frame. We store the saved and resumed generator registers into this scope with the symbol keys.
     // Since they are symbol keyed, these variables cannot be reached from the usual code.
     if (isGeneratorOrAsyncFunctionBodyParseMode(parseMode)) {
-        ASSERT(m_lexicalEnvironmentRegister);
         m_generatorFrameSymbolTable.set(*m_vm, functionSymbolTable);
         m_generatorFrameSymbolTableIndex = symbolTableConstantIndex;
-        move(generatorFrameRegister(), m_lexicalEnvironmentRegister);
+        if (m_lexicalEnvironmentRegister)
+            move(generatorFrameRegister(), m_lexicalEnvironmentRegister);
+        else {
+            // It would be possible that generator does not need to suspend and resume any registers.
+            // In this case, we would like to avoid creating a lexical environment as much as possible.
+            // op_create_generator_frame_environment is a marker, which is similar to op_yield.
+            // Generatorification inserts lexical environment creation if necessary. Otherwise, we convert it to op_mov frame, `undefined`.
+            OpCreateGeneratorFrameEnvironment::emit(this, generatorFrameRegister(), scopeRegister(), VirtualRegister { symbolTableConstantIndex }, addConstantValue(jsUndefined()));
+        }
         emitPutById(generatorRegister(), propertyNames().builtinNames().generatorFramePrivateName(), generatorFrameRegister());
     }
 
index 7852103..f45b38e 100644 (file)
@@ -281,6 +281,7 @@ CapabilityLevel capabilityLevel(OpcodeID opcodeID, CodeBlock* codeBlock, const I
         return CanCompile;
 
     case op_yield:
+    case op_create_generator_frame_environment:
     case llint_program_prologue:
     case llint_eval_prologue:
     case llint_module_program_prologue:
index 6a668f7..793572e 100644 (file)
@@ -1800,6 +1800,11 @@ llintOp(op_yield, OpYield, macro (unused, unused, unused)
 end)
 
 
+llintOp(op_create_generator_frame_environment, OpYield, macro (unused, unused, unused)
+    notSupported()
+end)
+
+
 llintOp(op_debug, OpDebug, macro (unused, unused, dispatch)
     loadp CodeBlock[cfr], t0
     loadi CodeBlock::m_debuggerRequests[t0], t0