FTL should generate code to call slow paths lazily
authorfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Mon, 12 Oct 2015 17:56:26 +0000 (17:56 +0000)
committerfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Mon, 12 Oct 2015 17:56:26 +0000 (17:56 +0000)
https://bugs.webkit.org/show_bug.cgi?id=149936

Reviewed by Saam Barati.

Source/JavaScriptCore:

We often have complex slow paths in FTL-generated code. Those slow paths may never run. Even
if they do run, they don't need stellar performance. So, it doesn't make sense to have LLVM
worry about compiling such slow path code.

This patch enables us to use our own MacroAssembler for compiling the slow path inside FTL
code. It does this by using a crazy lambda thingy (see FTLLowerDFGToLLVM.cpp's lazySlowPath()
and its documentation). The result is quite natural to use.

Even for straight slow path calls via something like vmCall(), the lazySlowPath offers the
benefit that the call marshalling and the exception checking are not expressed using LLVM IR
and do not require LLVM to think about it. It also has the benefit that we never generate the
code if it never runs. That's great, since function calls usually involve ~10 instructions
total (move arguments to argument registers, make the call, check exception, etc.).

This patch adds the lazy slow path abstraction and uses it for some slow paths in the FTL.
The code we generate with lazy slow paths is worse than the code that LLVM would have
generated. Therefore, a lazy slow path only makes sense when we have strong evidence that
the slow path will execute infrequently relative to the fast path. This completely precludes
the use of lazy slow paths for out-of-line Nodes that unconditionally call a C++ function.
It also precludes their use for the GetByVal out-of-bounds handler, since when we generate
a GetByVal with an out-of-bounds handler it means that we only know that the out-of-bounds
case executed at least once. So, for all we know, it may actually be the common case. So,
this patch just deployed the lazy slow path for GC slow paths and masquerades-as-undefined
slow paths. It makes sense for GC slow paths because those have a statistical guarantee of
slow path frequency - probably bounded at less than 1/10. It makes sense for masquerades-as-
undefined because we can say quite confidently that this is an uncommon scenario on the
modern Web.

Something that's always been challenging about abstractions involving the MacroAssembler is
that linking is a separate phase, and there is no way for someone who is just given access to
the MacroAssembler& to emit code that requires linking, since linking happens once we have
emitted all code and we are creating the LinkBuffer. Moreover, the FTL requires that the
final parts of linking happen on the main thread. This patch ran into this issue, and solved
it comprehensively, by introducing MacroAssembler::addLinkTask(). This takes a lambda and
runs it at the bitter end of linking - when performFinalization() is called. This ensure that
the task added by addLinkTask() runs on the main thread. This patch doesn't replace all of
the previously existing idioms for dealing with this issue; we can do that later.

This shows small speed-ups on a bunch of things. No big win on any benchmark aggregate. But
mainly this is done for https://bugs.webkit.org/show_bug.cgi?id=149852, where we found that
outlining the slow path in this way was a significant speed boost.

* CMakeLists.txt:
* JavaScriptCore.vcxproj/JavaScriptCore.vcxproj:
* JavaScriptCore.xcodeproj/project.pbxproj:
* assembler/AbstractMacroAssembler.h:
(JSC::AbstractMacroAssembler::replaceWithAddressComputation):
(JSC::AbstractMacroAssembler::addLinkTask):
(JSC::AbstractMacroAssembler::AbstractMacroAssembler):
* assembler/LinkBuffer.cpp:
(JSC::LinkBuffer::linkCode):
(JSC::LinkBuffer::allocate):
(JSC::LinkBuffer::performFinalization):
* assembler/LinkBuffer.h:
(JSC::LinkBuffer::wasAlreadyDisassembled):
(JSC::LinkBuffer::didAlreadyDisassemble):
(JSC::LinkBuffer::vm):
(JSC::LinkBuffer::executableOffsetFor):
* bytecode/CodeOrigin.h:
(JSC::CodeOrigin::CodeOrigin):
(JSC::CodeOrigin::isSet):
(JSC::CodeOrigin::operator bool):
(JSC::CodeOrigin::isHashTableDeletedValue):
(JSC::CodeOrigin::operator!): Deleted.
* ftl/FTLCompile.cpp:
(JSC::FTL::mmAllocateDataSection):
* ftl/FTLInlineCacheDescriptor.h:
(JSC::FTL::InlineCacheDescriptor::InlineCacheDescriptor):
(JSC::FTL::CheckInDescriptor::CheckInDescriptor):
(JSC::FTL::LazySlowPathDescriptor::LazySlowPathDescriptor):
* ftl/FTLJITCode.h:
* ftl/FTLJITFinalizer.cpp:
(JSC::FTL::JITFinalizer::finalizeFunction):
* ftl/FTLJITFinalizer.h:
* ftl/FTLLazySlowPath.cpp: Added.
(JSC::FTL::LazySlowPath::LazySlowPath):
(JSC::FTL::LazySlowPath::~LazySlowPath):
(JSC::FTL::LazySlowPath::generate):
* ftl/FTLLazySlowPath.h: Added.
(JSC::FTL::LazySlowPath::createGenerator):
(JSC::FTL::LazySlowPath::patchpoint):
(JSC::FTL::LazySlowPath::usedRegisters):
(JSC::FTL::LazySlowPath::callSiteIndex):
(JSC::FTL::LazySlowPath::stub):
* ftl/FTLLazySlowPathCall.h: Added.
(JSC::FTL::createLazyCallGenerator):
* ftl/FTLLowerDFGToLLVM.cpp:
(JSC::FTL::DFG::LowerDFGToLLVM::compileCreateActivation):
(JSC::FTL::DFG::LowerDFGToLLVM::compileNewFunction):
(JSC::FTL::DFG::LowerDFGToLLVM::compileCreateDirectArguments):
(JSC::FTL::DFG::LowerDFGToLLVM::compileNewArrayWithSize):
(JSC::FTL::DFG::LowerDFGToLLVM::compileMakeRope):
(JSC::FTL::DFG::LowerDFGToLLVM::compileNotifyWrite):
(JSC::FTL::DFG::LowerDFGToLLVM::compileIsObjectOrNull):
(JSC::FTL::DFG::LowerDFGToLLVM::compileIsFunction):
(JSC::FTL::DFG::LowerDFGToLLVM::compileIn):
(JSC::FTL::DFG::LowerDFGToLLVM::compileMaterializeNewObject):
(JSC::FTL::DFG::LowerDFGToLLVM::compileMaterializeCreateActivation):
(JSC::FTL::DFG::LowerDFGToLLVM::compileCheckWatchdogTimer):
(JSC::FTL::DFG::LowerDFGToLLVM::allocatePropertyStorageWithSizeImpl):
(JSC::FTL::DFG::LowerDFGToLLVM::allocateObject):
(JSC::FTL::DFG::LowerDFGToLLVM::allocateJSArray):
(JSC::FTL::DFG::LowerDFGToLLVM::buildTypeOf):
(JSC::FTL::DFG::LowerDFGToLLVM::sensibleDoubleToInt32):
(JSC::FTL::DFG::LowerDFGToLLVM::lazySlowPath):
(JSC::FTL::DFG::LowerDFGToLLVM::speculate):
(JSC::FTL::DFG::LowerDFGToLLVM::emitStoreBarrier):
* ftl/FTLOperations.cpp:
(JSC::FTL::operationMaterializeObjectInOSR):
(JSC::FTL::compileFTLLazySlowPath):
* ftl/FTLOperations.h:
* ftl/FTLSlowPathCall.cpp:
(JSC::FTL::SlowPathCallContext::SlowPathCallContext):
(JSC::FTL::SlowPathCallContext::~SlowPathCallContext):
(JSC::FTL::SlowPathCallContext::keyWithTarget):
(JSC::FTL::SlowPathCallContext::makeCall):
(JSC::FTL::callSiteIndexForCodeOrigin):
(JSC::FTL::storeCodeOrigin): Deleted.
(JSC::FTL::callOperation): Deleted.
* ftl/FTLSlowPathCall.h:
(JSC::FTL::callOperation):
* ftl/FTLState.h:
* ftl/FTLThunks.cpp:
(JSC::FTL::genericGenerationThunkGenerator):
(JSC::FTL::osrExitGenerationThunkGenerator):
(JSC::FTL::lazySlowPathGenerationThunkGenerator):
(JSC::FTL::registerClobberCheck):
* ftl/FTLThunks.h:
* interpreter/CallFrame.h:
(JSC::CallSiteIndex::CallSiteIndex):
(JSC::CallSiteIndex::operator bool):
(JSC::CallSiteIndex::bits):
* jit/CCallHelpers.h:
(JSC::CCallHelpers::setupArgument):
(JSC::CCallHelpers::setupArgumentsWithExecState):
* jit/JITOperations.cpp:

Source/WTF:

Enables SharedTask to handle any function type, not just void().

It's probably better to use SharedTask instead of std::function in performance-sensitive
code. std::function uses the system malloc and has copy semantics. SharedTask uses FastMalloc
and has aliasing semantics. So, you can just trust that it will have sensible performance
characteristics.

* wtf/ParallelHelperPool.cpp:
(WTF::ParallelHelperClient::~ParallelHelperClient):
(WTF::ParallelHelperClient::setTask):
(WTF::ParallelHelperClient::doSomeHelping):
(WTF::ParallelHelperClient::runTaskInParallel):
(WTF::ParallelHelperClient::finish):
(WTF::ParallelHelperClient::claimTask):
(WTF::ParallelHelperClient::runTask):
(WTF::ParallelHelperPool::doSomeHelping):
(WTF::ParallelHelperPool::helperThreadBody):
* wtf/ParallelHelperPool.h:
(WTF::ParallelHelperClient::setFunction):
(WTF::ParallelHelperClient::runFunctionInParallel):
(WTF::ParallelHelperClient::pool):
* wtf/SharedTask.h:
(WTF::createSharedTask):
(WTF::SharedTask::SharedTask): Deleted.
(WTF::SharedTask::~SharedTask): Deleted.
(WTF::SharedTaskFunctor::SharedTaskFunctor): Deleted.

git-svn-id: http://svn.webkit.org/repository/webkit/trunk@190860 268f45cc-cd09-0410-ab3c-d52691b4dbfc

31 files changed:
Source/JavaScriptCore/CMakeLists.txt
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/JavaScriptCore.vcxproj/JavaScriptCore.vcxproj
Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj
Source/JavaScriptCore/assembler/AbstractMacroAssembler.h
Source/JavaScriptCore/assembler/LinkBuffer.cpp
Source/JavaScriptCore/assembler/LinkBuffer.h
Source/JavaScriptCore/bytecode/CodeOrigin.h
Source/JavaScriptCore/ftl/FTLCompile.cpp
Source/JavaScriptCore/ftl/FTLInlineCacheDescriptor.h
Source/JavaScriptCore/ftl/FTLJITCode.h
Source/JavaScriptCore/ftl/FTLJITFinalizer.cpp
Source/JavaScriptCore/ftl/FTLJITFinalizer.h
Source/JavaScriptCore/ftl/FTLLazySlowPath.cpp [new file with mode: 0644]
Source/JavaScriptCore/ftl/FTLLazySlowPath.h [new file with mode: 0644]
Source/JavaScriptCore/ftl/FTLLazySlowPathCall.h [new file with mode: 0644]
Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp
Source/JavaScriptCore/ftl/FTLOperations.cpp
Source/JavaScriptCore/ftl/FTLOperations.h
Source/JavaScriptCore/ftl/FTLSlowPathCall.cpp
Source/JavaScriptCore/ftl/FTLSlowPathCall.h
Source/JavaScriptCore/ftl/FTLState.h
Source/JavaScriptCore/ftl/FTLThunks.cpp
Source/JavaScriptCore/ftl/FTLThunks.h
Source/JavaScriptCore/interpreter/CallFrame.h
Source/JavaScriptCore/jit/CCallHelpers.h
Source/JavaScriptCore/jit/JITOperations.cpp
Source/WTF/ChangeLog
Source/WTF/wtf/ParallelHelperPool.cpp
Source/WTF/wtf/ParallelHelperPool.h
Source/WTF/wtf/SharedTask.h

index d671e50..0b91f71 100644 (file)
@@ -905,6 +905,7 @@ if (ENABLE_FTL_JIT)
         ftl/FTLJSCallBase.cpp
         ftl/FTLJSCallVarargs.cpp
         ftl/FTLJSTailCall.cpp
+        ftl/FTLLazySlowPath.cpp
         ftl/FTLLink.cpp
         ftl/FTLLocation.cpp
         ftl/FTLLowerDFGToLLVM.cpp
index 7fa8908..3915865 100644 (file)
@@ -1,3 +1,147 @@
+2015-10-10  Filip Pizlo  <fpizlo@apple.com>
+
+        FTL should generate code to call slow paths lazily
+        https://bugs.webkit.org/show_bug.cgi?id=149936
+
+        Reviewed by Saam Barati.
+
+        We often have complex slow paths in FTL-generated code. Those slow paths may never run. Even
+        if they do run, they don't need stellar performance. So, it doesn't make sense to have LLVM
+        worry about compiling such slow path code.
+
+        This patch enables us to use our own MacroAssembler for compiling the slow path inside FTL
+        code. It does this by using a crazy lambda thingy (see FTLLowerDFGToLLVM.cpp's lazySlowPath()
+        and its documentation). The result is quite natural to use.
+
+        Even for straight slow path calls via something like vmCall(), the lazySlowPath offers the
+        benefit that the call marshalling and the exception checking are not expressed using LLVM IR
+        and do not require LLVM to think about it. It also has the benefit that we never generate the
+        code if it never runs. That's great, since function calls usually involve ~10 instructions
+        total (move arguments to argument registers, make the call, check exception, etc.).
+
+        This patch adds the lazy slow path abstraction and uses it for some slow paths in the FTL.
+        The code we generate with lazy slow paths is worse than the code that LLVM would have
+        generated. Therefore, a lazy slow path only makes sense when we have strong evidence that
+        the slow path will execute infrequently relative to the fast path. This completely precludes
+        the use of lazy slow paths for out-of-line Nodes that unconditionally call a C++ function.
+        It also precludes their use for the GetByVal out-of-bounds handler, since when we generate
+        a GetByVal with an out-of-bounds handler it means that we only know that the out-of-bounds
+        case executed at least once. So, for all we know, it may actually be the common case. So,
+        this patch just deployed the lazy slow path for GC slow paths and masquerades-as-undefined
+        slow paths. It makes sense for GC slow paths because those have a statistical guarantee of
+        slow path frequency - probably bounded at less than 1/10. It makes sense for masquerades-as-
+        undefined because we can say quite confidently that this is an uncommon scenario on the
+        modern Web.
+
+        Something that's always been challenging about abstractions involving the MacroAssembler is
+        that linking is a separate phase, and there is no way for someone who is just given access to
+        the MacroAssembler& to emit code that requires linking, since linking happens once we have
+        emitted all code and we are creating the LinkBuffer. Moreover, the FTL requires that the
+        final parts of linking happen on the main thread. This patch ran into this issue, and solved
+        it comprehensively, by introducing MacroAssembler::addLinkTask(). This takes a lambda and
+        runs it at the bitter end of linking - when performFinalization() is called. This ensure that
+        the task added by addLinkTask() runs on the main thread. This patch doesn't replace all of
+        the previously existing idioms for dealing with this issue; we can do that later.
+
+        This shows small speed-ups on a bunch of things. No big win on any benchmark aggregate. But
+        mainly this is done for https://bugs.webkit.org/show_bug.cgi?id=149852, where we found that
+        outlining the slow path in this way was a significant speed boost.
+
+        * CMakeLists.txt:
+        * JavaScriptCore.vcxproj/JavaScriptCore.vcxproj:
+        * JavaScriptCore.xcodeproj/project.pbxproj:
+        * assembler/AbstractMacroAssembler.h:
+        (JSC::AbstractMacroAssembler::replaceWithAddressComputation):
+        (JSC::AbstractMacroAssembler::addLinkTask):
+        (JSC::AbstractMacroAssembler::AbstractMacroAssembler):
+        * assembler/LinkBuffer.cpp:
+        (JSC::LinkBuffer::linkCode):
+        (JSC::LinkBuffer::allocate):
+        (JSC::LinkBuffer::performFinalization):
+        * assembler/LinkBuffer.h:
+        (JSC::LinkBuffer::wasAlreadyDisassembled):
+        (JSC::LinkBuffer::didAlreadyDisassemble):
+        (JSC::LinkBuffer::vm):
+        (JSC::LinkBuffer::executableOffsetFor):
+        * bytecode/CodeOrigin.h:
+        (JSC::CodeOrigin::CodeOrigin):
+        (JSC::CodeOrigin::isSet):
+        (JSC::CodeOrigin::operator bool):
+        (JSC::CodeOrigin::isHashTableDeletedValue):
+        (JSC::CodeOrigin::operator!): Deleted.
+        * ftl/FTLCompile.cpp:
+        (JSC::FTL::mmAllocateDataSection):
+        * ftl/FTLInlineCacheDescriptor.h:
+        (JSC::FTL::InlineCacheDescriptor::InlineCacheDescriptor):
+        (JSC::FTL::CheckInDescriptor::CheckInDescriptor):
+        (JSC::FTL::LazySlowPathDescriptor::LazySlowPathDescriptor):
+        * ftl/FTLJITCode.h:
+        * ftl/FTLJITFinalizer.cpp:
+        (JSC::FTL::JITFinalizer::finalizeFunction):
+        * ftl/FTLJITFinalizer.h:
+        * ftl/FTLLazySlowPath.cpp: Added.
+        (JSC::FTL::LazySlowPath::LazySlowPath):
+        (JSC::FTL::LazySlowPath::~LazySlowPath):
+        (JSC::FTL::LazySlowPath::generate):
+        * ftl/FTLLazySlowPath.h: Added.
+        (JSC::FTL::LazySlowPath::createGenerator):
+        (JSC::FTL::LazySlowPath::patchpoint):
+        (JSC::FTL::LazySlowPath::usedRegisters):
+        (JSC::FTL::LazySlowPath::callSiteIndex):
+        (JSC::FTL::LazySlowPath::stub):
+        * ftl/FTLLazySlowPathCall.h: Added.
+        (JSC::FTL::createLazyCallGenerator):
+        * ftl/FTLLowerDFGToLLVM.cpp:
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileCreateActivation):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileNewFunction):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileCreateDirectArguments):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileNewArrayWithSize):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileMakeRope):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileNotifyWrite):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileIsObjectOrNull):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileIsFunction):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileIn):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileMaterializeNewObject):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileMaterializeCreateActivation):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileCheckWatchdogTimer):
+        (JSC::FTL::DFG::LowerDFGToLLVM::allocatePropertyStorageWithSizeImpl):
+        (JSC::FTL::DFG::LowerDFGToLLVM::allocateObject):
+        (JSC::FTL::DFG::LowerDFGToLLVM::allocateJSArray):
+        (JSC::FTL::DFG::LowerDFGToLLVM::buildTypeOf):
+        (JSC::FTL::DFG::LowerDFGToLLVM::sensibleDoubleToInt32):
+        (JSC::FTL::DFG::LowerDFGToLLVM::lazySlowPath):
+        (JSC::FTL::DFG::LowerDFGToLLVM::speculate):
+        (JSC::FTL::DFG::LowerDFGToLLVM::emitStoreBarrier):
+        * ftl/FTLOperations.cpp:
+        (JSC::FTL::operationMaterializeObjectInOSR):
+        (JSC::FTL::compileFTLLazySlowPath):
+        * ftl/FTLOperations.h:
+        * ftl/FTLSlowPathCall.cpp:
+        (JSC::FTL::SlowPathCallContext::SlowPathCallContext):
+        (JSC::FTL::SlowPathCallContext::~SlowPathCallContext):
+        (JSC::FTL::SlowPathCallContext::keyWithTarget):
+        (JSC::FTL::SlowPathCallContext::makeCall):
+        (JSC::FTL::callSiteIndexForCodeOrigin):
+        (JSC::FTL::storeCodeOrigin): Deleted.
+        (JSC::FTL::callOperation): Deleted.
+        * ftl/FTLSlowPathCall.h:
+        (JSC::FTL::callOperation):
+        * ftl/FTLState.h:
+        * ftl/FTLThunks.cpp:
+        (JSC::FTL::genericGenerationThunkGenerator):
+        (JSC::FTL::osrExitGenerationThunkGenerator):
+        (JSC::FTL::lazySlowPathGenerationThunkGenerator):
+        (JSC::FTL::registerClobberCheck):
+        * ftl/FTLThunks.h:
+        * interpreter/CallFrame.h:
+        (JSC::CallSiteIndex::CallSiteIndex):
+        (JSC::CallSiteIndex::operator bool):
+        (JSC::CallSiteIndex::bits):
+        * jit/CCallHelpers.h:
+        (JSC::CCallHelpers::setupArgument):
+        (JSC::CCallHelpers::setupArgumentsWithExecState):
+        * jit/JITOperations.cpp:
+
 2015-10-12  Philip Chimento  <philip.chimento@gmail.com>
 
         webkit-gtk-2.3.4 fails to link JavaScriptCore, missing symbols add_history and readline
index 187b08d..0ba180a 100644 (file)
     <ClCompile Include="..\ftl\FTLJSCallBase.cpp" />
     <ClCompile Include="..\ftl\FTLJSCallVarargs.cpp" />
     <ClCompile Include="..\ftl\FTLJSTailCall.cpp" />
+    <ClCompile Include="..\ftl\FTLLazySlowPath.cpp" />
     <ClCompile Include="..\ftl\FTLLink.cpp" />
     <ClCompile Include="..\ftl\FTLLocation.cpp" />
     <ClCompile Include="..\ftl\FTLLowerDFGToLLVM.cpp" />
     <ClInclude Include="..\ftl\FTLJSCallBase.h" />
     <ClInclude Include="..\ftl\FTLJSCallVarargs.h" />
     <ClInclude Include="..\ftl\FTLJSTailCall.h" />
+    <ClInclude Include="..\ftl\FTLLazySlowPath.h" />
+    <ClInclude Include="..\ftl\FTLLazySlowPathCall.h" />
     <ClInclude Include="..\ftl\FTLLink.h" />
     <ClInclude Include="..\ftl\FTLLocation.h" />
     <ClInclude Include="..\ftl\FTLLowerDFGToLLVM.h" />
index d32c23a..ee685c4 100644 (file)
                0FB17662196B8F9E0091052A /* DFGPureValue.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0FB1765E196B8F9E0091052A /* DFGPureValue.cpp */; };
                0FB17663196B8F9E0091052A /* DFGPureValue.h in Headers */ = {isa = PBXBuildFile; fileRef = 0FB1765F196B8F9E0091052A /* DFGPureValue.h */; };
                0FB438A319270B1D00E1FBC9 /* StructureSet.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0FB438A219270B1D00E1FBC9 /* StructureSet.cpp */; };
+               0FB4FB731BC843140025CA5A /* FTLLazySlowPath.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0FB4FB701BC843140025CA5A /* FTLLazySlowPath.cpp */; };
+               0FB4FB741BC843140025CA5A /* FTLLazySlowPath.h in Headers */ = {isa = PBXBuildFile; fileRef = 0FB4FB711BC843140025CA5A /* FTLLazySlowPath.h */; };
+               0FB4FB751BC843140025CA5A /* FTLLazySlowPathCall.h in Headers */ = {isa = PBXBuildFile; fileRef = 0FB4FB721BC843140025CA5A /* FTLLazySlowPathCall.h */; };
                0FB5467714F59B5C002C2989 /* LazyOperandValueProfile.h in Headers */ = {isa = PBXBuildFile; fileRef = 0FB5467614F59AD1002C2989 /* LazyOperandValueProfile.h */; settings = {ATTRIBUTES = (Private, ); }; };
                0FB5467914F5C46B002C2989 /* LazyOperandValueProfile.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0FB5467814F5C468002C2989 /* LazyOperandValueProfile.cpp */; };
                0FB5467B14F5C7E1002C2989 /* MethodOfGettingAValueProfile.h in Headers */ = {isa = PBXBuildFile; fileRef = 0FB5467A14F5C7D4002C2989 /* MethodOfGettingAValueProfile.h */; settings = {ATTRIBUTES = (Private, ); }; };
                0FB4B51F16B62772003F696B /* DFGNodeAllocator.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = DFGNodeAllocator.h; path = dfg/DFGNodeAllocator.h; sourceTree = "<group>"; };
                0FB4B52116B6278D003F696B /* FunctionExecutableDump.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = FunctionExecutableDump.cpp; sourceTree = "<group>"; };
                0FB4B52216B6278D003F696B /* FunctionExecutableDump.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = FunctionExecutableDump.h; sourceTree = "<group>"; };
+               0FB4FB701BC843140025CA5A /* FTLLazySlowPath.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = FTLLazySlowPath.cpp; path = ftl/FTLLazySlowPath.cpp; sourceTree = "<group>"; };
+               0FB4FB711BC843140025CA5A /* FTLLazySlowPath.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = FTLLazySlowPath.h; path = ftl/FTLLazySlowPath.h; sourceTree = "<group>"; };
+               0FB4FB721BC843140025CA5A /* FTLLazySlowPathCall.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = FTLLazySlowPathCall.h; path = ftl/FTLLazySlowPathCall.h; sourceTree = "<group>"; };
                0FB5467614F59AD1002C2989 /* LazyOperandValueProfile.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = LazyOperandValueProfile.h; sourceTree = "<group>"; };
                0FB5467814F5C468002C2989 /* LazyOperandValueProfile.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = LazyOperandValueProfile.cpp; sourceTree = "<group>"; };
                0FB5467A14F5C7D4002C2989 /* MethodOfGettingAValueProfile.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = MethodOfGettingAValueProfile.h; sourceTree = "<group>"; };
                                0FD120321A8C85BD000F5280 /* FTLJSCallVarargs.h */,
                                62774DA81B8D4B190006F05A /* FTLJSTailCall.cpp */,
                                62774DA91B8D4B190006F05A /* FTLJSTailCall.h */,
+                               0FB4FB701BC843140025CA5A /* FTLLazySlowPath.cpp */,
+                               0FB4FB711BC843140025CA5A /* FTLLazySlowPath.h */,
+                               0FB4FB721BC843140025CA5A /* FTLLazySlowPathCall.h */,
                                0F8F2B93172E049E007DBDA5 /* FTLLink.cpp */,
                                0F8F2B94172E049E007DBDA5 /* FTLLink.h */,
                                0FCEFADD180738C000472CE4 /* FTLLocation.cpp */,
                                0F63947815DCE34B006A597C /* DFGStructureAbstractValue.h in Headers */,
                                0F50AF3C193E8B3900674EE8 /* DFGStructureClobberState.h in Headers */,
                                0F79085619A290B200F6310C /* DFGStructureRegistrationPhase.h in Headers */,
+                               0FB4FB751BC843140025CA5A /* FTLLazySlowPathCall.h in Headers */,
                                0F2FCCFF18A60070001A27F8 /* DFGThreadData.h in Headers */,
                                0FC097A2146B28CC00CF2442 /* DFGThunks.h in Headers */,
                                0FD8A32817D51F5700CA2C40 /* DFGTierUpCheckInjectionPhase.h in Headers */,
                                7C184E1B17BEDBD3007CB63A /* JSPromise.h in Headers */,
                                7C184E2317BEE240007CB63A /* JSPromiseConstructor.h in Headers */,
                                7C008CDB187124BB00955C24 /* JSPromiseDeferred.h in Headers */,
+                               0FB4FB741BC843140025CA5A /* FTLLazySlowPath.h in Headers */,
                                7C184E1F17BEE22E007CB63A /* JSPromisePrototype.h in Headers */,
                                2A05ABD61961DF2400341750 /* JSPropertyNameEnumerator.h in Headers */,
                                E3EF88751B66DF23003F26CB /* JSPropertyNameIterator.h in Headers */,
                                14469DEB107EC7E700650446 /* StringConstructor.cpp in Sources */,
                                70EC0EC61AA0D7DA00B6AAFA /* StringIteratorPrototype.cpp in Sources */,
                                14469DEC107EC7E700650446 /* StringObject.cpp in Sources */,
+                               0FB4FB731BC843140025CA5A /* FTLLazySlowPath.cpp in Sources */,
                                14469DED107EC7E700650446 /* StringPrototype.cpp in Sources */,
                                9335F24D12E6765B002B5553 /* StringRecursionChecker.cpp in Sources */,
                                BCDE3B430E6C832D001453A7 /* Structure.cpp in Sources */,
index 25d34a1..8ad5f6d 100644 (file)
@@ -33,6 +33,7 @@
 #include "Options.h"
 #include <wtf/CryptographicallyRandomNumber.h>
 #include <wtf/Noncopyable.h>
+#include <wtf/SharedTask.h>
 #include <wtf/WeakRandom.h>
 
 #if ENABLE(ASSEMBLER)
@@ -1004,6 +1005,17 @@ public:
         AssemblerType::replaceWithAddressComputation(label.dataLocation());
     }
 
+    void addLinkTask(RefPtr<SharedTask<void(LinkBuffer&)>> task)
+    {
+        m_linkTasks.append(task);
+    }
+
+    template<typename Functor>
+    void addLinkTask(const Functor& functor)
+    {
+        m_linkTasks.append(createSharedTask<void(LinkBuffer&)>(functor));
+    }
+
 protected:
     AbstractMacroAssembler()
         : m_randomSource(cryptographicallyRandomNumber())
@@ -1099,10 +1111,9 @@ protected:
 
     unsigned m_tempRegistersValidBits;
 
-    friend class LinkBuffer;
-
-private:
+    Vector<RefPtr<SharedTask<void(LinkBuffer&)>>> m_linkTasks;
 
+    friend class LinkBuffer;
 }; // class AbstractMacroAssembler
 
 } // namespace JSC
index 6df151b..0bc71f7 100644 (file)
@@ -193,6 +193,8 @@ void LinkBuffer::linkCode(MacroAssembler& macroAssembler, void* ownerUID, JITCom
 #elif CPU(ARM64)
     copyCompactAndLinkCode<uint32_t>(macroAssembler, ownerUID, effort);
 #endif
+
+    m_linkTasks = WTF::move(macroAssembler.m_linkTasks);
 }
 
 void LinkBuffer::allocate(size_t initialSize, void* ownerUID, JITCompilationEffort effort)
@@ -224,6 +226,9 @@ void LinkBuffer::shrink(size_t newSize)
 
 void LinkBuffer::performFinalization()
 {
+    for (auto& task : m_linkTasks)
+        task->run(*this);
+
 #ifndef NDEBUG
     ASSERT(!isCompilationThread());
     ASSERT(!m_completed);
index b34a6f1..8517ab1 100644 (file)
@@ -259,6 +259,8 @@ public:
     bool wasAlreadyDisassembled() const { return m_alreadyDisassembled; }
     void didAlreadyDisassemble() { m_alreadyDisassembled = true; }
 
+    VM& vm() { return *m_vm; }
+
 private:
 #if ENABLE(BRANCH_COMPACTION)
     int executableOffsetFor(int location)
@@ -315,6 +317,7 @@ private:
     bool m_completed;
 #endif
     bool m_alreadyDisassembled { false };
+    Vector<RefPtr<SharedTask<void(LinkBuffer&)>>> m_linkTasks;
 };
 
 #define FINALIZE_CODE_IF(condition, linkBufferReference, dataLogFArgumentsForHeading)  \
index c944e81..13c2b8d 100644 (file)
@@ -74,7 +74,7 @@ struct CodeOrigin {
     }
     
     bool isSet() const { return bytecodeIndex != invalidBytecodeIndex; }
-    bool operator!() const { return !isSet(); }
+    explicit operator bool() const { return isSet(); }
     
     bool isHashTableDeletedValue() const
     {
index eee4fab..df47213 100644 (file)
@@ -333,7 +333,7 @@ static void fixFunctionBasedOnStackMaps(
 {
     Graph& graph = state.graph;
     VM& vm = graph.m_vm;
-    StackMaps stackmaps = jitCode->stackmaps;
+    StackMaps& stackmaps = jitCode->stackmaps;
     
     int localsOffset = offsetOfStackRegion(recordMap, state.capturedStackmapID) + graph.m_nextMachineLocal;
     int varargsSpillSlotsOffset = offsetOfStackRegion(recordMap, state.varargsSpillSlotsStackmapID);
@@ -439,7 +439,10 @@ static void fixFunctionBasedOnStackMaps(
         state.finalizer->exitThunksLinkBuffer = WTF::move(linkBuffer);
     }
 
-    if (!state.getByIds.isEmpty() || !state.putByIds.isEmpty() || !state.checkIns.isEmpty()) {
+    if (!state.getByIds.isEmpty()
+        || !state.putByIds.isEmpty()
+        || !state.checkIns.isEmpty()
+        || !state.lazySlowPaths.isEmpty()) {
         CCallHelpers slowPathJIT(&vm, codeBlock);
         
         CCallHelpers::JumpList exceptionTarget;
@@ -473,7 +476,8 @@ static void fixFunctionBasedOnStackMaps(
 
                 MacroAssembler::Call call = callOperation(
                     state, usedRegisters, slowPathJIT, codeOrigin, &exceptionTarget,
-                    operationGetByIdOptimize, result, gen.stubInfo(), base, getById.uid());
+                    operationGetByIdOptimize, result, CCallHelpers::TrustedImmPtr(gen.stubInfo()),
+                    base, CCallHelpers::TrustedImmPtr(getById.uid())).call();
 
                 gen.reportSlowPathCall(begin, call);
 
@@ -511,7 +515,9 @@ static void fixFunctionBasedOnStackMaps(
                 
                 MacroAssembler::Call call = callOperation(
                     state, usedRegisters, slowPathJIT, codeOrigin, &exceptionTarget,
-                    gen.slowPathFunction(), gen.stubInfo(), value, base, putById.uid());
+                    gen.slowPathFunction(), InvalidGPRReg,
+                    CCallHelpers::TrustedImmPtr(gen.stubInfo()), value, base,
+                    CCallHelpers::TrustedImmPtr(putById.uid())).call();
                 
                 gen.reportSlowPathCall(begin, call);
                 
@@ -549,13 +555,56 @@ static void fixFunctionBasedOnStackMaps(
 
                 MacroAssembler::Call slowCall = callOperation(
                     state, usedRegisters, slowPathJIT, codeOrigin, &exceptionTarget,
-                    operationInOptimize, result, stubInfo, obj, checkIn.m_uid);
+                    operationInOptimize, result, CCallHelpers::TrustedImmPtr(stubInfo), obj,
+                    CCallHelpers::TrustedImmPtr(checkIn.uid())).call();
 
                 checkIn.m_slowPathDone.append(slowPathJIT.jump());
                 
                 checkIn.m_generators.append(CheckInGenerator(stubInfo, slowCall, begin));
             }
         }
+
+        for (unsigned i = state.lazySlowPaths.size(); i--;) {
+            LazySlowPathDescriptor& descriptor = state.lazySlowPaths[i];
+
+            if (verboseCompilationEnabled())
+                dataLog("Handling lazySlowPath stackmap #", descriptor.stackmapID(), "\n");
+
+            auto iter = recordMap.find(descriptor.stackmapID());
+            if (iter == recordMap.end()) {
+                // It was optimized out.
+                continue;
+            }
+
+            for (unsigned i = 0; i < iter->value.size(); ++i) {
+                StackMaps::Record& record = iter->value[i];
+                RegisterSet usedRegisters = usedRegistersFor(record);
+                Vector<Location> locations;
+                for (auto location : record.locations)
+                    locations.append(Location::forStackmaps(&stackmaps, location));
+
+                char* startOfIC =
+                    bitwise_cast<char*>(generatedFunction) + record.instructionOffset;
+                CodeLocationLabel patchpoint((MacroAssemblerCodePtr(startOfIC)));
+                CodeLocationLabel exceptionTarget =
+                    state.finalizer->handleExceptionsLinkBuffer->entrypoint();
+
+                std::unique_ptr<LazySlowPath> lazySlowPath = std::make_unique<LazySlowPath>(
+                    patchpoint, exceptionTarget, usedRegisters, descriptor.callSiteIndex(),
+                    descriptor.m_linker->run(locations));
+
+                CCallHelpers::Label begin = slowPathJIT.label();
+
+                slowPathJIT.pushToSaveImmediateWithoutTouchingRegisters(
+                    CCallHelpers::TrustedImm32(state.jitCode->lazySlowPaths.size()));
+                CCallHelpers::Jump generatorJump = slowPathJIT.jump();
+                
+                descriptor.m_generators.append(std::make_tuple(lazySlowPath.get(), begin));
+
+                state.jitCode->lazySlowPaths.append(WTF::move(lazySlowPath));
+                state.finalizer->lazySlowPathGeneratorJumps.append(generatorJump);
+            }
+        }
         
         exceptionTarget.link(&slowPathJIT);
         MacroAssembler::Jump exceptionJump = slowPathJIT.jump();
@@ -578,12 +627,19 @@ static void fixFunctionBasedOnStackMaps(
                 state, codeBlock, generatedFunction, recordMap, state.putByIds[i],
                 sizeOfPutById());
         }
-
         for (unsigned i = state.checkIns.size(); i--;) {
             generateCheckInICFastPath(
                 state, codeBlock, generatedFunction, recordMap, state.checkIns[i],
                 sizeOfIn()); 
-        } 
+        }
+        for (unsigned i = state.lazySlowPaths.size(); i--;) {
+            LazySlowPathDescriptor& lazySlowPath = state.lazySlowPaths[i];
+            for (auto& tuple : lazySlowPath.m_generators) {
+                MacroAssembler::replaceWithJump(
+                    std::get<0>(tuple)->patchpoint(),
+                    state.finalizer->sideCodeLinkBuffer->locationOf(std::get<1>(tuple)));
+            }
+        }
     }
     
     adjustCallICsForStackmaps(state.jsCalls, recordMap);
index c6ec708..d0e965f 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2013, 2014 Apple Inc. All rights reserved.
+ * Copyright (C) 2013-2015 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
 #if ENABLE(FTL_JIT)
 
 #include "CodeOrigin.h"
+#include "FTLLazySlowPath.h"
 #include "JITInlineCacheGenerator.h"
 #include "MacroAssembler.h"
 #include <wtf/text/UniquedStringImpl.h>
 
 namespace JSC { namespace FTL {
 
+class Location;
+
 class InlineCacheDescriptor {
 public:
     InlineCacheDescriptor() 
@@ -113,17 +116,46 @@ class CheckInDescriptor : public InlineCacheDescriptor {
 public:
     CheckInDescriptor() { }
     
-    CheckInDescriptor(unsigned stackmapID, CallSiteIndex callSite, const UniquedStringImpl* uid)
-        : InlineCacheDescriptor(stackmapID, callSite, nullptr)
-        , m_uid(uid)
+    CheckInDescriptor(unsigned stackmapID, CallSiteIndex callSite, UniquedStringImpl* uid)
+        : InlineCacheDescriptor(stackmapID, callSite, uid)
     {
     }
-
     
-    const UniquedStringImpl* m_uid;
     Vector<CheckInGenerator> m_generators;
 };
 
+// You can create a lazy slow path call in lowerDFGToLLVM by doing:
+// m_ftlState.lazySlowPaths.append(
+//     LazySlowPathDescriptor(
+//         stackmapID, callSiteIndex,
+//         createSharedTask<RefPtr<LazySlowPath::Generator>(const Vector<Location>&)>(
+//             [] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+//                 // This lambda should just record the registers that we will be using, and return
+//                 // a SharedTask that will actually generate the slow path.
+//                 return createLazyCallGenerator(
+//                     function, locations[0].directGPR(), locations[1].directGPR());
+//             })));
+//
+// Usually, you can use the LowerDFGToLLVM::lazySlowPath() helper, which takes care of the descriptor
+// for you and also creates the patchpoint.
+typedef RefPtr<LazySlowPath::Generator> LazySlowPathLinkerFunction(const Vector<Location>&);
+typedef SharedTask<LazySlowPathLinkerFunction> LazySlowPathLinkerTask;
+class LazySlowPathDescriptor : public InlineCacheDescriptor {
+public:
+    LazySlowPathDescriptor() { }
+
+    LazySlowPathDescriptor(
+        unsigned stackmapID, CallSiteIndex callSite,
+        RefPtr<LazySlowPathLinkerTask> linker)
+        : InlineCacheDescriptor(stackmapID, callSite, nullptr)
+        , m_linker(linker)
+    {
+    }
+
+    Vector<std::tuple<LazySlowPath*, CCallHelpers::Label>> m_generators;
+
+    RefPtr<LazySlowPathLinkerTask> m_linker;
+};
 
 } } // namespace JSC::FTL
 
index 5b50568..38f301e 100644 (file)
@@ -30,6 +30,7 @@
 
 #include "DFGCommonData.h"
 #include "FTLDataSection.h"
+#include "FTLLazySlowPath.h"
 #include "FTLOSRExit.h"
 #include "FTLStackMaps.h"
 #include "FTLUnwindInfo.h"
@@ -86,6 +87,7 @@ public:
     DFG::CommonData common;
     SegmentedVector<OSRExit, 8> osrExit;
     StackMaps stackmaps;
+    Vector<std::unique_ptr<LazySlowPath>> lazySlowPaths;
     
 private:
     Vector<RefPtr<DataSection>> m_dataSections;
index 0aba6a8..0cc31d9 100644 (file)
@@ -106,11 +106,11 @@ bool JITFinalizer::finalizeFunction()
         // Side code is for special slow paths that we generate ourselves, like for inline
         // caches.
         
-        for (unsigned i = slowPathCalls.size(); i--;) {
-            SlowPathCall& call = slowPathCalls[i];
+        for (CCallHelpers::Jump jump : lazySlowPathGeneratorJumps) {
             sideCodeLinkBuffer->link(
-                call.call(),
-                CodeLocationLabel(m_plan.vm.ftlThunks->getSlowPathCallThunk(m_plan.vm, call.key()).code()));
+                jump,
+                CodeLocationLabel(
+                    m_plan.vm.getCTIStub(lazySlowPathGenerationThunkGenerator).code()));
         }
         
         jitCode->addHandle(FINALIZE_DFG_CODE(
index a1e0d03..d01dc80 100644 (file)
@@ -65,8 +65,8 @@ public:
     std::unique_ptr<LinkBuffer> sideCodeLinkBuffer;
     std::unique_ptr<LinkBuffer> handleExceptionsLinkBuffer;
     Vector<OutOfLineCodeInfo> outOfLineCodeInfos;
-    Vector<SlowPathCall> slowPathCalls; // Calls inside the side code.
     Vector<OSRExitCompilationInfo> osrExit;
+    Vector<CCallHelpers::Jump> lazySlowPathGeneratorJumps;
     GeneratedFunction function;
     RefPtr<JITCode> jitCode;
 };
diff --git a/Source/JavaScriptCore/ftl/FTLLazySlowPath.cpp b/Source/JavaScriptCore/ftl/FTLLazySlowPath.cpp
new file mode 100644 (file)
index 0000000..06b7e66
--- /dev/null
@@ -0,0 +1,74 @@
+/*
+ * Copyright (C) 2015 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+#include "config.h"
+#include "FTLLazySlowPath.h"
+
+#include "FTLSlowPathCall.h"
+#include "LinkBuffer.h"
+
+namespace JSC { namespace FTL {
+
+LazySlowPath::LazySlowPath(
+    CodeLocationLabel patchpoint, CodeLocationLabel exceptionTarget,
+    const RegisterSet& usedRegisters, CallSiteIndex callSiteIndex, RefPtr<Generator> generator)
+    : m_patchpoint(patchpoint)
+    , m_exceptionTarget(exceptionTarget)
+    , m_usedRegisters(usedRegisters)
+    , m_callSiteIndex(callSiteIndex)
+    , m_generator(generator)
+{
+}
+
+LazySlowPath::~LazySlowPath()
+{
+}
+
+void LazySlowPath::generate(CodeBlock* codeBlock)
+{
+    RELEASE_ASSERT(!m_stub);
+
+    VM& vm = *codeBlock->vm();
+
+    CCallHelpers jit(&vm, codeBlock);
+    GenerationParams params;
+    CCallHelpers::JumpList exceptionJumps;
+    params.exceptionJumps = m_exceptionTarget ? &exceptionJumps : nullptr;
+    params.lazySlowPath = this;
+    m_generator->run(jit, params);
+
+    LinkBuffer linkBuffer(vm, jit, codeBlock, JITCompilationMustSucceed);
+    linkBuffer.link(
+        params.doneJumps, m_patchpoint.labelAtOffset(MacroAssembler::maxJumpReplacementSize()));
+    if (m_exceptionTarget)
+        linkBuffer.link(exceptionJumps, m_exceptionTarget);
+    m_stub = FINALIZE_CODE_FOR(codeBlock, linkBuffer, ("Lazy slow path call stub"));
+
+    MacroAssembler::replaceWithJump(m_patchpoint, CodeLocationLabel(m_stub.code()));
+}
+
+} } // namespace JSC::FTL
+
+
diff --git a/Source/JavaScriptCore/ftl/FTLLazySlowPath.h b/Source/JavaScriptCore/ftl/FTLLazySlowPath.h
new file mode 100644 (file)
index 0000000..e7184fb
--- /dev/null
@@ -0,0 +1,91 @@
+/*
+ * Copyright (C) 2015 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+#ifndef FTLLazySlowPath_h
+#define FTLLazySlowPath_h
+
+#include "CCallHelpers.h"
+#include "CodeBlock.h"
+#include "CodeLocation.h"
+#include "GPRInfo.h"
+#include "MacroAssemblerCodeRef.h"
+#include "RegisterSet.h"
+#include <wtf/SharedTask.h>
+
+namespace JSC { namespace FTL {
+
+// A LazySlowPath is an object that represents a piece of code that is part of FTL generated code
+// that will be generated lazily. It holds all of the important information needed to generate that
+// code, such as where to link jumps to and which registers are in use. It also has a reference to a
+// SharedTask that will do the actual code generation. That SharedTask may have additional data, like
+// which registers hold the inputs or outputs.
+class LazySlowPath {
+    WTF_MAKE_NONCOPYABLE(LazySlowPath);
+    WTF_MAKE_FAST_ALLOCATED;
+public:
+    struct GenerationParams {
+        // Extra parameters to the GeneratorFunction are made into fields of this struct, so that if
+        // we add new parameters, we don't have to change all of the users.
+        CCallHelpers::JumpList doneJumps;
+        CCallHelpers::JumpList* exceptionJumps;
+        LazySlowPath* lazySlowPath;
+    };
+
+    typedef void GeneratorFunction(CCallHelpers&, GenerationParams&);
+    typedef SharedTask<GeneratorFunction> Generator;
+
+    template<typename Functor>
+    static RefPtr<Generator> createGenerator(const Functor& functor)
+    {
+        return createSharedTask<GeneratorFunction>(functor);
+    }
+    
+    LazySlowPath(
+        CodeLocationLabel patchpoint, CodeLocationLabel exceptionTarget,
+        const RegisterSet& usedRegisters, CallSiteIndex callSiteIndex, RefPtr<Generator>);
+
+    ~LazySlowPath();
+
+    CodeLocationLabel patchpoint() const { return m_patchpoint; }
+    const RegisterSet& usedRegisters() const { return m_usedRegisters; }
+    CallSiteIndex callSiteIndex() const { return m_callSiteIndex; }
+
+    void generate(CodeBlock*);
+
+    MacroAssemblerCodeRef stub() const { return m_stub; }
+
+private:
+    CodeLocationLabel m_patchpoint;
+    CodeLocationLabel m_exceptionTarget;
+    RegisterSet m_usedRegisters;
+    CallSiteIndex m_callSiteIndex;
+    MacroAssemblerCodeRef m_stub;
+    RefPtr<Generator> m_generator;
+};
+
+} } // namespace JSC::FTL
+
+#endif // FTLLazySlowPath_h
+
diff --git a/Source/JavaScriptCore/ftl/FTLLazySlowPathCall.h b/Source/JavaScriptCore/ftl/FTLLazySlowPathCall.h
new file mode 100644 (file)
index 0000000..2b4d804
--- /dev/null
@@ -0,0 +1,56 @@
+/*
+ * Copyright (C) 2015 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+#ifndef FTLLazySlowPathCall_h
+#define FTLLazySlowPathCall_h
+
+#include "CodeBlock.h"
+#include "CodeLocation.h"
+#include "FTLLazySlowPath.h"
+#include "FTLSlowPathCall.h"
+#include "FTLThunks.h"
+#include "GPRInfo.h"
+#include "MacroAssemblerCodeRef.h"
+#include "RegisterSet.h"
+
+namespace JSC { namespace FTL {
+
+template<typename ResultType, typename... ArgumentTypes>
+RefPtr<LazySlowPath::Generator> createLazyCallGenerator(
+    FunctionPtr function, ResultType result, ArgumentTypes... arguments)
+{
+    return LazySlowPath::createGenerator(
+        [=] (CCallHelpers& jit, LazySlowPath::GenerationParams& params) {
+            callOperation(
+                params.lazySlowPath->usedRegisters(), jit, params.lazySlowPath->callSiteIndex(),
+                params.exceptionJumps, function, result, arguments...);
+            params.doneJumps.append(jit.jump());
+        });
+}
+
+} } // namespace JSC::FTL
+
+#endif // FTLLazySlowPathCall_h
+
index a7a69af..1cdd5cd 100644 (file)
@@ -39,6 +39,7 @@
 #include "FTLForOSREntryJITCode.h"
 #include "FTLFormattedValue.h"
 #include "FTLInlineCacheSize.h"
+#include "FTLLazySlowPathCall.h"
 #include "FTLLoweredNodeValue.h"
 #include "FTLOperations.h"
 #include "FTLOutput.h"
@@ -50,6 +51,7 @@
 #include "OperandsInlines.h"
 #include "ScopedArguments.h"
 #include "ScopedArgumentsTable.h"
+#include "ScratchRegisterAllocator.h"
 #include "VirtualRegister.h"
 #include "Watchdog.h"
 #include <atomic>
@@ -3159,9 +3161,15 @@ private:
         m_out.jump(continuation);
         
         m_out.appendTo(slowPath, continuation);
-        LValue callResult = vmCall(
-            m_out.operation(operationCreateActivationDirect), m_callFrame, weakPointer(structure),
-            scope, weakPointer(table), m_out.constInt64(JSValue::encode(initializationValue)));
+        LValue callResult = lazySlowPath(
+            [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                return createLazyCallGenerator(
+                    operationCreateActivationDirect, locations[0].directGPR(),
+                    CCallHelpers::TrustedImmPtr(structure), locations[1].directGPR(),
+                    CCallHelpers::TrustedImmPtr(table),
+                    CCallHelpers::TrustedImm64(JSValue::encode(initializationValue)));
+            },
+            scope);
         ValueFromBlock slowResult = m_out.anchor(callResult);
         m_out.jump(continuation);
         
@@ -3215,11 +3223,25 @@ private:
         m_out.jump(continuation);
         
         m_out.appendTo(slowPath, continuation);
-        
-        LValue callResult = isArrowFunction
-            ? vmCall(m_out.operation(operationNewArrowFunctionWithInvalidatedReallocationWatchpoint), m_callFrame, scope, weakPointer(executable), thisValue)
-            : vmCall(m_out.operation(operationNewFunctionWithInvalidatedReallocationWatchpoint), m_callFrame, scope, weakPointer(executable));
-        
+
+        Vector<LValue> slowPathArguments;
+        slowPathArguments.append(scope);
+        if (isArrowFunction)
+            slowPathArguments.append(thisValue);
+        LValue callResult = lazySlowPath(
+            [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                if (isArrowFunction) {
+                    return createLazyCallGenerator(
+                        operationNewArrowFunctionWithInvalidatedReallocationWatchpoint,
+                        locations[0].directGPR(), locations[1].directGPR(),
+                        CCallHelpers::TrustedImmPtr(executable), locations[2].directGPR());
+                }
+                return createLazyCallGenerator(
+                    operationNewFunctionWithInvalidatedReallocationWatchpoint,
+                    locations[0].directGPR(), locations[1].directGPR(),
+                    CCallHelpers::TrustedImmPtr(executable));
+            },
+            slowPathArguments);
         ValueFromBlock slowResult = m_out.anchor(callResult);
         m_out.jump(continuation);
         
@@ -3271,9 +3293,13 @@ private:
         m_out.jump(continuation);
         
         m_out.appendTo(slowPath, continuation);
-        LValue callResult = vmCall(
-            m_out.operation(operationCreateDirectArguments), m_callFrame, weakPointer(structure),
-            length.value, m_out.constInt32(minCapacity));
+        LValue callResult = lazySlowPath(
+            [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                return createLazyCallGenerator(
+                    operationCreateDirectArguments, locations[0].directGPR(),
+                    CCallHelpers::TrustedImmPtr(structure), locations[1].directGPR(),
+                    CCallHelpers::TrustedImm32(minCapacity));
+            }, length.value);
         ValueFromBlock slowResult = m_out.anchor(callResult);
         m_out.jump(continuation);
         
@@ -3563,9 +3589,14 @@ private:
             m_out.appendTo(slowCase, continuation);
             LValue structureValue = m_out.phi(
                 m_out.intPtr, largeStructure, failStructure);
-            ValueFromBlock slowResult = m_out.anchor(vmCall(
-                m_out.operation(operationNewArrayWithSize),
-                m_callFrame, structureValue, publicLength));
+            LValue slowResultValue = lazySlowPath(
+                [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                    return createLazyCallGenerator(
+                        operationNewArrayWithSize, locations[0].directGPR(),
+                        locations[1].directGPR(), locations[2].directGPR());
+                },
+                structureValue, publicLength);
+            ValueFromBlock slowResult = m_out.anchor(slowResultValue);
             m_out.jump(continuation);
             
             m_out.appendTo(continuation, lastNext);
@@ -3762,20 +3793,29 @@ private:
         m_out.jump(continuation);
         
         m_out.appendTo(slowPath, continuation);
-        ValueFromBlock slowResult;
+        LValue slowResultValue;
         switch (numKids) {
         case 2:
-            slowResult = m_out.anchor(vmCall(
-                m_out.operation(operationMakeRope2), m_callFrame, kids[0], kids[1]));
+            slowResultValue = lazySlowPath(
+                [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                    return createLazyCallGenerator(
+                        operationMakeRope2, locations[0].directGPR(), locations[1].directGPR(),
+                        locations[2].directGPR());
+                }, kids[0], kids[1]);
             break;
         case 3:
-            slowResult = m_out.anchor(vmCall(
-                m_out.operation(operationMakeRope3), m_callFrame, kids[0], kids[1], kids[2]));
+            slowResultValue = lazySlowPath(
+                [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                    return createLazyCallGenerator(
+                        operationMakeRope3, locations[0].directGPR(), locations[1].directGPR(),
+                        locations[2].directGPR(), locations[3].directGPR());
+                }, kids[0], kids[1], kids[2]);
             break;
         default:
             DFG_CRASH(m_graph, m_node, "Bad number of children");
             break;
         }
+        ValueFromBlock slowResult = m_out.anchor(slowResultValue);
         m_out.jump(continuation);
         
         m_out.appendTo(continuation, lastNext);
@@ -4132,7 +4172,11 @@ private:
         
         LBasicBlock lastNext = m_out.appendTo(isNotInvalidated, continuation);
 
-        vmCall(m_out.operation(operationNotifyWrite), m_callFrame, m_out.constIntPtr(set));
+        lazySlowPath(
+            [=] (const Vector<Location>&) -> RefPtr<LazySlowPath::Generator> {
+                return createLazyCallGenerator(
+                    operationNotifyWrite, InvalidGPRReg, CCallHelpers::TrustedImmPtr(set));
+            });
         m_out.jump(continuation);
         
         m_out.appendTo(continuation, lastNext);
@@ -4978,10 +5022,13 @@ private:
             rarely(slowPath), usually(continuation));
         
         m_out.appendTo(slowPath, notCellCase);
-        LValue slowResultValue = vmCall(
-            m_out.operation(operationObjectIsObject), m_callFrame, weakPointer(globalObject),
-            value);
-        ValueFromBlock slowResult = m_out.anchor(m_out.notNull(slowResultValue));
+        LValue slowResultValue = lazySlowPath(
+            [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                return createLazyCallGenerator(
+                    operationObjectIsObject, locations[0].directGPR(),
+                    CCallHelpers::TrustedImmPtr(globalObject), locations[1].directGPR());
+            }, value);
+        ValueFromBlock slowResult = m_out.anchor(m_out.notZero64(slowResultValue));
         m_out.jump(continuation);
         
         m_out.appendTo(notCellCase, continuation);
@@ -5025,9 +5072,12 @@ private:
             rarely(slowPath), usually(continuation));
         
         m_out.appendTo(slowPath, continuation);
-        LValue slowResultValue = vmCall(
-            m_out.operation(operationObjectIsFunction), m_callFrame, weakPointer(globalObject),
-            value);
+        LValue slowResultValue = lazySlowPath(
+            [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                return createLazyCallGenerator(
+                    operationObjectIsFunction, locations[0].directGPR(),
+                    CCallHelpers::TrustedImmPtr(globalObject), locations[1].directGPR());
+            }, value);
         ValueFromBlock slowResult = m_out.anchor(m_out.notNull(slowResultValue));
         m_out.jump(continuation);
         
@@ -5066,7 +5116,7 @@ private:
         if (JSString* string = m_node->child1()->dynamicCastConstant<JSString*>()) {
             if (string->tryGetValueImpl() && string->tryGetValueImpl()->isAtomic()) {
 
-                const auto str = static_cast<const AtomicStringImpl*>(string->tryGetValueImpl());
+                UniquedStringImpl* str = bitwise_cast<UniquedStringImpl*>(string->tryGetValueImpl());
                 unsigned stackmapID = m_stackmapIDs++;
             
                 LValue call = m_out.call(
@@ -5460,12 +5510,16 @@ private:
                 m_out.jump(continuation);
                 
                 m_out.appendTo(slowPath, continuation);
-                
-                ValueFromBlock slowObject = m_out.anchor(vmCall(
-                    m_out.operation(operationNewObjectWithButterfly),
-                    m_callFrame, m_out.constIntPtr(structure)));
+
+                LValue slowObjectValue = lazySlowPath(
+                    [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                        return createLazyCallGenerator(
+                            operationNewObjectWithButterfly, locations[0].directGPR(),
+                            CCallHelpers::TrustedImmPtr(structure));
+                    });
+                ValueFromBlock slowObject = m_out.anchor(slowObjectValue);
                 ValueFromBlock slowButterfly = m_out.anchor(
-                    m_out.loadPtr(slowObject.value(), m_heaps.JSObject_butterfly));
+                    m_out.loadPtr(slowObjectValue, m_heaps.JSObject_butterfly));
                 
                 m_out.jump(continuation);
                 
@@ -5537,9 +5591,14 @@ private:
         // because all fields will be overwritten.
         // FIXME: It may be worth creating an operation that calls a constructor on JSLexicalEnvironment that 
         // doesn't initialize every slot because we are guaranteed to do that here.
-        LValue callResult = vmCall(
-            m_out.operation(operationCreateActivationDirect), m_callFrame, weakPointer(structure),
-            scope, weakPointer(table), m_out.constInt64(JSValue::encode(jsUndefined())));
+        LValue callResult = lazySlowPath(
+            [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                return createLazyCallGenerator(
+                    operationCreateActivationDirect, locations[0].directGPR(),
+                    CCallHelpers::TrustedImmPtr(structure), locations[1].directGPR(),
+                    CCallHelpers::TrustedImmPtr(table),
+                    CCallHelpers::TrustedImm64(JSValue::encode(jsUndefined())));
+            }, scope);
         ValueFromBlock slowResult =  m_out.anchor(callResult);
         m_out.jump(continuation);
 
@@ -5581,7 +5640,10 @@ private:
 
         LBasicBlock lastNext = m_out.appendTo(timerDidFire, continuation);
 
-        vmCall(m_out.operation(operationHandleWatchdogTimer), m_callFrame);
+        lazySlowPath(
+            [=] (const Vector<Location>&) -> RefPtr<LazySlowPath::Generator> {
+                return createLazyCallGenerator(operationHandleWatchdogTimer, InvalidGPRReg);
+            });
         m_out.jump(continuation);
         
         m_out.appendTo(continuation, lastNext);
@@ -6015,13 +6077,19 @@ private:
         
         LValue slowButterflyValue;
         if (sizeInValues == initialOutOfLineCapacity) {
-            slowButterflyValue = vmCall(
-                m_out.operation(operationAllocatePropertyStorageWithInitialCapacity),
-                m_callFrame);
+            slowButterflyValue = lazySlowPath(
+                [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                    return createLazyCallGenerator(
+                        operationAllocatePropertyStorageWithInitialCapacity,
+                        locations[0].directGPR());
+                });
         } else {
-            slowButterflyValue = vmCall(
-                m_out.operation(operationAllocatePropertyStorage),
-                m_callFrame, m_out.constIntPtr(sizeInValues));
+            slowButterflyValue = lazySlowPath(
+                [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                    return createLazyCallGenerator(
+                        operationAllocatePropertyStorage, locations[0].directGPR(),
+                        CCallHelpers::TrustedImmPtr(sizeInValues));
+                });
         }
         ValueFromBlock slowButterfly = m_out.anchor(slowButterflyValue);
         
@@ -6303,9 +6371,14 @@ private:
         m_out.jump(continuation);
         
         m_out.appendTo(slowPath, continuation);
-        
-        ValueFromBlock slowResult = m_out.anchor(vmCall(
-            m_out.operation(operationNewObject), m_callFrame, m_out.constIntPtr(structure)));
+
+        LValue slowResultValue = lazySlowPath(
+            [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                return createLazyCallGenerator(
+                    operationNewObject, locations[0].directGPR(),
+                    CCallHelpers::TrustedImmPtr(structure));
+            });
+        ValueFromBlock slowResult = m_out.anchor(slowResultValue);
         m_out.jump(continuation);
         
         m_out.appendTo(continuation, lastNext);
@@ -6377,10 +6450,14 @@ private:
         m_out.jump(continuation);
         
         m_out.appendTo(slowPath, continuation);
-        
-        ValueFromBlock slowArray = m_out.anchor(vmCall(
-            m_out.operation(operationNewArrayWithSize), m_callFrame,
-            m_out.constIntPtr(structure), m_out.constInt32(numElements)));
+
+        LValue slowArrayValue = lazySlowPath(
+            [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                return createLazyCallGenerator(
+                    operationNewArrayWithSize, locations[0].directGPR(),
+                    CCallHelpers::TrustedImmPtr(structure), CCallHelpers::TrustedImm32(numElements));
+            });
+        ValueFromBlock slowArray = m_out.anchor(slowArrayValue);
         ValueFromBlock slowButterfly = m_out.anchor(
             m_out.loadPtr(slowArray.value(), m_heaps.JSObject_butterfly));
 
@@ -7049,14 +7126,17 @@ private:
         functor(TypeofType::Object);
         
         m_out.appendTo(slowPath, unreachable);
-        LValue result = vmCall(
-            m_out.operation(operationTypeOfObjectAsTypeofType), m_callFrame,
-            weakPointer(globalObject), value);
+        LValue result = lazySlowPath(
+            [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                return createLazyCallGenerator(
+                    operationTypeOfObjectAsTypeofType, locations[0].directGPR(),
+                    CCallHelpers::TrustedImmPtr(globalObject), locations[1].directGPR());
+            }, value);
         Vector<SwitchCase, 3> cases;
         cases.append(SwitchCase(m_out.constInt32(static_cast<int32_t>(TypeofType::Undefined)), undefinedCase));
         cases.append(SwitchCase(m_out.constInt32(static_cast<int32_t>(TypeofType::Object)), reallyObjectCase));
         cases.append(SwitchCase(m_out.constInt32(static_cast<int32_t>(TypeofType::Function)), functionCase));
-        m_out.switchInstruction(result, cases, unreachable, Weight());
+        m_out.switchInstruction(m_out.castToInt32(result), cases, unreachable, Weight());
         
         m_out.appendTo(unreachable, notObjectCase);
         m_out.unreachable();
@@ -7163,6 +7243,115 @@ private:
         m_out.appendTo(continuation, lastNext);
         return m_out.phi(m_out.int32, fastResult, slowResult);
     }
+
+    // This is a mechanism for creating a code generator that fills in a gap in the code using our
+    // own MacroAssembler. This is useful for slow paths that involve a lot of code and we don't want
+    // to pay the price of LLVM optimizing it. A lazy slow path will only be generated if it actually
+    // executes. On the other hand, a lazy slow path always incurs the cost of two additional jumps.
+    // Also, the lazy slow path's register allocation state is slaved to whatever LLVM did, so you
+    // have to use a ScratchRegisterAllocator to try to use some unused registers and you may have
+    // to spill to top of stack if there aren't enough registers available.
+    //
+    // Lazy slow paths involve three different stages of execution. Each stage has unique
+    // capabilities and knowledge. The stages are:
+    //
+    // 1) DFG->LLVM lowering, i.e. code that runs in this phase. Lowering is the last time you will
+    //    have access to LValues. If there is an LValue that needs to be fed as input to a lazy slow
+    //    path, then you must pass it as an argument here (as one of the varargs arguments after the
+    //    functor). But, lowering doesn't know which registers will be used for those LValues. Hence
+    //    you pass a lambda to lazySlowPath() and that lambda will run during stage (2):
+    //
+    // 2) FTLCompile.cpp's fixFunctionBasedOnStackMaps. This code is the only stage at which we know
+    //    the mapping from arguments passed to this method in (1) and the registers that LLVM
+    //    selected for those arguments. You don't actually want to generate any code here, since then
+    //    the slow path wouldn't actually be lazily generated. Instead, you want to save the
+    //    registers being used for the arguments and defer code generation to stage (3) by creating
+    //    and returning a LazySlowPath::Generator:
+    //
+    // 3) LazySlowPath's generate() method. This code runs in response to the lazy slow path
+    //    executing for the first time. It will call the generator you created in stage (2).
+    //
+    // Note that each time you invoke stage (1), stage (2) may be invoked zero, one, or many times.
+    // Stage (2) will usually be invoked once for stage (1). But, LLVM may kill the code, in which
+    // case stage (2) won't run. LLVM may duplicate the code (for example via jump threading),
+    // leading to many calls to your stage (2) lambda. Stage (3) may be called zero or once for each
+    // stage (2). It will be called zero times if the slow path never runs. This is what you hope for
+    // whenever you use the lazySlowPath() mechanism.
+    //
+    // A typical use of lazySlowPath() will look like the example below, which just creates a slow
+    // path that adds some value to the input and returns it.
+    //
+    // // Stage (1) is here. This is your last chance to figure out which LValues to use as inputs.
+    // // Notice how we pass "input" as an argument to lazySlowPath().
+    // LValue input = ...;
+    // int addend = ...;
+    // LValue output = lazySlowPath(
+    //     [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+    //         // Stage (2) is here. This is your last chance to figure out which registers are used
+    //         // for which values. Location zero is always the return value. You can ignore it if
+    //         // you don't want to return anything. Location 1 is the register for the first
+    //         // argument to the lazySlowPath(), i.e. "input". Note that the Location object could
+    //         // also hold an FPR, if you are passing a double.
+    //         GPRReg outputGPR = locations[0].directGPR();
+    //         GPRReg inputGPR = locations[1].directGPR();
+    //         return LazySlowPath::createGenerator(
+    //             [=] (CCallHelpers& jit, LazySlowPath::GenerationParams& params) {
+    //                 // Stage (3) is here. This is when you generate code. You have access to the
+    //                 // registers you collected in stage (2) because this lambda closes over those
+    //                 // variables (outputGPR and inputGPR). You also have access to whatever extra
+    //                 // data you collected in stage (1), such as the addend in this case.
+    //                 jit.add32(TrustedImm32(addend), inputGPR, outputGPR);
+    //                 // You have to end by jumping to done. There is nothing to fall through to.
+    //                 // You can also jump to the exception handler (see LazySlowPath.h for more
+    //                 // info). Note that currently you cannot OSR exit.
+    //                 params.doneJumps.append(jit.jump());
+    //             });
+    //     },
+    //     input);
+    //
+    // Note that if your slow path is only doing a call, you can use the createLazyCallGenerator()
+    // helper. For example:
+    //
+    // LValue input = ...;
+    // LValue output = lazySlowPath(
+    //     [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+    //         return createLazyCallGenerator(
+    //             operationDoThings, locations[0].directGPR(), locations[1].directGPR());
+    //     });
+    //
+    // Finally, note that all of the lambdas - both the stage (2) lambda and the stage (3) lambda -
+    // run after the function that created them returns. Hence, you should not use by-reference
+    // capture (i.e. [&]) in any of these lambdas.
+    template<typename Functor, typename... ArgumentTypes>
+    LValue lazySlowPath(const Functor& functor, ArgumentTypes... arguments)
+    {
+        return lazySlowPath(functor, Vector<LValue>{ arguments... });
+    }
+
+    template<typename Functor>
+    LValue lazySlowPath(const Functor& functor, const Vector<LValue>& userArguments)
+    {
+        unsigned stackmapID = m_stackmapIDs++;
+
+        Vector<LValue> arguments;
+        arguments.append(m_out.constInt64(stackmapID));
+        arguments.append(m_out.constInt32(MacroAssembler::maxJumpReplacementSize()));
+        arguments.append(constNull(m_out.ref8));
+        arguments.append(m_out.constInt32(userArguments.size()));
+        arguments.appendVector(userArguments);
+        LValue call = m_out.call(m_out.patchpointInt64Intrinsic(), arguments);
+        setInstructionCallingConvention(call, LLVMAnyRegCallConv);
+
+        CallSiteIndex callSiteIndex =
+            m_ftlState.jitCode->common.addCodeOrigin(m_node->origin.semantic);
+        
+        RefPtr<LazySlowPathLinkerTask> linker =
+            createSharedTask<LazySlowPathLinkerFunction>(functor);
+
+        m_ftlState.lazySlowPaths.append(LazySlowPathDescriptor(stackmapID, callSiteIndex, linker));
+
+        return call;
+    }
     
     void speculate(
         ExitKind kind, FormattedValue lowValue, Node* highValue, LValue failCondition)
@@ -8267,33 +8456,67 @@ private:
 
     void emitStoreBarrier(LValue base)
     {
-        LBasicBlock isMarkedAndNotRemembered = FTL_NEW_BLOCK(m_out, ("Store barrier is marked block"));
-        LBasicBlock bufferHasSpace = FTL_NEW_BLOCK(m_out, ("Store barrier buffer has space"));
-        LBasicBlock bufferIsFull = FTL_NEW_BLOCK(m_out, ("Store barrier buffer is full"));
+        LBasicBlock slowPath = FTL_NEW_BLOCK(m_out, ("Store barrier slow path"));
         LBasicBlock continuation = FTL_NEW_BLOCK(m_out, ("Store barrier continuation"));
 
-        // Check the mark byte. 
         m_out.branch(
-            m_out.notZero8(loadCellState(base)), usually(continuation), rarely(isMarkedAndNotRemembered));
+            m_out.notZero8(loadCellState(base)), usually(continuation), rarely(slowPath));
 
-        // Append to the write barrier buffer.
-        LBasicBlock lastNext = m_out.appendTo(isMarkedAndNotRemembered, bufferHasSpace);
-        LValue currentBufferIndex = m_out.load32(m_out.absolute(vm().heap.writeBarrierBuffer().currentIndexAddress()));
-        LValue bufferCapacity = m_out.constInt32(vm().heap.writeBarrierBuffer().capacity());
-        m_out.branch(
-            m_out.lessThan(currentBufferIndex, bufferCapacity),
-            usually(bufferHasSpace), rarely(bufferIsFull));
-
-        // Buffer has space, store to it.
-        m_out.appendTo(bufferHasSpace, bufferIsFull);
-        LValue writeBarrierBufferBase = m_out.constIntPtr(vm().heap.writeBarrierBuffer().buffer());
-        m_out.storePtr(base, m_out.baseIndex(m_heaps.WriteBarrierBuffer_bufferContents, writeBarrierBufferBase, m_out.zeroExtPtr(currentBufferIndex)));
-        m_out.store32(m_out.add(currentBufferIndex, m_out.constInt32(1)), m_out.absolute(vm().heap.writeBarrierBuffer().currentIndexAddress()));
-        m_out.jump(continuation);
+        LBasicBlock lastNext = m_out.appendTo(slowPath, continuation);
 
-        // Buffer is out of space, flush it.
-        m_out.appendTo(bufferIsFull, continuation);
-        vmCallNoExceptions(m_out.operation(operationFlushWriteBarrierBuffer), m_callFrame, base);
+        // We emit the store barrier slow path lazily. In a lot of cases, this will never fire. And
+        // when it does fire, it makes sense for us to generate this code using our JIT rather than
+        // wasting LLVM's time optimizing it.
+        lazySlowPath(
+            [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
+                GPRReg baseGPR = locations[1].directGPR();
+
+                return LazySlowPath::createGenerator(
+                    [=] (CCallHelpers& jit, LazySlowPath::GenerationParams& params) {
+                        RegisterSet usedRegisters = params.lazySlowPath->usedRegisters();
+                        ScratchRegisterAllocator scratchRegisterAllocator(usedRegisters);
+                        scratchRegisterAllocator.lock(baseGPR);
+
+                        GPRReg scratch1 = scratchRegisterAllocator.allocateScratchGPR();
+                        GPRReg scratch2 = scratchRegisterAllocator.allocateScratchGPR();
+
+                        unsigned bytesPushed =
+                            scratchRegisterAllocator.preserveReusedRegistersByPushing(jit);
+
+                        // We've already saved these, so when we make a slow path call, we don't have
+                        // to save them again.
+                        usedRegisters.exclude(RegisterSet(scratch1, scratch2));
+
+                        WriteBarrierBuffer& writeBarrierBuffer = jit.vm()->heap.writeBarrierBuffer();
+                        jit.load32(writeBarrierBuffer.currentIndexAddress(), scratch2);
+                        CCallHelpers::Jump needToFlush = jit.branch32(
+                            CCallHelpers::AboveOrEqual, scratch2,
+                            CCallHelpers::TrustedImm32(writeBarrierBuffer.capacity()));
+
+                        jit.add32(CCallHelpers::TrustedImm32(1), scratch2);
+                        jit.store32(scratch2, writeBarrierBuffer.currentIndexAddress());
+
+                        jit.move(CCallHelpers::TrustedImmPtr(writeBarrierBuffer.buffer()), scratch1);
+                        jit.storePtr(
+                            baseGPR,
+                            CCallHelpers::BaseIndex(
+                                scratch1, scratch2, CCallHelpers::ScalePtr,
+                                static_cast<int32_t>(-sizeof(void*))));
+
+                        scratchRegisterAllocator.restoreReusedRegistersByPopping(jit, bytesPushed);
+
+                        params.doneJumps.append(jit.jump());
+
+                        needToFlush.link(&jit);
+                        callOperation(
+                            usedRegisters, jit, params.lazySlowPath->callSiteIndex(),
+                            params.exceptionJumps, operationFlushWriteBarrierBuffer, InvalidGPRReg,
+                            baseGPR);
+                        scratchRegisterAllocator.restoreReusedRegistersByPopping(jit, bytesPushed);
+                        params.doneJumps.append(jit.jump());
+                    });
+            },
+            base);
         m_out.jump(continuation);
 
         m_out.appendTo(continuation, lastNext);
index bbc4549..80eab6f 100644 (file)
@@ -30,6 +30,8 @@
 
 #include "ClonedArguments.h"
 #include "DirectArguments.h"
+#include "FTLJITCode.h"
+#include "FTLLazySlowPath.h"
 #include "InlineCallFrame.h"
 #include "JSCInlines.h"
 #include "JSLexicalEnvironment.h"
@@ -357,6 +359,22 @@ extern "C" JSCell* JIT_OPERATION operationMaterializeObjectInOSR(
     }
 }
 
+extern "C" void* JIT_OPERATION compileFTLLazySlowPath(ExecState* exec, unsigned index)
+{
+    VM& vm = exec->vm();
+
+    // We cannot GC. We've got pointers in evil places.
+    DeferGCForAWhile deferGC(vm.heap);
+
+    CodeBlock* codeBlock = exec->codeBlock();
+    JITCode* jitCode = codeBlock->jitCode()->ftl();
+
+    LazySlowPath& lazySlowPath = *jitCode->lazySlowPaths[index];
+    lazySlowPath.generate(codeBlock);
+
+    return lazySlowPath.stub().code().executableAddress();
+}
+
 } } // namespace JSC::FTL
 
 #endif // ENABLE(FTL_JIT)
index a5568f9..4f38c24 100644 (file)
@@ -33,6 +33,8 @@
 
 namespace JSC { namespace FTL {
 
+class LazySlowPath;
+
 extern "C" {
 
 JSCell* JIT_OPERATION operationNewObjectWithButterfly(ExecState*, Structure*) WTF_INTERNAL;
@@ -43,6 +45,8 @@ JSCell* JIT_OPERATION operationMaterializeObjectInOSR(
 void JIT_OPERATION operationPopulateObjectInOSR(
     ExecState*, ExitTimeObjectMaterialization*, EncodedJSValue*, EncodedJSValue*) WTF_INTERNAL;
 
+void* JIT_OPERATION compileFTLLazySlowPath(ExecState*, unsigned) WTF_INTERNAL;
+
 } // extern "C"
 
 } } // namespace JSC::DFG
index 2a4c1c9..eeba482 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2013, 2014 Apple Inc. All rights reserved.
+ * Copyright (C) 2013-2015 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
 
 #include "CCallHelpers.h"
 #include "FTLState.h"
+#include "FTLThunks.h"
 #include "GPRInfo.h"
 #include "JSCInlines.h"
 
 namespace JSC { namespace FTL {
 
-namespace {
-
 // This code relies on us being 64-bit. FTL is currently always 64-bit.
 static const size_t wordSize = 8;
 
-// This will be an RAII thingy that will set up the necessary stack sizes and offsets and such.
-class CallContext {
-public:
-    CallContext(
-        State& state, const RegisterSet& usedRegisters, CCallHelpers& jit,
-        unsigned numArgs, GPRReg returnRegister)
-        : m_state(state)
-        , m_usedRegisters(usedRegisters)
-        , m_jit(jit)
-        , m_numArgs(numArgs)
-        , m_returnRegister(returnRegister)
-    {
-        // We don't care that you're using callee-save, stack, or hardware registers.
-        m_usedRegisters.exclude(RegisterSet::stackRegisters());
-        m_usedRegisters.exclude(RegisterSet::reservedHardwareRegisters());
-        m_usedRegisters.exclude(RegisterSet::calleeSaveRegisters());
-        
-        // The return register doesn't need to be saved.
-        if (m_returnRegister != InvalidGPRReg)
-            m_usedRegisters.clear(m_returnRegister);
-        
-        size_t stackBytesNeededForReturnAddress = wordSize;
+SlowPathCallContext::SlowPathCallContext(
+    RegisterSet usedRegisters, CCallHelpers& jit, unsigned numArgs, GPRReg returnRegister)
+    : m_jit(jit)
+    , m_numArgs(numArgs)
+    , m_returnRegister(returnRegister)
+{
+    // We don't care that you're using callee-save, stack, or hardware registers.
+    usedRegisters.exclude(RegisterSet::stackRegisters());
+    usedRegisters.exclude(RegisterSet::reservedHardwareRegisters());
+    usedRegisters.exclude(RegisterSet::calleeSaveRegisters());
         
-        m_offsetToSavingArea =
-            (std::max(m_numArgs, NUMBER_OF_ARGUMENT_REGISTERS) - NUMBER_OF_ARGUMENT_REGISTERS) * wordSize;
+    // The return register doesn't need to be saved.
+    if (m_returnRegister != InvalidGPRReg)
+        usedRegisters.clear(m_returnRegister);
         
-        for (unsigned i = std::min(NUMBER_OF_ARGUMENT_REGISTERS, numArgs); i--;)
-            m_argumentRegisters.set(GPRInfo::toArgumentRegister(i));
-        m_callingConventionRegisters.merge(m_argumentRegisters);
-        if (returnRegister != InvalidGPRReg)
-            m_callingConventionRegisters.set(GPRInfo::returnValueGPR);
-        m_callingConventionRegisters.filter(m_usedRegisters);
+    size_t stackBytesNeededForReturnAddress = wordSize;
         
-        unsigned numberOfCallingConventionRegisters =
-            m_callingConventionRegisters.numberOfSetRegisters();
+    m_offsetToSavingArea =
+        (std::max(m_numArgs, NUMBER_OF_ARGUMENT_REGISTERS) - NUMBER_OF_ARGUMENT_REGISTERS) * wordSize;
         
-        size_t offsetToThunkSavingArea =
-            m_offsetToSavingArea +
-            numberOfCallingConventionRegisters * wordSize;
+    for (unsigned i = std::min(NUMBER_OF_ARGUMENT_REGISTERS, numArgs); i--;)
+        m_argumentRegisters.set(GPRInfo::toArgumentRegister(i));
+    m_callingConventionRegisters.merge(m_argumentRegisters);
+    if (returnRegister != InvalidGPRReg)
+        m_callingConventionRegisters.set(GPRInfo::returnValueGPR);
+    m_callingConventionRegisters.filter(usedRegisters);
         
-        m_stackBytesNeeded =
-            offsetToThunkSavingArea +
-            stackBytesNeededForReturnAddress +
-            (m_usedRegisters.numberOfSetRegisters() - numberOfCallingConventionRegisters) * wordSize;
+    unsigned numberOfCallingConventionRegisters =
+        m_callingConventionRegisters.numberOfSetRegisters();
         
-        m_stackBytesNeeded = (m_stackBytesNeeded + stackAlignmentBytes() - 1) & ~(stackAlignmentBytes() - 1);
+    size_t offsetToThunkSavingArea =
+        m_offsetToSavingArea +
+        numberOfCallingConventionRegisters * wordSize;
         
-        m_jit.subPtr(CCallHelpers::TrustedImm32(m_stackBytesNeeded), CCallHelpers::stackPointerRegister);
+    m_stackBytesNeeded =
+        offsetToThunkSavingArea +
+        stackBytesNeededForReturnAddress +
+        (usedRegisters.numberOfSetRegisters() - numberOfCallingConventionRegisters) * wordSize;
         
-        m_thunkSaveSet = m_usedRegisters;
+    m_stackBytesNeeded = (m_stackBytesNeeded + stackAlignmentBytes() - 1) & ~(stackAlignmentBytes() - 1);
         
-        // This relies on all calling convention registers also being temp registers.
-        unsigned stackIndex = 0;
-        for (unsigned i = GPRInfo::numberOfRegisters; i--;) {
-            GPRReg reg = GPRInfo::toRegister(i);
-            if (!m_callingConventionRegisters.get(reg))
-                continue;
-            m_jit.storePtr(reg, CCallHelpers::Address(CCallHelpers::stackPointerRegister, m_offsetToSavingArea + (stackIndex++) * wordSize));
-            m_thunkSaveSet.clear(reg);
-        }
+    m_jit.subPtr(CCallHelpers::TrustedImm32(m_stackBytesNeeded), CCallHelpers::stackPointerRegister);
+
+    m_thunkSaveSet = usedRegisters;
         
-        m_offset = offsetToThunkSavingArea;
+    // This relies on all calling convention registers also being temp registers.
+    unsigned stackIndex = 0;
+    for (unsigned i = GPRInfo::numberOfRegisters; i--;) {
+        GPRReg reg = GPRInfo::toRegister(i);
+        if (!m_callingConventionRegisters.get(reg))
+            continue;
+        m_jit.storePtr(reg, CCallHelpers::Address(CCallHelpers::stackPointerRegister, m_offsetToSavingArea + (stackIndex++) * wordSize));
+        m_thunkSaveSet.clear(reg);
     }
-    
-    ~CallContext()
-    {
-        if (m_returnRegister != InvalidGPRReg)
-            m_jit.move(GPRInfo::returnValueGPR, m_returnRegister);
         
-        unsigned stackIndex = 0;
-        for (unsigned i = GPRInfo::numberOfRegisters; i--;) {
-            GPRReg reg = GPRInfo::toRegister(i);
-            if (!m_callingConventionRegisters.get(reg))
-                continue;
-            m_jit.loadPtr(CCallHelpers::Address(CCallHelpers::stackPointerRegister, m_offsetToSavingArea + (stackIndex++) * wordSize), reg);
-        }
-        
-        m_jit.addPtr(CCallHelpers::TrustedImm32(m_stackBytesNeeded), CCallHelpers::stackPointerRegister);
-    }
-    
-    RegisterSet usedRegisters() const
-    {
-        return m_thunkSaveSet;
-    }
-    
-    ptrdiff_t offset() const
-    {
-        return m_offset;
-    }
+    m_offset = offsetToThunkSavingArea;
+}
     
-    SlowPathCallKey keyWithTarget(void* callTarget) const
-    {
-        return SlowPathCallKey(usedRegisters(), callTarget, m_argumentRegisters, offset());
-    }
+SlowPathCallContext::~SlowPathCallContext()
+{
+    if (m_returnRegister != InvalidGPRReg)
+        m_jit.move(GPRInfo::returnValueGPR, m_returnRegister);
     
-    MacroAssembler::Call makeCall(void* callTarget, MacroAssembler::JumpList* exceptionTarget)
-    {
-        MacroAssembler::Call result = m_jit.call();
-        m_state.finalizer->slowPathCalls.append(SlowPathCall(
-            result, keyWithTarget(callTarget)));
-        if (exceptionTarget)
-            exceptionTarget->append(m_jit.emitExceptionCheck());
-        return result;
+    unsigned stackIndex = 0;
+    for (unsigned i = GPRInfo::numberOfRegisters; i--;) {
+        GPRReg reg = GPRInfo::toRegister(i);
+        if (!m_callingConventionRegisters.get(reg))
+            continue;
+        m_jit.loadPtr(CCallHelpers::Address(CCallHelpers::stackPointerRegister, m_offsetToSavingArea + (stackIndex++) * wordSize), reg);
     }
     
-private:
-    State& m_state;
-    RegisterSet m_usedRegisters;
-    RegisterSet m_argumentRegisters;
-    RegisterSet m_callingConventionRegisters;
-    CCallHelpers& m_jit;
-    unsigned m_numArgs;
-    GPRReg m_returnRegister;
-    size_t m_offsetToSavingArea;
-    size_t m_stackBytesNeeded;
-    RegisterSet m_thunkSaveSet;
-    ptrdiff_t m_offset;
-};
-
-} // anonymous namespace
-
-void storeCodeOrigin(State& state, CCallHelpers& jit, CodeOrigin codeOrigin)
-{
-    if (!codeOrigin.isSet())
-        return;
-    
-    CallSiteIndex callSite = state.jitCode->common.addCodeOrigin(codeOrigin);
-    unsigned locationBits = callSite.bits();
-    jit.store32(
-        CCallHelpers::TrustedImm32(locationBits),
-        CCallHelpers::tagFor(static_cast<VirtualRegister>(JSStack::ArgumentCount)));
+    m_jit.addPtr(CCallHelpers::TrustedImm32(m_stackBytesNeeded), CCallHelpers::stackPointerRegister);
 }
 
-MacroAssembler::Call callOperation(
-    State& state, const RegisterSet& usedRegisters, CCallHelpers& jit,
-    CodeOrigin codeOrigin, MacroAssembler::JumpList* exceptionTarget,
-    J_JITOperation_ESsiCI operation, GPRReg result, StructureStubInfo* stubInfo,
-    GPRReg object, const UniquedStringImpl* uid)
+SlowPathCallKey SlowPathCallContext::keyWithTarget(void* callTarget) const
 {
-    storeCodeOrigin(state, jit, codeOrigin);
-    CallContext context(state, usedRegisters, jit, 4, result);
-    jit.setupArgumentsWithExecState(
-        CCallHelpers::TrustedImmPtr(stubInfo), object, CCallHelpers::TrustedImmPtr(uid));
-    return context.makeCall(bitwise_cast<void*>(operation), exceptionTarget);
+    return SlowPathCallKey(m_thunkSaveSet, callTarget, m_argumentRegisters, m_offset);
 }
 
-MacroAssembler::Call callOperation(
-    State& state, const RegisterSet& usedRegisters, CCallHelpers& jit,
-    CodeOrigin codeOrigin, MacroAssembler::JumpList* exceptionTarget,
-    J_JITOperation_ESsiJI operation, GPRReg result, StructureStubInfo* stubInfo,
-    GPRReg object, UniquedStringImpl* uid)
+SlowPathCall SlowPathCallContext::makeCall(void* callTarget)
 {
-    storeCodeOrigin(state, jit, codeOrigin);
-    CallContext context(state, usedRegisters, jit, 4, result);
-    jit.setupArgumentsWithExecState(
-        CCallHelpers::TrustedImmPtr(stubInfo), object,
-        CCallHelpers::TrustedImmPtr(uid));
-    return context.makeCall(bitwise_cast<void*>(operation), exceptionTarget);
+    SlowPathCall result = SlowPathCall(m_jit.call(), keyWithTarget(callTarget));
+
+    m_jit.addLinkTask(
+        [result] (LinkBuffer& linkBuffer) {
+            VM& vm = linkBuffer.vm();
+
+            MacroAssemblerCodeRef thunk =
+                vm.ftlThunks->getSlowPathCallThunk(vm, result.key());
+
+            linkBuffer.link(result.call(), CodeLocationLabel(thunk.code()));
+        });
+    
+    return result;
 }
 
-MacroAssembler::Call callOperation(
-    State& state, const RegisterSet& usedRegisters, CCallHelpers& jit, 
-    CodeOrigin codeOrigin, MacroAssembler::JumpList* exceptionTarget,
-    V_JITOperation_ESsiJJI operation, StructureStubInfo* stubInfo, GPRReg value,
-    GPRReg object, UniquedStringImpl* uid)
+CallSiteIndex callSiteIndexForCodeOrigin(State& state, CodeOrigin codeOrigin)
 {
-    storeCodeOrigin(state, jit, codeOrigin);
-    CallContext context(state, usedRegisters, jit, 5, InvalidGPRReg);
-    jit.setupArgumentsWithExecState(
-        CCallHelpers::TrustedImmPtr(stubInfo), value, object,
-        CCallHelpers::TrustedImmPtr(uid));
-    return context.makeCall(bitwise_cast<void*>(operation), exceptionTarget);
+    if (codeOrigin)
+        return state.jitCode->common.addCodeOrigin(codeOrigin);
+    return CallSiteIndex();
 }
 
 } } // namespace JSC::FTL
index 818ef64..442ffe9 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2013, 2014 Apple Inc. All rights reserved.
+ * Copyright (C) 2013-2015 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -55,20 +55,71 @@ private:
     SlowPathCallKey m_key;
 };
 
-void storeCodeOrigin(State&, CCallHelpers&, CodeOrigin);
-
-MacroAssembler::Call callOperation(
-    State&, const RegisterSet&, CCallHelpers&, CodeOrigin, CCallHelpers::JumpList*,
-    J_JITOperation_ESsiCI, GPRReg, StructureStubInfo*, GPRReg,
-    const UniquedStringImpl* uid);
-MacroAssembler::Call callOperation(
-    State&, const RegisterSet&, CCallHelpers&, CodeOrigin, CCallHelpers::JumpList*,
-    J_JITOperation_ESsiJI, GPRReg result, StructureStubInfo*, GPRReg object,
-    UniquedStringImpl* uid);
-MacroAssembler::Call callOperation(
-    State&, const RegisterSet&, CCallHelpers&, CodeOrigin, CCallHelpers::JumpList*,
-    V_JITOperation_ESsiJJI, StructureStubInfo*, GPRReg value, GPRReg object,
-    UniquedStringImpl* uid);
+// This will be an RAII thingy that will set up the necessary stack sizes and offsets and such.
+class SlowPathCallContext {
+public:
+    SlowPathCallContext(RegisterSet usedRegisters, CCallHelpers&, unsigned numArgs, GPRReg returnRegister);
+    ~SlowPathCallContext();
+
+    // NOTE: The call that this returns is already going to be linked by the JIT using addLinkTask(),
+    // so there is no need for you to link it yourself.
+    SlowPathCall makeCall(void* callTarget);
+
+private:
+    SlowPathCallKey keyWithTarget(void* callTarget) const;
+    
+    RegisterSet m_argumentRegisters;
+    RegisterSet m_callingConventionRegisters;
+    CCallHelpers& m_jit;
+    unsigned m_numArgs;
+    GPRReg m_returnRegister;
+    size_t m_offsetToSavingArea;
+    size_t m_stackBytesNeeded;
+    RegisterSet m_thunkSaveSet;
+    ptrdiff_t m_offset;
+};
+
+template<typename... ArgumentTypes>
+SlowPathCall callOperation(
+    const RegisterSet& usedRegisters, CCallHelpers& jit, CCallHelpers::JumpList* exceptionTarget,
+    FunctionPtr function, GPRReg resultGPR, ArgumentTypes... arguments)
+{
+    SlowPathCall call;
+    {
+        SlowPathCallContext context(usedRegisters, jit, sizeof...(ArgumentTypes) + 1, resultGPR);
+        jit.setupArgumentsWithExecState(arguments...);
+        call = context.makeCall(function.value());
+    }
+    if (exceptionTarget)
+        exceptionTarget->append(jit.emitExceptionCheck());
+    return call;
+}
+
+template<typename... ArgumentTypes>
+SlowPathCall callOperation(
+    const RegisterSet& usedRegisters, CCallHelpers& jit, CallSiteIndex callSiteIndex,
+    CCallHelpers::JumpList* exceptionTarget, FunctionPtr function, GPRReg resultGPR,
+    ArgumentTypes... arguments)
+{
+    if (callSiteIndex) {
+        jit.store32(
+            CCallHelpers::TrustedImm32(callSiteIndex.bits()),
+            CCallHelpers::tagFor(JSStack::ArgumentCount));
+    }
+    return callOperation(usedRegisters, jit, exceptionTarget, function, resultGPR, arguments...);
+}
+
+CallSiteIndex callSiteIndexForCodeOrigin(State&, CodeOrigin);
+
+template<typename... ArgumentTypes>
+SlowPathCall callOperation(
+    State& state, const RegisterSet& usedRegisters, CCallHelpers& jit, CodeOrigin codeOrigin,
+    CCallHelpers::JumpList* exceptionTarget, FunctionPtr function, GPRReg result, ArgumentTypes... arguments)
+{
+    return callOperation(
+        usedRegisters, jit, callSiteIndexForCodeOrigin(state, codeOrigin), exceptionTarget, function,
+        result, arguments...);
+}
 
 } } // namespace JSC::FTL
 
index f8414ec..a4308f2 100644 (file)
@@ -78,6 +78,7 @@ public:
     SegmentedVector<GetByIdDescriptor> getByIds;
     SegmentedVector<PutByIdDescriptor> putByIds;
     SegmentedVector<CheckInDescriptor> checkIns;
+    SegmentedVector<LazySlowPathDescriptor> lazySlowPaths;
     Vector<JSCall> jsCalls;
     Vector<JSCallVarargs> jsCallVarargses;
     Vector<JSTailCall> jsTailCalls;
index 2792583..cda81b5 100644 (file)
@@ -31,6 +31,7 @@
 #include "AssemblyHelpers.h"
 #include "FPRInfo.h"
 #include "FTLOSRExitCompiler.h"
+#include "FTLOperations.h"
 #include "FTLSaveRestore.h"
 #include "GPRInfo.h"
 #include "LinkBuffer.h"
@@ -39,11 +40,12 @@ namespace JSC { namespace FTL {
 
 using namespace DFG;
 
-MacroAssemblerCodeRef osrExitGenerationThunkGenerator(VM* vm)
+static MacroAssemblerCodeRef genericGenerationThunkGenerator(
+    VM* vm, FunctionPtr generationFunction, const char* name, unsigned extraPopsToRestore)
 {
     AssemblyHelpers jit(vm, 0);
     
-    // Note that the "return address" will be the OSR exit ID.
+    // Note that the "return address" will be the ID that we pass to the generation function.
     
     ptrdiff_t stackMisalignment = MacroAssembler::pushToSaveByteOffset();
     
@@ -90,11 +92,14 @@ MacroAssemblerCodeRef osrExitGenerationThunkGenerator(VM* vm)
     while (numberOfRequiredPops--)
         jit.popToRestore(GPRInfo::regT1);
     jit.popToRestore(MacroAssembler::framePointerRegister);
-    
-    // At this point we're sitting on the return address - so if we did a jump right now, the
-    // tail-callee would be happy. Instead we'll stash the callee in the return address and then
-    // restore all registers.
-    
+
+    // When we came in here, there was an additional thing pushed to the stack. Some clients want it
+    // popped before proceeding.
+    while (extraPopsToRestore--)
+        jit.popToRestore(GPRInfo::regT1);
+
+    // Put the return address wherever the return instruction wants it. On all platforms, this
+    // ensures that the return address is out of the way of register restoration.
     jit.restoreReturnAddressBeforeReturn(GPRInfo::regT0);
 
     restoreAllRegisters(jit, buffer);
@@ -102,8 +107,22 @@ MacroAssemblerCodeRef osrExitGenerationThunkGenerator(VM* vm)
     jit.ret();
     
     LinkBuffer patchBuffer(*vm, jit, GLOBAL_THUNK_ID);
-    patchBuffer.link(functionCall, compileFTLOSRExit);
-    return FINALIZE_CODE(patchBuffer, ("FTL OSR exit generation thunk"));
+    patchBuffer.link(functionCall, generationFunction);
+    return FINALIZE_CODE(patchBuffer, ("%s", name));
+}
+
+MacroAssemblerCodeRef osrExitGenerationThunkGenerator(VM* vm)
+{
+    unsigned extraPopsToRestore = 0;
+    return genericGenerationThunkGenerator(
+        vm, compileFTLOSRExit, "FTL OSR exit generation thunk", extraPopsToRestore);
+}
+
+MacroAssemblerCodeRef lazySlowPathGenerationThunkGenerator(VM* vm)
+{
+    unsigned extraPopsToRestore = 1;
+    return genericGenerationThunkGenerator(
+        vm, compileFTLLazySlowPath, "FTL lazy slow path generation thunk", extraPopsToRestore);
 }
 
 static void registerClobberCheck(AssemblyHelpers& jit, RegisterSet dontClobber)
index 8cfb16d..68a9e8f 100644 (file)
@@ -40,6 +40,7 @@ class VM;
 namespace FTL {
 
 MacroAssemblerCodeRef osrExitGenerationThunkGenerator(VM*);
+MacroAssemblerCodeRef lazySlowPathGenerationThunkGenerator(VM*);
 MacroAssemblerCodeRef slowPathCallThunkGenerator(VM&, const SlowPathCallKey&);
 
 template<typename KeyTypeArgument>
index 9e5b55f..7251951 100644 (file)
@@ -1,7 +1,7 @@
 /*
  *  Copyright (C) 1999-2001 Harri Porten (porten@kde.org)
  *  Copyright (C) 2001 Peter Kelly (pmk@post.com)
- *  Copyright (C) 2003, 2007, 2008, 2011, 2013, 2014 Apple Inc. All rights reserved.
+ *  Copyright (C) 2003, 2007, 2008, 2011, 2013-2015 Apple Inc. All rights reserved.
  *
  *  This library is free software; you can redistribute it and/or
  *  modify it under the terms of the GNU Library General Public
@@ -39,6 +39,11 @@ namespace JSC  {
     class JSScope;
 
     struct CallSiteIndex {
+        CallSiteIndex()
+            : m_bits(UINT_MAX)
+        {
+        }
+        
         explicit CallSiteIndex(uint32_t bits)
             : m_bits(bits)
         { }
@@ -47,6 +52,9 @@ namespace JSC  {
             : m_bits(bitwise_cast<uint32_t>(instruction))
         { }
 #endif
+
+        explicit operator bool() const { return m_bits != UINT_MAX; }
+        
         inline uint32_t bits() const { return m_bits; }
 
     private:
index 40b37df..1e6ce52 100644 (file)
@@ -63,6 +63,8 @@ public:
         poke(GPRInfo::nonArgGPR0, POKE_ARGUMENT_OFFSET + argumentIndex - GPRInfo::numberOfArgumentRegisters);
     }
 
+    void setupArgumentsWithExecState() { setupArgumentsExecState(); }
+
     // These methods used to sort arguments into the correct registers.
     // On X86 we use cdecl calling conventions, which pass all arguments on the
     // stack. On other architectures we may need to sort values into the
index 8c92e1a..2b530f5 100644 (file)
@@ -1031,7 +1031,7 @@ JSCell* JIT_OPERATION operationNewObject(ExecState* exec, Structure* structure)
 {
     VM* vm = &exec->vm();
     NativeCallFrameTracer tracer(vm, exec);
-    
+
     return constructEmptyObject(exec, structure);
 }
 
index ef61275..2e02e44 100644 (file)
@@ -1,3 +1,37 @@
+2015-10-10  Filip Pizlo  <fpizlo@apple.com>
+
+        FTL should generate code to call slow paths lazily
+        https://bugs.webkit.org/show_bug.cgi?id=149936
+
+        Reviewed by Saam Barati.
+
+        Enables SharedTask to handle any function type, not just void().
+
+        It's probably better to use SharedTask instead of std::function in performance-sensitive
+        code. std::function uses the system malloc and has copy semantics. SharedTask uses FastMalloc
+        and has aliasing semantics. So, you can just trust that it will have sensible performance
+        characteristics.
+
+        * wtf/ParallelHelperPool.cpp:
+        (WTF::ParallelHelperClient::~ParallelHelperClient):
+        (WTF::ParallelHelperClient::setTask):
+        (WTF::ParallelHelperClient::doSomeHelping):
+        (WTF::ParallelHelperClient::runTaskInParallel):
+        (WTF::ParallelHelperClient::finish):
+        (WTF::ParallelHelperClient::claimTask):
+        (WTF::ParallelHelperClient::runTask):
+        (WTF::ParallelHelperPool::doSomeHelping):
+        (WTF::ParallelHelperPool::helperThreadBody):
+        * wtf/ParallelHelperPool.h:
+        (WTF::ParallelHelperClient::setFunction):
+        (WTF::ParallelHelperClient::runFunctionInParallel):
+        (WTF::ParallelHelperClient::pool):
+        * wtf/SharedTask.h:
+        (WTF::createSharedTask):
+        (WTF::SharedTask::SharedTask): Deleted.
+        (WTF::SharedTask::~SharedTask): Deleted.
+        (WTF::SharedTaskFunctor::SharedTaskFunctor): Deleted.
+
 2015-10-10  Dan Bernstein  <mitz@apple.com>
 
         [iOS] Remove unnecessary iOS version checks
index 743ae34..eebce1d 100644 (file)
@@ -53,7 +53,7 @@ ParallelHelperClient::~ParallelHelperClient()
     }
 }
 
-void ParallelHelperClient::setTask(RefPtr<SharedTask> task)
+void ParallelHelperClient::setTask(RefPtr<SharedTask<void()>> task)
 {
     LockHolder locker(m_pool->m_lock);
     RELEASE_ASSERT(!m_task);
@@ -69,7 +69,7 @@ void ParallelHelperClient::finish()
 
 void ParallelHelperClient::doSomeHelping()
 {
-    RefPtr<SharedTask> task;
+    RefPtr<SharedTask<void()>> task;
     {
         LockHolder locker(m_pool->m_lock);
         task = claimTask(locker);
@@ -80,7 +80,7 @@ void ParallelHelperClient::doSomeHelping()
     runTask(task);
 }
 
-void ParallelHelperClient::runTaskInParallel(RefPtr<SharedTask> task)
+void ParallelHelperClient::runTaskInParallel(RefPtr<SharedTask<void()>> task)
 {
     setTask(task);
     doSomeHelping();
@@ -94,7 +94,7 @@ void ParallelHelperClient::finish(const LockHolder&)
         m_pool->m_workCompleteCondition.wait(m_pool->m_lock);
 }
 
-RefPtr<SharedTask> ParallelHelperClient::claimTask(const LockHolder&)
+RefPtr<SharedTask<void()>> ParallelHelperClient::claimTask(const LockHolder&)
 {
     if (!m_task)
         return nullptr;
@@ -103,7 +103,7 @@ RefPtr<SharedTask> ParallelHelperClient::claimTask(const LockHolder&)
     return m_task;
 }
 
-void ParallelHelperClient::runTask(RefPtr<SharedTask> task)
+void ParallelHelperClient::runTask(RefPtr<SharedTask<void()>> task)
 {
     RELEASE_ASSERT(m_numActive);
     RELEASE_ASSERT(task);
@@ -153,7 +153,7 @@ void ParallelHelperPool::ensureThreads(unsigned numThreads)
 void ParallelHelperPool::doSomeHelping()
 {
     ParallelHelperClient* client;
-    RefPtr<SharedTask> task;
+    RefPtr<SharedTask<void()>> task;
     {
         LockHolder locker(m_lock);
         client = getClientWithTask(locker);
@@ -182,7 +182,7 @@ void ParallelHelperPool::helperThreadBody()
 {
     for (;;) {
         ParallelHelperClient* client;
-        RefPtr<SharedTask> task;
+        RefPtr<SharedTask<void()>> task;
 
         {
             LockHolder locker(m_lock);
index f1d2c13..7ce7f71 100644 (file)
@@ -130,12 +130,12 @@ public:
     WTF_EXPORT_PRIVATE ParallelHelperClient(RefPtr<ParallelHelperPool>);
     WTF_EXPORT_PRIVATE ~ParallelHelperClient();
 
-    WTF_EXPORT_PRIVATE void setTask(RefPtr<SharedTask>);
+    WTF_EXPORT_PRIVATE void setTask(RefPtr<SharedTask<void()>>);
 
     template<typename Functor>
     void setFunction(const Functor& functor)
     {
-        setTask(createSharedTask(functor));
+        setTask(createSharedTask<void()>(functor));
     }
     
     WTF_EXPORT_PRIVATE void finish();
@@ -146,7 +146,7 @@ public:
     // client->setTask(task);
     // client->doSomeHelping();
     // client->finish();
-    WTF_EXPORT_PRIVATE void runTaskInParallel(RefPtr<SharedTask>);
+    WTF_EXPORT_PRIVATE void runTaskInParallel(RefPtr<SharedTask<void()>>);
 
     // Equivalent to:
     // client->setFunction(functor);
@@ -155,7 +155,7 @@ public:
     template<typename Functor>
     void runFunctionInParallel(const Functor& functor)
     {
-        runTaskInParallel(createSharedTask(functor));
+        runTaskInParallel(createSharedTask<void()>(functor));
     }
 
     ParallelHelperPool& pool() { return *m_pool; }
@@ -165,11 +165,11 @@ private:
     friend class ParallelHelperPool;
 
     void finish(const LockHolder&);
-    RefPtr<SharedTask> claimTask(const LockHolder&);
-    void runTask(RefPtr<SharedTask>);
+    RefPtr<SharedTask<void()>> claimTask(const LockHolder&);
+    void runTask(RefPtr<SharedTask<void()>>);
     
     RefPtr<ParallelHelperPool> m_pool;
-    RefPtr<SharedTask> m_task;
+    RefPtr<SharedTask<void()>> m_task;
     unsigned m_numActive { 0 };
 };
 
index ab8da95..1ff569c 100644 (file)
@@ -41,13 +41,13 @@ namespace WTF {
 //
 // Here's an example of how SharedTask can be better than std::function. If you do:
 //
-// std::function a = b;
+// std::function<int(double)> a = b;
 //
 // Then "a" will get its own copy of all captured by-value variables. The act of copying may
 // require calls to system malloc, and it may be linear time in the total size of captured
 // variables. On the other hand, if you do:
 //
-// RefPtr<SharedTask> a = b;
+// RefPtr<SharedTask<int(double)> a = b;
 //
 // Then "a" will point to the same task as b, and the only work involved is the CAS to increase the
 // reference count.
@@ -58,18 +58,21 @@ namespace WTF {
 // createSharedTask(), below). But SharedTask also allows you to create your own subclass and put
 // state in member fields. This can be more natural if you want fine-grained control over what
 // state is shared between instances of the task.
-class SharedTask : public ThreadSafeRefCounted<SharedTask> {
+template<typename FunctionType> class SharedTask;
+template<typename ResultType, typename... ArgumentTypes>
+class SharedTask<ResultType (ArgumentTypes...)> : public ThreadSafeRefCounted<SharedTask<ResultType (ArgumentTypes...)>> {
 public:
     SharedTask() { }
     virtual ~SharedTask() { }
 
-    virtual void run() = 0;
+    virtual ResultType run(ArgumentTypes&&...) = 0;
 };
 
 // This is a utility class that allows you to create a SharedTask subclass using a lambda. Usually,
 // you don't want to use this class directly. Use createSharedTask() instead.
-template<typename Functor>
-class SharedTaskFunctor : public SharedTask {
+template<typename FunctionType, typename Functor> class SharedTaskFunctor;
+template<typename ResultType, typename... ArgumentTypes, typename Functor>
+class SharedTaskFunctor<ResultType (ArgumentTypes...), Functor> : public SharedTask<ResultType (ArgumentTypes...)> {
 public:
     SharedTaskFunctor(const Functor& functor)
         : m_functor(functor)
@@ -77,9 +80,9 @@ public:
     }
 
 private:
-    void run() override
+    ResultType run(ArgumentTypes&&... arguments) override
     {
-        m_functor();
+        return m_functor(std::forward<ArgumentTypes>(arguments)...);
     }
 
     Functor m_functor;
@@ -87,7 +90,7 @@ private:
 
 // Create a SharedTask from a functor, such as a lambda. You can use this like so:
 //
-// RefPtr<SharedTask> task = createSharedTask(
+// RefPtr<SharedTask<void()>> task = createSharedTask<void()>(
 //     [=] () {
 //         do things;
 //     });
@@ -102,10 +105,10 @@ private:
 // On the other hand, if you use something like ParallelHelperClient::runTaskInParallel() (or its
 // helper, runFunctionInParallel(), which does createSharedTask() for you), then it can be OK to
 // use [&], since the stack frame will remain live for the entire duration of the task's lifetime.
-template<typename Functor>
-Ref<SharedTask> createSharedTask(const Functor& functor)
+template<typename FunctionType, typename Functor>
+Ref<SharedTask<FunctionType>> createSharedTask(const Functor& functor)
 {
-    return adoptRef(*new SharedTaskFunctor<Functor>(functor));
+    return adoptRef(*new SharedTaskFunctor<FunctionType, Functor>(functor));
 }
 
 } // namespace WTF