The write barrier should be down with TSO
authorfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Wed, 28 Sep 2016 21:55:53 +0000 (21:55 +0000)
committerfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Wed, 28 Sep 2016 21:55:53 +0000 (21:55 +0000)
https://bugs.webkit.org/show_bug.cgi?id=162316

Reviewed by Geoffrey Garen.

Source/JavaScriptCore:

This makes our write barrier behave correctly when it races with the collector. The
collector wants to do this when visiting:

    object->cellState = black
    visit(object)

The mutator wants to do this when storing:

    object->property = newValue
    if (object->cellState == black)
        remember(object)

Prior to this change, this didn't work right because the compiler would sometimes place
barriers before the store to the property and because the mutator did not have adequate
fences.

Prior to this change, the DFG and FTL would emit this:

    if (object->cellState == black)
        remember(object)
    object->property = newValue

Which is wrong, because the object could start being scanned just after the cellState
check, at which point the store would be lost. We need to confirm that the state was not
black *after* the store! This change was harder than you'd expect: placing the barrier
after the store broke B3's ability to do its super crazy ninja CSE on some store-load
redundancies. Because the B3 CSE has some moves that the DFG CSE lacks, the DFG CSE's
ability to ignore barriers didn't help. I fixed this by having the FTL convey precise
heap ranges for the patchpoint corresponding to the barrier slow path. It reads the world
(because of the store-load fence) and it writes only cellState (because the B3 heap ranges
don't have any way to represent any of the GC's other state, which means that B3 does not
have to worry about aliasing with any of that).

The collector already uses a store-load fence on x86 just after setting the cellState and
before visiting the object. The mutator needs to do the same. But we cannot put a
store-load fence of any kind before store barriers, because that causes enormous slow
downs. In the worst case, Octane/richards slowed down by 90%! That's crazy! However, the
overall slow downs were small enough (0-15% on benchmark suite aggregates) that it would be
reasonable if the slow down only happened while the GC was running. Then, the concurrent GC
would lift throughput-while-collecting from 0% of peak to 85% of peak. This changes the
barrier so that it looks like this:

    if (object->cellState <= heap.sneakyBlackThreshold)
        slowPath(object)

Where sneakyBlackThreshold is the normal blackThreshold when we're not collecting, or a
tautoligical threshold (that makes everything look black) when we are collecting. This
turns out to not be any more expensive than the barrier in tip of tree when the GC is not
running, or a 0-15% slow-down when it is "running". (Of course we don't run the GC
concurrently yet. I still have more work to do.) The slowPath() does some extra work to
check if we are concurrently collecting; if so, it does a fence and rechecks if the object
really did need that barrier.

This also reintroduces elimination of redundant store barriers, which was lost in the last
store barrier change. We can only do it when there is no possibility of GC, exit, or
exceptions between the two store barriers. We could remove the exit/exception limitation if
we taught OSR exit how to buffer store barriers, which is an insane thing to do considering
that I've never been able to detect a win from redundant store barrier elimination. I just
want us to have it for stupidly obvious situations, like a tight sequence of stores to the
same object. This same optimization also sometimes strength-reduces the store barrier so
that it uses a constant black threshold rather than the sneaky one, thereby saving one
load.

Even with all of those optimizations, I still had problems with barrier cost. I found that one
of the benchmarks that was being hit particularly hard was JetStream/regexp-2010. Fortunately
that benchmark does most of its barriers in a tight C++ loop in RegExpMatchesArray.h. When we
know what we're doing, we can defer GC around a bunch of object initializations and then remove
all of the barriers between any of the objects allocated within the deferral. Unfortunately,
our GC deferral mechanism isn't really performant enough to make this be a worthwhile
optimization. The most efficient version of such an optimization that I could come up with was
to have a DeferralContext object that houses a boolean that is false by default, but the GC
writes true into it if it would have wanted to GC. You thread a pointer to the deferralContext
through all of your allocations. This kind of mechanism has the overhead of a zero
initialization on the stack on entry and a zero check on exit. This is probably even efficient
enough that we could start thinking about having the DFG use it, for example if we found a
bounded-time section of code with a lot of barriers and entry/exit sites that aren't totally
wacky. This optimization took this patch from 0.68% JetStream regressed to neutral, according
to my latest data.

Finally, an earlier version of this change put the store-load fence in B3 IR, so I ended up
adding FTLOutput support for it and AbstractHeapRepository magic for decorating the heaps.
I think we might as well keep that, it'll be useful.

* CMakeLists.txt:
* JavaScriptCore.xcodeproj/project.pbxproj:
* assembler/MacroAssembler.h:
(JSC::MacroAssembler::branch32):
* assembler/MacroAssemblerX86_64.h:
(JSC::MacroAssemblerX86_64::branch32):
(JSC::MacroAssemblerX86_64::branch64): Deleted.
* bytecode/PolymorphicAccess.cpp:
(JSC::AccessCase::generateImpl):
* dfg/DFGAbstractHeap.h:
* dfg/DFGAbstractInterpreterInlines.h:
(JSC::DFG::AbstractInterpreter<AbstractStateType>::executeEffects):
* dfg/DFGClobberize.h:
(JSC::DFG::clobberize):
* dfg/DFGClobbersExitState.cpp:
(JSC::DFG::clobbersExitState):
* dfg/DFGDoesGC.cpp:
(JSC::DFG::doesGC):
* dfg/DFGFixupPhase.cpp:
(JSC::DFG::FixupPhase::fixupNode):
* dfg/DFGMayExit.cpp:
* dfg/DFGNode.h:
(JSC::DFG::Node::isStoreBarrier):
* dfg/DFGNodeType.h:
* dfg/DFGPlan.cpp:
(JSC::DFG::Plan::compileInThreadImpl):
* dfg/DFGPredictionPropagationPhase.cpp:
* dfg/DFGSafeToExecute.h:
(JSC::DFG::safeToExecute):
* dfg/DFGSpeculativeJIT.cpp:
(JSC::DFG::SpeculativeJIT::compileStoreBarrier):
(JSC::DFG::SpeculativeJIT::storeToWriteBarrierBuffer): Deleted.
(JSC::DFG::SpeculativeJIT::writeBarrier): Deleted.
* dfg/DFGSpeculativeJIT.h:
* dfg/DFGSpeculativeJIT32_64.cpp:
(JSC::DFG::SpeculativeJIT::compile):
(JSC::DFG::SpeculativeJIT::compileBaseValueStoreBarrier): Deleted.
(JSC::DFG::SpeculativeJIT::writeBarrier): Deleted.
* dfg/DFGSpeculativeJIT64.cpp:
(JSC::DFG::SpeculativeJIT::compile):
(JSC::DFG::SpeculativeJIT::compileBaseValueStoreBarrier): Deleted.
(JSC::DFG::SpeculativeJIT::writeBarrier): Deleted.
* dfg/DFGStoreBarrierClusteringPhase.cpp: Added.
(JSC::DFG::performStoreBarrierClustering):
* dfg/DFGStoreBarrierClusteringPhase.h: Added.
* dfg/DFGStoreBarrierInsertionPhase.cpp:
* dfg/DFGStoreBarrierInsertionPhase.h:
* ftl/FTLAbstractHeap.h:
(JSC::FTL::AbsoluteAbstractHeap::at):
(JSC::FTL::AbsoluteAbstractHeap::operator[]):
* ftl/FTLAbstractHeapRepository.cpp:
(JSC::FTL::AbstractHeapRepository::decorateFenceRead):
(JSC::FTL::AbstractHeapRepository::decorateFenceWrite):
(JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions):
* ftl/FTLAbstractHeapRepository.h:
* ftl/FTLCapabilities.cpp:
(JSC::FTL::canCompile):
* ftl/FTLLowerDFGToB3.cpp:
(JSC::FTL::DFG::LowerDFGToB3::compileNode):
(JSC::FTL::DFG::LowerDFGToB3::compileStoreBarrier):
(JSC::FTL::DFG::LowerDFGToB3::storageForTransition):
(JSC::FTL::DFG::LowerDFGToB3::lazySlowPath):
(JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier):
* ftl/FTLOutput.cpp:
(JSC::FTL::Output::fence):
(JSC::FTL::Output::absolute):
* ftl/FTLOutput.h:
* heap/CellState.h:
(JSC::isWithinThreshold):
(JSC::isBlack):
* heap/Heap.cpp:
(JSC::Heap::writeBarrierSlowPath):
* heap/Heap.h:
(JSC::Heap::barrierShouldBeFenced):
(JSC::Heap::addressOfBarrierShouldBeFenced):
(JSC::Heap::sneakyBlackThreshold):
(JSC::Heap::addressOfSneakyBlackThreshold):
* heap/HeapInlines.h:
(JSC::Heap::writeBarrier):
(JSC::Heap::writeBarrierWithoutFence):
* jit/AssemblyHelpers.h:
(JSC::AssemblyHelpers::jumpIfIsRememberedOrInEdenWithoutFence):
(JSC::AssemblyHelpers::sneakyJumpIfIsRememberedOrInEden):
(JSC::AssemblyHelpers::jumpIfIsRememberedOrInEden):
(JSC::AssemblyHelpers::storeBarrierStoreLoadFence):
(JSC::AssemblyHelpers::jumpIfStoreBarrierStoreLoadFenceNotNeeded):
* jit/JITOperations.cpp:
* jit/JITOperations.h:
* jit/JITPropertyAccess.cpp:
(JSC::JIT::emit_op_put_by_id):
(JSC::JIT::emitWriteBarrier):
(JSC::JIT::privateCompilePutByVal):
* jit/JITPropertyAccess32_64.cpp:
(JSC::JIT::emit_op_put_by_id):
* llint/LowLevelInterpreter.asm:
* offlineasm/x86.rb:
* runtime/Options.h:

Source/WTF:

Added clearRange(), which quickly clears a range of bits. This turned out to be useful for
a DFG optimization pass.

* wtf/FastBitVector.cpp:
(WTF::FastBitVector::clearRange):
* wtf/FastBitVector.h:

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@206555 268f45cc-cd09-0410-ab3c-d52691b4dbfc

64 files changed:
Source/JavaScriptCore/CMakeLists.txt
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj
Source/JavaScriptCore/assembler/MacroAssembler.h
Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h
Source/JavaScriptCore/bytecode/PolymorphicAccess.cpp
Source/JavaScriptCore/dfg/DFGAbstractHeap.h
Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h
Source/JavaScriptCore/dfg/DFGClobberize.h
Source/JavaScriptCore/dfg/DFGClobbersExitState.cpp
Source/JavaScriptCore/dfg/DFGDoesGC.cpp
Source/JavaScriptCore/dfg/DFGFixupPhase.cpp
Source/JavaScriptCore/dfg/DFGMayExit.cpp
Source/JavaScriptCore/dfg/DFGNode.h
Source/JavaScriptCore/dfg/DFGNodeType.h
Source/JavaScriptCore/dfg/DFGOSRExitCompilerCommon.cpp
Source/JavaScriptCore/dfg/DFGOperations.cpp
Source/JavaScriptCore/dfg/DFGPlan.cpp
Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp
Source/JavaScriptCore/dfg/DFGSafeToExecute.h
Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp
Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h
Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp
Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp
Source/JavaScriptCore/dfg/DFGStoreBarrierClusteringPhase.cpp [new file with mode: 0644]
Source/JavaScriptCore/dfg/DFGStoreBarrierClusteringPhase.h [new file with mode: 0644]
Source/JavaScriptCore/dfg/DFGStoreBarrierInsertionPhase.cpp
Source/JavaScriptCore/dfg/DFGStoreBarrierInsertionPhase.h
Source/JavaScriptCore/ftl/FTLAbstractHeap.h
Source/JavaScriptCore/ftl/FTLAbstractHeapRepository.cpp
Source/JavaScriptCore/ftl/FTLAbstractHeapRepository.h
Source/JavaScriptCore/ftl/FTLCapabilities.cpp
Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp
Source/JavaScriptCore/ftl/FTLOutput.cpp
Source/JavaScriptCore/ftl/FTLOutput.h
Source/JavaScriptCore/heap/CellState.h
Source/JavaScriptCore/heap/GCDeferralContext.h [new file with mode: 0644]
Source/JavaScriptCore/heap/GCDeferralContextInlines.h [new file with mode: 0644]
Source/JavaScriptCore/heap/Heap.cpp
Source/JavaScriptCore/heap/Heap.h
Source/JavaScriptCore/heap/HeapInlines.h
Source/JavaScriptCore/heap/MarkedAllocator.cpp
Source/JavaScriptCore/heap/MarkedAllocator.h
Source/JavaScriptCore/heap/MarkedSpace.cpp
Source/JavaScriptCore/heap/MarkedSpace.h
Source/JavaScriptCore/jit/AssemblyHelpers.h
Source/JavaScriptCore/jit/JITOperations.cpp
Source/JavaScriptCore/jit/JITOperations.h
Source/JavaScriptCore/jit/JITPropertyAccess.cpp
Source/JavaScriptCore/jit/JITPropertyAccess32_64.cpp
Source/JavaScriptCore/llint/LowLevelInterpreter.asm
Source/JavaScriptCore/offlineasm/x86.rb
Source/JavaScriptCore/runtime/JSArray.cpp
Source/JavaScriptCore/runtime/JSArray.h
Source/JavaScriptCore/runtime/JSCell.h
Source/JavaScriptCore/runtime/JSCellInlines.h
Source/JavaScriptCore/runtime/JSObject.h
Source/JavaScriptCore/runtime/JSString.h
Source/JavaScriptCore/runtime/Options.h
Source/JavaScriptCore/runtime/RegExpMatchesArray.cpp
Source/JavaScriptCore/runtime/RegExpMatchesArray.h
Source/WTF/ChangeLog
Source/WTF/wtf/FastBitVector.cpp
Source/WTF/wtf/FastBitVector.h

index 9c91dec..a98d864 100644 (file)
@@ -361,6 +361,7 @@ set(JavaScriptCore_SOURCES
     dfg/DFGSpeculativeJIT64.cpp
     dfg/DFGStackLayoutPhase.cpp
     dfg/DFGStaticExecutionCountEstimationPhase.cpp
+    dfg/DFGStoreBarrierClusteringPhase.cpp
     dfg/DFGStoreBarrierInsertionPhase.cpp
     dfg/DFGStrengthReductionPhase.cpp
     dfg/DFGStructureAbstractValue.cpp
index 9232e3c..c004f73 100644 (file)
@@ -1,3 +1,191 @@
+2016-09-28  Filip Pizlo  <fpizlo@apple.com>
+
+        The write barrier should be down with TSO
+        https://bugs.webkit.org/show_bug.cgi?id=162316
+
+        Reviewed by Geoffrey Garen.
+        
+        This makes our write barrier behave correctly when it races with the collector. The
+        collector wants to do this when visiting:
+        
+            object->cellState = black
+            visit(object)
+        
+        The mutator wants to do this when storing:
+        
+            object->property = newValue
+            if (object->cellState == black)
+                remember(object)
+        
+        Prior to this change, this didn't work right because the compiler would sometimes place
+        barriers before the store to the property and because the mutator did not have adequate
+        fences.
+        
+        Prior to this change, the DFG and FTL would emit this:
+        
+            if (object->cellState == black)
+                remember(object)
+            object->property = newValue
+        
+        Which is wrong, because the object could start being scanned just after the cellState
+        check, at which point the store would be lost. We need to confirm that the state was not
+        black *after* the store! This change was harder than you'd expect: placing the barrier
+        after the store broke B3's ability to do its super crazy ninja CSE on some store-load
+        redundancies. Because the B3 CSE has some moves that the DFG CSE lacks, the DFG CSE's
+        ability to ignore barriers didn't help. I fixed this by having the FTL convey precise
+        heap ranges for the patchpoint corresponding to the barrier slow path. It reads the world
+        (because of the store-load fence) and it writes only cellState (because the B3 heap ranges
+        don't have any way to represent any of the GC's other state, which means that B3 does not
+        have to worry about aliasing with any of that).
+        
+        The collector already uses a store-load fence on x86 just after setting the cellState and
+        before visiting the object. The mutator needs to do the same. But we cannot put a
+        store-load fence of any kind before store barriers, because that causes enormous slow
+        downs. In the worst case, Octane/richards slowed down by 90%! That's crazy! However, the
+        overall slow downs were small enough (0-15% on benchmark suite aggregates) that it would be
+        reasonable if the slow down only happened while the GC was running. Then, the concurrent GC
+        would lift throughput-while-collecting from 0% of peak to 85% of peak. This changes the
+        barrier so that it looks like this:
+        
+            if (object->cellState <= heap.sneakyBlackThreshold)
+                slowPath(object)
+        
+        Where sneakyBlackThreshold is the normal blackThreshold when we're not collecting, or a
+        tautoligical threshold (that makes everything look black) when we are collecting. This
+        turns out to not be any more expensive than the barrier in tip of tree when the GC is not
+        running, or a 0-15% slow-down when it is "running". (Of course we don't run the GC
+        concurrently yet. I still have more work to do.) The slowPath() does some extra work to
+        check if we are concurrently collecting; if so, it does a fence and rechecks if the object
+        really did need that barrier.
+        
+        This also reintroduces elimination of redundant store barriers, which was lost in the last
+        store barrier change. We can only do it when there is no possibility of GC, exit, or
+        exceptions between the two store barriers. We could remove the exit/exception limitation if
+        we taught OSR exit how to buffer store barriers, which is an insane thing to do considering
+        that I've never been able to detect a win from redundant store barrier elimination. I just
+        want us to have it for stupidly obvious situations, like a tight sequence of stores to the
+        same object. This same optimization also sometimes strength-reduces the store barrier so
+        that it uses a constant black threshold rather than the sneaky one, thereby saving one
+        load.
+        
+        Even with all of those optimizations, I still had problems with barrier cost. I found that one
+        of the benchmarks that was being hit particularly hard was JetStream/regexp-2010. Fortunately
+        that benchmark does most of its barriers in a tight C++ loop in RegExpMatchesArray.h. When we
+        know what we're doing, we can defer GC around a bunch of object initializations and then remove
+        all of the barriers between any of the objects allocated within the deferral. Unfortunately,
+        our GC deferral mechanism isn't really performant enough to make this be a worthwhile
+        optimization. The most efficient version of such an optimization that I could come up with was
+        to have a DeferralContext object that houses a boolean that is false by default, but the GC
+        writes true into it if it would have wanted to GC. You thread a pointer to the deferralContext
+        through all of your allocations. This kind of mechanism has the overhead of a zero 
+        initialization on the stack on entry and a zero check on exit. This is probably even efficient
+        enough that we could start thinking about having the DFG use it, for example if we found a
+        bounded-time section of code with a lot of barriers and entry/exit sites that aren't totally
+        wacky. This optimization took this patch from 0.68% JetStream regressed to neutral, according
+        to my latest data.
+        
+        Finally, an earlier version of this change put the store-load fence in B3 IR, so I ended up
+        adding FTLOutput support for it and AbstractHeapRepository magic for decorating the heaps.
+        I think we might as well keep that, it'll be useful.
+
+        * CMakeLists.txt:
+        * JavaScriptCore.xcodeproj/project.pbxproj:
+        * assembler/MacroAssembler.h:
+        (JSC::MacroAssembler::branch32):
+        * assembler/MacroAssemblerX86_64.h:
+        (JSC::MacroAssemblerX86_64::branch32):
+        (JSC::MacroAssemblerX86_64::branch64): Deleted.
+        * bytecode/PolymorphicAccess.cpp:
+        (JSC::AccessCase::generateImpl):
+        * dfg/DFGAbstractHeap.h:
+        * dfg/DFGAbstractInterpreterInlines.h:
+        (JSC::DFG::AbstractInterpreter<AbstractStateType>::executeEffects):
+        * dfg/DFGClobberize.h:
+        (JSC::DFG::clobberize):
+        * dfg/DFGClobbersExitState.cpp:
+        (JSC::DFG::clobbersExitState):
+        * dfg/DFGDoesGC.cpp:
+        (JSC::DFG::doesGC):
+        * dfg/DFGFixupPhase.cpp:
+        (JSC::DFG::FixupPhase::fixupNode):
+        * dfg/DFGMayExit.cpp:
+        * dfg/DFGNode.h:
+        (JSC::DFG::Node::isStoreBarrier):
+        * dfg/DFGNodeType.h:
+        * dfg/DFGPlan.cpp:
+        (JSC::DFG::Plan::compileInThreadImpl):
+        * dfg/DFGPredictionPropagationPhase.cpp:
+        * dfg/DFGSafeToExecute.h:
+        (JSC::DFG::safeToExecute):
+        * dfg/DFGSpeculativeJIT.cpp:
+        (JSC::DFG::SpeculativeJIT::compileStoreBarrier):
+        (JSC::DFG::SpeculativeJIT::storeToWriteBarrierBuffer): Deleted.
+        (JSC::DFG::SpeculativeJIT::writeBarrier): Deleted.
+        * dfg/DFGSpeculativeJIT.h:
+        * dfg/DFGSpeculativeJIT32_64.cpp:
+        (JSC::DFG::SpeculativeJIT::compile):
+        (JSC::DFG::SpeculativeJIT::compileBaseValueStoreBarrier): Deleted.
+        (JSC::DFG::SpeculativeJIT::writeBarrier): Deleted.
+        * dfg/DFGSpeculativeJIT64.cpp:
+        (JSC::DFG::SpeculativeJIT::compile):
+        (JSC::DFG::SpeculativeJIT::compileBaseValueStoreBarrier): Deleted.
+        (JSC::DFG::SpeculativeJIT::writeBarrier): Deleted.
+        * dfg/DFGStoreBarrierClusteringPhase.cpp: Added.
+        (JSC::DFG::performStoreBarrierClustering):
+        * dfg/DFGStoreBarrierClusteringPhase.h: Added.
+        * dfg/DFGStoreBarrierInsertionPhase.cpp:
+        * dfg/DFGStoreBarrierInsertionPhase.h:
+        * ftl/FTLAbstractHeap.h:
+        (JSC::FTL::AbsoluteAbstractHeap::at):
+        (JSC::FTL::AbsoluteAbstractHeap::operator[]):
+        * ftl/FTLAbstractHeapRepository.cpp:
+        (JSC::FTL::AbstractHeapRepository::decorateFenceRead):
+        (JSC::FTL::AbstractHeapRepository::decorateFenceWrite):
+        (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions):
+        * ftl/FTLAbstractHeapRepository.h:
+        * ftl/FTLCapabilities.cpp:
+        (JSC::FTL::canCompile):
+        * ftl/FTLLowerDFGToB3.cpp:
+        (JSC::FTL::DFG::LowerDFGToB3::compileNode):
+        (JSC::FTL::DFG::LowerDFGToB3::compileStoreBarrier):
+        (JSC::FTL::DFG::LowerDFGToB3::storageForTransition):
+        (JSC::FTL::DFG::LowerDFGToB3::lazySlowPath):
+        (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier):
+        * ftl/FTLOutput.cpp:
+        (JSC::FTL::Output::fence):
+        (JSC::FTL::Output::absolute):
+        * ftl/FTLOutput.h:
+        * heap/CellState.h:
+        (JSC::isWithinThreshold):
+        (JSC::isBlack):
+        * heap/Heap.cpp:
+        (JSC::Heap::writeBarrierSlowPath):
+        * heap/Heap.h:
+        (JSC::Heap::barrierShouldBeFenced):
+        (JSC::Heap::addressOfBarrierShouldBeFenced):
+        (JSC::Heap::sneakyBlackThreshold):
+        (JSC::Heap::addressOfSneakyBlackThreshold):
+        * heap/HeapInlines.h:
+        (JSC::Heap::writeBarrier):
+        (JSC::Heap::writeBarrierWithoutFence):
+        * jit/AssemblyHelpers.h:
+        (JSC::AssemblyHelpers::jumpIfIsRememberedOrInEdenWithoutFence):
+        (JSC::AssemblyHelpers::sneakyJumpIfIsRememberedOrInEden):
+        (JSC::AssemblyHelpers::jumpIfIsRememberedOrInEden):
+        (JSC::AssemblyHelpers::storeBarrierStoreLoadFence):
+        (JSC::AssemblyHelpers::jumpIfStoreBarrierStoreLoadFenceNotNeeded):
+        * jit/JITOperations.cpp:
+        * jit/JITOperations.h:
+        * jit/JITPropertyAccess.cpp:
+        (JSC::JIT::emit_op_put_by_id):
+        (JSC::JIT::emitWriteBarrier):
+        (JSC::JIT::privateCompilePutByVal):
+        * jit/JITPropertyAccess32_64.cpp:
+        (JSC::JIT::emit_op_put_by_id):
+        * llint/LowLevelInterpreter.asm:
+        * offlineasm/x86.rb:
+        * runtime/Options.h:
+
 2016-09-27  Joseph Pecoraro  <pecoraro@apple.com>
 
         Improve useCodeCache Option description string.
index f734f91..29fec2c 100644 (file)
                0F7C39FF1C90C55B00480151 /* DFGOpInfo.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F7C39FE1C90C55B00480151 /* DFGOpInfo.h */; };
                0F7C5FB81D888A0C0044F5E2 /* MarkedBlockInlines.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F7C5FB71D888A010044F5E2 /* MarkedBlockInlines.h */; };
                0F7C5FBA1D8895070044F5E2 /* MarkedSpaceInlines.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F7C5FB91D8895050044F5E2 /* MarkedSpaceInlines.h */; };
+               0F7F988B1D9596C500F4F12E /* DFGStoreBarrierClusteringPhase.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F7F98891D9596C300F4F12E /* DFGStoreBarrierClusteringPhase.cpp */; };
+               0F7F988C1D9596C800F4F12E /* DFGStoreBarrierClusteringPhase.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F7F988A1D9596C300F4F12E /* DFGStoreBarrierClusteringPhase.h */; };
                0F8023EA1613832B00A0BA45 /* ByValInfo.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F8023E91613832300A0BA45 /* ByValInfo.h */; settings = {ATTRIBUTES = (Private, ); }; };
                0F8335B71639C1E6001443B5 /* ArrayAllocationProfile.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F8335B41639C1E3001443B5 /* ArrayAllocationProfile.cpp */; };
                0F8335B81639C1EA001443B5 /* ArrayAllocationProfile.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F8335B51639C1E3001443B5 /* ArrayAllocationProfile.h */; settings = {ATTRIBUTES = (Private, ); }; };
                0FB387921BFD31A100E3AB1E /* FTLCompile.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0FB387911BFD31A100E3AB1E /* FTLCompile.cpp */; };
                0FB415841D78FB4C00DF8D09 /* ArrayConventions.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0FB415831D78F98200DF8D09 /* ArrayConventions.cpp */; };
                0FB438A319270B1D00E1FBC9 /* StructureSet.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0FB438A219270B1D00E1FBC9 /* StructureSet.cpp */; };
+               0FB4767E1D99AEA9008EA6CB /* GCDeferralContext.h in Headers */ = {isa = PBXBuildFile; fileRef = 0FB4767C1D99AEA7008EA6CB /* GCDeferralContext.h */; settings = {ATTRIBUTES = (Private, ); }; };
+               0FB4767F1D99AEAD008EA6CB /* GCDeferralContextInlines.h in Headers */ = {isa = PBXBuildFile; fileRef = 0FB4767D1D99AEA7008EA6CB /* GCDeferralContextInlines.h */; };
                0FB4FB731BC843140025CA5A /* FTLLazySlowPath.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0FB4FB701BC843140025CA5A /* FTLLazySlowPath.cpp */; };
                0FB4FB741BC843140025CA5A /* FTLLazySlowPath.h in Headers */ = {isa = PBXBuildFile; fileRef = 0FB4FB711BC843140025CA5A /* FTLLazySlowPath.h */; };
                0FB4FB751BC843140025CA5A /* FTLLazySlowPathCall.h in Headers */ = {isa = PBXBuildFile; fileRef = 0FB4FB721BC843140025CA5A /* FTLLazySlowPathCall.h */; };
                0F7C39FE1C90C55B00480151 /* DFGOpInfo.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = DFGOpInfo.h; path = dfg/DFGOpInfo.h; sourceTree = "<group>"; };
                0F7C5FB71D888A010044F5E2 /* MarkedBlockInlines.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = MarkedBlockInlines.h; sourceTree = "<group>"; };
                0F7C5FB91D8895050044F5E2 /* MarkedSpaceInlines.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = MarkedSpaceInlines.h; sourceTree = "<group>"; };
+               0F7F98891D9596C300F4F12E /* DFGStoreBarrierClusteringPhase.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = DFGStoreBarrierClusteringPhase.cpp; path = dfg/DFGStoreBarrierClusteringPhase.cpp; sourceTree = "<group>"; };
+               0F7F988A1D9596C300F4F12E /* DFGStoreBarrierClusteringPhase.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = DFGStoreBarrierClusteringPhase.h; path = dfg/DFGStoreBarrierClusteringPhase.h; sourceTree = "<group>"; };
                0F8023E91613832300A0BA45 /* ByValInfo.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = ByValInfo.h; sourceTree = "<group>"; };
                0F8335B41639C1E3001443B5 /* ArrayAllocationProfile.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = ArrayAllocationProfile.cpp; sourceTree = "<group>"; };
                0F8335B51639C1E3001443B5 /* ArrayAllocationProfile.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = ArrayAllocationProfile.h; sourceTree = "<group>"; };
                0FB387911BFD31A100E3AB1E /* FTLCompile.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = FTLCompile.cpp; path = ftl/FTLCompile.cpp; sourceTree = "<group>"; };
                0FB415831D78F98200DF8D09 /* ArrayConventions.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = ArrayConventions.cpp; sourceTree = "<group>"; };
                0FB438A219270B1D00E1FBC9 /* StructureSet.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = StructureSet.cpp; sourceTree = "<group>"; };
+               0FB4767C1D99AEA7008EA6CB /* GCDeferralContext.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = GCDeferralContext.h; sourceTree = "<group>"; };
+               0FB4767D1D99AEA7008EA6CB /* GCDeferralContextInlines.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = GCDeferralContextInlines.h; sourceTree = "<group>"; };
                0FB4B51016B3A964003F696B /* DFGMinifiedID.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = DFGMinifiedID.h; path = dfg/DFGMinifiedID.h; sourceTree = "<group>"; };
                0FB4B51916B62772003F696B /* DFGAllocator.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = DFGAllocator.h; path = dfg/DFGAllocator.h; sourceTree = "<group>"; };
                0FB4B51A16B62772003F696B /* DFGCommon.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = DFGCommon.cpp; path = dfg/DFGCommon.cpp; sourceTree = "<group>"; };
                                2AACE63A18CA5A0300ED0191 /* GCActivityCallback.cpp */,
                                2AACE63B18CA5A0300ED0191 /* GCActivityCallback.h */,
                                BCBE2CAD14E985AA000593AD /* GCAssertions.h */,
+                               0FB4767C1D99AEA7008EA6CB /* GCDeferralContext.h */,
+                               0FB4767D1D99AEA7008EA6CB /* GCDeferralContextInlines.h */,
                                0F2B66A817B6B53D00A7AE3F /* GCIncomingRefCounted.h */,
                                0F2B66A917B6B53D00A7AE3F /* GCIncomingRefCountedInlines.h */,
                                0F2B66AA17B6B53D00A7AE3F /* GCIncomingRefCountedSet.h */,
                                0F9FB4F317FCB91700CB67F8 /* DFGStackLayoutPhase.h */,
                                0F4F29DD18B6AD1C0057BC15 /* DFGStaticExecutionCountEstimationPhase.cpp */,
                                0F4F29DE18B6AD1C0057BC15 /* DFGStaticExecutionCountEstimationPhase.h */,
+                               0F7F98891D9596C300F4F12E /* DFGStoreBarrierClusteringPhase.cpp */,
+                               0F7F988A1D9596C300F4F12E /* DFGStoreBarrierClusteringPhase.h */,
                                0F9E32611B05AB0400801ED5 /* DFGStoreBarrierInsertionPhase.cpp */,
                                0F9E32621B05AB0400801ED5 /* DFGStoreBarrierInsertionPhase.h */,
                                0FC20CB31852E2C600C9E954 /* DFGStrengthReductionPhase.cpp */,
                                0F61832A1C45BF070072450B /* AirCCallingConvention.h in Headers */,
                                E33F50791B84225700413856 /* JSInternalPromiseConstructor.h in Headers */,
                                E328DAE91D38D005001A2529 /* BytecodeGraph.h in Headers */,
+                               0F7F988C1D9596C800F4F12E /* DFGStoreBarrierClusteringPhase.h in Headers */,
                                E33F50871B8449EF00413856 /* JSInternalPromiseConstructor.lut.h in Headers */,
                                E33F50851B8437A000413856 /* JSInternalPromiseDeferred.h in Headers */,
                                E33F50751B8421C000413856 /* JSInternalPromisePrototype.h in Headers */,
                                A7B601821639FD2A00372BA3 /* UnlinkedCodeBlock.h in Headers */,
                                14142E511B796ECE00F4BF4B /* UnlinkedFunctionExecutable.h in Headers */,
                                0F2E892C16D028AD009E4FD2 /* UnusedPointer.h in Headers */,
+                               0FB4767E1D99AEA9008EA6CB /* GCDeferralContext.h in Headers */,
                                99DA00B11BD5994E00F4575C /* UpdateContents.py in Headers */,
                                0F963B3813FC6FE90002D9B2 /* ValueProfile.h in Headers */,
                                0F426A481460CBB300131F8F /* ValueRecovery.h in Headers */,
                                451539B912DC994500EF7AC4 /* Yarr.h in Headers */,
                                86704B8512DBA33700A9FE7B /* YarrInterpreter.h in Headers */,
                                86704B8712DBA33700A9FE7B /* YarrJIT.h in Headers */,
+                               0FB4767F1D99AEAD008EA6CB /* GCDeferralContextInlines.h in Headers */,
                                86704B8812DBA33700A9FE7B /* YarrParser.h in Headers */,
                                86704B8A12DBA33700A9FE7B /* YarrPattern.h in Headers */,
                                262D85B71C0D650F006ACB61 /* AirFixPartialRegisterStalls.h in Headers */,
                                A7D89CF717A0B8CC00773AD8 /* DFGFlushFormat.cpp in Sources */,
                                0F69CC88193AC60A0045759E /* DFGFrozenValue.cpp in Sources */,
                                86EC9DC71328DF82002B2AD7 /* DFGGraph.cpp in Sources */,
+                               0F7F988B1D9596C500F4F12E /* DFGStoreBarrierClusteringPhase.cpp in Sources */,
                                0F2FCCF918A60070001A27F8 /* DFGGraphSafepoint.cpp in Sources */,
                                0FB17660196B8F9E0091052A /* DFGHeapLocation.cpp in Sources */,
                                0FC841681BA8C3210061837D /* DFGInferredTypeCheck.cpp in Sources */,
index 363506d..ee663ac 100644 (file)
@@ -373,6 +373,11 @@ public:
         branchPtr(cond, op1, imm).linkTo(target, this);
     }
 
+    Jump branch32(RelationalCondition cond, RegisterID left, AbsoluteAddress right)
+    {
+        return branch32(flip(cond), right, left);
+    }
+
     void branch32(RelationalCondition cond, RegisterID op1, RegisterID op2, Label target)
     {
         branch32(cond, op1, op2).linkTo(target, this);
index 97d4291..3797282 100644 (file)
@@ -44,6 +44,7 @@ public:
 
     using MacroAssemblerX86Common::add32;
     using MacroAssemblerX86Common::and32;
+    using MacroAssemblerX86Common::branch32;
     using MacroAssemblerX86Common::branchAdd32;
     using MacroAssemblerX86Common::or32;
     using MacroAssemblerX86Common::sub32;
@@ -827,6 +828,12 @@ public:
         m_assembler.cmpq_rm(right, address.offset, address.base, address.index, address.scale);
         return Jump(m_assembler.jCC(x86Condition(cond)));
     }
+    
+    Jump branch32(RelationalCondition cond, AbsoluteAddress left, RegisterID right)
+    {
+        load32(left.m_ptr, scratchRegister());
+        return branch32(cond, scratchRegister(), right);
+    }
 
     Jump branchPtr(RelationalCondition cond, BaseIndex left, RegisterID right)
     {
index 256ca43..2cf2667 100644 (file)
@@ -1309,29 +1309,7 @@ void AccessCase::generateImpl(AccessGenerationState& state)
                 CCallHelpers::Address(scratchGPR, offsetInButterfly(m_offset) * sizeof(JSValue)));
         }
         
-        // If we had allocated using an operation then we would have already executed the store
-        // barrier and we would have already stored the butterfly into the object.
         if (allocatingInline) {
-            CCallHelpers::Jump ownerIsRememberedOrInEden = jit.jumpIfIsRememberedOrInEden(baseGPR);
-            WriteBarrierBuffer& writeBarrierBuffer = jit.vm()->heap.writeBarrierBuffer();
-            jit.load32(writeBarrierBuffer.currentIndexAddress(), scratchGPR2);
-            slowPath.append(
-                jit.branch32(
-                    CCallHelpers::AboveOrEqual, scratchGPR2,
-                    CCallHelpers::TrustedImm32(writeBarrierBuffer.capacity())));
-            
-            jit.add32(CCallHelpers::TrustedImm32(1), scratchGPR2);
-            jit.store32(scratchGPR2, writeBarrierBuffer.currentIndexAddress());
-            
-            jit.move(CCallHelpers::TrustedImmPtr(writeBarrierBuffer.buffer()), scratchGPR3);
-            // We use an offset of -sizeof(void*) because we already added 1 to scratchGPR2.
-            jit.storePtr(
-                baseGPR,
-                CCallHelpers::BaseIndex(
-                    scratchGPR3, scratchGPR2, CCallHelpers::ScalePtr,
-                    static_cast<int32_t>(-sizeof(void*))));
-            ownerIsRememberedOrInEden.link(&jit);
-            
             // We set the new butterfly and the structure last. Doing it this way ensures that
             // whatever we had done up to this point is forgotten if we choose to branch to slow
             // path.
index cbe5a00..0abd502 100644 (file)
@@ -51,8 +51,9 @@ namespace JSC { namespace DFG {
     macro(Butterfly_vectorLength) \
     macro(GetterSetter_getter) \
     macro(GetterSetter_setter) \
-    macro(JSCell_structureID) \
+    macro(JSCell_cellState) \
     macro(JSCell_indexingType) \
+    macro(JSCell_structureID) \
     macro(JSCell_typeInfoFlags) \
     macro(JSCell_typeInfoType) \
     macro(JSObject_butterfly) \
index b8a40b1..9db13e3 100644 (file)
@@ -2843,11 +2843,12 @@ bool AbstractInterpreter<AbstractStateType>::executeEffects(unsigned clobberLimi
         break;
     }
 
-    case StoreBarrier: {
+    case StoreBarrier:
+    case FencedStoreBarrier: {
         filter(node->child1(), SpecCell);
         break;
     }
-
+        
     case CheckTierUpAndOSREnter:
     case LoopHint:
     case ZombieHint:
index f030477..87d8b52 100644 (file)
@@ -404,11 +404,20 @@ void clobberize(Graph& graph, Node* node, const ReadFunctor& read, const WriteFu
     case LoopHint:
     case ProfileType:
     case ProfileControlFlow:
-    case StoreBarrier:
     case PutHint:
         write(SideState);
         return;
         
+    case StoreBarrier:
+        read(JSCell_cellState);
+        write(JSCell_cellState);
+        return;
+        
+    case FencedStoreBarrier:
+        read(Heap);
+        write(JSCell_cellState);
+        return;
+
     case InvalidationPoint:
         write(SideState);
         def(HeapLocation(InvalidationPointLoc, Watchpoint_fire), LazyNode(node));
index 809a45d..bd339b8 100644 (file)
@@ -67,6 +67,8 @@ bool clobbersExitState(Graph& graph, Node* node)
     case CountExecution:
     case AllocatePropertyStorage:
     case ReallocatePropertyStorage:
+    case StoreBarrier:
+    case FencedStoreBarrier:
         // These do clobber memory, but nothing that is observable. It may be nice to separate the
         // heaps into those that are observable and those that aren't, but we don't do that right now.
         // FIXME: https://bugs.webkit.org/show_bug.cgi?id=148440
index 2dcf55d..0ad54fe 100644 (file)
@@ -196,6 +196,7 @@ bool doesGC(Graph& graph, Node* node)
     case CheckTierUpAndOSREnter:
     case LoopHint:
     case StoreBarrier:
+    case FencedStoreBarrier:
     case InvalidationPoint:
     case NotifyWrite:
     case CheckInBounds:
index 6ee5142..5ad6afd 100644 (file)
@@ -1402,6 +1402,7 @@ private:
         case KillStack:
         case GetStack:
         case StoreBarrier:
+        case FencedStoreBarrier:
         case GetRegExpObjectLastIndex:
         case SetRegExpObjectLastIndex:
         case RecordRegExpCachedResult:
index 68df0aa..80b3374 100644 (file)
@@ -87,6 +87,7 @@ ExitMode mayExitImpl(Graph& graph, Node* node, StateType& state)
     case NotifyWrite:
     case PutStructure:
     case StoreBarrier:
+    case FencedStoreBarrier:
     case PutByOffset:
     case PutClosureVar:
     case RecordRegExpCachedResult:
index 9b70071..4bae9ae 100644 (file)
@@ -896,7 +896,7 @@ public:
 
     bool isStoreBarrier()
     {
-        return op() == StoreBarrier;
+        return op() == StoreBarrier || op() == FencedStoreBarrier;
     }
 
     bool hasIdentifier()
index e01ce3a..8b4d9f0 100644 (file)
@@ -375,8 +375,9 @@ namespace JSC { namespace DFG {
     \
     /* Checks the watchdog timer. If the timer has fired, we call operation operationHandleWatchdogTimer*/ \
     macro(CheckWatchdogTimer, NodeMustGenerate) \
-    /* Write barriers */\
+    /* Write barriers */\
     macro(StoreBarrier, NodeMustGenerate) \
+    macro(FencedStoreBarrier, NodeMustGenerate) \
     \
     /* For-in enumeration opcodes */\
     macro(GetEnumerableLength, NodeMustGenerate | NodeResultJS) \
index 1014785..c63f53f 100644 (file)
@@ -249,7 +249,7 @@ void reifyInlinedCallFrames(CCallHelpers& jit, const OSRExitBase& exit)
 
 static void osrWriteBarrier(CCallHelpers& jit, GPRReg owner, GPRReg scratch)
 {
-    AssemblyHelpers::Jump ownerIsRememberedOrInEden = jit.jumpIfIsRememberedOrInEden(owner);
+    AssemblyHelpers::Jump ownerIsRememberedOrInEden = jit.barrierBranchWithoutFence(owner);
 
     // We need these extra slots because setupArgumentsWithExecState will use poke on x86.
 #if CPU(X86)
@@ -269,6 +269,8 @@ static void osrWriteBarrier(CCallHelpers& jit, GPRReg owner, GPRReg scratch)
 
 void adjustAndJumpToTarget(CCallHelpers& jit, const OSRExitBase& exit)
 {
+    jit.memoryFence();
+    
     jit.move(
         AssemblyHelpers::TrustedImmPtr(
             jit.codeBlock()->baselineAlternative()), GPRInfo::argumentGPR1);
index 7e8cc58..99829cb 100644 (file)
@@ -1021,7 +1021,7 @@ char* JIT_OPERATION operationNewArrayWithSize(ExecState* exec, Structure* arrayS
 
     JSArray* result;
     if (butterfly)
-        result = JSArray::createWithButterfly(vm, arrayStructure, butterfly);
+        result = JSArray::createWithButterfly(vm, nullptr, arrayStructure, butterfly);
     else
         result = JSArray::create(vm, arrayStructure, size);
     return bitwise_cast<char*>(result);
index 7416a67..1efcabf 100644 (file)
@@ -64,6 +64,7 @@
 #include "DFGSSALoweringPhase.h"
 #include "DFGStackLayoutPhase.h"
 #include "DFGStaticExecutionCountEstimationPhase.h"
+#include "DFGStoreBarrierClusteringPhase.h"
 #include "DFGStoreBarrierInsertionPhase.h"
 #include "DFGStrengthReductionPhase.h"
 #include "DFGStructureRegistrationPhase.h"
@@ -362,6 +363,7 @@ Plan::CompilationPath Plan::compileInThreadImpl(LongLivedState& longLivedState)
         performTierUpCheckInjection(dfg);
 
         performFastStoreBarrierInsertion(dfg);
+        performStoreBarrierClustering(dfg);
         performCleanUp(dfg);
         performCPSRethreading(dfg);
         performDCE(dfg);
@@ -451,6 +453,7 @@ Plan::CompilationPath Plan::compileInThreadImpl(LongLivedState& longLivedState)
         performLivenessAnalysis(dfg);
         performCFA(dfg);
         performGlobalStoreBarrierInsertion(dfg);
+        performStoreBarrierClustering(dfg);
         if (Options::useMovHintRemoval())
             performMovHintRemoval(dfg);
         performCleanUp(dfg);
index 06db40e..bafdf90 100644 (file)
@@ -995,6 +995,7 @@ private:
         case PutStack:
         case KillStack:
         case StoreBarrier:
+        case FencedStoreBarrier:
         case GetStack:
         case GetRegExpObjectLastIndex:
         case SetRegExpObjectLastIndex:
index 7d80cc6..bddd466 100644 (file)
@@ -314,7 +314,6 @@ bool safeToExecute(AbstractStateType& state, Graph& graph, Node* node)
     case CheckTierUpAtReturn:
     case CheckTierUpAndOSREnter:
     case LoopHint:
-    case StoreBarrier:
     case InvalidationPoint:
     case NotifyWrite:
     case CheckInBounds:
@@ -370,6 +369,14 @@ bool safeToExecute(AbstractStateType& state, Graph& graph, Node* node)
         // compiling this node.
         return false;
 
+    case StoreBarrier:
+    case FencedStoreBarrier:
+        // We conservatively assume that these cannot be put anywhere, which forces the compiler to
+        // keep them exactly where they were. This is sort of overkill since the clobberize effects
+        // already force these things to be ordered precisely. I'm just not confident enough in my
+        // effect based memory model to rely solely on that right now.
+        return false;
+
     case GetByVal:
     case GetIndexedPropertyStorage:
     case GetArrayLength:
index 7771a95..10efb72 100644 (file)
@@ -8305,46 +8305,51 @@ void SpeculativeJIT::linkBranches()
 
 void SpeculativeJIT::compileStoreBarrier(Node* node)
 {
-    ASSERT(node->op() == StoreBarrier);
+    ASSERT(node->op() == StoreBarrier || node->op() == FencedStoreBarrier);
+    
+    bool isFenced = node->op() == FencedStoreBarrier;
     
     SpeculateCellOperand base(this, node->child1());
     GPRTemporary scratch1(this);
     GPRTemporary scratch2(this);
     
-    writeBarrier(base.gpr(), scratch1.gpr(), scratch2.gpr());
-
-    noResult(node);
-}
+    GPRReg baseGPR = base.gpr();
+    GPRReg scratch1GPR = scratch1.gpr();
+    GPRReg scratch2GPR = scratch2.gpr();
+    
+    JITCompiler::JumpList ok;
+    
+    if (isFenced) {
+        ok.append(m_jit.barrierBranch(baseGPR, scratch1GPR));
+        
+        JITCompiler::Jump noFence = m_jit.jumpIfBarrierStoreLoadFenceNotNeeded();
+        m_jit.memoryFence();
+        ok.append(m_jit.barrierBranchWithoutFence(baseGPR));
+        noFence.link(&m_jit);
+    } else
+        ok.append(m_jit.barrierBranchWithoutFence(baseGPR));
 
-void SpeculativeJIT::storeToWriteBarrierBuffer(GPRReg cell, GPRReg scratch1, GPRReg scratch2)
-{
-    ASSERT(scratch1 != scratch2);
     WriteBarrierBuffer& writeBarrierBuffer = m_jit.vm()->heap.m_writeBarrierBuffer;
-    m_jit.load32(writeBarrierBuffer.currentIndexAddress(), scratch2);
-    JITCompiler::Jump needToFlush = m_jit.branch32(MacroAssembler::AboveOrEqual, scratch2, MacroAssembler::TrustedImm32(writeBarrierBuffer.capacity()));
+    m_jit.load32(writeBarrierBuffer.currentIndexAddress(), scratch2GPR);
+    JITCompiler::Jump needToFlush = m_jit.branch32(MacroAssembler::AboveOrEqual, scratch2GPR, MacroAssembler::TrustedImm32(writeBarrierBuffer.capacity()));
 
-    m_jit.add32(TrustedImm32(1), scratch2);
-    m_jit.store32(scratch2, writeBarrierBuffer.currentIndexAddress());
+    m_jit.add32(TrustedImm32(1), scratch2GPR);
+    m_jit.store32(scratch2GPR, writeBarrierBuffer.currentIndexAddress());
 
-    m_jit.move(TrustedImmPtr(writeBarrierBuffer.buffer()), scratch1);
+    m_jit.move(TrustedImmPtr(writeBarrierBuffer.buffer()), scratch1GPR);
     // We use an offset of -sizeof(void*) because we already added 1 to scratch2.
-    m_jit.storePtr(cell, MacroAssembler::BaseIndex(scratch1, scratch2, MacroAssembler::ScalePtr, static_cast<int32_t>(-sizeof(void*))));
+    m_jit.storePtr(baseGPR, MacroAssembler::BaseIndex(scratch1GPR, scratch2GPR, MacroAssembler::ScalePtr, static_cast<int32_t>(-sizeof(void*))));
 
-    JITCompiler::Jump done = m_jit.jump();
+    ok.append(m_jit.jump());
     needToFlush.link(&m_jit);
 
     silentSpillAllRegisters(InvalidGPRReg);
-    callOperation(operationFlushWriteBarrierBuffer, cell);
+    callOperation(operationFlushWriteBarrierBuffer, baseGPR);
     silentFillAllRegisters(InvalidGPRReg);
 
-    done.link(&m_jit);
-}
+    ok.link(&m_jit);
 
-void SpeculativeJIT::writeBarrier(GPRReg ownerGPR, GPRReg scratch1, GPRReg scratch2)
-{
-    JITCompiler::Jump ownerIsRememberedOrInEden = m_jit.jumpIfIsRememberedOrInEden(ownerGPR);
-    storeToWriteBarrierBuffer(ownerGPR, scratch1, scratch2);
-    ownerIsRememberedOrInEden.link(&m_jit);
+    noResult(node);
 }
 
 void SpeculativeJIT::compilePutAccessorById(Node* node)
index 9d3bb6f..ca978f6 100644 (file)
@@ -295,12 +295,6 @@ public:
         return masqueradesAsUndefinedWatchpointIsStillValid(m_currentNode->origin.semantic);
     }
 
-    void storeToWriteBarrierBuffer(GPRReg cell, GPRReg scratch1, GPRReg scratch2);
-
-    void writeBarrier(GPRReg owner, GPRReg scratch1, GPRReg scratch2);
-
-    void writeBarrier(GPRReg owner, GPRReg value, Edge valueUse, GPRReg scratch1, GPRReg scratch2);
-
     void compileStoreBarrier(Node*);
 
     static GPRReg selectScratchGPR(GPRReg preserve1 = InvalidGPRReg, GPRReg preserve2 = InvalidGPRReg, GPRReg preserve3 = InvalidGPRReg, GPRReg preserve4 = InvalidGPRReg)
@@ -737,8 +731,6 @@ public:
     void compileTryGetById(Node*);
     void compileIn(Node*);
     
-    void compileBaseValueStoreBarrier(Edge& baseEdge, Edge& valueEdge);
-
     void nonSpeculativeNonPeepholeCompareNullOrUndefined(Edge operand);
     void nonSpeculativePeepholeBranchNullOrUndefined(Edge operand, Node* branchNode);
     
index a22dbea..3a10a10 100644 (file)
@@ -1268,18 +1268,6 @@ GPRReg SpeculativeJIT::fillSpeculateBoolean(Edge edge)
     }
 }
 
-void SpeculativeJIT::compileBaseValueStoreBarrier(Edge& baseEdge, Edge& valueEdge)
-{
-    ASSERT(!isKnownNotCell(valueEdge.node()));
-
-    SpeculateCellOperand base(this, baseEdge);
-    JSValueOperand value(this, valueEdge);
-    GPRTemporary scratch1(this);
-    GPRTemporary scratch2(this);
-
-    writeBarrier(base.gpr(), value.tagGPR(), valueEdge, scratch1.gpr(), scratch2.gpr());
-}
-
 void SpeculativeJIT::compileObjectEquality(Node* node)
 {
     SpeculateCellOperand op1(this, node->child1());
@@ -5007,7 +4995,8 @@ void SpeculativeJIT::compile(Node* node)
         break;
     }
 
-    case StoreBarrier: {
+    case StoreBarrier:
+    case FencedStoreBarrier: {
         compileStoreBarrier(node);
         break;
     }
@@ -5474,20 +5463,6 @@ void SpeculativeJIT::compile(Node* node)
         use(node);
 }
 
-void SpeculativeJIT::writeBarrier(GPRReg ownerGPR, GPRReg valueTagGPR, Edge valueUse, GPRReg scratch1, GPRReg scratch2)
-{
-    JITCompiler::Jump isNotCell;
-    if (!isKnownCell(valueUse.node()))
-        isNotCell = m_jit.branch32(JITCompiler::NotEqual, valueTagGPR, JITCompiler::TrustedImm32(JSValue::CellTag));
-
-    JITCompiler::Jump ownerIsRememberedOrInEden = m_jit.jumpIfIsRememberedOrInEden(ownerGPR);
-    storeToWriteBarrierBuffer(ownerGPR, scratch1, scratch2);
-    ownerIsRememberedOrInEden.link(&m_jit);
-
-    if (!isKnownCell(valueUse.node()))
-        isNotCell.link(&m_jit);
-}
-
 void SpeculativeJIT::moveTrueTo(GPRReg gpr)
 {
     m_jit.move(TrustedImm32(1), gpr);
index 8cd5d71..f5ea74b 100644 (file)
@@ -1376,18 +1376,6 @@ GPRReg SpeculativeJIT::fillSpeculateBoolean(Edge edge)
     }
 }
 
-void SpeculativeJIT::compileBaseValueStoreBarrier(Edge& baseEdge, Edge& valueEdge)
-{
-    ASSERT(!isKnownNotCell(valueEdge.node()));
-
-    SpeculateCellOperand base(this, baseEdge);
-    JSValueOperand value(this, valueEdge);
-    GPRTemporary scratch1(this);
-    GPRTemporary scratch2(this);
-
-    writeBarrier(base.gpr(), value.gpr(), valueEdge, scratch1.gpr(), scratch2.gpr());
-}
-
 void SpeculativeJIT::compileObjectEquality(Node* node)
 {
     SpeculateCellOperand op1(this, node->child1());
@@ -5079,11 +5067,12 @@ void SpeculativeJIT::compile(Node* node)
         unreachable(node);
         break;
 
-    case StoreBarrier: {
+    case StoreBarrier:
+    case FencedStoreBarrier: {
         compileStoreBarrier(node);
         break;
     }
-
+        
     case GetEnumerableLength: {
         SpeculateCellOperand enumerator(this, node->child1());
         GPRFlushedCallResult result(this);
@@ -5575,20 +5564,6 @@ void SpeculativeJIT::compile(Node* node)
         use(node);
 }
 
-void SpeculativeJIT::writeBarrier(GPRReg ownerGPR, GPRReg valueGPR, Edge valueUse, GPRReg scratch1, GPRReg scratch2)
-{
-    JITCompiler::Jump isNotCell;
-    if (!isKnownCell(valueUse.node()))
-        isNotCell = m_jit.branchIfNotCell(JSValueRegs(valueGPR));
-    
-    JITCompiler::Jump ownerIsRememberedOrInEden = m_jit.jumpIfIsRememberedOrInEden(ownerGPR);
-    storeToWriteBarrierBuffer(ownerGPR, scratch1, scratch2);
-    ownerIsRememberedOrInEden.link(&m_jit);
-
-    if (!isKnownCell(valueUse.node()))
-        isNotCell.link(&m_jit);
-}
-
 void SpeculativeJIT::moveTrueTo(GPRReg gpr)
 {
     m_jit.move(TrustedImm32(ValueTrue), gpr);
diff --git a/Source/JavaScriptCore/dfg/DFGStoreBarrierClusteringPhase.cpp b/Source/JavaScriptCore/dfg/DFGStoreBarrierClusteringPhase.cpp
new file mode 100644 (file)
index 0000000..101c6b0
--- /dev/null
@@ -0,0 +1,173 @@
+/*
+ * Copyright (C) 2016 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "config.h"
+#include "DFGStoreBarrierClusteringPhase.h"
+
+#if ENABLE(DFG_JIT)
+
+#include "DFGDoesGC.h"
+#include "DFGGraph.h"
+#include "DFGInsertionSet.h"
+#include "DFGMayExit.h"
+#include "DFGPhase.h"
+#include "JSCInlines.h"
+#include <wtf/FastBitVector.h>
+
+namespace JSC { namespace DFG {
+
+namespace {
+
+class StoreBarrierClusteringPhase : public Phase {
+public:
+    StoreBarrierClusteringPhase(Graph& graph)
+        : Phase(graph, "store barrier fencing")
+        , m_insertionSet(graph)
+    {
+    }
+    
+    bool run()
+    {
+        size_t maxSize = 0;
+        for (BasicBlock* block : m_graph.blocksInNaturalOrder())
+            maxSize = std::max(maxSize, block->size());
+        m_barrierPoints.resize(maxSize);
+        
+        for (BasicBlock* block : m_graph.blocksInNaturalOrder()) {
+            size_t blockSize = block->size();
+            doBlock(block);
+            m_barrierPoints.clearRange(0, blockSize);
+        }
+        
+        return true;
+    }
+
+private:
+    // This summarizes everything we need to remember about a barrier.
+    struct ChildAndOrigin {
+        ChildAndOrigin() { }
+        
+        ChildAndOrigin(Node* child, CodeOrigin semanticOrigin)
+            : child(child)
+            , semanticOrigin(semanticOrigin)
+        {
+        }
+        
+        Node* child { nullptr };
+        CodeOrigin semanticOrigin;
+    };
+    
+    void doBlock(BasicBlock* block)
+    {
+        ASSERT(m_barrierPoints.isEmpty());
+        
+        // First identify the places where we would want to place all of the barriers based on a
+        // backwards analysis. We use the futureGC flag to tell us if we had seen a GC. Since this
+        // is a backwards analysis, when we get to a node, futureGC tells us if a GC will happen
+        // in the future after that node.
+        bool futureGC = true;
+        for (unsigned nodeIndex = block->size(); nodeIndex--;) {
+            Node* node = block->at(nodeIndex);
+            
+            // This is a backwards analysis, so exits require conservatism. If we exit, then there
+            // probably will be a GC in the future! If we needed to then we could lift that
+            // requirement by either (1) having a StoreBarrierHint that tells OSR exit to barrier that
+            // value or (2) automatically barriering any DFG-live Node on OSR exit. Either way, it
+            // would be weird because it would create a new root for OSR availability analysis. I
+            // don't have evidence that it would be worth it.
+            if (doesGC(m_graph, node) || mayExit(m_graph, node) != DoesNotExit) {
+                futureGC = true;
+                continue;
+            }
+            
+            if (node->isStoreBarrier() && futureGC) {
+                m_barrierPoints[nodeIndex] = true;
+                futureGC = false;
+            }
+        }
+        
+        // Now we run forward and collect the barriers. When we hit a barrier point, insert all of
+        // them with a fence.
+        for (unsigned nodeIndex = 0; nodeIndex < block->size(); ++nodeIndex) {
+            Node* node = block->at(nodeIndex);
+            if (!node->isStoreBarrier())
+                continue;
+            
+            DFG_ASSERT(m_graph, node, !node->origin.wasHoisted);
+            DFG_ASSERT(m_graph, node, node->child1().useKind() == KnownCellUse);
+            
+            NodeOrigin origin = node->origin;
+            m_neededBarriers.append(ChildAndOrigin(node->child1().node(), origin.semantic));
+            node->remove();
+            
+            if (!m_barrierPoints[nodeIndex])
+                continue;
+            
+            std::sort(
+                m_neededBarriers.begin(), m_neededBarriers.end(),
+                [&] (const ChildAndOrigin& a, const ChildAndOrigin& b) -> bool {
+                    return a.child < b.child;
+                });
+            auto end = std::unique(
+                m_neededBarriers.begin(), m_neededBarriers.end(),
+                [&] (const ChildAndOrigin& a, const ChildAndOrigin& b) -> bool{
+                    return a.child == b.child;
+                });
+            for (auto iter = m_neededBarriers.begin(); iter != end; ++iter) {
+                Node* child = iter->child;
+                CodeOrigin semanticOrigin = iter->semanticOrigin;
+                
+                NodeType type;
+                if (Options::useConcurrentBarriers() && iter == m_neededBarriers.begin())
+                    type = FencedStoreBarrier;
+                else
+                    type = StoreBarrier;
+                
+                m_insertionSet.insertNode(
+                    nodeIndex, SpecNone, type, origin.withSemantic(semanticOrigin),
+                    Edge(child, KnownCellUse));
+            }
+            m_neededBarriers.resize(0);
+        }
+        
+        m_insertionSet.execute(block);
+    }
+    
+    InsertionSet m_insertionSet;
+    FastBitVector m_barrierPoints;
+    Vector<ChildAndOrigin> m_neededBarriers;
+};
+
+} // anonymous namespace
+
+bool performStoreBarrierClustering(Graph& graph)
+{
+    return runPhase<StoreBarrierClusteringPhase>(graph);
+}
+
+} } // namespace JSC::DFG
+
+#endif // ENABLE(DFG_JIT)
+
diff --git a/Source/JavaScriptCore/dfg/DFGStoreBarrierClusteringPhase.h b/Source/JavaScriptCore/dfg/DFGStoreBarrierClusteringPhase.h
new file mode 100644 (file)
index 0000000..c1111b1
--- /dev/null
@@ -0,0 +1,91 @@
+/*
+ * Copyright (C) 2016 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#pragma once
+
+#if ENABLE(DFG_JIT)
+
+namespace JSC { namespace DFG {
+
+class Graph;
+
+// Picks up groups of barriers that could be executed in any order with respect to each other and
+// places then at the earliest point in the program where the cluster would be correct. This phase
+// makes only the first of the cluster be a FencedStoreBarrier while the rest are normal
+// StoreBarriers. This phase also removes redundant barriers - for example, the cluster may end up
+// with two or more barriers on the same object, in which case it is totally safe for us to drop
+// one of them. The reason why this is sound hinges on the "earliest point where the cluster would
+// be correct" property. For example, take this input:
+//
+//     a: Call()
+//     b: PutByOffset(@o, @o, @x)
+//     c: FencedStoreBarrier(@o)
+//     d: PutByOffset(@o, @o, @y)
+//     e: FencedStoreBarrier(@o)
+//     f: PutByOffset(@p, @p, @z)
+//     g: FencedStoreBarrier(@p)
+//     h: GetByOffset(@q)
+//     i: Call()
+//
+// The cluster of barriers is @c, @e, and @g. All of the barriers are between two doesGC effects:
+// the calls at @a and @i. Because there are no doesGC effects between @a and @i and there is no
+// possible control flow entry into this sequence between @ and @i, we could could just execute all
+// of the barriers just before @i in any order. The earliest point where the cluster would be
+// correct is just after @f, since that's the last operation that needs a barrier. We use the
+// earliest to reduce register pressure. When the barriers are clustered just after @f, we get:
+//
+//     a: Call()
+//     b: PutByOffset(@o, @o, @x)
+//     d: PutByOffset(@o, @o, @y)
+//     f: PutByOffset(@p, @p, @z)
+//     c: FencedStoreBarrier(@o)
+//     e: FencedStoreBarrier(@o)
+//     g: FencedStoreBarrier(@p)
+//     h: GetByOffset(@q)
+//     i: Call()
+//
+// This phase does more. It takes advantage of the clustering to remove fences and remove redundant
+// barriers. So this phase will output this:
+//
+//     a: Call()
+//     b: PutByOffset(@o, @o, @x)
+//     d: PutByOffset(@o, @o, @y)
+//     f: PutByOffset(@p, @p, @z)
+//     c: FencedStoreBarrier(@o)
+//     g: StoreBarrier(@p)
+//     h: GetByOffset(@q)
+//     i: Call()
+//
+// This optimization improves both overall throughput and the throughput while the concurrent GC is
+// running. In the former, we are simplifying instruction selection for all but the first fence. In
+// the latter, we are reducing the cost of all but the first barrier. The first barrier will awlays
+// take slow path when there is concurrent GC activity, since the slow path contains the fence. But
+// all of the other barriers will only take slow path if they really need to remember the object.
+bool performStoreBarrierClustering(Graph&);
+
+} } // namespace JSC::DFG
+
+#endif // ENABLE(DFG_JIT)
+
index a847e37..f407b7f 100644 (file)
@@ -321,7 +321,7 @@ private:
             case AllocatePropertyStorage:
             case ReallocatePropertyStorage:
                 // These allocate but then run their own barrier.
-                insertBarrier(m_nodeIndex + 1, Edge(m_node->child1().node(), KnownCellUse));
+                insertBarrier(m_nodeIndex + 1, m_node->child1());
                 m_node->setEpoch(Epoch());
                 break;
                 
@@ -430,13 +430,14 @@ private:
         
         if (verbose)
             dataLog("            Inserting barrier.\n");
-        insertBarrier(m_nodeIndex, base);
+        insertBarrier(m_nodeIndex + 1, base);
     }
 
-    void insertBarrier(unsigned nodeIndex, Edge base, bool exitOK = true)
+    void insertBarrier(unsigned nodeIndex, Edge base)
     {
-        // Inserting a barrier means that the object may become marked by the GC, which will make
-        // the object black.
+        // This is just our way of saying that barriers are not redundant with each other according
+        // to forward analysis: if we proved one time that a barrier was necessary then it'll for
+        // sure be necessary next time.
         base->setEpoch(Epoch());
 
         // If we're in global mode, we should only insert the barriers once we have converged.
@@ -446,17 +447,23 @@ private:
         // FIXME: We could support StoreBarrier(UntypedUse:). That would be sort of cool.
         // But right now we don't need it.
 
-        // If the original edge was unchecked, we should also not have a check. We may be in a context
-        // where checks are not allowed. If we ever did have to insert a barrier at an ExitInvalid
-        // context and that barrier needed a check, then we could make that work by hoisting the check.
-        // That doesn't happen right now.
-        if (base.useKind() != KnownCellUse) {
-            DFG_ASSERT(m_graph, m_node, m_node->origin.exitOK);
-            base.setUseKind(CellUse);
-        }
+        DFG_ASSERT(m_graph, m_node, isCell(base.useKind()));
+        
+        // Barriers are always inserted after the node that they service. Therefore, we always know
+        // that the thing is a cell now.
+        base.setUseKind(KnownCellUse);
+        
+        NodeOrigin origin = m_node->origin;
+        if (clobbersExitState(m_graph, m_node))
+            origin = origin.withInvalidExit();
+        
+        NodeType type;
+        if (Options::useConcurrentBarriers())
+            type = FencedStoreBarrier;
+        else
+            type = StoreBarrier;
         
-        m_insertionSet.insertNode(
-            nodeIndex, SpecNone, StoreBarrier, m_node->origin.takeValidExit(exitOK), base);
+        m_insertionSet.insertNode(nodeIndex, SpecNone, type, origin, base);
     }
     
     bool reallyInsertBarriers()
index 0caa386..730352c 100644 (file)
@@ -32,14 +32,13 @@ namespace JSC { namespace DFG {
 class Graph;
 
 // Inserts store barriers in a block-local manner without consulting the abstract interpreter.
-// Uses a simple epoch-based analysis to avoid inserting redundant barriers. This phase requires
-// that we are not in SSA.
+// Uses a simple epoch-based analysis to avoid inserting barriers on newly allocated objects. This
+// phase requires that we are not in SSA.
 bool performFastStoreBarrierInsertion(Graph&);
 
 // Inserts store barriers using a global analysis and consults the abstract interpreter. Uses a
-// simple epoch-based analysis to avoid inserting redundant barriers, but only propagates "same
-// epoch as current" property from one block to the next. This phase requires SSA. This phase
-// also requires having valid AI and liveness.
+// simple epoch-based analysis to avoid inserting barriers on newly allocated objects. This phase
+// requires SSA. This phase also requires having valid AI and liveness.
 bool performGlobalStoreBarrierInsertion(Graph&);
 
 } } // namespace JSC::DFG
index 838e7f9..ae883a6 100644 (file)
@@ -195,12 +195,12 @@ public:
     
     const AbstractHeap& atAnyAddress() const { return m_indexedHeap.atAnyIndex(); }
     
-    const AbstractHeap& at(void* address)
+    const AbstractHeap& at(const void* address)
     {
         return m_indexedHeap.at(bitwise_cast<ptrdiff_t>(address));
     }
     
-    const AbstractHeap& operator[](void* address) { return at(address); }
+    const AbstractHeap& operator[](const void* address) { return at(address); }
 
     void dump(PrintStream&) const;
 
index d13d8a0..fb5258f 100644 (file)
@@ -29,6 +29,7 @@
 #if ENABLE(FTL_JIT)
 
 #include "B3CCallValue.h"
+#include "B3FenceValue.h"
 #include "B3MemoryValue.h"
 #include "B3PatchpointValue.h"
 #include "B3ValueInlines.h"
@@ -116,6 +117,16 @@ void AbstractHeapRepository::decoratePatchpointWrite(const AbstractHeap* heap, V
     m_heapForPatchpointWrite.append(HeapForValue(heap, value));
 }
 
+void AbstractHeapRepository::decorateFenceRead(const AbstractHeap* heap, Value* value)
+{
+    m_heapForFenceRead.append(HeapForValue(heap, value));
+}
+
+void AbstractHeapRepository::decorateFenceWrite(const AbstractHeap* heap, Value* value)
+{
+    m_heapForFenceWrite.append(HeapForValue(heap, value));
+}
+
 void AbstractHeapRepository::computeRangesAndDecorateInstructions()
 {
     root.compute();
@@ -132,9 +143,13 @@ void AbstractHeapRepository::computeRangesAndDecorateInstructions()
     for (HeapForValue entry : m_heapForCCallWrite)
         entry.value->as<CCallValue>()->effects.writes = entry.heap->range();
     for (HeapForValue entry : m_heapForPatchpointRead)
-        entry.value->as<CCallValue>()->effects.reads = entry.heap->range();
+        entry.value->as<PatchpointValue>()->effects.reads = entry.heap->range();
     for (HeapForValue entry : m_heapForPatchpointWrite)
-        entry.value->as<CCallValue>()->effects.writes = entry.heap->range();
+        entry.value->as<PatchpointValue>()->effects.writes = entry.heap->range();
+    for (HeapForValue entry : m_heapForFenceRead)
+        entry.value->as<FenceValue>()->read = entry.heap->range();
+    for (HeapForValue entry : m_heapForFenceWrite)
+        entry.value->as<FenceValue>()->write = entry.heap->range();
 }
 
 } } // namespace JSC::FTL
index cfaffd0..719f52c 100644 (file)
@@ -215,6 +215,8 @@ public:
     void decorateCCallWrite(const AbstractHeap*, B3::Value*);
     void decoratePatchpointRead(const AbstractHeap*, B3::Value*);
     void decoratePatchpointWrite(const AbstractHeap*, B3::Value*);
+    void decorateFenceRead(const AbstractHeap*, B3::Value*);
+    void decorateFenceWrite(const AbstractHeap*, B3::Value*);
 
     void computeRangesAndDecorateInstructions();
 
@@ -240,6 +242,8 @@ private:
     Vector<HeapForValue> m_heapForCCallWrite;
     Vector<HeapForValue> m_heapForPatchpointRead;
     Vector<HeapForValue> m_heapForPatchpointWrite;
+    Vector<HeapForValue> m_heapForFenceRead;
+    Vector<HeapForValue> m_heapForFenceWrite;
 };
 
 } } // namespace JSC::FTL
index cad45fd..b6eb2e5 100644 (file)
@@ -138,6 +138,7 @@ inline CapabilityLevel canCompile(Node* node)
     case GetTypedArrayByteOffset:
     case NotifyWrite:
     case StoreBarrier:
+    case FencedStoreBarrier:
     case Call:
     case TailCall:
     case TailCallInlinedCaller:
index 9f885b8..063cc59 100644 (file)
@@ -31,6 +31,7 @@
 #include "AirGenerationContext.h"
 #include "AllowMacroScratchRegisterUsage.h"
 #include "B3CheckValue.h"
+#include "B3FenceValue.h"
 #include "B3PatchpointValue.h"
 #include "B3SlotBaseValue.h"
 #include "B3StackmapGenerationParams.h"
@@ -945,6 +946,7 @@ private:
             compileCountExecution();
             break;
         case StoreBarrier:
+        case FencedStoreBarrier:
             compileStoreBarrier();
             break;
         case HasIndexedProperty:
@@ -6997,9 +6999,9 @@ private:
     
     void compileStoreBarrier()
     {
-        emitStoreBarrier(lowCell(m_node->child1()));
+        emitStoreBarrier(lowCell(m_node->child1()), m_node->op() == FencedStoreBarrier);
     }
-
+    
     void compileHasIndexedProperty()
     {
         switch (m_node->arrayMode().type()) {
@@ -8128,8 +8130,6 @@ private:
                 previousStructure, nextStructure);
         }
         
-        emitStoreBarrier(object);
-        
         return result;
     }
 
@@ -10055,13 +10055,13 @@ private:
     // run after the function that created them returns. Hence, you should not use by-reference
     // capture (i.e. [&]) in any of these lambdas.
     template<typename Functor, typename... ArgumentTypes>
-    LValue lazySlowPath(const Functor& functor, ArgumentTypes... arguments)
+    PatchpointValue* lazySlowPath(const Functor& functor, ArgumentTypes... arguments)
     {
         return lazySlowPath(functor, Vector<LValue>{ arguments... });
     }
 
     template<typename Functor>
-    LValue lazySlowPath(const Functor& functor, const Vector<LValue>& userArguments)
+    PatchpointValue* lazySlowPath(const Functor& functor, const Vector<LValue>& userArguments)
     {
         CodeOrigin origin = m_node->origin.semantic;
         
@@ -11431,26 +11431,39 @@ private:
         return m_out.load8ZeroExt32(base, m_heaps.JSCell_cellState);
     }
 
-    void emitStoreBarrier(LValue base)
+    void emitStoreBarrier(LValue base, bool isFenced)
     {
         LBasicBlock slowPath = m_out.newBlock();
         LBasicBlock continuation = m_out.newBlock();
 
+        LValue threshold;
+        if (isFenced)
+            threshold = m_out.load32(m_out.absolute(vm().heap.addressOfBarrierThreshold()));
+        else
+            threshold = m_out.constInt32(blackThreshold);
+        
         m_out.branch(
-            m_out.above(loadCellState(base), m_out.constInt32(blackThreshold)),
+            m_out.above(loadCellState(base), threshold),
             usually(continuation), rarely(slowPath));
 
         LBasicBlock lastNext = m_out.appendTo(slowPath, continuation);
-
+        
         // We emit the store barrier slow path lazily. In a lot of cases, this will never fire. And
         // when it does fire, it makes sense for us to generate this code using our JIT rather than
         // wasting B3's time optimizing it.
-        lazySlowPath(
+        PatchpointValue* patchpoint = lazySlowPath(
             [=] (const Vector<Location>& locations) -> RefPtr<LazySlowPath::Generator> {
                 GPRReg baseGPR = locations[1].directGPR();
 
                 return LazySlowPath::createGenerator(
                     [=] (CCallHelpers& jit, LazySlowPath::GenerationParams& params) {
+                        if (isFenced) {
+                            CCallHelpers::Jump noFence = jit.jumpIfBarrierStoreLoadFenceNotNeeded();
+                            jit.memoryFence();
+                            params.doneJumps.append(jit.barrierBranchWithoutFence(baseGPR));
+                            noFence.link(&jit);
+                        }
+                        
                         RegisterSet usedRegisters = params.lazySlowPath->usedRegisters();
                         ScratchRegisterAllocator scratchRegisterAllocator(usedRegisters);
                         scratchRegisterAllocator.lock(baseGPR);
@@ -11495,6 +11508,13 @@ private:
                     });
             },
             base);
+        
+        if (isFenced)
+            m_heaps.decoratePatchpointRead(&m_heaps.root, patchpoint);
+        else
+            m_heaps.decoratePatchpointRead(&m_heaps.JSCell_cellState, patchpoint);
+        m_heaps.decoratePatchpointWrite(&m_heaps.JSCell_cellState, patchpoint);
+        
         m_out.jump(continuation);
 
         m_out.appendTo(continuation, lastNext);
index c561380..db5858e 100644 (file)
@@ -33,6 +33,7 @@
 #include "B3CCallValue.h"
 #include "B3Const32Value.h"
 #include "B3ConstPtrValue.h"
+#include "B3FenceValue.h"
 #include "B3MathExtras.h"
 #include "B3MemoryValue.h"
 #include "B3SlotBaseValue.h"
@@ -445,6 +446,11 @@ void Output::store(LValue value, TypedPointer pointer)
     m_heaps->decorateMemory(pointer.heap(), store);
 }
 
+FenceValue* Output::fence()
+{
+    return m_block->appendNew<FenceValue>(m_proc, origin());
+}
+
 void Output::store32As8(LValue value, TypedPointer pointer)
 {
     LValue store = m_block->appendNew<MemoryValue>(m_proc, Store8, origin(), value, pointer.value());
@@ -795,7 +801,7 @@ void Output::store(LValue value, TypedPointer pointer, StoreType type)
     RELEASE_ASSERT_NOT_REACHED();
 }
 
-TypedPointer Output::absolute(void* address)
+TypedPointer Output::absolute(const void* address)
 {
     return TypedPointer(m_heaps->absolute[address], constIntPtr(address));
 }
index 1924097..27153c9 100644 (file)
@@ -61,6 +61,7 @@ struct Node;
 } // namespace DFG
 
 namespace B3 {
+class FenceValue;
 class SlotBaseValue;
 } // namespace B3
 
@@ -188,6 +189,7 @@ public:
 
     LValue load(TypedPointer, LType);
     void store(LValue, TypedPointer);
+    B3::FenceValue* fence();
 
     LValue load8SignExt32(TypedPointer);
     LValue load8ZeroExt32(TypedPointer);
@@ -284,7 +286,7 @@ public:
         return heap.baseIndex(*this, base, index, indexAsConstant, offset);
     }
 
-    TypedPointer absolute(void* address);
+    TypedPointer absolute(const void* address);
 
     LValue load8SignExt32(LValue base, const AbstractHeap& field) { return load8SignExt32(address(base, field)); }
     LValue load8ZeroExt32(LValue base, const AbstractHeap& field) { return load8ZeroExt32(address(base, field)); }
index 85cc481..9b15de5 100644 (file)
@@ -52,10 +52,16 @@ enum class CellState : uint8_t {
 };
 
 static const unsigned blackThreshold = 1; // x <= blackThreshold means x is black.
+static const unsigned tautologicalThreshold = 100; // x <= tautologicalThreshold is always true.
+
+inline bool isWithinThreshold(CellState cellState, unsigned threshold)
+{
+    return static_cast<unsigned>(cellState) <= threshold;
+}
 
 inline bool isBlack(CellState cellState)
 {
-    return static_cast<unsigned>(cellState) <= blackThreshold;
+    return isWithinThreshold(cellState, blackThreshold);
 }
 
 inline CellState blacken(CellState cellState)
diff --git a/Source/JavaScriptCore/heap/GCDeferralContext.h b/Source/JavaScriptCore/heap/GCDeferralContext.h
new file mode 100644 (file)
index 0000000..b4d151e
--- /dev/null
@@ -0,0 +1,46 @@
+/*
+ * Copyright (C) 2016 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+#pragma once
+
+namespace JSC {
+
+class Heap;
+class MarkedAllocator;
+
+class GCDeferralContext {
+    friend class Heap;
+    friend class MarkedAllocator;
+public:
+    inline GCDeferralContext(Heap&);
+    inline ~GCDeferralContext();
+
+private:
+    Heap& m_heap;
+    bool m_shouldGC { false };
+};
+
+} // namespace JSC
+
diff --git a/Source/JavaScriptCore/heap/GCDeferralContextInlines.h b/Source/JavaScriptCore/heap/GCDeferralContextInlines.h
new file mode 100644 (file)
index 0000000..cd8c1e7
--- /dev/null
@@ -0,0 +1,49 @@
+/*
+ * Copyright (C) 2016 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+#pragma once
+
+#include "GCDeferralContext.h"
+#include "Heap.h"
+
+namespace JSC {
+
+ALWAYS_INLINE GCDeferralContext::GCDeferralContext(Heap& heap)
+    : m_heap(heap)
+{
+}
+
+ALWAYS_INLINE GCDeferralContext::~GCDeferralContext()
+{
+    ASSERT(!DisallowGC::isGCDisallowedOnCurrentThread());
+#if ENABLE(GC_VALIDATION)
+    ASSERT(!m_heap.vm()->isInitializingObject());
+#endif
+    if (UNLIKELY(m_shouldGC))
+        m_heap.collectIfNecessaryOrDefer();
+}
+
+} // namespace JSC
+
index 81da6bd..128facb 100644 (file)
@@ -1549,4 +1549,17 @@ void Heap::forEachCodeBlockImpl(const ScopedLambda<bool(CodeBlock*)>& func)
     return m_codeBlocks->iterate(func);
 }
 
+void Heap::writeBarrierSlowPath(const JSCell* from)
+{
+    if (UNLIKELY(barrierShouldBeFenced())) {
+        // In this case, the barrierThreshold is the tautological threshold, so from could still be
+        // not black. But we can't know for sure until we fire off a fence.
+        WTF::storeLoadFence();
+        if (!isBlack(from->cellState()))
+            return;
+    }
+    
+    addToRememberedSet(from);
+}
+
 } // namespace JSC
index 92c10c6..adfc1f3 100644 (file)
@@ -51,6 +51,7 @@ namespace JSC {
 class AllocationScope;
 class CodeBlock;
 class CodeBlockSet;
+class GCDeferralContext;
 class EdenGCActivityCallback;
 class ExecutableBase;
 class FullGCActivityCallback;
@@ -103,9 +104,14 @@ public:
     
     static size_t cellSize(const void*);
 
-    void writeBarrier(const JSCell*);
-    void writeBarrier(const JSCell*, JSValue);
-    void writeBarrier(const JSCell*, JSCell*);
+    void writeBarrier(const JSCell* from);
+    void writeBarrier(const JSCell* from, JSValue to);
+    void writeBarrier(const JSCell* from, JSCell* to);
+    
+    void writeBarrierWithoutFence(const JSCell* from);
+    
+    // Take this if you know that from->cellState() < barrierThreshold.
+    JS_EXPORT_PRIVATE void writeBarrierSlowPath(const JSCell* from);
 
     WriteBarrierBuffer& writeBarrierBuffer() { return m_writeBarrierBuffer; }
     void flushWriteBarrierBuffer(JSCell*);
@@ -148,6 +154,7 @@ public:
     MarkedAllocator* allocatorForAuxiliaryData(size_t bytes) { return m_objectSpace.auxiliaryAllocatorFor(bytes); }
     void* allocateAuxiliary(JSCell* intendedOwner, size_t);
     void* tryAllocateAuxiliary(JSCell* intendedOwner, size_t);
+    void* tryAllocateAuxiliary(GCDeferralContext*, JSCell* intendedOwner, size_t);
     void* tryReallocateAuxiliary(JSCell* intendedOwner, void* oldBase, size_t oldSize, size_t newSize);
     void ascribeOwner(JSCell* intendedOwner, void*);
 
@@ -165,7 +172,7 @@ public:
 
     bool shouldCollect();
     JS_EXPORT_PRIVATE void collect(HeapOperation collectionType = AnyCollection);
-    bool collectIfNecessaryOrDefer(); // Returns true if it did collect.
+    bool collectIfNecessaryOrDefer(GCDeferralContext* = nullptr); // Returns true if it did collect.
     void collectAccordingToDeferGCProbability();
 
     void completeAllJITPlans();
@@ -253,6 +260,12 @@ public:
 
     void didAllocateBlock(size_t capacity);
     void didFreeBlock(size_t capacity);
+    
+    bool barrierShouldBeFenced() const { return m_barrierShouldBeFenced; }
+    const bool* addressOfBarrierShouldBeFenced() const { return &m_barrierShouldBeFenced; }
+    
+    unsigned barrierThreshold() const { return m_barrierThreshold; }
+    const unsigned* addressOfBarrierThreshold() const { return &m_barrierThreshold; }
 
 private:
     friend class AllocationScope;
@@ -277,12 +290,17 @@ private:
     friend class WeakSet;
     template<typename T> friend void* allocateCell(Heap&);
     template<typename T> friend void* allocateCell(Heap&, size_t);
+    template<typename T> friend void* allocateCell(Heap&, GCDeferralContext*);
+    template<typename T> friend void* allocateCell(Heap&, GCDeferralContext*, size_t);
 
     void collectWithoutAnySweep(HeapOperation collectionType = AnyCollection);
 
     void* allocateWithDestructor(size_t); // For use with objects with destructors.
     void* allocateWithoutDestructor(size_t); // For use with objects without destructors.
+    void* allocateWithDestructor(GCDeferralContext*, size_t);
+    void* allocateWithoutDestructor(GCDeferralContext*, size_t);
     template<typename ClassType> void* allocateObjectOfType(size_t); // Chooses one of the methods above based on type.
+    template<typename ClassType> void* allocateObjectOfType(GCDeferralContext*, size_t);
 
     static const size_t minExtraMemory = 256;
     
@@ -410,6 +428,8 @@ private:
     bool m_isSafeToCollect;
 
     WriteBarrierBuffer m_writeBarrierBuffer;
+    bool m_barrierShouldBeFenced { Options::forceFencedBarrier() };
+    unsigned m_barrierThreshold { Options::forceFencedBarrier() ? tautologicalThreshold : blackThreshold };
 
     VM* m_vm;
     double m_lastFullGCLength;
index 7acee5a..fdd2150 100644 (file)
@@ -25,6 +25,7 @@
 
 #pragma once
 
+#include "GCDeferralContext.h"
 #include "Heap.h"
 #include "HeapCellInlines.h"
 #include "IndexingHeader.h"
@@ -124,19 +125,31 @@ inline void Heap::writeBarrier(const JSCell* from, JSCell* to)
 #if ENABLE(WRITE_BARRIER_PROFILING)
     WriteBarrierCounters::countWriteBarrier();
 #endif
-    if (!from || !isBlack(from->cellState()))
+    if (!from)
         return;
-    if (!to || to->cellState() != CellState::NewWhite)
+    if (!isWithinThreshold(from->cellState(), barrierThreshold()))
         return;
-    addToRememberedSet(from);
+    if (LIKELY(!to || to->cellState() != CellState::NewWhite))
+        return;
+    writeBarrierSlowPath(from);
 }
 
 inline void Heap::writeBarrier(const JSCell* from)
 {
     ASSERT_GC_OBJECT_LOOKS_VALID(const_cast<JSCell*>(from));
-    if (!from || !isBlack(from->cellState()))
+    if (!from)
+        return;
+    if (UNLIKELY(isWithinThreshold(from->cellState(), barrierThreshold())))
+        writeBarrierSlowPath(from);
+}
+
+inline void Heap::writeBarrierWithoutFence(const JSCell* from)
+{
+    ASSERT_GC_OBJECT_LOOKS_VALID(const_cast<JSCell*>(from));
+    if (!from)
         return;
-    addToRememberedSet(from);
+    if (UNLIKELY(isWithinThreshold(from->cellState(), blackThreshold)))
+        addToRememberedSet(from);
 }
 
 inline void Heap::reportExtraMemoryAllocated(size_t size)
@@ -213,6 +226,18 @@ inline void* Heap::allocateWithoutDestructor(size_t bytes)
     return m_objectSpace.allocateWithoutDestructor(bytes);
 }
 
+inline void* Heap::allocateWithDestructor(GCDeferralContext* deferralContext, size_t bytes)
+{
+    ASSERT(isValidAllocation(bytes));
+    return m_objectSpace.allocateWithDestructor(deferralContext, bytes);
+}
+
+inline void* Heap::allocateWithoutDestructor(GCDeferralContext* deferralContext, size_t bytes)
+{
+    ASSERT(isValidAllocation(bytes));
+    return m_objectSpace.allocateWithoutDestructor(deferralContext, bytes);
+}
+
 template<typename ClassType>
 inline void* Heap::allocateObjectOfType(size_t bytes)
 {
@@ -225,6 +250,16 @@ inline void* Heap::allocateObjectOfType(size_t bytes)
 }
 
 template<typename ClassType>
+inline void* Heap::allocateObjectOfType(GCDeferralContext* deferralContext, size_t bytes)
+{
+    ASSERT((!ClassType::needsDestruction || (ClassType::StructureFlags & StructureIsImmortal) || std::is_convertible<ClassType, JSDestructibleObject>::value));
+
+    if (ClassType::needsDestruction)
+        return allocateWithDestructor(deferralContext, bytes);
+    return allocateWithoutDestructor(deferralContext, bytes);
+}
+
+template<typename ClassType>
 inline MarkedSpace::Subspace& Heap::subspaceForObjectOfType()
 {
     // JSCell::classInfo() expects objects allocated with normal destructor to derive from JSDestructibleObject.
@@ -273,6 +308,17 @@ inline void* Heap::tryAllocateAuxiliary(JSCell* intendedOwner, size_t bytes)
     return result;
 }
 
+inline void* Heap::tryAllocateAuxiliary(GCDeferralContext* deferralContext, JSCell* intendedOwner, size_t bytes)
+{
+    void* result = m_objectSpace.tryAllocateAuxiliary(deferralContext, bytes);
+#if ENABLE(ALLOCATION_LOGGING)
+    dataLogF("JSC GC allocating %lu bytes of auxiliary for %p: %p.\n", bytes, intendedOwner, result);
+#else
+    UNUSED_PARAM(intendedOwner);
+#endif
+    return result;
+}
+
 inline void* Heap::tryReallocateAuxiliary(JSCell* intendedOwner, void* oldBase, size_t oldSize, size_t newSize)
 {
     void* newBase = tryAllocateAuxiliary(intendedOwner, newSize);
@@ -312,12 +358,15 @@ inline void Heap::decrementDeferralDepth()
     m_deferralDepth--;
 }
 
-inline bool Heap::collectIfNecessaryOrDefer()
+inline bool Heap::collectIfNecessaryOrDefer(GCDeferralContext* deferralContext)
 {
     if (!shouldCollect())
         return false;
 
-    collect();
+    if (deferralContext)
+        deferralContext->m_shouldGC = true;
+    else
+        collect();
     return true;
 }
 
index ae634e5..e011eff 100644 (file)
@@ -175,45 +175,49 @@ void* MarkedAllocator::tryAllocateIn(MarkedBlock::Handle* block)
     return result;
 }
 
-ALWAYS_INLINE void MarkedAllocator::doTestCollectionsIfNeeded()
+ALWAYS_INLINE void MarkedAllocator::doTestCollectionsIfNeeded(GCDeferralContext* deferralContext)
 {
     if (!Options::slowPathAllocsBetweenGCs())
         return;
 
     static unsigned allocationCount = 0;
     if (!allocationCount) {
-        if (!m_heap->isDeferred())
-            m_heap->collectAllGarbage();
+        if (!m_heap->isDeferred()) {
+            if (deferralContext)
+                deferralContext->m_shouldGC = true;
+            else
+                m_heap->collectAllGarbage();
+        }
         ASSERT(m_heap->m_operationInProgress == NoOperation);
     }
     if (++allocationCount >= Options::slowPathAllocsBetweenGCs())
         allocationCount = 0;
 }
 
-void* MarkedAllocator::allocateSlowCase()
+void* MarkedAllocator::allocateSlowCase(GCDeferralContext* deferralContext)
 {
     bool crashOnFailure = true;
-    return allocateSlowCaseImpl(crashOnFailure);
+    return allocateSlowCaseImpl(deferralContext, crashOnFailure);
 }
 
-void* MarkedAllocator::tryAllocateSlowCase()
+void* MarkedAllocator::tryAllocateSlowCase(GCDeferralContext* deferralContext)
 {
     bool crashOnFailure = false;
-    return allocateSlowCaseImpl(crashOnFailure);
+    return allocateSlowCaseImpl(deferralContext, crashOnFailure);
 }
 
-void* MarkedAllocator::allocateSlowCaseImpl(bool crashOnFailure)
+void* MarkedAllocator::allocateSlowCaseImpl(GCDeferralContext* deferralContext, bool crashOnFailure)
 {
     SuperSamplerScope superSamplerScope(false);
     ASSERT(m_heap->vm()->currentThreadIsHoldingAPILock());
-    doTestCollectionsIfNeeded();
+    doTestCollectionsIfNeeded(deferralContext);
 
     ASSERT(!m_markedSpace->isIterating());
     m_heap->didAllocate(m_freeList.originalSize);
     
     didConsumeFreeList();
     
-    m_heap->collectIfNecessaryOrDefer();
+    m_heap->collectIfNecessaryOrDefer(deferralContext);
     
     AllocationScope allocationScope(*m_heap);
 
index 969cedd..37e4b20 100644 (file)
@@ -34,6 +34,7 @@
 
 namespace JSC {
 
+class GCDeferralContext;
 class Heap;
 class MarkedSpace;
 class LLIntOffsetsExtractor;
@@ -151,8 +152,8 @@ public:
     bool needsDestruction() const { return m_attributes.destruction == NeedsDestruction; }
     DestructionMode destruction() const { return m_attributes.destruction; }
     HeapCell::Kind cellKind() const { return m_attributes.cellKind; }
-    void* allocate();
-    void* tryAllocate();
+    void* allocate(GCDeferralContext* = nullptr);
+    void* tryAllocate(GCDeferralContext* = nullptr);
     Heap* heap() { return m_heap; }
     MarkedBlock::Handle* takeLastActiveBlock()
     {
@@ -215,15 +216,15 @@ private:
     
     bool shouldStealEmptyBlocksFromOtherAllocators() const;
     
-    JS_EXPORT_PRIVATE void* allocateSlowCase();
-    JS_EXPORT_PRIVATE void* tryAllocateSlowCase();
-    void* allocateSlowCaseImpl(bool crashOnFailure);
+    JS_EXPORT_PRIVATE void* allocateSlowCase(GCDeferralContext*);
+    JS_EXPORT_PRIVATE void* tryAllocateSlowCase(GCDeferralContext*);
+    void* allocateSlowCaseImpl(GCDeferralContext*, bool crashOnFailure);
     void didConsumeFreeList();
     void* tryAllocateWithoutCollecting();
     MarkedBlock::Handle* tryAllocateBlock();
     void* tryAllocateIn(MarkedBlock::Handle*);
     void* allocateIn(MarkedBlock::Handle*);
-    ALWAYS_INLINE void doTestCollectionsIfNeeded();
+    ALWAYS_INLINE void doTestCollectionsIfNeeded(GCDeferralContext*);
     
     void setFreeList(const FreeList&);
     
@@ -264,7 +265,7 @@ inline ptrdiff_t MarkedAllocator::offsetOfCellSize()
     return OBJECT_OFFSETOF(MarkedAllocator, m_cellSize);
 }
 
-ALWAYS_INLINE void* MarkedAllocator::tryAllocate()
+ALWAYS_INLINE void* MarkedAllocator::tryAllocate(GCDeferralContext* deferralContext)
 {
     unsigned remaining = m_freeList.remaining;
     if (remaining) {
@@ -276,13 +277,13 @@ ALWAYS_INLINE void* MarkedAllocator::tryAllocate()
     
     FreeCell* head = m_freeList.head;
     if (UNLIKELY(!head))
-        return tryAllocateSlowCase();
+        return tryAllocateSlowCase(deferralContext);
     
     m_freeList.head = head->next;
     return head;
 }
 
-ALWAYS_INLINE void* MarkedAllocator::allocate()
+ALWAYS_INLINE void* MarkedAllocator::allocate(GCDeferralContext* deferralContext)
 {
     unsigned remaining = m_freeList.remaining;
     if (remaining) {
@@ -294,7 +295,7 @@ ALWAYS_INLINE void* MarkedAllocator::allocate()
     
     FreeCell* head = m_freeList.head;
     if (UNLIKELY(!head))
-        return allocateSlowCase();
+        return allocateSlowCase(deferralContext);
     
     m_freeList.head = head->next;
     return head;
index 3b4e033..377d04e 100644 (file)
@@ -235,32 +235,58 @@ void MarkedSpace::lastChanceToFinalize()
 
 void* MarkedSpace::allocate(Subspace& subspace, size_t bytes)
 {
+    if (false)
+        dataLog("Allocating ", bytes, " bytes in ", subspace.attributes, ".\n");
     if (MarkedAllocator* allocator = allocatorFor(subspace, bytes)) {
         void* result = allocator->allocate();
         return result;
     }
-    return allocateLarge(subspace, bytes);
+    return allocateLarge(subspace, nullptr, bytes);
+}
+
+void* MarkedSpace::allocate(Subspace& subspace, GCDeferralContext* deferralContext, size_t bytes)
+{
+    if (false)
+        dataLog("Allocating ", bytes, " deferred bytes in ", subspace.attributes, ".\n");
+    if (MarkedAllocator* allocator = allocatorFor(subspace, bytes)) {
+        void* result = allocator->allocate(deferralContext);
+        return result;
+    }
+    return allocateLarge(subspace, deferralContext, bytes);
 }
 
 void* MarkedSpace::tryAllocate(Subspace& subspace, size_t bytes)
 {
+    if (false)
+        dataLog("Try-allocating ", bytes, " bytes in ", subspace.attributes, ".\n");
     if (MarkedAllocator* allocator = allocatorFor(subspace, bytes)) {
         void* result = allocator->tryAllocate();
         return result;
     }
-    return tryAllocateLarge(subspace, bytes);
+    return tryAllocateLarge(subspace, nullptr, bytes);
+}
+
+void* MarkedSpace::tryAllocate(Subspace& subspace, GCDeferralContext* deferralContext, size_t bytes)
+{
+    if (false)
+        dataLog("Try-allocating ", bytes, " deferred bytes in ", subspace.attributes, ".\n");
+    if (MarkedAllocator* allocator = allocatorFor(subspace, bytes)) {
+        void* result = allocator->tryAllocate(deferralContext);
+        return result;
+    }
+    return tryAllocateLarge(subspace, deferralContext, bytes);
 }
 
-void* MarkedSpace::allocateLarge(Subspace& subspace, size_t size)
+void* MarkedSpace::allocateLarge(Subspace& subspace, GCDeferralContext* deferralContext, size_t size)
 {
-    void* result = tryAllocateLarge(subspace, size);
+    void* result = tryAllocateLarge(subspace, deferralContext, size);
     RELEASE_ASSERT(result);
     return result;
 }
 
-void* MarkedSpace::tryAllocateLarge(Subspace& subspace, size_t size)
+void* MarkedSpace::tryAllocateLarge(Subspace& subspace, GCDeferralContext* deferralContext, size_t size)
 {
-    m_heap->collectIfNecessaryOrDefer();
+    m_heap->collectIfNecessaryOrDefer(deferralContext);
     
     size = WTF::roundUpToMultipleOf<sizeStep>(size);
     LargeAllocation* allocation = LargeAllocation::tryCreate(*m_heap, size, subspace.attributes);
index 7d58b75..3fd3e50 100644 (file)
@@ -111,12 +111,17 @@ public:
     MarkedAllocator* auxiliaryAllocatorFor(size_t);
 
     JS_EXPORT_PRIVATE void* allocate(Subspace&, size_t);
+    JS_EXPORT_PRIVATE void* allocate(Subspace&, GCDeferralContext*, size_t);
     JS_EXPORT_PRIVATE void* tryAllocate(Subspace&, size_t);
+    JS_EXPORT_PRIVATE void* tryAllocate(Subspace&, GCDeferralContext*, size_t);
     
     void* allocateWithDestructor(size_t);
     void* allocateWithoutDestructor(size_t);
+    void* allocateWithDestructor(GCDeferralContext*, size_t);
+    void* allocateWithoutDestructor(GCDeferralContext*, size_t);
     void* allocateAuxiliary(size_t);
     void* tryAllocateAuxiliary(size_t);
+    void* tryAllocateAuxiliary(GCDeferralContext*, size_t);
     
     Subspace& subspaceForObjectsWithDestructor() { return m_destructorSpace; }
     Subspace& subspaceForObjectsWithoutDestructor() { return m_normalSpace; }
@@ -194,8 +199,8 @@ private:
     
     JS_EXPORT_PRIVATE static std::array<size_t, numSizeClasses> s_sizeClassForSizeStep;
     
-    JS_EXPORT_PRIVATE void* allocateLarge(Subspace&, size_t);
-    JS_EXPORT_PRIVATE void* tryAllocateLarge(Subspace&, size_t);
+    void* allocateLarge(Subspace&, GCDeferralContext*, size_t);
+    void* tryAllocateLarge(Subspace&, GCDeferralContext*, size_t);
 
     static void initializeSizeClassForStepSize();
     
@@ -263,6 +268,16 @@ inline void* MarkedSpace::allocateWithDestructor(size_t bytes)
     return allocate(m_destructorSpace, bytes);
 }
 
+inline void* MarkedSpace::allocateWithoutDestructor(GCDeferralContext* deferralContext, size_t bytes)
+{
+    return allocate(m_normalSpace, deferralContext, bytes);
+}
+
+inline void* MarkedSpace::allocateWithDestructor(GCDeferralContext* deferralContext, size_t bytes)
+{
+    return allocate(m_destructorSpace, deferralContext, bytes);
+}
+
 inline void* MarkedSpace::allocateAuxiliary(size_t bytes)
 {
     return allocate(m_auxiliarySpace, bytes);
@@ -273,6 +288,11 @@ inline void* MarkedSpace::tryAllocateAuxiliary(size_t bytes)
     return tryAllocate(m_auxiliarySpace, bytes);
 }
 
+inline void* MarkedSpace::tryAllocateAuxiliary(GCDeferralContext* deferralContext, size_t bytes)
+{
+    return tryAllocate(m_auxiliarySpace, deferralContext, bytes);
+}
+
 template <typename Functor> inline void MarkedSpace::forEachBlock(const Functor& functor)
 {
     forEachAllocator(
index ff0766c..769491a 100644 (file)
@@ -1305,17 +1305,44 @@ public:
 
     static void emitStoreStructureWithTypeInfo(AssemblyHelpers& jit, TrustedImmPtr structure, RegisterID dest);
 
-    Jump jumpIfIsRememberedOrInEden(GPRReg cell)
+    Jump barrierBranchWithoutFence(GPRReg cell)
     {
         return branch8(Above, Address(cell, JSCell::cellStateOffset()), TrustedImm32(blackThreshold));
     }
 
-    Jump jumpIfIsRememberedOrInEden(JSCell* cell)
+    Jump barrierBranchWithoutFence(JSCell* cell)
     {
         uint8_t* address = reinterpret_cast<uint8_t*>(cell) + JSCell::cellStateOffset();
         return branch8(Above, AbsoluteAddress(address), TrustedImm32(blackThreshold));
     }
     
+    Jump barrierBranch(GPRReg cell, GPRReg scratchGPR)
+    {
+        load8(Address(cell, JSCell::cellStateOffset()), scratchGPR);
+        return branch32(Above, scratchGPR, AbsoluteAddress(vm()->heap.addressOfBarrierThreshold()));
+    }
+
+    Jump barrierBranch(JSCell* cell, GPRReg scratchGPR)
+    {
+        uint8_t* address = reinterpret_cast<uint8_t*>(cell) + JSCell::cellStateOffset();
+        load8(address, scratchGPR);
+        return branch32(Above, scratchGPR, AbsoluteAddress(vm()->heap.addressOfBarrierThreshold()));
+    }
+    
+    void barrierStoreLoadFence()
+    {
+        if (!Options::useConcurrentBarriers())
+            return;
+        Jump ok = jumpIfBarrierStoreLoadFenceNotNeeded();
+        memoryFence();
+        ok.link(this);
+    }
+    
+    Jump jumpIfBarrierStoreLoadFenceNotNeeded()
+    {
+        return branchTest8(Zero, AbsoluteAddress(vm()->heap.addressOfBarrierShouldBeFenced()));
+    }
+    
     // Emits the branch structure for typeof. The code emitted by this doesn't fall through. The
     // functor is called at those points where we have pinpointed a type. One way to use this is to
     // have the functor emit the code to put the type string into an appropriate register and then
index 189730a..e65a76b 100644 (file)
@@ -2165,14 +2165,11 @@ void JIT_OPERATION operationOSRWriteBarrier(ExecState* exec, JSCell* cell)
     vm->heap.writeBarrier(cell);
 }
 
-// NB: We don't include the value as part of the barrier because the write barrier elision
-// phase in the DFG only tracks whether the object being stored to has been barriered. It 
-// would be much more complicated to try to model the value being stored as well.
-void JIT_OPERATION operationUnconditionalWriteBarrier(ExecState* exec, JSCell* cell)
+void JIT_OPERATION operationWriteBarrierSlowPath(ExecState* exec, JSCell* cell)
 {
     VM* vm = &exec->vm();
     NativeCallFrameTracer tracer(vm, exec);
-    vm->heap.writeBarrier(cell);
+    vm->heap.writeBarrierSlowPath(cell);
 }
 
 void JIT_OPERATION lookupExceptionHandler(VM* vm, ExecState* exec)
index cf5ad1a..1d4ad38 100644 (file)
@@ -410,8 +410,7 @@ char* JIT_OPERATION operationReallocateButterflyToHavePropertyStorageWithInitial
 char* JIT_OPERATION operationReallocateButterflyToGrowPropertyStorage(ExecState*, JSObject*, size_t newSize) WTF_INTERNAL;
 
 void JIT_OPERATION operationFlushWriteBarrierBuffer(ExecState*, JSCell*);
-void JIT_OPERATION operationWriteBarrier(ExecState*, JSCell*, JSCell*);
-void JIT_OPERATION operationUnconditionalWriteBarrier(ExecState*, JSCell*);
+void JIT_OPERATION operationWriteBarrierSlowPath(ExecState*, JSCell*);
 void JIT_OPERATION operationOSRWriteBarrier(ExecState*, JSCell*);
 
 void JIT_OPERATION operationExceptionFuzz(ExecState*);
index 560268b..5218fa8 100644 (file)
@@ -666,8 +666,6 @@ void JIT::emit_op_put_by_id(Instruction* currentInstruction)
     int valueVReg = currentInstruction[3].u.operand;
     unsigned direct = currentInstruction[8].u.putByIdFlags & PutByIdIsDirect;
 
-    emitWriteBarrier(baseVReg, valueVReg, ShouldFilterBase);
-
     // In order to be able to patch both the Structure, and the object offset, we store one pointer,
     // to just after the arguments have been loaded into registers 'hotPathBegin', and we generate code
     // such that the Structure & offset are always at the same distance from this.
@@ -684,6 +682,8 @@ void JIT::emit_op_put_by_id(Instruction* currentInstruction)
     gen.generateFastPath(*this);
     addSlowCase(gen.slowPathJump());
     
+    emitWriteBarrier(baseVReg, valueVReg, ShouldFilterBase);
+
     m_putByIds.append(gen);
 }
 
@@ -1180,8 +1180,8 @@ void JIT::emitWriteBarrier(unsigned owner, unsigned value, WriteBarrierMode mode
     if (mode == ShouldFilterBaseAndValue || mode == ShouldFilterBase)
         ownerNotCell = branchTest64(NonZero, regT0, tagMaskRegister);
 
-    Jump ownerIsRememberedOrInEden = jumpIfIsRememberedOrInEden(regT0);
-    callOperation(operationUnconditionalWriteBarrier, regT0);
+    Jump ownerIsRememberedOrInEden = barrierBranch(regT0, regT1);
+    callOperation(operationWriteBarrierSlowPath, regT0);
     ownerIsRememberedOrInEden.link(this);
 
     if (mode == ShouldFilterBaseAndValue || mode == ShouldFilterBase)
@@ -1218,8 +1218,8 @@ void JIT::emitWriteBarrier(unsigned owner, unsigned value, WriteBarrierMode mode
     if (mode == ShouldFilterBase || mode == ShouldFilterBaseAndValue)
         ownerNotCell = branch32(NotEqual, regT0, TrustedImm32(JSValue::CellTag));
 
-    Jump ownerIsRememberedOrInEden = jumpIfIsRememberedOrInEden(regT1);
-    callOperation(operationUnconditionalWriteBarrier, regT1);
+    Jump ownerIsRememberedOrInEden = barrierBranch(regT1, regT2);
+    callOperation(operationWriteBarrierSlowPath, regT1);
     ownerIsRememberedOrInEden.link(this);
 
     if (mode == ShouldFilterBase || mode == ShouldFilterBaseAndValue)
@@ -1246,12 +1246,9 @@ void JIT::emitWriteBarrier(JSCell* owner, unsigned value, WriteBarrierMode mode)
 
 void JIT::emitWriteBarrier(JSCell* owner)
 {
-    if (!owner->cellContainer().isMarked(owner)) {
-        Jump ownerIsRememberedOrInEden = jumpIfIsRememberedOrInEden(owner);
-        callOperation(operationUnconditionalWriteBarrier, owner);
-        ownerIsRememberedOrInEden.link(this);
-    } else
-        callOperation(operationUnconditionalWriteBarrier, owner);
+    Jump ownerIsRememberedOrInEden = barrierBranch(owner, regT0);
+    callOperation(operationWriteBarrierSlowPath, owner);
+    ownerIsRememberedOrInEden.link(this);
 }
 
 void JIT::emitByValIdentifierCheck(ByValInfo* byValInfo, RegisterID cell, RegisterID scratch, const Identifier& propertyName, JumpList& slowCases)
@@ -1390,8 +1387,8 @@ void JIT::privateCompilePutByVal(ByValInfo* byValInfo, ReturnAddressPtr returnAd
     patchBuffer.link(slowCases, CodeLocationLabel(MacroAssemblerCodePtr::createFromExecutableAddress(returnAddress.value())).labelAtOffset(byValInfo->returnAddressToSlowPath));
     patchBuffer.link(done, byValInfo->badTypeJump.labelAtOffset(byValInfo->badTypeJumpToDone));
     if (needsLinkForWriteBarrier) {
-        ASSERT(m_calls.last().to == operationUnconditionalWriteBarrier);
-        patchBuffer.link(m_calls.last().from, operationUnconditionalWriteBarrier);
+        ASSERT(m_calls.last().to == operationWriteBarrierSlowPath);
+        patchBuffer.link(m_calls.last().from, operationWriteBarrierSlowPath);
     }
     
     bool isDirect = m_interpreter->getOpcodeID(currentInstruction->u.opcode) == op_put_by_val_direct;
index 32a8648..5f14e90 100644 (file)
@@ -673,8 +673,6 @@ void JIT::emit_op_put_by_id(Instruction* currentInstruction)
     int value = currentInstruction[3].u.operand;
     int direct = currentInstruction[8].u.putByIdFlags & PutByIdIsDirect;
     
-    emitWriteBarrier(base, value, ShouldFilterBase);
-
     emitLoad2(base, regT1, regT0, value, regT3, regT2);
     
     emitJumpSlowCaseIfNotJSCell(base, regT1);
@@ -687,6 +685,8 @@ void JIT::emit_op_put_by_id(Instruction* currentInstruction)
     gen.generateFastPath(*this);
     addSlowCase(gen.slowPathJump());
     
+    emitWriteBarrier(base, value, ShouldFilterBase);
+
     m_putByIds.append(gen);
 }
 
index 10037af..d0f1ecb 100644 (file)
@@ -891,6 +891,7 @@ macro arrayProfile(cellAndIndexingType, profile, scratch)
 end
 
 macro skipIfIsRememberedOrInEden(cell, slowPath)
+    memfence
     bba JSCell::m_cellState[cell], BlackThreshold, .done
     slowPath()
 .done:
index 607085b..2e5427f 100644 (file)
@@ -1527,7 +1527,12 @@ class Instruction
         when "leap"
             $asm.puts "lea#{x86Suffix(:ptr)} #{orderOperands(operands[0].x86AddressOperand(:ptr), operands[1].x86Operand(:ptr))}"
         when "memfence"
-            $asm.puts "mfence"
+            sp = RegisterID.new(nil, "sp")
+            if isIntelSyntax
+                $asm.puts "mfence"
+            else
+                $asm.puts "lock; orl $0, (#{sp.x86Operand(:ptr)})"
+            end
         else
             lowerDefault
         end
index ff73148..7849072 100644 (file)
@@ -59,7 +59,7 @@ Butterfly* createArrayButterflyInDictionaryIndexingMode(
     return butterfly;
 }
 
-JSArray* JSArray::tryCreateUninitialized(VM& vm, Structure* structure, unsigned initialLength)
+JSArray* JSArray::tryCreateUninitialized(VM& vm, GCDeferralContext* deferralContext, Structure* structure, unsigned initialLength)
 {
     if (initialLength > MAX_STORAGE_VECTOR_LENGTH)
         return 0;
@@ -76,7 +76,7 @@ JSArray* JSArray::tryCreateUninitialized(VM& vm, Structure* structure, unsigned
             || hasContiguous(indexingType));
 
         unsigned vectorLength = Butterfly::optimalContiguousVectorLength(structure, initialLength);
-        void* temp = vm.heap.tryAllocateAuxiliary(nullptr, Butterfly::totalSize(0, outOfLineStorage, true, vectorLength * sizeof(EncodedJSValue)));
+        void* temp = vm.heap.tryAllocateAuxiliary(deferralContext, nullptr, Butterfly::totalSize(0, outOfLineStorage, true, vectorLength * sizeof(EncodedJSValue)));
         if (!temp)
             return nullptr;
         butterfly = Butterfly::fromBase(temp, 0, outOfLineStorage);
@@ -104,7 +104,7 @@ JSArray* JSArray::tryCreateUninitialized(VM& vm, Structure* structure, unsigned
             storage->m_vector[i].clear();
     }
 
-    return createWithButterfly(vm, structure, butterfly);
+    return createWithButterfly(vm, deferralContext, structure, butterfly);
 }
 
 void JSArray::setLengthWritable(ExecState* exec, bool writable)
index 57bfcd4..0a57f4a 100644 (file)
@@ -53,13 +53,17 @@ protected:
 
 public:
     static JSArray* create(VM&, Structure*, unsigned initialLength = 0);
-    static JSArray* createWithButterfly(VM&, Structure*, Butterfly*);
+    static JSArray* createWithButterfly(VM&, GCDeferralContext*, Structure*, Butterfly*);
 
     // tryCreateUninitialized is used for fast construction of arrays whose size and
     // contents are known at time of creation. Clients of this interface must:
     //   - null-check the result (indicating out of memory, or otherwise unable to allocate vector).
     //   - call 'initializeIndex' for all properties in sequence, for 0 <= i < initialLength.
-    JS_EXPORT_PRIVATE static JSArray* tryCreateUninitialized(VM&, Structure*, unsigned initialLength);
+    JS_EXPORT_PRIVATE static JSArray* tryCreateUninitialized(VM&, GCDeferralContext*, Structure*, unsigned initialLength);
+    static JSArray* tryCreateUninitialized(VM& vm, Structure* structure, unsigned initialLength)
+    {
+        return tryCreateUninitialized(vm, nullptr, structure, initialLength);
+    }
 
     JS_EXPORT_PRIVATE static bool defineOwnProperty(JSObject*, ExecState*, PropertyName, const PropertyDescriptor&, bool throwException);
 
@@ -230,12 +234,12 @@ inline JSArray* JSArray::create(VM& vm, Structure* structure, unsigned initialLe
             butterfly->arrayStorage()->m_vector[i].clear();
     }
 
-    return createWithButterfly(vm, structure, butterfly);
+    return createWithButterfly(vm, nullptr, structure, butterfly);
 }
 
-inline JSArray* JSArray::createWithButterfly(VM& vm, Structure* structure, Butterfly* butterfly)
+inline JSArray* JSArray::createWithButterfly(VM& vm, GCDeferralContext* deferralContext, Structure* structure, Butterfly* butterfly)
 {
-    JSArray* array = new (NotNull, allocateCell<JSArray>(vm.heap)) JSArray(vm, structure, butterfly);
+    JSArray* array = new (NotNull, allocateCell<JSArray>(vm.heap, deferralContext)) JSArray(vm, structure, butterfly);
     array->finishCreation(vm);
     return array;
 }
index 00e79e2..c40f3e2 100644 (file)
@@ -39,6 +39,7 @@
 namespace JSC {
 
 class CopyVisitor;
+class GCDeferralContext;
 class ExecState;
 class Identifier;
 class JSArrayBufferView;
@@ -52,6 +53,9 @@ class Structure;
 template<typename T> void* allocateCell(Heap&);
 template<typename T> void* allocateCell(Heap&, size_t);
 
+template<typename T> void* allocateCell(Heap&, GCDeferralContext*);
+template<typename T> void* allocateCell(Heap&, GCDeferralContext*, size_t);
+
 #define DECLARE_EXPORT_INFO                                             \
     protected:                                                          \
         static JS_EXPORTDATA const ::JSC::ClassInfo s_info;             \
@@ -69,6 +73,8 @@ class JSCell : public HeapCell {
     friend class MarkedBlock;
     template<typename T> friend void* allocateCell(Heap&);
     template<typename T> friend void* allocateCell(Heap&, size_t);
+    template<typename T> friend void* allocateCell(Heap&, GCDeferralContext*);
+    template<typename T> friend void* allocateCell(Heap&, GCDeferralContext*, size_t);
 
 public:
     static const unsigned StructureFlags = 0;
index 20becec..49ee34e 100644 (file)
@@ -141,6 +141,25 @@ void* allocateCell(Heap& heap)
     return allocateCell<T>(heap, sizeof(T));
 }
     
+template<typename T>
+void* allocateCell(Heap& heap, GCDeferralContext* deferralContext, size_t size)
+{
+    ASSERT(size >= sizeof(T));
+    JSCell* result = static_cast<JSCell*>(heap.allocateObjectOfType<T>(deferralContext, size));
+#if ENABLE(GC_VALIDATION)
+    ASSERT(!heap.vm()->isInitializingObject());
+    heap.vm()->setInitializingObjectClass(T::info());
+#endif
+    result->clearStructure();
+    return result;
+}
+    
+template<typename T>
+void* allocateCell(Heap& heap, GCDeferralContext* deferralContext)
+{
+    return allocateCell<T>(heap, deferralContext, sizeof(T));
+}
+    
 inline bool JSCell::isObject() const
 {
     return TypeInfo::isObject(m_type);
index 437dbcf..695bfb7 100644 (file)
@@ -417,7 +417,7 @@ public:
 
     // NOTE: Clients of this method may call it more than once for any index, and this is supposed
     // to work.
-    void initializeIndex(VM& vm, unsigned i, JSValue v, IndexingType indexingType)
+    ALWAYS_INLINE void initializeIndex(VM& vm, unsigned i, JSValue v, IndexingType indexingType)
     {
         Butterfly* butterfly = m_butterfly.get();
         switch (indexingType) {
@@ -467,6 +467,54 @@ public:
         }
     }
         
+    void initializeIndexWithoutBarrier(unsigned i, JSValue v)
+    {
+        initializeIndexWithoutBarrier(i, v, indexingType());
+    }
+
+    // This version of initializeIndex is for cases where you know that you will not need any
+    // barriers. This implies not having any data format conversions.
+    ALWAYS_INLINE void initializeIndexWithoutBarrier(unsigned i, JSValue v, IndexingType indexingType)
+    {
+        Butterfly* butterfly = m_butterfly.get();
+        switch (indexingType) {
+        case ALL_UNDECIDED_INDEXING_TYPES: {
+            RELEASE_ASSERT_NOT_REACHED();
+            break;
+        }
+        case ALL_INT32_INDEXING_TYPES: {
+            ASSERT(i < butterfly->publicLength());
+            ASSERT(i < butterfly->vectorLength());
+            RELEASE_ASSERT(v.isInt32());
+            FALLTHROUGH;
+        }
+        case ALL_CONTIGUOUS_INDEXING_TYPES: {
+            ASSERT(i < butterfly->publicLength());
+            ASSERT(i < butterfly->vectorLength());
+            butterfly->contiguous()[i].setWithoutWriteBarrier(v);
+            break;
+        }
+        case ALL_DOUBLE_INDEXING_TYPES: {
+            ASSERT(i < butterfly->publicLength());
+            ASSERT(i < butterfly->vectorLength());
+            RELEASE_ASSERT(v.isNumber());
+            double value = v.asNumber();
+            RELEASE_ASSERT(value == value);
+            butterfly->contiguousDouble()[i] = value;
+            break;
+        }
+        case ALL_ARRAY_STORAGE_INDEXING_TYPES: {
+            ArrayStorage* storage = butterfly->arrayStorage();
+            ASSERT(i < storage->length());
+            ASSERT(i < storage->m_numValuesInVector);
+            storage->m_vector[i].setWithoutWriteBarrier(v);
+            break;
+        }
+        default:
+            RELEASE_ASSERT_NOT_REACHED();
+        }
+    }
+        
     bool hasSparseMap()
     {
         switch (indexingType()) {
@@ -641,6 +689,7 @@ public:
     // Fast access to known property offsets.
     JSValue getDirect(PropertyOffset offset) const { return locationForOffset(offset)->get(); }
     void putDirect(VM& vm, PropertyOffset offset, JSValue value) { locationForOffset(offset)->set(vm, this, value); }
+    void putDirectWithoutBarrier(PropertyOffset offset, JSValue value) { locationForOffset(offset)->setWithoutWriteBarrier(value); }
     void putDirectUndefined(PropertyOffset offset) { locationForOffset(offset)->setUndefined(); }
 
     JS_EXPORT_PRIVATE bool putDirectNativeIntrinsicGetter(VM&, JSGlobalObject*, Identifier, NativeFunction, Intrinsic, unsigned attributes);
index 14177fc..672aae7 100644 (file)
@@ -365,13 +365,18 @@ public:
         return newString;
     }
 
-    ALWAYS_INLINE static JSString* createSubstringOfResolved(VM& vm, JSString* base, unsigned offset, unsigned length)
+    ALWAYS_INLINE static JSString* createSubstringOfResolved(VM& vm, GCDeferralContext* deferralContext, JSString* base, unsigned offset, unsigned length)
     {
-        JSRopeString* newString = new (NotNull, allocateCell<JSRopeString>(vm.heap)) JSRopeString(vm);
+        JSRopeString* newString = new (NotNull, allocateCell<JSRopeString>(vm.heap, deferralContext)) JSRopeString(vm);
         newString->finishCreationSubstringOfResolved(vm, base, offset, length);
         return newString;
     }
 
+    ALWAYS_INLINE static JSString* createSubstringOfResolved(VM& vm, JSString* base, unsigned offset, unsigned length)
+    {
+        return createSubstringOfResolved(vm, nullptr, base, offset, length);
+    }
+
     void visitFibers(SlotVisitor&);
 
     static ptrdiff_t offsetOfFibers() { return OBJECT_OFFSETOF(JSRopeString, u); }
@@ -491,6 +496,7 @@ inline JSString* asString(JSValue value)
     return jsCast<JSString*>(value.asCell());
 }
 
+// This MUST NOT GC.
 inline JSString* jsEmptyString(VM* vm)
 {
     return vm->smallStrings.emptyString();
@@ -581,7 +587,7 @@ inline JSString* jsSubstring(VM& vm, ExecState* exec, JSString* s, unsigned offs
     return JSRopeString::create(vm, exec, s, offset, length);
 }
 
-inline JSString* jsSubstringOfResolved(VM& vm, JSString* s, unsigned offset, unsigned length)
+inline JSString* jsSubstringOfResolved(VM& vm, GCDeferralContext* deferralContext, JSString* s, unsigned offset, unsigned length)
 {
     ASSERT(offset <= static_cast<unsigned>(s->length()));
     ASSERT(length <= static_cast<unsigned>(s->length()));
@@ -590,7 +596,12 @@ inline JSString* jsSubstringOfResolved(VM& vm, JSString* s, unsigned offset, uns
         return vm.smallStrings.emptyString();
     if (!offset && length == s->length())
         return s;
-    return JSRopeString::createSubstringOfResolved(vm, s, offset, length);
+    return JSRopeString::createSubstringOfResolved(vm, deferralContext, s, offset, length);
+}
+
+inline JSString* jsSubstringOfResolved(VM& vm, JSString* s, unsigned offset, unsigned length)
+{
+    return jsSubstringOfResolved(vm, nullptr, s, offset, length);
 }
 
 inline JSString* jsSubstring(ExecState* exec, JSString* s, unsigned offset, unsigned length)
index 67fdadb..db5c572 100644 (file)
@@ -181,6 +181,8 @@ typedef const char* optionString;
     v(bool, testTheFTL, false, Normal, nullptr) \
     v(bool, verboseSanitizeStack, false, Normal, nullptr) \
     v(bool, useGenerationalGC, true, Normal, nullptr) \
+    v(bool, useConcurrentBarriers, true, Normal, nullptr) \
+    v(bool, forceFencedBarrier, false, Normal, nullptr) \
     v(bool, scribbleFreeCells, false, Normal, nullptr) \
     v(double, sizeClassProgression, 1.4, Normal, nullptr) \
     v(unsigned, largeAllocationCutoff, 100000, Normal, nullptr) \
index a4cfdd9..163c726 100644 (file)
@@ -36,29 +36,31 @@ JSArray* createEmptyRegExpMatchesArray(JSGlobalObject* globalObject, JSString* i
     // FIXME: This should handle array allocation errors gracefully.
     // https://bugs.webkit.org/show_bug.cgi?id=155144
     
+    GCDeferralContext deferralContext(vm.heap);
+    
     if (UNLIKELY(globalObject->isHavingABadTime())) {
-        array = JSArray::tryCreateUninitialized(vm, globalObject->regExpMatchesArrayStructure(), regExp->numSubpatterns() + 1);
+        array = JSArray::tryCreateUninitialized(vm, &deferralContext, globalObject->regExpMatchesArrayStructure(), regExp->numSubpatterns() + 1);
         
-        array->initializeIndex(vm, 0, jsEmptyString(&vm));
+        array->initializeIndexWithoutBarrier(0, jsEmptyString(&vm));
         
         if (unsigned numSubpatterns = regExp->numSubpatterns()) {
             for (unsigned i = 1; i <= numSubpatterns; ++i)
-                array->initializeIndex(vm, i, jsUndefined());
+                array->initializeIndexWithoutBarrier(i, jsUndefined());
         }
     } else {
-        array = tryCreateUninitializedRegExpMatchesArray(vm, globalObject->regExpMatchesArrayStructure(), regExp->numSubpatterns() + 1);
+        array = tryCreateUninitializedRegExpMatchesArray(vm, &deferralContext, globalObject->regExpMatchesArrayStructure(), regExp->numSubpatterns() + 1);
         RELEASE_ASSERT(array);
         
-        array->initializeIndex(vm, 0, jsEmptyString(&vm), ArrayWithContiguous);
+        array->initializeIndexWithoutBarrier(0, jsEmptyString(&vm), ArrayWithContiguous);
         
         if (unsigned numSubpatterns = regExp->numSubpatterns()) {
             for (unsigned i = 1; i <= numSubpatterns; ++i)
-                array->initializeIndex(vm, i, jsUndefined(), ArrayWithContiguous);
+                array->initializeIndexWithoutBarrier(i, jsUndefined(), ArrayWithContiguous);
         }
     }
 
-    array->putDirect(vm, RegExpMatchesArrayIndexPropertyOffset, jsNumber(-1));
-    array->putDirect(vm, RegExpMatchesArrayInputPropertyOffset, input);
+    array->putDirectWithoutBarrier(RegExpMatchesArrayIndexPropertyOffset, jsNumber(-1));
+    array->putDirectWithoutBarrier(RegExpMatchesArrayInputPropertyOffset, input);
     return array;
 }
 
index 233678c..237e0df 100644 (file)
@@ -20,6 +20,7 @@
 #pragma once
 
 #include "ButterflyInlines.h"
+#include "GCDeferralContextInlines.h"
 #include "JSArray.h"
 #include "JSCInlines.h"
 #include "JSGlobalObject.h"
@@ -31,13 +32,13 @@ namespace JSC {
 static const PropertyOffset RegExpMatchesArrayIndexPropertyOffset = 100;
 static const PropertyOffset RegExpMatchesArrayInputPropertyOffset = 101;
 
-ALWAYS_INLINE JSArray* tryCreateUninitializedRegExpMatchesArray(VM& vm, Structure* structure, unsigned initialLength)
+ALWAYS_INLINE JSArray* tryCreateUninitializedRegExpMatchesArray(VM& vm, GCDeferralContext* deferralContext, Structure* structure, unsigned initialLength)
 {
     unsigned vectorLength = initialLength;
     if (vectorLength > MAX_STORAGE_VECTOR_LENGTH)
         return 0;
 
-    void* temp = vm.heap.tryAllocateAuxiliary(nullptr, Butterfly::totalSize(0, structure->outOfLineCapacity(), true, vectorLength * sizeof(EncodedJSValue)));
+    void* temp = vm.heap.tryAllocateAuxiliary(deferralContext, nullptr, Butterfly::totalSize(0, structure->outOfLineCapacity(), true, vectorLength * sizeof(EncodedJSValue)));
     if (!temp)
         return nullptr;
     Butterfly* butterfly = Butterfly::fromBase(temp, 0, structure->outOfLineCapacity());
@@ -47,7 +48,7 @@ ALWAYS_INLINE JSArray* tryCreateUninitializedRegExpMatchesArray(VM& vm, Structur
     for (unsigned i = initialLength; i < vectorLength; ++i)
         butterfly->contiguous()[i].clear();
     
-    return JSArray::createWithButterfly(vm, structure, butterfly);
+    return JSArray::createWithButterfly(vm, deferralContext, structure, butterfly);
 }
 
 ALWAYS_INLINE JSArray* createRegExpMatchesArray(
@@ -76,47 +77,44 @@ ALWAYS_INLINE JSArray* createRegExpMatchesArray(
     
     unsigned numSubpatterns = regExp->numSubpatterns();
     
+    GCDeferralContext deferralContext(vm.heap);
+    
     if (UNLIKELY(globalObject->isHavingABadTime())) {
-        array = JSArray::tryCreateUninitialized(vm, globalObject->regExpMatchesArrayStructure(), numSubpatterns + 1);
+        array = JSArray::tryCreateUninitialized(vm, &deferralContext, globalObject->regExpMatchesArrayStructure(), numSubpatterns + 1);
         
         setProperties();
         
-        array->initializeIndex(vm, 0, jsUndefined());
-        
-        for (unsigned i = 1; i <= numSubpatterns; ++i)
-            array->initializeIndex(vm, i, jsUndefined());
-        
-        // Now the object is safe to scan by GC.
-        
-        array->initializeIndex(vm, 0, jsSubstringOfResolved(vm, input, result.start, result.end - result.start));
+        array->initializeIndexWithoutBarrier(0, jsSubstringOfResolved(vm, &deferralContext, input, result.start, result.end - result.start));
         
         for (unsigned i = 1; i <= numSubpatterns; ++i) {
             int start = subpatternResults[2 * i];
+            JSValue value;
             if (start >= 0)
-                array->initializeIndex(vm, i, JSRopeString::createSubstringOfResolved(vm, input, start, subpatternResults[2 * i + 1] - start));
+                value = JSRopeString::createSubstringOfResolved(vm, &deferralContext, input, start, subpatternResults[2 * i + 1] - start);
+            else
+                value = jsUndefined();
+            array->initializeIndexWithoutBarrier(i, value);
         }
     } else {
-        array = tryCreateUninitializedRegExpMatchesArray(vm, globalObject->regExpMatchesArrayStructure(), numSubpatterns + 1);
+        array = tryCreateUninitializedRegExpMatchesArray(vm, &deferralContext, globalObject->regExpMatchesArrayStructure(), numSubpatterns + 1);
         RELEASE_ASSERT(array);
         
         setProperties();
         
-        array->initializeIndex(vm, 0, jsUndefined(), ArrayWithContiguous);
-        
-        for (unsigned i = 1; i <= numSubpatterns; ++i)
-            array->initializeIndex(vm, i, jsUndefined(), ArrayWithContiguous);
-        
         // Now the object is safe to scan by GC.
 
-        array->initializeIndex(vm, 0, jsSubstringOfResolved(vm, input, result.start, result.end - result.start), ArrayWithContiguous);
+        array->initializeIndexWithoutBarrier(0, jsSubstringOfResolved(vm, &deferralContext, input, result.start, result.end - result.start), ArrayWithContiguous);
         
         for (unsigned i = 1; i <= numSubpatterns; ++i) {
             int start = subpatternResults[2 * i];
+            JSValue value;
             if (start >= 0)
-                array->initializeIndex(vm, i, JSRopeString::createSubstringOfResolved(vm, input, start, subpatternResults[2 * i + 1] - start), ArrayWithContiguous);
+                value = JSRopeString::createSubstringOfResolved(vm, &deferralContext, input, start, subpatternResults[2 * i + 1] - start);
+            else
+                value = jsUndefined();
+            array->initializeIndexWithoutBarrier(i, value, ArrayWithContiguous);
         }
     }
-
     return array;
 }
 
index eabf5f2..6b7bcfa 100644 (file)
@@ -1,3 +1,17 @@
+2016-09-28  Filip Pizlo  <fpizlo@apple.com>
+
+        The write barrier should be down with TSO
+        https://bugs.webkit.org/show_bug.cgi?id=162316
+
+        Reviewed by Geoffrey Garen.
+        
+        Added clearRange(), which quickly clears a range of bits. This turned out to be useful for
+        a DFG optimization pass.
+
+        * wtf/FastBitVector.cpp:
+        (WTF::FastBitVector::clearRange):
+        * wtf/FastBitVector.h:
+
 2016-09-28  Mark Lam  <mark.lam@apple.com>
 
         Fix race condition in StringView's UnderlyingString lifecycle management.
index 5a76ad7..eed3169 100644 (file)
@@ -53,5 +53,24 @@ void FastBitVectorWordOwner::resizeSlow(size_t numBits)
     m_words = newArray;
 }
 
+void FastBitVector::clearRange(size_t begin, size_t end)
+{
+    if (end - begin < 32) {
+        for (size_t i = begin; i < end; ++i)
+            at(i) = false;
+        return;
+    }
+    
+    size_t endBeginSlop = (begin + 31) & ~31;
+    size_t beginEndSlop = end & ~31;
+    
+    for (size_t i = begin; i < endBeginSlop; ++i)
+        at(i) = false;
+    for (size_t i = beginEndSlop; i < end; ++i)
+        at(i) = false;
+    for (size_t i = endBeginSlop / 32; i < beginEndSlop / 32; ++i)
+        m_words.word(i) = 0;
+}
+
 } // namespace WTF
 
index 2a65ad4..ba25e72 100644 (file)
@@ -468,6 +468,8 @@ public:
     {
         m_words.clearAll();
     }
+    
+    WTF_EXPORT_PRIVATE void clearRange(size_t begin, size_t end);
 
     // Returns true if the contents of this bitvector changed.
     template<typename OtherWords>