Concurrent GC should be able to run splay in debug mode and earley/raytrace in releas...
authorfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Fri, 18 Nov 2016 22:11:51 +0000 (22:11 +0000)
committerfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Fri, 18 Nov 2016 22:11:51 +0000 (22:11 +0000)
https://bugs.webkit.org/show_bug.cgi?id=164282

Reviewed by Geoffrey Garen and Oliver Hunt.

PerformanceTests:

CDjs is a fun benchmark for stressing concurrent GCs, but to really give the GC a good
workout you need to increase the amount of work that the test does. This adds a second
configuration of the benchmark that has more aircraft. It uses much more memory and causes us
to do more GCs and those GCs take longer.

* JetStream/cdjs/benchmark.js:
(benchmarkImpl):
(benchmark):
* JetStream/cdjs/large.js: Added.

Source/JavaScriptCore:

The two three remaining bugs were:

- Improper ordering inside putDirectWithoutTransition() and friends. We need to make sure
  that the GC doesn't see the store to Structure::m_offset until we've resized the butterfly.
  That proved a bit tricky. On the other hand, this means that we could probably remove the
  requirement that the GC holds the Structure lock in some cases. I haven't removed that lock
  yet because I still think it might protect some weird cases, and it doesn't seem to cost us
  anything.

- CodeBlock's GC strategy needed to be made thread-safe (visitWeakly, visitChildren, and
  their friends now hold locks) and incremental-safe (we need to update predictions in the
  finalizer to make sure we clear anything that was put into a value profile towards the end
  of GC).

- The GC timeslicing scheduler needed to be made a bit more aggressive to deal with
  generational workloads like earley, raytrace, and CDjs. Once I got those benchmarks to run,
  I found that they would do many useless iterations of GC because they wouldn't pause long
  enough after rescanning weak references and roots. I added a bunch of knobs for forcing a
  pause. In the end, I realized that I could get the desired effect by putting a ceiling on
  mutator utilization. We want the GC to finish quickly if it is possible to do so, even if
  the amount of allocation that the mutator had done is low. Having a utilization ceiling
  seems to accomplish this for benchmarks with trivial heaps (earley and raytrace) as well as
  huge heaps (like CDjs in its "large" configuration).

This preserves splay performance, makes the concurrent GC more stable, and makes the
concurrent GC not a perf regression on earley or raytrace. It seems to give us great CDjs
performance as well, but this is still hard to tell because we crash a lot in that benchmark.

* bytecode/CodeBlock.cpp:
(JSC::CodeBlock::CodeBlock):
(JSC::CodeBlock::visitWeakly):
(JSC::CodeBlock::visitChildren):
(JSC::CodeBlock::shouldVisitStrongly):
(JSC::CodeBlock::shouldJettisonDueToOldAge):
(JSC::CodeBlock::propagateTransitions):
(JSC::CodeBlock::determineLiveness):
(JSC::CodeBlock::WeakReferenceHarvester::visitWeakReferences):
(JSC::CodeBlock::UnconditionalFinalizer::finalizeUnconditionally):
(JSC::CodeBlock::visitOSRExitTargets):
(JSC::CodeBlock::stronglyVisitStrongReferences):
(JSC::CodeBlock::stronglyVisitWeakReferences):
* bytecode/CodeBlock.h:
(JSC::CodeBlock::clearVisitWeaklyHasBeenCalled):
* heap/CodeBlockSet.cpp:
(JSC::CodeBlockSet::deleteUnmarkedAndUnreferenced):
* heap/Heap.cpp:
(JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope):
(JSC::Heap::markToFixpoint):
(JSC::Heap::beginMarking):
(JSC::Heap::addToRememberedSet):
(JSC::Heap::collectInThread):
* heap/Heap.h:
* heap/HeapInlines.h:
(JSC::Heap::mutatorFence):
* heap/MarkedBlock.cpp:
* runtime/JSCellInlines.h:
(JSC::JSCell::finishCreation):
* runtime/JSObjectInlines.h:
(JSC::JSObject::putDirectWithoutTransition):
(JSC::JSObject::putDirectInternal):
* runtime/Options.h:
* runtime/Structure.cpp:
(JSC::Structure::add):
* runtime/Structure.h:
* runtime/StructureInlines.h:
(JSC::Structure::add):

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208897 268f45cc-cd09-0410-ab3c-d52691b4dbfc

17 files changed:
PerformanceTests/ChangeLog
PerformanceTests/JetStream/cdjs/benchmark.js
PerformanceTests/JetStream/cdjs/large.js [new file with mode: 0644]
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/bytecode/CodeBlock.cpp
Source/JavaScriptCore/bytecode/CodeBlock.h
Source/JavaScriptCore/heap/CodeBlockSet.cpp
Source/JavaScriptCore/heap/Heap.cpp
Source/JavaScriptCore/heap/Heap.h
Source/JavaScriptCore/heap/HeapInlines.h
Source/JavaScriptCore/heap/MarkedBlock.cpp
Source/JavaScriptCore/runtime/JSCellInlines.h
Source/JavaScriptCore/runtime/JSObjectInlines.h
Source/JavaScriptCore/runtime/Options.h
Source/JavaScriptCore/runtime/Structure.cpp
Source/JavaScriptCore/runtime/Structure.h
Source/JavaScriptCore/runtime/StructureInlines.h

index c9279a1..8ee3482 100644 (file)
@@ -1,3 +1,20 @@
+2016-11-18  Filip Pizlo  <fpizlo@apple.com>
+
+        Concurrent GC should be able to run splay in debug mode and earley/raytrace in release mode with no perf regression
+        https://bugs.webkit.org/show_bug.cgi?id=164282
+
+        Reviewed by Geoffrey Garen and Oliver Hunt.
+        
+        CDjs is a fun benchmark for stressing concurrent GCs, but to really give the GC a good
+        workout you need to increase the amount of work that the test does. This adds a second
+        configuration of the benchmark that has more aircraft. It uses much more memory and causes us
+        to do more GCs and those GCs take longer.
+
+        * JetStream/cdjs/benchmark.js:
+        (benchmarkImpl):
+        (benchmark):
+        * JetStream/cdjs/large.js: Added.
+
 2016-11-14  Filip Pizlo  <fpizlo@apple.com>
 
         Unreviewed, revert unintended change.
index 9f8d108..49b06be 100644 (file)
@@ -1,5 +1,5 @@
 // Copyright (c) 2001-2010, Purdue University. All rights reserved.
-// Copyright (C) 2015 Apple Inc. All rights reserved.
+// Copyright (C) 2015-2016 Apple Inc. All rights reserved.
 // 
 // Redistribution and use in source and binary forms, with or without
 // modification, are permitted provided that the following conditions are met:
 // (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
-function benchmark() {
-    var verbosity = 0;
-    var numAircraft = 1000;
-    var numFrames = 200;
-    var expectedCollisions = 14484;
-    var percentile = 95;
+function benchmarkImpl(configuration) {
+    var verbosity = configuration.verbosity;
+    var numAircraft = configuration.numAircraft;
+    var numFrames = configuration.numFrames;
+    var expectedCollisions = configuration.expectedCollisions;
+    var percentile = configuration.percentile;
 
     var simulator = new Simulator(numAircraft);
     var detector = new CollisionDetector();
@@ -78,3 +78,24 @@ function benchmark() {
     
     return averageAbovePercentile(times, percentile);
 }
+
+function benchmark() {
+    return benchmarkImpl({
+        verbosity: 0,
+        numAircraft: 1000,
+        numFrames: 200,
+        expectedCollisions: 14484,
+        percentile: 95
+    });
+}
+
+function largeBenchmark() {
+    return benchmarkImpl({
+        verbosity: 0,
+        numAircraft: 20000,
+        numFrames: 100,
+        expectedCollisions: 5827,
+        percentile: 95
+    });
+}
+
diff --git a/PerformanceTests/JetStream/cdjs/large.js b/PerformanceTests/JetStream/cdjs/large.js
new file mode 100644 (file)
index 0000000..8c1388c
--- /dev/null
@@ -0,0 +1,45 @@
+// Copyright (c) 2001-2010, Purdue University. All rights reserved.
+// Copyright (C) 2016 Apple Inc. All rights reserved.
+// 
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//  * Redistributions of source code must retain the above copyright
+//    notice, this list of conditions and the following disclaimer.
+//  * Redistributions in binary form must reproduce the above copyright
+//    notice, this list of conditions and the following disclaimer in the
+//    documentation and/or other materials provided with the distribution.
+//  * Neither the name of the Purdue University nor the
+//    names of its contributors may be used to endorse or promote products
+//    derived from this software without specific prior written permission.
+// 
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE FOR ANY
+// DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+// (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+// LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+// ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+// This is run as a JSC stress test. Let the harness know that this is a slow test.
+//@ slow!
+
+load("constants.js");
+load("util.js");
+load("red_black_tree.js");
+load("call_sign.js");
+load("vector_2d.js");
+load("vector_3d.js");
+load("motion.js");
+load("reduce_collision_set.js");
+load("simulator.js");
+load("collision.js");
+load("collision_detector.js");
+load("benchmark.js");
+
+var result = largeBenchmark();
+
+print("Average worst case: " + result + " ms.");
+
index 3cefd6a..64cfff7 100644 (file)
@@ -1,3 +1,77 @@
+2016-11-18  Filip Pizlo  <fpizlo@apple.com>
+
+        Concurrent GC should be able to run splay in debug mode and earley/raytrace in release mode with no perf regression
+        https://bugs.webkit.org/show_bug.cgi?id=164282
+
+        Reviewed by Geoffrey Garen and Oliver Hunt.
+        
+        The two three remaining bugs were:
+
+        - Improper ordering inside putDirectWithoutTransition() and friends. We need to make sure
+          that the GC doesn't see the store to Structure::m_offset until we've resized the butterfly.
+          That proved a bit tricky. On the other hand, this means that we could probably remove the
+          requirement that the GC holds the Structure lock in some cases. I haven't removed that lock
+          yet because I still think it might protect some weird cases, and it doesn't seem to cost us
+          anything.
+        
+        - CodeBlock's GC strategy needed to be made thread-safe (visitWeakly, visitChildren, and
+          their friends now hold locks) and incremental-safe (we need to update predictions in the
+          finalizer to make sure we clear anything that was put into a value profile towards the end
+          of GC).
+        
+        - The GC timeslicing scheduler needed to be made a bit more aggressive to deal with
+          generational workloads like earley, raytrace, and CDjs. Once I got those benchmarks to run,
+          I found that they would do many useless iterations of GC because they wouldn't pause long
+          enough after rescanning weak references and roots. I added a bunch of knobs for forcing a
+          pause. In the end, I realized that I could get the desired effect by putting a ceiling on
+          mutator utilization. We want the GC to finish quickly if it is possible to do so, even if
+          the amount of allocation that the mutator had done is low. Having a utilization ceiling
+          seems to accomplish this for benchmarks with trivial heaps (earley and raytrace) as well as
+          huge heaps (like CDjs in its "large" configuration).
+        
+        This preserves splay performance, makes the concurrent GC more stable, and makes the
+        concurrent GC not a perf regression on earley or raytrace. It seems to give us great CDjs
+        performance as well, but this is still hard to tell because we crash a lot in that benchmark.
+
+        * bytecode/CodeBlock.cpp:
+        (JSC::CodeBlock::CodeBlock):
+        (JSC::CodeBlock::visitWeakly):
+        (JSC::CodeBlock::visitChildren):
+        (JSC::CodeBlock::shouldVisitStrongly):
+        (JSC::CodeBlock::shouldJettisonDueToOldAge):
+        (JSC::CodeBlock::propagateTransitions):
+        (JSC::CodeBlock::determineLiveness):
+        (JSC::CodeBlock::WeakReferenceHarvester::visitWeakReferences):
+        (JSC::CodeBlock::UnconditionalFinalizer::finalizeUnconditionally):
+        (JSC::CodeBlock::visitOSRExitTargets):
+        (JSC::CodeBlock::stronglyVisitStrongReferences):
+        (JSC::CodeBlock::stronglyVisitWeakReferences):
+        * bytecode/CodeBlock.h:
+        (JSC::CodeBlock::clearVisitWeaklyHasBeenCalled):
+        * heap/CodeBlockSet.cpp:
+        (JSC::CodeBlockSet::deleteUnmarkedAndUnreferenced):
+        * heap/Heap.cpp:
+        (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope):
+        (JSC::Heap::markToFixpoint):
+        (JSC::Heap::beginMarking):
+        (JSC::Heap::addToRememberedSet):
+        (JSC::Heap::collectInThread):
+        * heap/Heap.h:
+        * heap/HeapInlines.h:
+        (JSC::Heap::mutatorFence):
+        * heap/MarkedBlock.cpp:
+        * runtime/JSCellInlines.h:
+        (JSC::JSCell::finishCreation):
+        * runtime/JSObjectInlines.h:
+        (JSC::JSObject::putDirectWithoutTransition):
+        (JSC::JSObject::putDirectInternal):
+        * runtime/Options.h:
+        * runtime/Structure.cpp:
+        (JSC::Structure::add):
+        * runtime/Structure.h:
+        * runtime/StructureInlines.h:
+        (JSC::Structure::add):
+
 2016-11-18  Joseph Pecoraro  <pecoraro@apple.com>
 
         Web Inspector: Generator functions should have a displayable name when shown in stack traces
index fbd8549..6456340 100644 (file)
@@ -1873,7 +1873,7 @@ CodeBlock::CodeBlock(VM* vm, Structure* structure, CopyParsedBlockTag, CodeBlock
     , m_reoptimizationRetryCounter(0)
     , m_creationTime(std::chrono::steady_clock::now())
 {
-    m_visitWeaklyHasBeenCalled.store(false, std::memory_order_relaxed);
+    m_visitWeaklyHasBeenCalled = false;
 
     ASSERT(heap()->isDeferred());
     ASSERT(m_scopeRegister.isLocal());
@@ -1932,7 +1932,7 @@ CodeBlock::CodeBlock(VM* vm, Structure* structure, ScriptExecutable* ownerExecut
     , m_reoptimizationRetryCounter(0)
     , m_creationTime(std::chrono::steady_clock::now())
 {
-    m_visitWeaklyHasBeenCalled.store(false, std::memory_order_relaxed);
+    m_visitWeaklyHasBeenCalled = false;
 
     ASSERT(heap()->isDeferred());
     ASSERT(m_scopeRegister.isLocal());
@@ -2508,18 +2508,20 @@ CodeBlock* CodeBlock::specialOSREntryBlockOrNull()
 
 void CodeBlock::visitWeakly(SlotVisitor& visitor)
 {
-    bool setByMe = !m_visitWeaklyHasBeenCalled.compareExchangeStrong(false, true);
-    if (!setByMe)
+    ConcurrentJSLocker locker(m_lock);
+    if (m_visitWeaklyHasBeenCalled)
         return;
+    
+    m_visitWeaklyHasBeenCalled = true;
 
     if (Heap::isMarkedConcurrently(this))
         return;
 
-    if (shouldVisitStrongly()) {
+    if (shouldVisitStrongly(locker)) {
         visitor.appendUnbarrieredReadOnlyPointer(this);
         return;
     }
-
+    
     // There are two things that may use unconditional finalizers: inline cache clearing
     // and jettisoning. The probability of us wanting to do at least one of those things
     // is probably quite close to 1. So we add one no matter what and when it runs, it
@@ -2549,10 +2551,10 @@ void CodeBlock::visitWeakly(SlotVisitor& visitor)
     // decision by calling harvestWeakReferences().
 
     m_allTransitionsHaveBeenMarked = false;
-    propagateTransitions(visitor);
+    propagateTransitions(locker, visitor);
 
     m_jitCode->dfgCommon()->livenessHasBeenProved = false;
-    determineLiveness(visitor);
+    determineLiveness(locker, visitor);
 #endif // ENABLE(DFG_JIT)
 }
 
@@ -2575,6 +2577,7 @@ void CodeBlock::visitChildren(JSCell* cell, SlotVisitor& visitor)
 
 void CodeBlock::visitChildren(SlotVisitor& visitor)
 {
+    ConcurrentJSLocker locker(m_lock);
     // There are two things that may use unconditional finalizers: inline cache clearing
     // and jettisoning. The probability of us wanting to do at least one of those things
     // is probably quite close to 1. So we add one no matter what and when it runs, it
@@ -2592,19 +2595,19 @@ void CodeBlock::visitChildren(SlotVisitor& visitor)
         visitor.reportExtraMemoryVisited(m_instructions.size() * sizeof(Instruction) / refCount);
     }
 
-    stronglyVisitStrongReferences(visitor);
-    stronglyVisitWeakReferences(visitor);
+    stronglyVisitStrongReferences(locker, visitor);
+    stronglyVisitWeakReferences(locker, visitor);
 
     m_allTransitionsHaveBeenMarked = false;
-    propagateTransitions(visitor);
+    propagateTransitions(locker, visitor);
 }
 
-bool CodeBlock::shouldVisitStrongly()
+bool CodeBlock::shouldVisitStrongly(const ConcurrentJSLocker& locker)
 {
     if (Options::forceCodeBlockLiveness())
         return true;
 
-    if (shouldJettisonDueToOldAge())
+    if (shouldJettisonDueToOldAge(locker))
         return false;
 
     // Interpreter and Baseline JIT CodeBlocks don't need to be jettisoned when
@@ -2656,7 +2659,7 @@ static std::chrono::milliseconds timeToLive(JITCode::JITType jitType)
     }
 }
 
-bool CodeBlock::shouldJettisonDueToOldAge()
+bool CodeBlock::shouldJettisonDueToOldAge(const ConcurrentJSLocker&)
 {
     if (Heap::isMarkedConcurrently(this))
         return false;
@@ -2683,7 +2686,7 @@ static bool shouldMarkTransition(DFG::WeakReferenceTransition& transition)
 }
 #endif // ENABLE(DFG_JIT)
 
-void CodeBlock::propagateTransitions(SlotVisitor& visitor)
+void CodeBlock::propagateTransitions(const ConcurrentJSLocker&, SlotVisitor& visitor)
 {
     UNUSED_PARAM(visitor);
 
@@ -2764,7 +2767,7 @@ void CodeBlock::propagateTransitions(SlotVisitor& visitor)
         m_allTransitionsHaveBeenMarked = true;
 }
 
-void CodeBlock::determineLiveness(SlotVisitor& visitor)
+void CodeBlock::determineLiveness(const ConcurrentJSLocker&, SlotVisitor& visitor)
 {
     UNUSED_PARAM(visitor);
     
@@ -2810,9 +2813,9 @@ void CodeBlock::WeakReferenceHarvester::visitWeakReferences(SlotVisitor& visitor
     CodeBlock* codeBlock =
         bitwise_cast<CodeBlock*>(
             bitwise_cast<char*>(this) - OBJECT_OFFSETOF(CodeBlock, m_weakReferenceHarvester));
-
-    codeBlock->propagateTransitions(visitor);
-    codeBlock->determineLiveness(visitor);
+    
+    codeBlock->propagateTransitions(NoLockingNecessary, visitor);
+    codeBlock->determineLiveness(NoLockingNecessary, visitor);
 }
 
 void CodeBlock::clearLLIntGetByIdCache(Instruction* instruction)
@@ -2951,7 +2954,9 @@ void CodeBlock::UnconditionalFinalizer::finalizeUnconditionally()
 {
     CodeBlock* codeBlock = bitwise_cast<CodeBlock*>(
         bitwise_cast<char*>(this) - OBJECT_OFFSETOF(CodeBlock, m_unconditionalFinalizer));
-
+    
+    codeBlock->updateAllPredictions();
+    
 #if ENABLE(DFG_JIT)
     if (codeBlock->shouldJettisonDueToWeakReference()) {
         codeBlock->jettison(Profiler::JettisonDueToWeakReference);
@@ -2959,7 +2964,7 @@ void CodeBlock::UnconditionalFinalizer::finalizeUnconditionally()
     }
 #endif // ENABLE(DFG_JIT)
 
-    if (codeBlock->shouldJettisonDueToOldAge()) {
+    if (codeBlock->shouldJettisonDueToOldAge(NoLockingNecessary)) {
         codeBlock->jettison(Profiler::JettisonDueToOldAge);
         return;
     }
@@ -3099,7 +3104,7 @@ void CodeBlock::resetJITData()
 }
 #endif
 
-void CodeBlock::visitOSRExitTargets(SlotVisitor& visitor)
+void CodeBlock::visitOSRExitTargets(const ConcurrentJSLocker&, SlotVisitor& visitor)
 {
     // We strongly visit OSR exits targets because we don't want to deal with
     // the complexity of generating an exit target CodeBlock on demand and
@@ -3119,7 +3124,7 @@ void CodeBlock::visitOSRExitTargets(SlotVisitor& visitor)
 #endif
 }
 
-void CodeBlock::stronglyVisitStrongReferences(SlotVisitor& visitor)
+void CodeBlock::stronglyVisitStrongReferences(const ConcurrentJSLocker& locker, SlotVisitor& visitor)
 {
     visitor.append(&m_globalObject);
     visitor.append(&m_ownerExecutable);
@@ -3141,13 +3146,11 @@ void CodeBlock::stronglyVisitStrongReferences(SlotVisitor& visitor)
 
 #if ENABLE(DFG_JIT)
     if (JITCode::isOptimizingJIT(jitType()))
-        visitOSRExitTargets(visitor);
+        visitOSRExitTargets(locker, visitor);
 #endif
-
-    updateAllPredictions();
 }
 
-void CodeBlock::stronglyVisitWeakReferences(SlotVisitor& visitor)
+void CodeBlock::stronglyVisitWeakReferences(const ConcurrentJSLocker&, SlotVisitor& visitor)
 {
     UNUSED_PARAM(visitor);
 
index 819834b..6bb91db 100644 (file)
@@ -836,7 +836,7 @@ public:
     // concurrent compilation threads finish what they're doing.
     mutable ConcurrentJSLock m_lock;
 
-    Atomic<bool> m_visitWeaklyHasBeenCalled;
+    bool m_visitWeaklyHasBeenCalled;
 
     bool m_shouldAlwaysBeInlined; // Not a bitfield because the JIT wants to store to it.
 
@@ -947,16 +947,16 @@ private:
     void dumpRareCaseProfile(PrintStream&, const char* name, RareCaseProfile*, bool& hasPrintedProfiling);
     void dumpArithProfile(PrintStream&, ArithProfile*, bool& hasPrintedProfiling);
 
-    bool shouldVisitStrongly();
+    bool shouldVisitStrongly(const ConcurrentJSLocker&);
     bool shouldJettisonDueToWeakReference();
-    bool shouldJettisonDueToOldAge();
+    bool shouldJettisonDueToOldAge(const ConcurrentJSLocker&);
     
-    void propagateTransitions(SlotVisitor&);
-    void determineLiveness(SlotVisitor&);
+    void propagateTransitions(const ConcurrentJSLocker&, SlotVisitor&);
+    void determineLiveness(const ConcurrentJSLocker&, SlotVisitor&);
         
-    void stronglyVisitStrongReferences(SlotVisitor&);
-    void stronglyVisitWeakReferences(SlotVisitor&);
-    void visitOSRExitTargets(SlotVisitor&);
+    void stronglyVisitStrongReferences(const ConcurrentJSLocker&, SlotVisitor&);
+    void stronglyVisitWeakReferences(const ConcurrentJSLocker&, SlotVisitor&);
+    void visitOSRExitTargets(const ConcurrentJSLocker&, SlotVisitor&);
 
     std::chrono::milliseconds timeSinceCreation()
     {
@@ -1079,7 +1079,7 @@ inline Register& ExecState::uncheckedR(VirtualRegister reg)
 
 inline void CodeBlock::clearVisitWeaklyHasBeenCalled()
 {
-    m_visitWeaklyHasBeenCalled.store(false, std::memory_order_relaxed);
+    m_visitWeaklyHasBeenCalled = false;
 }
 
 template <typename ExecutableType>
index ec07db2..06c63b6 100644 (file)
@@ -94,8 +94,7 @@ void CodeBlockSet::deleteUnmarkedAndUnreferenced(CollectionScope scope)
     }
 
     // Any remaining young CodeBlocks are live and need to be promoted to the set of old CodeBlocks.
-    if (scope == CollectionScope::Eden)
-        promoteYoungCodeBlocks(locker);
+    promoteYoungCodeBlocks(locker);
 }
 
 bool CodeBlockSet::contains(const LockHolder&, void* candidateCodeBlock)
index 4540548..124d36f 100644 (file)
@@ -98,7 +98,7 @@ public:
         
         if (Options::logGC()) {
             double thisPauseMS = (MonotonicTime::now() - m_heap.m_stopTime).milliseconds();
-            dataLog(thisPauseMS, " ms (max ", maxPauseMS(thisPauseMS), ")...]\n");
+            dataLog("p=", thisPauseMS, " ms (max ", maxPauseMS(thisPauseMS), ")...]\n");
         }
         
         m_heap.resumeTheWorld();
@@ -542,19 +542,40 @@ void Heap::markToFixpoint(double gcStartTime)
     const Seconds period = Seconds::fromMilliseconds(Options::concurrentGCPeriodMS());
     
     const double bytesAllocatedThisCycleAtTheBeginning = m_bytesAllocatedThisCycle;
-    const double bytesAllocatedThisCycleAtTheEnd = bytesAllocatedThisCycleAtTheBeginning * Options::concurrentGCHeadroomRatio();
+    const double bytesAllocatedThisCycleAtTheEnd =
+        Options::concurrentGCMaxHeadroom() *
+        std::max(
+            bytesAllocatedThisCycleAtTheBeginning,
+            static_cast<double>(m_maxEdenSize));
     
-    auto targetCollectorUtilization = [&] () -> double {
+    auto targetMutatorUtilization = [&] () -> double {
         double headroomFullness =
             (m_bytesAllocatedThisCycle - bytesAllocatedThisCycleAtTheBeginning) /
             (bytesAllocatedThisCycleAtTheEnd - bytesAllocatedThisCycleAtTheBeginning);
         
+        // headroomFullness can be NaN and other interesting things if
+        // bytesAllocatedThisCycleAtTheBeginning is zero. We see that in debug tests. This code
+        // defends against all floating point dragons.
+        
         if (!(headroomFullness >= 0))
             headroomFullness = 0;
         if (!(headroomFullness <= 1))
             headroomFullness = 1;
         
-        return headroomFullness;
+        double mutatorUtilization = 1 - headroomFullness;
+        
+        // Scale the mutator utilization into the permitted window.
+        mutatorUtilization =
+            Options::minimumMutatorUtilization() +
+            mutatorUtilization * (
+                Options::maximumMutatorUtilization() -
+                Options::minimumMutatorUtilization());
+        
+        return mutatorUtilization;
+    };
+    
+    auto targetCollectorUtilization = [&] () -> double {
+        return 1 - targetMutatorUtilization();
     };
     
     auto elapsedInPeriod = [&] (MonotonicTime now) -> Seconds {
@@ -566,22 +587,41 @@ void Heap::markToFixpoint(double gcStartTime)
     };
     
     auto shouldBeResumed = [&] (MonotonicTime now) -> bool {
+        if (Options::collectorShouldResumeFirst())
+            return phase(now) <= targetMutatorUtilization();
         return phase(now) > targetCollectorUtilization();
     };
     
     auto timeToResume = [&] (MonotonicTime now) -> MonotonicTime {
         ASSERT(!shouldBeResumed(now));
+        if (Options::collectorShouldResumeFirst())
+            return now - elapsedInPeriod(now) + period;
         return now - elapsedInPeriod(now) + period * targetCollectorUtilization();
     };
     
     auto timeToStop = [&] (MonotonicTime now) -> MonotonicTime {
         ASSERT(shouldBeResumed(now));
-        return now - elapsedInPeriod(now) + period;
+        if (Options::collectorShouldResumeFirst())
+            return now - elapsedInPeriod(now) + period * targetMutatorUtilization();
+        return now -  - elapsedInPeriod(now) + period;
     };
     
+    // Adjust the target extra pause ratio as necessary.
+    double rateOfCollection =
+        (m_lastGCEndTime - m_lastGCStartTime) /
+        (m_currentGCStartTime - m_lastGCStartTime);
+    
+    if (Options::logGC())
+        dataLog("cr=", rateOfCollection, " ");
+    
+    // FIXME: Determine if this is useful or get rid of it.
+    // https://bugs.webkit.org/show_bug.cgi?id=164940
+    double extraPauseRatio = Options::initialExtraPauseRatio();
+    
     for (unsigned iteration = 1; ; ++iteration) {
         if (Options::logGC())
             dataLog("i#", iteration, " ");
+        MonotonicTime topOfLoop = MonotonicTime::now();
         {
             TimingScope preConvergenceTimingScope(*this, "Heap::markToFixpoint conservative scan");
             ConservativeRoots conservativeRoots(*this);
@@ -642,6 +682,9 @@ void Heap::markToFixpoint(double gcStartTime)
         DFG::markCodeBlocks(*m_vm, *m_collectorSlotVisitor);
         bool shouldTerminate = m_collectorSlotVisitor->isEmpty() && m_mutatorMarkStack->isEmpty();
         
+        if (Options::logGC())
+            dataLog(m_collectorSlotVisitor->collectorMarkStack().size(), "+", m_mutatorMarkStack->size() + m_collectorSlotVisitor->mutatorMarkStack().size(), ", a=", m_bytesAllocatedThisCycle / 1024, " kb, b=", m_barriersExecuted, ", mu=", targetMutatorUtilization(), " ");
+        
         // We want to do this to conservatively ensure that we rescan any code blocks that are
         // running right now. However, we need to be sure to do it *after* we mark the code block
         // so that we know for sure if it really needs a barrier. Also, this has to happen after the
@@ -661,23 +704,30 @@ void Heap::markToFixpoint(double gcStartTime)
         if (Options::logGC() == GCLogging::Verbose)
             dataLog("Live Weak Handles:\n", *m_collectorSlotVisitor);
         
-        if (Options::logGC())
-            dataLog(m_collectorSlotVisitor->collectorMarkStack().size(), "+", m_collectorSlotVisitor->mutatorMarkStack().size(), " mu=", 1 - targetCollectorUtilization(), " ");
+        MonotonicTime beforeConvergence = MonotonicTime::now();
         
         {
             TimingScope traceTimingScope(*this, "Heap::markToFixpoint tracing");
             ParallelModeEnabler enabler(*m_collectorSlotVisitor);
+            
             if (Options::useCollectorTimeslicing()) {
-                for (;;) {
+                // Before we yield to the mutator, we should do GC work proportional to the time we
+                // spent paused. We initialize the timeslicer to start after this "mandatory" pause
+                // completes.
+                
+                SlotVisitor::SharedDrainResult drainResult;
+                
+                Seconds extraPause = (beforeConvergence - topOfLoop) * extraPauseRatio;
+                initialTime = beforeConvergence + extraPause;
+                drainResult = m_collectorSlotVisitor->drainInParallel(initialTime);
+                
+                while (drainResult != SlotVisitor::SharedDrainResult::Done) {
                     MonotonicTime now = MonotonicTime::now();
-                    SlotVisitor::SharedDrainResult drainResult;
                     if (shouldBeResumed(now)) {
                         ResumeTheWorldScope resumeTheWorldScope(*this);
                         drainResult = m_collectorSlotVisitor->drainInParallel(timeToStop(now));
                     } else
                         drainResult = m_collectorSlotVisitor->drainInParallel(timeToResume(now));
-                    if (drainResult == SlotVisitor::SharedDrainResult::Done)
-                        break;
                 }
             } else {
                 // Disabling collector timeslicing is meant to be used together with
@@ -686,6 +736,8 @@ void Heap::markToFixpoint(double gcStartTime)
                 m_collectorSlotVisitor->drainInParallel();
             }
         }
+        
+        extraPauseRatio *= Options::extraPauseRatioIterationGrowthRate();
     }
 
     {
@@ -730,6 +782,7 @@ void Heap::beginMarking()
     m_objectSpace.beginMarking();
     m_mutatorShouldBeFenced = true;
     m_barrierThreshold = tautologicalThreshold;
+    m_barriersExecuted = 0;
 }
 
 void Heap::visitConservativeRoots(ConservativeRoots& roots)
@@ -986,6 +1039,7 @@ void Heap::addToRememberedSet(const JSCell* cell)
 {
     ASSERT(cell);
     ASSERT(!Options::useConcurrentJIT() || !isCompilationThread());
+    m_barriersExecuted++;
     if (!Heap::isMarkedConcurrently(cell)) {
         // During a full collection a store into an unmarked object that had surivived past
         // collections will manifest as a store to an unmarked black object. If the object gets
@@ -1082,6 +1136,8 @@ bool Heap::shouldCollectInThread(const LockHolder&)
 
 void Heap::collectInThread()
 {
+    m_currentGCStartTime = MonotonicTime::now();
+    
     Optional<CollectionScope> scope;
     {
         LockHolder locker(*m_threadLock);
@@ -1166,7 +1222,7 @@ void Heap::collectInThread()
     if (Options::logGC()) {
         MonotonicTime after = MonotonicTime::now();
         double thisPauseMS = (after - m_stopTime).milliseconds();
-        dataLog(thisPauseMS, " ms (max ", maxPauseMS(thisPauseMS), "), cycle ", (after - before).milliseconds(), " ms END]\n");
+        dataLog("p=", thisPauseMS, " ms (max ", maxPauseMS(thisPauseMS), "), cycle ", (after - before).milliseconds(), " ms END]\n");
     }
     
     {
@@ -1179,6 +1235,9 @@ void Heap::collectInThread()
 
     setNeedFinalize();
     resumeTheWorld();
+    
+    m_lastGCStartTime = m_currentGCStartTime;
+    m_lastGCEndTime = MonotonicTime::now();
 }
 
 void Heap::stopTheWorld()
@@ -1698,7 +1757,7 @@ void Heap::updateAllocationLimits()
     m_bytesAllocatedThisCycle = 0;
 
     if (Options::logGC())
-        dataLog(currentHeapSize / 1024, " kb, ");
+        dataLog("=> ", currentHeapSize / 1024, " kb, ");
 }
 
 void Heap::didFinishCollection(double gcStartTime)
index 355fcd8..411f92a 100644 (file)
@@ -116,6 +116,8 @@ public:
     
     void writeBarrierWithoutFence(const JSCell* from);
     
+    void mutatorFence();
+    
     // Take this if you know that from->cellState() < barrierThreshold.
     JS_EXPORT_PRIVATE void writeBarrierSlowPath(const JSCell* from);
 
@@ -605,6 +607,12 @@ private:
     Box<Lock> m_threadLock;
     RefPtr<AutomaticThreadCondition> m_threadCondition; // The mutator must not wait on this. It would cause a deadlock.
     RefPtr<AutomaticThread> m_thread;
+    
+    MonotonicTime m_lastGCStartTime;
+    MonotonicTime m_lastGCEndTime;
+    MonotonicTime m_currentGCStartTime;
+    
+    uintptr_t m_barriersExecuted { 0 };
 };
 
 } // namespace JSC
index 1fa9d85..6f27b30 100644 (file)
@@ -157,6 +157,12 @@ inline void Heap::writeBarrierWithoutFence(const JSCell* from)
         addToRememberedSet(from);
 }
 
+inline void Heap::mutatorFence()
+{
+    if (isX86() || UNLIKELY(mutatorShouldBeFenced()))
+        WTF::storeStoreFence();
+}
+
 template<typename Functor> inline void Heap::forEachCodeBlock(const Functor& func)
 {
     forEachCodeBlockImpl(scopedLambdaRef<bool(CodeBlock*)>(func));
index 89d8fd1..b04102c 100644 (file)
@@ -34,6 +34,8 @@
 
 namespace JSC {
 
+const size_t MarkedBlock::blockSize;
+
 static const bool computeBalance = false;
 static size_t balance;
 
index 4ee8fbd..a951904 100644 (file)
@@ -60,8 +60,7 @@ inline void JSCell::finishCreation(VM& vm)
 {
     // This object is ready to be escaped so the concurrent GC may see it at any time. We have
     // to make sure that none of our stores sink below here.
-    if (isX86() || UNLIKELY(vm.heap.mutatorShouldBeFenced()))
-        WTF::storeStoreFence();
+    vm.heap.mutatorFence();
 #if ENABLE(GC_VALIDATION)
     ASSERT(vm.isInitializingObject());
     vm.setInitializingObjectClass(0);
index 179bcf8..613efc5 100644 (file)
@@ -167,9 +167,9 @@ inline void JSObject::putDirectWithoutTransition(VM& vm, PropertyName propertyNa
     unsigned oldOutOfLineCapacity = structure->outOfLineCapacity();
     structure->addPropertyWithoutTransition(
         vm, propertyName, attributes,
-        [&] (const GCSafeConcurrentJSLocker&, PropertyOffset offset) {
-            if (structure->outOfLineCapacity() != oldOutOfLineCapacity) {
-                butterfly = allocateMoreOutOfLineStorage(vm, oldOutOfLineCapacity, structure->outOfLineCapacity());
+        [&] (const GCSafeConcurrentJSLocker&, PropertyOffset offset, unsigned newOutOfLineCapacity) {
+            if (newOutOfLineCapacity != oldOutOfLineCapacity) {
+                butterfly = allocateMoreOutOfLineStorage(vm, oldOutOfLineCapacity, newOutOfLineCapacity);
                 WTF::storeStoreFence();
                 setButterfly(vm, butterfly);
             }
@@ -266,15 +266,14 @@ ALWAYS_INLINE bool JSObject::putDirectInternal(VM& vm, PropertyName propertyName
         unsigned oldOutOfLineCapacity = structure->outOfLineCapacity();
         offset = structure->addPropertyWithoutTransition(
             vm, propertyName, attributes,
-            [&] (const GCSafeConcurrentJSLocker&, PropertyOffset offset) {
+            [&] (const GCSafeConcurrentJSLocker&, PropertyOffset offset, unsigned newOutOfLineCapacity) {
                 Butterfly* butterfly = this->butterfly();
-                if (structure->outOfLineCapacity() != oldOutOfLineCapacity) {
-                    butterfly = allocateMoreOutOfLineStorage(vm, oldOutOfLineCapacity, structure->outOfLineCapacity());
+                if (newOutOfLineCapacity != oldOutOfLineCapacity) {
+                    butterfly = allocateMoreOutOfLineStorage(vm, oldOutOfLineCapacity, newOutOfLineCapacity);
                     WTF::storeStoreFence();
                     setButterfly(vm, butterfly);
                 }
                 validateOffset(offset);
-                ASSERT(structure->isValidOffset(offset));
                 putDirect(vm, offset, value);
             });
 
index 7c3e941..092a812 100644 (file)
@@ -195,8 +195,13 @@ typedef const char* optionString;
     v(double, mediumHeapGrowthFactor, 1.5, Normal, nullptr) \
     v(double, largeHeapGrowthFactor, 1.24, Normal, nullptr) \
     v(bool, useCollectorTimeslicing, true, Normal, nullptr) \
-    v(double, concurrentGCHeadroomRatio, 1.5, Normal, nullptr) \
+    v(double, minimumMutatorUtilization, 0, Normal, nullptr) \
+    v(double, maximumMutatorUtilization, 0.7, Normal, nullptr) \
+    v(double, concurrentGCMaxHeadroom, 1.5, Normal, nullptr) \
     v(double, concurrentGCPeriodMS, 2, Normal, nullptr) \
+    v(double, initialExtraPauseRatio, 0, Normal, nullptr) \
+    v(double, extraPauseRatioIterationGrowthRate, 1.1, Normal, nullptr) \
+    v(bool, collectorShouldResumeFirst, false, Normal, nullptr) \
     v(bool, scribbleFreeCells, false, Normal, nullptr) \
     v(double, sizeClassProgression, 1.4, Normal, nullptr) \
     v(unsigned, largeAllocationCutoff, 100000, Normal, nullptr) \
index 353d859..279729a 100644 (file)
@@ -943,7 +943,7 @@ Vector<PropertyMapEntry> Structure::getPropertiesConcurrently()
 
 PropertyOffset Structure::add(VM& vm, PropertyName propertyName, unsigned attributes)
 {
-    return add(vm, propertyName, attributes, [] (const GCSafeConcurrentJSLocker&, PropertyOffset) { });
+    return add(vm, propertyName, attributes, [] (const GCSafeConcurrentJSLocker&, PropertyOffset, unsigned) { });
 }
 
 PropertyOffset Structure::remove(PropertyName propertyName)
index 36fb34a..413d3eb 100644 (file)
@@ -294,26 +294,11 @@ public:
 
     unsigned outOfLineCapacity() const
     {
-        ASSERT(checkOffsetConsistency());
-            
-        unsigned outOfLineSize = this->outOfLineSize();
-
-        if (!outOfLineSize)
-            return 0;
-
-        if (outOfLineSize <= initialOutOfLineCapacity)
-            return initialOutOfLineCapacity;
-
-        ASSERT(outOfLineSize > initialOutOfLineCapacity);
-        COMPILE_ASSERT(outOfLineGrowthFactor == 2, outOfLineGrowthFactor_is_two);
-        return WTF::roundUpToPowerOfTwo(outOfLineSize);
+        return outOfLineCapacity(m_offset);
     }
     unsigned outOfLineSize() const
     {
-        ASSERT(checkOffsetConsistency());
-        ASSERT(structure()->classInfo() == info());
-            
-        return numberOfOutOfLineSlotsForLastOffset(m_offset);
+        return outOfLineSize(m_offset);
     }
     bool hasInlineStorage() const
     {
@@ -636,6 +621,33 @@ private:
     void findStructuresAndMapForMaterialization(Vector<Structure*, 8>& structures, Structure*&, PropertyTable*&);
     
     static Structure* toDictionaryTransition(VM&, Structure*, DictionaryKind, DeferredStructureTransitionWatchpointFire* = nullptr);
+    
+    unsigned outOfLineCapacity(PropertyOffset lastOffset) const
+    {
+        unsigned outOfLineSize = this->outOfLineSize(lastOffset);
+
+        // This algorithm completely determines the out-of-line property storage growth algorithm.
+        // The JSObject code will only trigger a resize if the value returned by this algorithm
+        // changed between the new and old structure. So, it's important to keep this simple because
+        // it's on a fast path.
+        
+        if (!outOfLineSize)
+            return 0;
+
+        if (outOfLineSize <= initialOutOfLineCapacity)
+            return initialOutOfLineCapacity;
+
+        ASSERT(outOfLineSize > initialOutOfLineCapacity);
+        COMPILE_ASSERT(outOfLineGrowthFactor == 2, outOfLineGrowthFactor_is_two);
+        return WTF::roundUpToPowerOfTwo(outOfLineSize);
+    }
+    
+    unsigned outOfLineSize(PropertyOffset lastOffset) const
+    {
+        ASSERT(structure()->classInfo() == info());
+            
+        return numberOfOutOfLineSlotsForLastOffset(lastOffset);
+    }
 
     template<typename Func>
     PropertyOffset add(VM&, PropertyName, unsigned attributes, const Func&);
index 1dba9c7..2a01c3a 100644 (file)
@@ -304,11 +304,14 @@ inline PropertyOffset Structure::add(VM& vm, PropertyName propertyName, unsigned
 
     PropertyOffset newOffset = table->nextOffset(m_inlineCapacity);
     
-    table->add(PropertyMapEntry(rep, newOffset, attributes), m_offset, PropertyTable::PropertyOffsetMayChange);
+    PropertyOffset newLastOffset = m_offset;
+    table->add(PropertyMapEntry(rep, newOffset, attributes), newLastOffset, PropertyTable::PropertyOffsetMayChange);
     
-    checkConsistency();
+    func(locker, newOffset, outOfLineCapacity(newLastOffset));
+    vm.heap.mutatorFence();
+    m_offset = newLastOffset;
 
-    func(locker, newOffset);
+    checkConsistency();
     return newOffset;
 }