Make MarkedBlock state tracking support overlapped allocation and marking state
authorfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Tue, 20 Sep 2016 18:12:18 +0000 (18:12 +0000)
committerfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Tue, 20 Sep 2016 18:12:18 +0000 (18:12 +0000)
commitb53d34eb16c16516c69315619b8ee7247b29a616
tree75eb1d6c831d176e13ddb58314287c07943ff2e7
parent926296aa4af8944679e9930ae3e5e86c5e706139
Make MarkedBlock state tracking support overlapped allocation and marking state
https://bugs.webkit.org/show_bug.cgi?id=161581

Reviewed by Geoffrey Garen.

JSTests:

Add a microbenchmark for why we want to reclaim empty blocks from other allocators.

* microbenchmarks/switching-size-classes.js: Added.

Source/JavaScriptCore:

Concurrent GCs must allow for mutation and allocation during collection. We already know
how to mutate during collection. We have a write barrier for that. Allocation during
collection is more involved: the collector modifies the the mark bits, as well as other
kinds of MarkedBlock state, in-place during a collection. The allocator uses that same
MarkedBlock state to decide which regions of memory are free. This works if the allocator
never runs while the collector is running, but if we want to allow them to run at the same
time, then we need to have two versions of the state: one version built up by the
collector and another consumed by the allocator. We clear the collector state at the
beginning of collection, and splat the collector state onto the allocator state after
collection.

This could be super expensive, but we can make it cheap with some cleverness. The biggest
observation is just that most of the state is a handful of bits per block: is the block
free-listed? is it completely full? completely empty? in the incremental sweeper's
snapshot? is it retired? is it in eden? There is also state inside blocks, like the mark
bits, but I have a solid plan there and I'll save it for another patch. Once we view the
state of blocks as bits, we can put that state into bitvectors, so that if the collector
needs to transform the state of some blocks, it can do it with a single operation over
bitvectors. I like to think of this as 32-way parallelizing block operations, since
doing one operation on a 32-bit word in one of those bitvectors instantly affects 32
blocks.

This change converts all previous collections of MarkedBlocks, along with the MarkedBlock
state, into 8 bitvectors (live, empty, allocated, canAllocateButNotEmpty, eden, unswept,
markingNotEmpty, and markingRetired). The bitvectors separate allocator state (empty,
allocated, canAllocateButNotEmpty) from marking state (markingNotEmpty, markingRetired).

As a nice side-effect of switching to bitvectors, we get size class rebalancing for free.
It used to be that if a MarkedAllocator had an empty block, we would only allow that
memory to be reused by a different MarkedAllocator if we did an incremental sweep or a
full eager sweep. Now we hunt down all destructorless empty blocks before allocating new
MarkedBlocks. It would be relatively easy to also hunt down destructor empty blocks, but
the theory is that those might be expensive to sweep, so it might still be better to leave
those to the incremental sweeper.

This change is perf-neutral all around. I did some tests with two different kinds of
allocation strategies - something that is somewhat easier to do now that you can look for
blocks that are candidates for allocation by just scanning some bitvectors. I tried two
variants:

- Allocate out of non-empty blocks first, leaving empty blocks for last in case a
  different allocator needed them. This is sort of a best-fit strategy. I tried this
  first, and it can be expressed as:

  m_allocationCursor = m_canAllocateButNotEmpty.findBit(m_allocationCursor, true)

- Allocate out of lower-indexed blocks first, treating empty and canAllocateButNotEmpty
  blocks equally. This is sort of a first-fit strategy. This is what I ended up settling
  on, and it can be expressed as:

  m_allocationCursor = (m_canAllocateButNotEmpty | m_empty).findBit(m_allocationCursor, true)

The best-fit strategy meant 1% regressions in LongSpider and Octane overall, and a 11%
regression on Octane/earley. First-fit means perf-neutrality. Most great allocators skew
towards first-fit because it's empirically better, so this result is not surprising.

Overall, the performance of this patch on my machine is as follows, where "neutral" means
less than 1% and not statistically significant.

run-jsc-benchmarks:
    SunSpider: neutral
    LongSpider: 0.6% slower
    V8Spider: neutral
    Octane: neutral
    Kraken: neutral
    Microbenchmarks: 0.37% slower
    AsmBench: neutral
    CompressionBench: maybe 1% faster

For browser benchmarks, I report the ratio of means (bigger / smaller) along with a T-test
from Mathematica reported as % chance of not [sic] the null hypothesis. Note that we
normally consider anything less than 95% confidence to be inconclusive.

Browser benchmarks:
    PLT3: 0.3% faster with 67% confidence
    membuster:
        Snap2FinishedLoadingPost: 0.68% more memory with 50% confidence
        Snap3EndPost: 2.4% more memory with 61% confidence
    JetStream: 0.2% slower with 32% confidence
    Speedometer: 0.7% faster with 82% confidence

Additionally, Octane/splay's heap capacity goes down to ~180KB from ~200KB, so about a 10%
progression. This is due to the allocator rebalancing feature.

Finally, this breaks --useImmortalObjects. It was already broken as far as I can tell. I
filed a bug to reimplement it (bug 162296). Unless someone urgently needs this internal
tool, it's probably best to reimplement it after I'm done refactoring MarkedSpace.

* JavaScriptCore.xcodeproj/project.pbxproj:
* debugger/Debugger.cpp:
* heap/CellContainer.h:
* heap/CellContainerInlines.h:
(JSC::CellContainer::vm):
(JSC::CellContainer::heap):
(JSC::CellContainer::isMarkedOrNewlyAllocated):
(JSC::CellContainer::aboutToMark):
(JSC::CellContainer::isMarked): Deleted.
(JSC::CellContainer::flipIfNecessary): Deleted.
* heap/ConservativeRoots.cpp:
* heap/Heap.cpp:
(JSC::Heap::beginMarking):
(JSC::Heap::endMarking):
(JSC::Heap::collectAllGarbage):
(JSC::Heap::collectImpl):
(JSC::Heap::snapshotMarkedSpace):
(JSC::Heap::prepareForAllocation):
(JSC::Heap::zombifyDeadObjects):
(JSC::MarkedBlockSnapshotFunctor::MarkedBlockSnapshotFunctor): Deleted.
(JSC::MarkedBlockSnapshotFunctor::operator()): Deleted.
(JSC::Heap::resetAllocators): Deleted.
* heap/Heap.h:
* heap/HeapInlines.h:
(JSC::Heap::isMarked):
(JSC::Heap::isMarkedConcurrently):
(JSC::Heap::testAndSetMarked):
* heap/HeapStatistics.cpp:
* heap/HeapUtil.h:
(JSC::HeapUtil::findGCObjectPointersForMarking):
(JSC::HeapUtil::isPointerGCObjectJSCell):
* heap/HeapVerifier.cpp:
* heap/IncrementalSweeper.cpp:
(JSC::IncrementalSweeper::IncrementalSweeper):
(JSC::IncrementalSweeper::doSweep):
(JSC::IncrementalSweeper::sweepNextBlock):
(JSC::IncrementalSweeper::startSweeping):
(JSC::IncrementalSweeper::willFinishSweeping):
* heap/IncrementalSweeper.h:
* heap/LargeAllocation.h:
(JSC::LargeAllocation::isMarked):
(JSC::LargeAllocation::isMarkedConcurrently):
(JSC::LargeAllocation::isMarkedOrNewlyAllocated):
(JSC::LargeAllocation::aboutToMark):
(JSC::LargeAllocation::isMarkedDuringWeakVisiting): Deleted.
(JSC::LargeAllocation::flipIfNecessary): Deleted.
(JSC::LargeAllocation::flipIfNecessaryDuringMarking): Deleted.
* heap/MarkedAllocator.cpp:
(JSC::MarkedAllocator::MarkedAllocator):
(JSC::MarkedAllocator::isPagedOut):
(JSC::MarkedAllocator::findEmptyBlock):
(JSC::MarkedAllocator::tryAllocateWithoutCollectingImpl):
(JSC::MarkedAllocator::allocateIn):
(JSC::MarkedAllocator::tryAllocateIn):
(JSC::MarkedAllocator::allocateSlowCaseImpl):
(JSC::MarkedAllocator::tryAllocateBlock):
(JSC::MarkedAllocator::addBlock):
(JSC::MarkedAllocator::removeBlock):
(JSC::MarkedAllocator::stopAllocating):
(JSC::MarkedAllocator::prepareForAllocation):
(JSC::MarkedAllocator::lastChanceToFinalize):
(JSC::MarkedAllocator::resumeAllocating):
(JSC::MarkedAllocator::beginMarkingForFullCollection):
(JSC::MarkedAllocator::endMarking):
(JSC::MarkedAllocator::snapshotForEdenCollection):
(JSC::MarkedAllocator::snapshotForFullCollection):
(JSC::MarkedAllocator::findBlockToSweep):
(JSC::MarkedAllocator::sweep):
(JSC::MarkedAllocator::shrink):
(JSC::MarkedAllocator::assertSnapshotEmpty):
(JSC::MarkedAllocator::dump):
(JSC::MarkedAllocator::dumpBits):
(JSC::MarkedAllocator::retire): Deleted.
(JSC::MarkedAllocator::filterNextBlock): Deleted.
(JSC::MarkedAllocator::setNextBlockToSweep): Deleted.
(JSC::MarkedAllocator::reset): Deleted.
* heap/MarkedAllocator.h:
(JSC::MarkedAllocator::forEachBitVector):
(JSC::MarkedAllocator::forEachBitVectorWithName):
(JSC::MarkedAllocator::nextAllocator):
(JSC::MarkedAllocator::setNextAllocator):
(JSC::MarkedAllocator::forEachBlock):
(JSC::MarkedAllocator::resumeAllocating): Deleted.
* heap/MarkedBlock.cpp:
(JSC::MarkedBlock::tryCreate):
(JSC::MarkedBlock::Handle::Handle):
(JSC::MarkedBlock::Handle::~Handle):
(JSC::MarkedBlock::MarkedBlock):
(JSC::MarkedBlock::Handle::specializedSweep):
(JSC::MarkedBlock::Handle::sweep):
(JSC::MarkedBlock::Handle::sweepHelperSelectScribbleMode):
(JSC::MarkedBlock::Handle::sweepHelperSelectEmptyMode):
(JSC::MarkedBlock::Handle::sweepHelperSelectHasNewlyAllocated):
(JSC::MarkedBlock::Handle::sweepHelperSelectSweepMode):
(JSC::MarkedBlock::Handle::sweepHelperSelectFlipMode):
(JSC::MarkedBlock::Handle::unsweepWithNoNewlyAllocated):
(JSC::MarkedBlock::Handle::setIsFreeListed):
(JSC::MarkedBlock::Handle::stopAllocating):
(JSC::MarkedBlock::Handle::lastChanceToFinalize):
(JSC::MarkedBlock::Handle::resumeAllocating):
(JSC::MarkedBlock::aboutToMarkSlow):
(JSC::MarkedBlock::clearMarks):
(JSC::MarkedBlock::isMarked):
(JSC::MarkedBlock::Handle::isMarkedOrNewlyAllocated):
(JSC::MarkedBlock::isMarkedOrNewlyAllocated):
(JSC::MarkedBlock::Handle::didConsumeFreeList):
(JSC::MarkedBlock::markCount):
(JSC::MarkedBlock::Handle::isEmpty):
(JSC::MarkedBlock::noteMarkedSlow):
(JSC::MarkedBlock::Handle::removeFromAllocator):
(JSC::MarkedBlock::Handle::didAddToAllocator):
(JSC::MarkedBlock::Handle::didRemoveFromAllocator):
(JSC::MarkedBlock::Handle::isLive):
(JSC::MarkedBlock::Handle::isLiveCell):
(JSC::MarkedBlock::Handle::sweepHelperSelectStateAndSweepMode): Deleted.
(JSC::MarkedBlock::flipIfNecessary): Deleted.
(JSC::MarkedBlock::Handle::flipIfNecessary): Deleted.
(JSC::MarkedBlock::flipIfNecessarySlow): Deleted.
(JSC::MarkedBlock::flipIfNecessaryDuringMarkingSlow): Deleted.
(JSC::MarkedBlock::Handle::willRemoveBlock): Deleted.
(WTF::printInternal): Deleted.
* heap/MarkedBlock.h:
(JSC::MarkedBlock::Handle::isFreeListed):
(JSC::MarkedBlock::Handle::index):
(JSC::MarkedBlock::aboutToMark):
(JSC::MarkedBlock::isMarked):
(JSC::MarkedBlock::isMarkedConcurrently):
(JSC::MarkedBlock::Handle::isMarkedOrNewlyAllocated):
(JSC::MarkedBlock::isMarkedOrNewlyAllocated):
(JSC::MarkedBlock::Handle::isOnBlocksToSweep): Deleted.
(JSC::MarkedBlock::Handle::setIsOnBlocksToSweep): Deleted.
(JSC::MarkedBlock::Handle::state): Deleted.
(JSC::MarkedBlock::flipIfNecessary): Deleted.
(JSC::MarkedBlock::flipIfNecessaryDuringMarking): Deleted.
(JSC::MarkedBlock::Handle::flipIfNecessary): Deleted.
(JSC::MarkedBlock::Handle::flipIfNecessaryDuringMarking): Deleted.
(JSC::MarkedBlock::Handle::flipForEdenCollection): Deleted.
(JSC::MarkedBlock::isMarkedDuringWeakVisiting): Deleted.
(JSC::MarkedBlock::Handle::isLive): Deleted.
(JSC::MarkedBlock::Handle::isLiveCell): Deleted.
(JSC::MarkedBlock::Handle::forEachLiveCell): Deleted.
(JSC::MarkedBlock::Handle::forEachDeadCell): Deleted.
(JSC::MarkedBlock::Handle::needsSweeping): Deleted.
(JSC::MarkedBlock::Handle::isAllocated): Deleted.
(JSC::MarkedBlock::Handle::isMarked): Deleted.
* heap/MarkedBlockInlines.h: Added.
(JSC::MarkedBlock::Handle::isLive):
(JSC::MarkedBlock::Handle::isLiveCell):
(JSC::MarkedBlock::Handle::forEachLiveCell):
(JSC::MarkedBlock::Handle::forEachDeadCell):
(JSC::MarkedBlock::resetVersion):
* heap/MarkedSpace.cpp:
(JSC::MarkedSpace::MarkedSpace):
(JSC::MarkedSpace::allocate):
(JSC::MarkedSpace::tryAllocate):
(JSC::MarkedSpace::sweep):
(JSC::MarkedSpace::prepareForAllocation):
(JSC::MarkedSpace::shrink):
(JSC::MarkedSpace::clearNewlyAllocated):
(JSC::MarkedSpace::beginMarking):
(JSC::MarkedSpace::endMarking):
(JSC::MarkedSpace::didAllocateInBlock):
(JSC::MarkedSpace::findEmptyBlock):
(JSC::MarkedSpace::snapshot):
(JSC::MarkedSpace::assertSnapshotEmpty):
(JSC::MarkedSpace::dumpBits):
(JSC::MarkedSpace::zombifySweep): Deleted.
(JSC::MarkedSpace::resetAllocators): Deleted.
(JSC::VerifyMarked::operator()): Deleted.
(JSC::MarkedSpace::flip): Deleted.
* heap/MarkedSpace.h:
(JSC::MarkedSpace::nextVersion):
(JSC::MarkedSpace::firstAllocator):
(JSC::MarkedSpace::allocatorForEmptyAllocation):
(JSC::MarkedSpace::forEachAllocator):
(JSC::MarkedSpace::blocksWithNewObjects): Deleted.
(JSC::MarkedSpace::setIsMarking): Deleted.
(JSC::MarkedSpace::forEachLiveCell): Deleted.
(JSC::MarkedSpace::forEachDeadCell): Deleted.
* heap/MarkedSpaceInlines.h: Added.
(JSC::MarkedSpace::forEachLiveCell):
(JSC::MarkedSpace::forEachDeadCell):
* heap/SlotVisitor.cpp:
(JSC::SlotVisitor::setMarkedAndAppendToMarkStack):
(JSC::SlotVisitor::markAuxiliary):
(JSC::SlotVisitor::visitChildren):
* heap/Weak.h:
(WTF::HashTraits<JSC::Weak<T>>::emptyValue):
(WTF::HashTraits<JSC::Weak<T>>::peek):
* heap/WeakBlock.cpp:
(JSC::WeakBlock::specializedVisit):
(JSC::WeakBlock::reap):
* heap/WeakInlines.h:
(WTF::HashTraits<JSC::Weak<T>>::emptyValue): Deleted.
(WTF::HashTraits<JSC::Weak<T>>::peek): Deleted.
* jit/JITThunks.h:
* runtime/JSGlobalObject.cpp:
* runtime/PrototypeMap.h:
* runtime/SamplingProfiler.cpp:
* runtime/WeakGCMap.h:
* tools/JSDollarVMPrototype.cpp:

Source/WTF:

The main change here is to bring back FastBitVector.cpp, so that I could outline some
large slow path functions. This also adds some utilities, like atomicSetAndCheck() and
isEmpty(). The GC uses these.

* WTF.xcodeproj/project.pbxproj:
* wtf/CMakeLists.txt:
* wtf/FastBitVector.cpp: Added.
(WTF::FastBitVectorWordOwner::setEqualsSlow):
(WTF::FastBitVectorWordOwner::resizeSlow):
* wtf/FastBitVector.h:
(WTF::FastBitVectorWordOwner::operator=):
(WTF::FastBitVectorWordOwner::resize):
(WTF::FastBitVectorImpl::isEmpty):
(WTF::FastBitVector::atomicSetAndCheck):
(WTF::FastBitVector::operator[]): Deleted.

Tools:

Remove the always-trigger-copy-phase configuration.

* Scripts/run-jsc-stress-tests:

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@206154 268f45cc-cd09-0410-ab3c-d52691b4dbfc
52 files changed:
JSTests/ChangeLog
JSTests/microbenchmarks/switching-size-classes.js [new file with mode: 0644]
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj
Source/JavaScriptCore/Scripts/builtins/builtins_generate_internals_wrapper_header.py
Source/JavaScriptCore/debugger/Debugger.cpp
Source/JavaScriptCore/heap/AllocationScope.h [new file with mode: 0644]
Source/JavaScriptCore/heap/CellContainer.h
Source/JavaScriptCore/heap/CellContainerInlines.h
Source/JavaScriptCore/heap/ConservativeRoots.cpp
Source/JavaScriptCore/heap/Heap.cpp
Source/JavaScriptCore/heap/Heap.h
Source/JavaScriptCore/heap/HeapInlines.h
Source/JavaScriptCore/heap/HeapStatistics.cpp
Source/JavaScriptCore/heap/HeapUtil.h
Source/JavaScriptCore/heap/HeapVerifier.cpp
Source/JavaScriptCore/heap/IncrementalSweeper.cpp
Source/JavaScriptCore/heap/IncrementalSweeper.h
Source/JavaScriptCore/heap/LargeAllocation.h
Source/JavaScriptCore/heap/MarkedAllocator.cpp
Source/JavaScriptCore/heap/MarkedAllocator.h
Source/JavaScriptCore/heap/MarkedBlock.cpp
Source/JavaScriptCore/heap/MarkedBlock.h
Source/JavaScriptCore/heap/MarkedBlockInlines.h [new file with mode: 0644]
Source/JavaScriptCore/heap/MarkedSpace.cpp
Source/JavaScriptCore/heap/MarkedSpace.h
Source/JavaScriptCore/heap/MarkedSpaceInlines.h [new file with mode: 0644]
Source/JavaScriptCore/heap/SlotVisitor.cpp
Source/JavaScriptCore/heap/Weak.h
Source/JavaScriptCore/heap/WeakBlock.cpp
Source/JavaScriptCore/heap/WeakInlines.h
Source/JavaScriptCore/jit/JITThunks.h
Source/JavaScriptCore/runtime/JSGlobalObject.cpp
Source/JavaScriptCore/runtime/JSObject.cpp
Source/JavaScriptCore/runtime/JSProxy.cpp
Source/JavaScriptCore/runtime/Options.h
Source/JavaScriptCore/runtime/PrototypeMap.cpp
Source/JavaScriptCore/runtime/PrototypeMap.h
Source/JavaScriptCore/runtime/PrototypeMapInlines.h [new file with mode: 0644]
Source/JavaScriptCore/runtime/RegExpCache.h
Source/JavaScriptCore/runtime/SamplingProfiler.cpp
Source/JavaScriptCore/runtime/WeakGCMap.h
Source/JavaScriptCore/runtime/WeakGCMapInlines.h
Source/JavaScriptCore/tools/JSDollarVMPrototype.cpp
Source/WTF/ChangeLog
Source/WTF/WTF.xcodeproj/project.pbxproj
Source/WTF/wtf/CMakeLists.txt
Source/WTF/wtf/FastBitVector.cpp [new file with mode: 0644]
Source/WTF/wtf/FastBitVector.h
Source/WebCore/testing/GCObservation.cpp
Tools/ChangeLog
Tools/Scripts/run-jsc-stress-tests