Air should support linear scan for optLevel<2
authorfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Thu, 30 Mar 2017 22:55:44 +0000 (22:55 +0000)
committerfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Thu, 30 Mar 2017 22:55:44 +0000 (22:55 +0000)
commit73c9828f7965d9c822c26c3e23b03ed9935aa7a3
tree95ba9bbc8769fd8bdd1d1989d5b5296284a5fd9a
parentbebaffbe3d7f44df48440f8669ae6e44f4e268bc
Air should support linear scan for optLevel<2
https://bugs.webkit.org/show_bug.cgi?id=170161

Reviewed by Saam Barati.

Source/JavaScriptCore:

This changes the default opt level of B3 to 2. It makes the other opt levels useful by adding a
new register allocator. This new linear scan allocator will produce significantly worse code.
But it will produce that code a lot faster than IRC or Briggs.

The opt levels are:
    0: no optimizations, linear scan
    1: some optimizations, linear scan
    2: full optimizations, graph coloring (IRC or Briggs based on CPU)

What we used to call optLevel=1 is not called optLevel=2, or better yet,
optLevel=B3::defaultOptLevel(). We no longer have anything like the old optLevel=0 (which did no
optimizations but ran graph coloring).

allocateRegistersByLinearScan() faithfully implements Massimiliano Poletto and Vivek Sarkar's
famous algorithm. It uses the variant that handles clobbered registers by avoiding assigning
ranges to those registers if the range overlaps a clobber. It's engineered to allocate registers
very quickly and generate inefficient code without falling off a cliff.

The new optLevel=1 speeds up B3 by a factor of 2, and results in a 80% throughput regression.
Linear scan runs 4.7x faster than graph coloring on average.

* CMakeLists.txt:
* JavaScriptCore.xcodeproj/project.pbxproj:
* b3/B3BasicBlockUtils.h:
(JSC::B3::blocksInPreOrder):
(JSC::B3::blocksInPostOrder):
* b3/B3BlockWorklist.h:
* b3/B3CFG.h:
(JSC::B3::CFG::newMap):
* b3/B3Common.h:
(JSC::B3::defaultOptLevel):
* b3/B3Compile.h:
* b3/B3DuplicateTails.cpp:
* b3/B3EliminateCommonSubexpressions.cpp:
* b3/B3FixSSA.cpp:
(JSC::B3::demoteValues):
(JSC::B3::fixSSA):
* b3/B3FixSSA.h:
* b3/B3Generate.cpp:
(JSC::B3::prepareForGeneration):
(JSC::B3::generateToAir):
* b3/B3Generate.h:
* b3/B3HeapRange.cpp: Removed.
* b3/B3HeapRange.h:
(JSC::B3::HeapRange::HeapRange): Deleted.
(JSC::B3::HeapRange::top): Deleted.
(JSC::B3::HeapRange::operator==): Deleted.
(JSC::B3::HeapRange::operator!=): Deleted.
(JSC::B3::HeapRange::operator|): Deleted.
(JSC::B3::HeapRange::operator bool): Deleted.
(JSC::B3::HeapRange::begin): Deleted.
(JSC::B3::HeapRange::end): Deleted.
(JSC::B3::HeapRange::overlaps): Deleted.
* b3/B3LowerToAir.cpp:
* b3/B3MoveConstants.cpp:
* b3/B3PhiChildren.h:
* b3/B3Procedure.cpp:
(JSC::B3::Procedure::dump):
(JSC::B3::Procedure::deleteOrphans):
(JSC::B3::Procedure::setBlockOrderImpl):
* b3/B3ReduceDoubleToFloat.cpp:
* b3/B3ReduceStrength.cpp:
* b3/B3SSACalculator.h:
* b3/B3UseCounts.h:
* b3/air/AirAllocateRegistersByGraphColoring.cpp:
* b3/air/AirAllocateRegistersByLinearScan.cpp: Added.
(JSC::B3::Air::allocateRegistersByLinearScan):
* b3/air/AirAllocateRegistersByLinearScan.h: Added.
* b3/air/AirAllocateStack.cpp:
(JSC::B3::Air::allocateStack):
* b3/air/AirArg.cpp:
(WTF::printInternal):
* b3/air/AirArg.h:
(JSC::B3::Air::Arg::activeAt):
(JSC::B3::Air::Arg::timing):
(JSC::B3::Air::Arg::forEachPhase):
* b3/air/AirBasicBlock.h:
* b3/air/AirBlockWorklist.h:
* b3/air/AirCFG.h:
(JSC::B3::Air::CFG::newMap):
* b3/air/AirEliminateDeadCode.cpp:
(JSC::B3::Air::eliminateDeadCode):
* b3/air/AirFixObviousSpills.cpp:
* b3/air/AirFixPartialRegisterStalls.cpp:
(JSC::B3::Air::fixPartialRegisterStalls):
* b3/air/AirFixSpillsAfterTerminals.cpp: Added.
(JSC::B3::Air::fixSpillsAfterTerminals):
* b3/air/AirFixSpillsAfterTerminals.h: Added.
* b3/air/AirGenerate.cpp:
(JSC::B3::Air::prepareForGeneration):
(JSC::B3::Air::generate):
* b3/air/AirGenerate.h:
* b3/air/AirGenerationContext.h:
* b3/air/AirInsertionSet.h:
* b3/air/AirInst.cpp:
(JSC::B3::Air::Inst::needsPadding):
* b3/air/AirLowerAfterRegAlloc.cpp:
(JSC::B3::Air::lowerAfterRegAlloc):
* b3/air/AirLowerEntrySwitch.cpp:
(JSC::B3::Air::lowerEntrySwitch):
* b3/air/AirOpcode.opcodes:
* b3/air/AirPhaseInsertionSet.cpp: Added.
(JSC::B3::Air::PhaseInsertionSet::execute):
* b3/air/AirPhaseInsertionSet.h: Added.
(JSC::B3::Air::PhaseInsertion::PhaseInsertion):
(JSC::B3::Air::PhaseInsertion::phase):
(JSC::B3::Air::PhaseInsertion::operator<):
(JSC::B3::Air::PhaseInsertionSet::PhaseInsertionSet):
(JSC::B3::Air::PhaseInsertionSet::appendInsertion):
(JSC::B3::Air::PhaseInsertionSet::insertInst):
(JSC::B3::Air::PhaseInsertionSet::insert):
* b3/air/AirRegLiveness.h:
(JSC::B3::Air::RegLiveness::LocalCalc::LocalCalc):
* b3/air/AirSpillEverything.cpp:
(JSC::B3::Air::spillEverything):
* b3/air/AirTmp.cpp:
* b3/air/AirTmp.h:
(JSC::B3::Air::Tmp::tmpForIndex):
* b3/air/AirTmpInlines.h:
(JSC::B3::Air::Tmp::Indexed::Indexed):
(JSC::B3::Air::Tmp::Indexed::index):
(JSC::B3::Air::Tmp::AbsolutelyIndexed::AbsolutelyIndexed):
(JSC::B3::Air::Tmp::AbsolutelyIndexed::index):
(JSC::B3::Air::Tmp::indexed):
(JSC::B3::Air::Tmp::absolutelyIndexed):
(JSC::B3::Air::Tmp::tmpForAbsoluteIndex):
* b3/testb3.cpp:
(JSC::B3::compile):
(JSC::B3::testMulLoadTwice):
* jit/RegisterSet.h:
(JSC::RegisterSet::add):
(JSC::RegisterSet::remove):
* runtime/Options.h:
* wasm/WasmB3IRGenerator.h:

Source/WTF:

This change introduces a new low-latency register allocator. It can allocate registers very
quickly by doing a relatively poor job. Implementing this algorithm required beefing up some of
our core algorithms.

* WTF.xcodeproj/project.pbxproj:
* wtf/CMakeLists.txt:
* wtf/Deque.h: Make it possible to do some basic priority queueing with this data structure.
(WTF::inlineCapacity>::removeAllMatching):
(WTF::inlineCapacity>::appendAndBubble):
(WTF::inlineCapacity>::takeLast):
* wtf/IndexKeyType.h: Added. This makes it possible to use IndexMap and IndexSet with value or pointer types. Previously they just worked with pointer types.
(WTF::IndexKeyType::index):
* wtf/IndexMap.h: Adopt IndexKeyType.
(WTF::IndexMap::operator[]):
(WTF::IndexMap::append):
* wtf/IndexSet.h: Adopt IndexKeyType.
(WTF::IndexSet::add):
(WTF::IndexSet::addAll):
(WTF::IndexSet::remove):
(WTF::IndexSet::contains):
(WTF::IndexSet::Iterable::iterator::operator*):
* wtf/Range.h: Added. This used to be B3::HeapRange. This generalizes that data structure to any kind of range stuff.
(WTF::Range::Range):
(WTF::Range::top):
(WTF::Range::operator==):
(WTF::Range::operator!=):
(WTF::Range::operator bool):
(WTF::Range::operator|):
(WTF::Range::operator|=):
(WTF::Range::begin):
(WTF::Range::end):
(WTF::Range::overlaps):
(WTF::Range::dump):
* wtf/RangeSet.h:
(WTF::RangeSet::add):

Tools:

This makes us run a bunch of JS tests at optLevel=1 to force testing of this new compiler
pipeline.

* Scripts/run-jsc-stress-tests:

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@214636 268f45cc-cd09-0410-ab3c-d52691b4dbfc
67 files changed:
Source/JavaScriptCore/CMakeLists.txt
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj
Source/JavaScriptCore/b3/B3BasicBlockUtils.h
Source/JavaScriptCore/b3/B3BlockWorklist.h
Source/JavaScriptCore/b3/B3CFG.h
Source/JavaScriptCore/b3/B3Common.h
Source/JavaScriptCore/b3/B3Compile.h
Source/JavaScriptCore/b3/B3DuplicateTails.cpp
Source/JavaScriptCore/b3/B3EliminateCommonSubexpressions.cpp
Source/JavaScriptCore/b3/B3FixSSA.cpp
Source/JavaScriptCore/b3/B3FixSSA.h
Source/JavaScriptCore/b3/B3Generate.cpp
Source/JavaScriptCore/b3/B3Generate.h
Source/JavaScriptCore/b3/B3HeapRange.h
Source/JavaScriptCore/b3/B3LowerToAir.cpp
Source/JavaScriptCore/b3/B3MoveConstants.cpp
Source/JavaScriptCore/b3/B3PhiChildren.h
Source/JavaScriptCore/b3/B3Procedure.cpp
Source/JavaScriptCore/b3/B3ReduceDoubleToFloat.cpp
Source/JavaScriptCore/b3/B3ReduceStrength.cpp
Source/JavaScriptCore/b3/B3SSACalculator.h
Source/JavaScriptCore/b3/B3UseCounts.h
Source/JavaScriptCore/b3/air/AirAllocateRegistersByGraphColoring.cpp
Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp [new file with mode: 0644]
Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.h [new file with mode: 0644]
Source/JavaScriptCore/b3/air/AirAllocateStack.cpp
Source/JavaScriptCore/b3/air/AirArg.cpp
Source/JavaScriptCore/b3/air/AirArg.h
Source/JavaScriptCore/b3/air/AirBasicBlock.h
Source/JavaScriptCore/b3/air/AirBlockWorklist.h
Source/JavaScriptCore/b3/air/AirCFG.h
Source/JavaScriptCore/b3/air/AirEliminateDeadCode.cpp
Source/JavaScriptCore/b3/air/AirFixObviousSpills.cpp
Source/JavaScriptCore/b3/air/AirFixPartialRegisterStalls.cpp
Source/JavaScriptCore/b3/air/AirFixSpillsAfterTerminals.cpp [new file with mode: 0644]
Source/JavaScriptCore/b3/air/AirFixSpillsAfterTerminals.h [new file with mode: 0644]
Source/JavaScriptCore/b3/air/AirGenerate.cpp
Source/JavaScriptCore/b3/air/AirGenerate.h
Source/JavaScriptCore/b3/air/AirGenerationContext.h
Source/JavaScriptCore/b3/air/AirInsertionSet.h
Source/JavaScriptCore/b3/air/AirInst.cpp
Source/JavaScriptCore/b3/air/AirLowerAfterRegAlloc.cpp
Source/JavaScriptCore/b3/air/AirLowerEntrySwitch.cpp
Source/JavaScriptCore/b3/air/AirOpcode.opcodes
Source/JavaScriptCore/b3/air/AirPhaseInsertionSet.cpp [moved from Source/JavaScriptCore/b3/B3HeapRange.cpp with 77% similarity]
Source/JavaScriptCore/b3/air/AirPhaseInsertionSet.h [new file with mode: 0644]
Source/JavaScriptCore/b3/air/AirRegLiveness.h
Source/JavaScriptCore/b3/air/AirSpillEverything.cpp
Source/JavaScriptCore/b3/air/AirTmp.cpp
Source/JavaScriptCore/b3/air/AirTmp.h
Source/JavaScriptCore/b3/air/AirTmpInlines.h
Source/JavaScriptCore/b3/testb3.cpp
Source/JavaScriptCore/jit/RegisterSet.h
Source/JavaScriptCore/runtime/Options.h
Source/JavaScriptCore/wasm/WasmB3IRGenerator.h
Source/WTF/ChangeLog
Source/WTF/WTF.xcodeproj/project.pbxproj
Source/WTF/wtf/CMakeLists.txt
Source/WTF/wtf/Deque.h
Source/WTF/wtf/IndexKeyType.h [new file with mode: 0644]
Source/WTF/wtf/IndexMap.h
Source/WTF/wtf/IndexSet.h
Source/WTF/wtf/Range.h [new file with mode: 0644]
Source/WTF/wtf/RangeSet.h
Tools/ChangeLog
Tools/Scripts/run-jsc-stress-tests