B3 -O1 should not allocateStackByGraphColoring
authorfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Wed, 12 Apr 2017 21:22:14 +0000 (21:22 +0000)
committerfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Wed, 12 Apr 2017 21:22:14 +0000 (21:22 +0000)
commitcd1dab5c3a04ea915a23aaa49aea1c1c331b0140
treefdd99cb8fb226984836708e6158e5b36c65cd6db
parent1d15530fe8a58c8faa1f188edf3be658b9eb97f9
B3 -O1 should not allocateStackByGraphColoring
https://bugs.webkit.org/show_bug.cgi?id=170742

Reviewed by Keith Miller.

One of B3 -O1's longest running phases is allocateStackByGraphColoring. One approach to
this would be to make that phase cheaper. But it's weird that this phase reruns
liveness after register allocation already ran liveness. If only it could reuse the
liveness computed by register allocation then it would run a lot faster. At -O2, we do
not want this, since we run phases between register allocation and stack allocation,
and those phases are free to change the liveness of spill slots (in fact,
fixObviousSpills will both shorten and lengthen live ranges because of load and store
elimination, respectively). But at -O1, we don't really need to run any phases between
register and stack allocation.

This changes Air's backend in the following ways:

- Linear scan does stack allocation. This means that we don't need to run
  allocateStackByGraphColoring at all. In reality, we reuse some of its innards, but
  we don't run the expensive part of it (liveness->interference->coalescing->coloring).
  This is a speed-up because we only run liveness once and reuse it for both register
  and stack allocation.

- Phases that previously ran between register and stack allocation are taken care of,
  each in its own special way:

  -> handleCalleSaves: this is now a utility function called by both
     allocateStackByGraphColoring and allocateRegistersAndStackByLinearScan.

  -> fixObviousSpills: we didn't run this at -O1, so nothing needs to be done.

  -> lowerAfterRegAlloc: this needed to be able to run before stack allocation because
     it could change register usage (vis a vis callee saves) and it could introduce
     spill slots. I changed this phase to have a secondary mode for when it runs after
     stack allocation.

- The part of allocateStackByGraphColoring that lowered stack addresses and took care
  of the call arg area is now a separate phase called lowerStackArgs. We run this phase
  regardless of optimization level. It's a cheap and general lowering.

This also removes spillEverything, because we never use that phase, we never test it,
and it got in the way in this refactoring.

This is a 21% speed-up on wasm -O1 compile times. This does not significantly change
-O1 throughput. We had already disabled allocateStack's most important optimization
(spill coalescing). This probably regresses average stack frame size, but I didn't
measure by how much. Stack frame size is really not that important. The algorithm in
allocateStackByGraphColoring is about much more than optimal frame size; it also
tries to avoid having to zero-extend 32-bit spills, it kills dead code, and of course
it coalesces.

* CMakeLists.txt:
* JavaScriptCore.xcodeproj/project.pbxproj:
* b3/B3Procedure.cpp:
(JSC::B3::Procedure::calleeSaveRegisterAtOffsetList):
(JSC::B3::Procedure::calleeSaveRegisters): Deleted.
* b3/B3Procedure.h:
* b3/B3StackmapGenerationParams.cpp:
(JSC::B3::StackmapGenerationParams::unavailableRegisters):
* b3/air/AirAllocateRegistersAndStackByLinearScan.cpp: Copied from Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp.
(JSC::B3::Air::allocateRegistersAndStackByLinearScan):
(JSC::B3::Air::allocateRegistersByLinearScan): Deleted.
* b3/air/AirAllocateRegistersAndStackByLinearScan.h: Copied from Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.h.
* b3/air/AirAllocateRegistersByLinearScan.cpp: Removed.
* b3/air/AirAllocateRegistersByLinearScan.h: Removed.
* b3/air/AirAllocateStackByGraphColoring.cpp:
(JSC::B3::Air::allocateEscapedStackSlots):
(JSC::B3::Air::updateFrameSizeBasedOnStackSlots):
(JSC::B3::Air::allocateStackByGraphColoring):
* b3/air/AirAllocateStackByGraphColoring.h:
* b3/air/AirArg.cpp:
(JSC::B3::Air::Arg::stackAddr):
* b3/air/AirArg.h:
(JSC::B3::Air::Arg::stackAddr): Deleted.
* b3/air/AirCode.cpp:
(JSC::B3::Air::Code::addStackSlot):
(JSC::B3::Air::Code::setCalleeSaveRegisterAtOffsetList):
(JSC::B3::Air::Code::calleeSaveRegisterAtOffsetList):
(JSC::B3::Air::Code::dump):
* b3/air/AirCode.h:
(JSC::B3::Air::Code::setStackIsAllocated):
(JSC::B3::Air::Code::stackIsAllocated):
(JSC::B3::Air::Code::calleeSaveRegisters):
* b3/air/AirGenerate.cpp:
(JSC::B3::Air::prepareForGeneration):
(JSC::B3::Air::generate):
* b3/air/AirHandleCalleeSaves.cpp:
(JSC::B3::Air::handleCalleeSaves):
* b3/air/AirHandleCalleeSaves.h:
* b3/air/AirLowerAfterRegAlloc.cpp:
(JSC::B3::Air::lowerAfterRegAlloc):
* b3/air/AirLowerStackArgs.cpp: Added.
(JSC::B3::Air::lowerStackArgs):
* b3/air/AirLowerStackArgs.h: Added.
* b3/testb3.cpp:
(JSC::B3::testPinRegisters):
* ftl/FTLCompile.cpp:
(JSC::FTL::compile):
* jit/RegisterAtOffsetList.h:
* wasm/WasmB3IRGenerator.cpp:
(JSC::Wasm::parseAndCompile):

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@215292 268f45cc-cd09-0410-ab3c-d52691b4dbfc
27 files changed:
Source/JavaScriptCore/CMakeLists.txt
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj
Source/JavaScriptCore/b3/B3Procedure.cpp
Source/JavaScriptCore/b3/B3Procedure.h
Source/JavaScriptCore/b3/B3StackmapGenerationParams.cpp
Source/JavaScriptCore/b3/air/AirAllocateRegistersAndStackByLinearScan.cpp [moved from Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp with 87% similarity]
Source/JavaScriptCore/b3/air/AirAllocateRegistersAndStackByLinearScan.h [moved from Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.h with 78% similarity]
Source/JavaScriptCore/b3/air/AirAllocateStackByGraphColoring.cpp
Source/JavaScriptCore/b3/air/AirAllocateStackByGraphColoring.h
Source/JavaScriptCore/b3/air/AirArg.cpp
Source/JavaScriptCore/b3/air/AirArg.h
Source/JavaScriptCore/b3/air/AirCode.cpp
Source/JavaScriptCore/b3/air/AirCode.h
Source/JavaScriptCore/b3/air/AirGenerate.cpp
Source/JavaScriptCore/b3/air/AirHandleCalleeSaves.cpp
Source/JavaScriptCore/b3/air/AirHandleCalleeSaves.h
Source/JavaScriptCore/b3/air/AirLowerAfterRegAlloc.cpp
Source/JavaScriptCore/b3/air/AirLowerStackArgs.cpp [new file with mode: 0644]
Source/JavaScriptCore/b3/air/AirLowerStackArgs.h [moved from Source/JavaScriptCore/b3/air/AirSpillEverything.h with 63% similarity]
Source/JavaScriptCore/b3/air/AirSpillEverything.cpp [deleted file]
Source/JavaScriptCore/b3/testb3.cpp
Source/JavaScriptCore/ftl/FTLCompile.cpp
Source/JavaScriptCore/jit/RegisterAtOffsetList.h
Source/JavaScriptCore/runtime/Options.h
Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp
Tools/Scripts/run-jsc-stress-tests