[JSC] Speed up Air Liveness Analysis on Tmps
authorbenjamin@webkit.org <benjamin@webkit.org@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Tue, 1 Dec 2015 02:26:57 +0000 (02:26 +0000)
committerbenjamin@webkit.org <benjamin@webkit.org@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Tue, 1 Dec 2015 02:26:57 +0000 (02:26 +0000)
commitd3d574149a42299526c94903318f0d05ea9b30b6
treef9e0c1760aef41480963c0ca29c2be9f84583075
parent58ce9d834aee827fc02e6c616d90716feb2c0f0f
[JSC] Speed up Air Liveness Analysis on Tmps
https://bugs.webkit.org/show_bug.cgi?id=151556

Patch by Benjamin Poulain <bpoulain@apple.com> on 2015-11-30
Reviewed by Filip Pizlo.

Source/JavaScriptCore:

Liveness Analysis scales poorly on large graphs like the ones
generated by testComplex().
This patch introduces a faster of Liveness using the continuous indices
of values instead of the values themselves.

There are two main areas of improvements:
1) Reduce the cost of doing a LocalCalc over a BasicBlock.
2) Reduce how many LocalCalc are needed to converge to a solution.

Most of the costs of LocalCalc are from HashSet manipulations.
The HashSet operations are O(1) but the constant is large enough
to be a problem.

I used a similar trick as the Register Allocator to remove hashing
and collision handling: the absolute value of the Tmp is used as an index
into a flat array.

I used Briggs's Sparse Set implementation for the local live information
at each instruction. It has great properties for doing the local calculation:
-No memory reallocation.
-O(1) add() and remove() with a small constant.
-Strict O(n) iteration.
-O(1) clear().

The values Live-At-Head are now stored into a Vector. The Sparse Set
is used to maintain the Tmp uniqueness.

When forwarding new liveness at head to the predecessor, I start by removing
everything that was already in live-at-head. We can assume that any value
in that list has already been added to the predecessors.
This leaves us with a small-ish number of Tmps to add to live-at-head
and to the predecessors.

The speed up convergence, I used the same trick as DFG's liveness: keep
a set of dirty blocks to process. In practice, all the blocks without
back-edges converge quickly, and we only propagate liveness as needed.

This patch reduces the time taken by "testComplex(64, 384)" by another 5%.

The remaining things to do for Liveness are:
-Skip the first block for the fix point (it is often large and doing a local
 calc on it is useless).
-Find a better Data Structure for live-at-tail (updating the HashSet takes
 > 50% of the total convergence time).

* JavaScriptCore.xcodeproj/project.pbxproj:
* b3/air/AirIteratedRegisterCoalescing.cpp:
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::build):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::getAlias):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::getAliasWhenSpilling):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::allocatedReg):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::tmpArraySize):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::initializeDegrees):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::addEdges):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::addEdge):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::makeWorkList):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::simplify):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::forEachAdjacent):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::hasBeenSimplified):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::decrementDegree):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::forEachNodeMoves):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::isMoveRelated):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::enableMovesOnValue):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::precoloredCoalescingHeuristic):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::conservativeHeuristic):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::addWorkList):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::combine):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::freezeMoves):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::selectSpill):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::assignColors):
(JSC::B3::Air::IteratedRegisterCoalescingAllocator::dumpInterferenceGraphInDot):
(JSC::B3::Air::iteratedRegisterCoalescingOnType):
(JSC::B3::Air::iteratedRegisterCoalescing):
(JSC::B3::Air::AbsoluteTmpHelper<Arg::GP>::absoluteIndex): Deleted.
(JSC::B3::Air::AbsoluteTmpHelper<Arg::GP>::tmpFromAbsoluteIndex): Deleted.
(JSC::B3::Air::AbsoluteTmpHelper<Arg::FP>::absoluteIndex): Deleted.
(JSC::B3::Air::AbsoluteTmpHelper<Arg::FP>::tmpFromAbsoluteIndex): Deleted.
* b3/air/AirReportUsedRegisters.cpp:
(JSC::B3::Air::reportUsedRegisters):
* b3/air/AirTmpInlines.h:
(JSC::B3::Air::AbsoluteTmpMapper<Arg::GP>::absoluteIndex):
(JSC::B3::Air::AbsoluteTmpMapper<Arg::GP>::tmpFromAbsoluteIndex):
(JSC::B3::Air::AbsoluteTmpMapper<Arg::FP>::absoluteIndex):
(JSC::B3::Air::AbsoluteTmpMapper<Arg::FP>::tmpFromAbsoluteIndex):
* b3/air/AirLiveness.h: Added.

Source/WTF:

* WTF.xcodeproj/project.pbxproj:
* wtf/IndexSparseSet.h: Added.
(WTF::IndexSparseSet<OverflowHandler>::IndexSparseSet):
(WTF::IndexSparseSet<OverflowHandler>::add):
(WTF::IndexSparseSet<OverflowHandler>::remove):
(WTF::IndexSparseSet<OverflowHandler>::clear):
(WTF::IndexSparseSet<OverflowHandler>::size):
(WTF::IndexSparseSet<OverflowHandler>::isEmpty):
(WTF::IndexSparseSet<OverflowHandler>::contains):

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@192851 268f45cc-cd09-0410-ab3c-d52691b4dbfc
12 files changed:
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj
Source/JavaScriptCore/b3/B3IndexMap.h
Source/JavaScriptCore/b3/air/AirAllocateStack.cpp
Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp
Source/JavaScriptCore/b3/air/AirLiveness.h
Source/JavaScriptCore/b3/air/AirReportUsedRegisters.cpp
Source/JavaScriptCore/b3/air/AirSpillEverything.cpp
Source/JavaScriptCore/b3/air/AirTmpInlines.h
Source/WTF/ChangeLog
Source/WTF/WTF.xcodeproj/project.pbxproj
Source/WTF/wtf/IndexSparseSet.h [new file with mode: 0644]