B3 should have comprehensive support for atomic operations
authorfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Fri, 10 Mar 2017 17:49:42 +0000 (17:49 +0000)
committerfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Fri, 10 Mar 2017 17:49:42 +0000 (17:49 +0000)
https://bugs.webkit.org/show_bug.cgi?id=162349

Reviewed by Keith Miller.

Source/JavaScriptCore:

This adds the following capabilities to B3:

- Atomic weak/strong unfenced/fenced compare-and-swap
- Atomic add/sub/or/and/xor/xchg
- Acquire/release fencing on loads/stores
- Fenceless load-load dependencies

This adds lowering to the following instructions on x86:

- lock cmpxchg
- lock xadd
- lock add/sub/or/and/xor/xchg

This adds lowering to the following instructions on ARM64:

- ldar and friends
- stlr and friends
- ldxr and friends (unfenced LL)
- stxr and friends (unfended SC)
- ldaxr and friends (fenced LL)
- stlxr and friends (fenced SC)
- eor as a fenceless load-load dependency

This does instruction selection pattern matching to ensure that weak/strong CAS and all of the
variants of fences and atomic math ops get lowered to the best possible instruction sequence.
For example, we support the Equal(AtomicStrongCAS(expected, ...), expected) pattern and a bunch
of its friends. You can say Branch(Equal(AtomicStrongCAS(expected, ...), expected)) and it will
generate the best possible branch sequence on x86 and ARM64.

B3 now knows how to model all of the kinds of fencing. It knows that acq loads are ordered with
respect to each other and with respect to rel stores, creating sequential consistency that
transcends just the acq/rel fences themselves (see Effects::fence). It knows that the phantom
fence effects may only target some abstract heaps but not others, so that load elimination and
store sinking can still operate across fences if you just tell B3 that the fence does not alias
those accesses. This makes it super easy to teach B3 that some of your heap is thread-local.
Even better, it lets you express fine-grained dependencies where the atomics that affect one
property in shared memory do not clobber non-atomics that ffect some other property in shared
memory.

One of my favorite features is Depend, which allows you to express load-load dependencies. On
x86 it lowers to nothing, while on ARM64 it lowers to eor.

This also exposes a common atomicWeakCAS API to the x86_64/ARM64 MacroAssemblers. Same for
acq/rel. JSC's 64-bit JITs are now a happy concurrency playground.

This doesn't yet expose the functionality to JS or wasm. SAB still uses the non-intrinsic
implementations of the Atomics object, for now.

* CMakeLists.txt:
* JavaScriptCore.xcodeproj/project.pbxproj:
* assembler/ARM64Assembler.h:
(JSC::ARM64Assembler::ldar):
(JSC::ARM64Assembler::ldxr):
(JSC::ARM64Assembler::ldaxr):
(JSC::ARM64Assembler::stxr):
(JSC::ARM64Assembler::stlr):
(JSC::ARM64Assembler::stlxr):
(JSC::ARM64Assembler::excepnGenerationImmMask):
(JSC::ARM64Assembler::exoticLoad):
(JSC::ARM64Assembler::storeRelease):
(JSC::ARM64Assembler::exoticStore):
* assembler/AbstractMacroAssembler.cpp: Added.
(WTF::printInternal):
* assembler/AbstractMacroAssembler.h:
(JSC::AbstractMacroAssemblerBase::invert):
* assembler/MacroAssembler.h:
* assembler/MacroAssemblerARM64.h:
(JSC::MacroAssemblerARM64::loadAcq8SignedExtendTo32):
(JSC::MacroAssemblerARM64::loadAcq8):
(JSC::MacroAssemblerARM64::storeRel8):
(JSC::MacroAssemblerARM64::loadAcq16SignedExtendTo32):
(JSC::MacroAssemblerARM64::loadAcq16):
(JSC::MacroAssemblerARM64::storeRel16):
(JSC::MacroAssemblerARM64::loadAcq32):
(JSC::MacroAssemblerARM64::loadAcq64):
(JSC::MacroAssemblerARM64::storeRel32):
(JSC::MacroAssemblerARM64::storeRel64):
(JSC::MacroAssemblerARM64::loadLink8):
(JSC::MacroAssemblerARM64::loadLinkAcq8):
(JSC::MacroAssemblerARM64::storeCond8):
(JSC::MacroAssemblerARM64::storeCondRel8):
(JSC::MacroAssemblerARM64::loadLink16):
(JSC::MacroAssemblerARM64::loadLinkAcq16):
(JSC::MacroAssemblerARM64::storeCond16):
(JSC::MacroAssemblerARM64::storeCondRel16):
(JSC::MacroAssemblerARM64::loadLink32):
(JSC::MacroAssemblerARM64::loadLinkAcq32):
(JSC::MacroAssemblerARM64::storeCond32):
(JSC::MacroAssemblerARM64::storeCondRel32):
(JSC::MacroAssemblerARM64::loadLink64):
(JSC::MacroAssemblerARM64::loadLinkAcq64):
(JSC::MacroAssemblerARM64::storeCond64):
(JSC::MacroAssemblerARM64::storeCondRel64):
(JSC::MacroAssemblerARM64::atomicStrongCAS8):
(JSC::MacroAssemblerARM64::atomicStrongCAS16):
(JSC::MacroAssemblerARM64::atomicStrongCAS32):
(JSC::MacroAssemblerARM64::atomicStrongCAS64):
(JSC::MacroAssemblerARM64::atomicRelaxedStrongCAS8):
(JSC::MacroAssemblerARM64::atomicRelaxedStrongCAS16):
(JSC::MacroAssemblerARM64::atomicRelaxedStrongCAS32):
(JSC::MacroAssemblerARM64::atomicRelaxedStrongCAS64):
(JSC::MacroAssemblerARM64::branchAtomicWeakCAS8):
(JSC::MacroAssemblerARM64::branchAtomicWeakCAS16):
(JSC::MacroAssemblerARM64::branchAtomicWeakCAS32):
(JSC::MacroAssemblerARM64::branchAtomicWeakCAS64):
(JSC::MacroAssemblerARM64::branchAtomicRelaxedWeakCAS8):
(JSC::MacroAssemblerARM64::branchAtomicRelaxedWeakCAS16):
(JSC::MacroAssemblerARM64::branchAtomicRelaxedWeakCAS32):
(JSC::MacroAssemblerARM64::branchAtomicRelaxedWeakCAS64):
(JSC::MacroAssemblerARM64::depend32):
(JSC::MacroAssemblerARM64::depend64):
(JSC::MacroAssemblerARM64::loadLink):
(JSC::MacroAssemblerARM64::loadLinkAcq):
(JSC::MacroAssemblerARM64::storeCond):
(JSC::MacroAssemblerARM64::storeCondRel):
(JSC::MacroAssemblerARM64::signExtend):
(JSC::MacroAssemblerARM64::branch):
(JSC::MacroAssemblerARM64::atomicStrongCAS):
(JSC::MacroAssemblerARM64::atomicRelaxedStrongCAS):
(JSC::MacroAssemblerARM64::branchAtomicWeakCAS):
(JSC::MacroAssemblerARM64::branchAtomicRelaxedWeakCAS):
(JSC::MacroAssemblerARM64::extractSimpleAddress):
(JSC::MacroAssemblerARM64::signExtend<8>):
(JSC::MacroAssemblerARM64::signExtend<16>):
(JSC::MacroAssemblerARM64::branch<64>):
* assembler/MacroAssemblerX86Common.h:
(JSC::MacroAssemblerX86Common::add32):
(JSC::MacroAssemblerX86Common::and32):
(JSC::MacroAssemblerX86Common::and16):
(JSC::MacroAssemblerX86Common::and8):
(JSC::MacroAssemblerX86Common::neg32):
(JSC::MacroAssemblerX86Common::neg16):
(JSC::MacroAssemblerX86Common::neg8):
(JSC::MacroAssemblerX86Common::or32):
(JSC::MacroAssemblerX86Common::or16):
(JSC::MacroAssemblerX86Common::or8):
(JSC::MacroAssemblerX86Common::sub16):
(JSC::MacroAssemblerX86Common::sub8):
(JSC::MacroAssemblerX86Common::sub32):
(JSC::MacroAssemblerX86Common::xor32):
(JSC::MacroAssemblerX86Common::xor16):
(JSC::MacroAssemblerX86Common::xor8):
(JSC::MacroAssemblerX86Common::not32):
(JSC::MacroAssemblerX86Common::not16):
(JSC::MacroAssemblerX86Common::not8):
(JSC::MacroAssemblerX86Common::store16):
(JSC::MacroAssemblerX86Common::atomicStrongCAS8):
(JSC::MacroAssemblerX86Common::atomicStrongCAS16):
(JSC::MacroAssemblerX86Common::atomicStrongCAS32):
(JSC::MacroAssemblerX86Common::branchAtomicStrongCAS8):
(JSC::MacroAssemblerX86Common::branchAtomicStrongCAS16):
(JSC::MacroAssemblerX86Common::branchAtomicStrongCAS32):
(JSC::MacroAssemblerX86Common::atomicWeakCAS8):
(JSC::MacroAssemblerX86Common::atomicWeakCAS16):
(JSC::MacroAssemblerX86Common::atomicWeakCAS32):
(JSC::MacroAssemblerX86Common::branchAtomicWeakCAS8):
(JSC::MacroAssemblerX86Common::branchAtomicWeakCAS16):
(JSC::MacroAssemblerX86Common::branchAtomicWeakCAS32):
(JSC::MacroAssemblerX86Common::atomicRelaxedWeakCAS8):
(JSC::MacroAssemblerX86Common::atomicRelaxedWeakCAS16):
(JSC::MacroAssemblerX86Common::atomicRelaxedWeakCAS32):
(JSC::MacroAssemblerX86Common::branchAtomicRelaxedWeakCAS8):
(JSC::MacroAssemblerX86Common::branchAtomicRelaxedWeakCAS16):
(JSC::MacroAssemblerX86Common::branchAtomicRelaxedWeakCAS32):
(JSC::MacroAssemblerX86Common::atomicAdd8):
(JSC::MacroAssemblerX86Common::atomicAdd16):
(JSC::MacroAssemblerX86Common::atomicAdd32):
(JSC::MacroAssemblerX86Common::atomicSub8):
(JSC::MacroAssemblerX86Common::atomicSub16):
(JSC::MacroAssemblerX86Common::atomicSub32):
(JSC::MacroAssemblerX86Common::atomicAnd8):
(JSC::MacroAssemblerX86Common::atomicAnd16):
(JSC::MacroAssemblerX86Common::atomicAnd32):
(JSC::MacroAssemblerX86Common::atomicOr8):
(JSC::MacroAssemblerX86Common::atomicOr16):
(JSC::MacroAssemblerX86Common::atomicOr32):
(JSC::MacroAssemblerX86Common::atomicXor8):
(JSC::MacroAssemblerX86Common::atomicXor16):
(JSC::MacroAssemblerX86Common::atomicXor32):
(JSC::MacroAssemblerX86Common::atomicNeg8):
(JSC::MacroAssemblerX86Common::atomicNeg16):
(JSC::MacroAssemblerX86Common::atomicNeg32):
(JSC::MacroAssemblerX86Common::atomicNot8):
(JSC::MacroAssemblerX86Common::atomicNot16):
(JSC::MacroAssemblerX86Common::atomicNot32):
(JSC::MacroAssemblerX86Common::atomicXchgAdd8):
(JSC::MacroAssemblerX86Common::atomicXchgAdd16):
(JSC::MacroAssemblerX86Common::atomicXchgAdd32):
(JSC::MacroAssemblerX86Common::atomicXchg8):
(JSC::MacroAssemblerX86Common::atomicXchg16):
(JSC::MacroAssemblerX86Common::atomicXchg32):
(JSC::MacroAssemblerX86Common::loadAcq8):
(JSC::MacroAssemblerX86Common::loadAcq8SignedExtendTo32):
(JSC::MacroAssemblerX86Common::loadAcq16):
(JSC::MacroAssemblerX86Common::loadAcq16SignedExtendTo32):
(JSC::MacroAssemblerX86Common::loadAcq32):
(JSC::MacroAssemblerX86Common::storeRel8):
(JSC::MacroAssemblerX86Common::storeRel16):
(JSC::MacroAssemblerX86Common::storeRel32):
(JSC::MacroAssemblerX86Common::storeFence):
(JSC::MacroAssemblerX86Common::loadFence):
(JSC::MacroAssemblerX86Common::replaceWithJump):
(JSC::MacroAssemblerX86Common::maxJumpReplacementSize):
(JSC::MacroAssemblerX86Common::patchableJumpSize):
(JSC::MacroAssemblerX86Common::supportsFloatingPointRounding):
(JSC::MacroAssemblerX86Common::supportsAVX):
(JSC::MacroAssemblerX86Common::updateEax1EcxFlags):
(JSC::MacroAssemblerX86Common::x86Condition):
(JSC::MacroAssemblerX86Common::atomicStrongCAS):
(JSC::MacroAssemblerX86Common::branchAtomicStrongCAS):
* assembler/MacroAssemblerX86_64.h:
(JSC::MacroAssemblerX86_64::add64):
(JSC::MacroAssemblerX86_64::and64):
(JSC::MacroAssemblerX86_64::neg64):
(JSC::MacroAssemblerX86_64::or64):
(JSC::MacroAssemblerX86_64::sub64):
(JSC::MacroAssemblerX86_64::xor64):
(JSC::MacroAssemblerX86_64::not64):
(JSC::MacroAssemblerX86_64::store64):
(JSC::MacroAssemblerX86_64::atomicStrongCAS64):
(JSC::MacroAssemblerX86_64::branchAtomicStrongCAS64):
(JSC::MacroAssemblerX86_64::atomicWeakCAS64):
(JSC::MacroAssemblerX86_64::branchAtomicWeakCAS64):
(JSC::MacroAssemblerX86_64::atomicRelaxedWeakCAS64):
(JSC::MacroAssemblerX86_64::branchAtomicRelaxedWeakCAS64):
(JSC::MacroAssemblerX86_64::atomicAdd64):
(JSC::MacroAssemblerX86_64::atomicSub64):
(JSC::MacroAssemblerX86_64::atomicAnd64):
(JSC::MacroAssemblerX86_64::atomicOr64):
(JSC::MacroAssemblerX86_64::atomicXor64):
(JSC::MacroAssemblerX86_64::atomicNeg64):
(JSC::MacroAssemblerX86_64::atomicNot64):
(JSC::MacroAssemblerX86_64::atomicXchgAdd64):
(JSC::MacroAssemblerX86_64::atomicXchg64):
(JSC::MacroAssemblerX86_64::loadAcq64):
(JSC::MacroAssemblerX86_64::storeRel64):
* assembler/X86Assembler.h:
(JSC::X86Assembler::addl_mr):
(JSC::X86Assembler::addq_mr):
(JSC::X86Assembler::addq_rm):
(JSC::X86Assembler::addq_im):
(JSC::X86Assembler::andl_mr):
(JSC::X86Assembler::andl_rm):
(JSC::X86Assembler::andw_rm):
(JSC::X86Assembler::andb_rm):
(JSC::X86Assembler::andl_im):
(JSC::X86Assembler::andw_im):
(JSC::X86Assembler::andb_im):
(JSC::X86Assembler::andq_mr):
(JSC::X86Assembler::andq_rm):
(JSC::X86Assembler::andq_im):
(JSC::X86Assembler::incq_m):
(JSC::X86Assembler::negq_m):
(JSC::X86Assembler::negl_m):
(JSC::X86Assembler::negw_m):
(JSC::X86Assembler::negb_m):
(JSC::X86Assembler::notl_m):
(JSC::X86Assembler::notw_m):
(JSC::X86Assembler::notb_m):
(JSC::X86Assembler::notq_m):
(JSC::X86Assembler::orl_mr):
(JSC::X86Assembler::orl_rm):
(JSC::X86Assembler::orw_rm):
(JSC::X86Assembler::orb_rm):
(JSC::X86Assembler::orl_im):
(JSC::X86Assembler::orw_im):
(JSC::X86Assembler::orb_im):
(JSC::X86Assembler::orq_mr):
(JSC::X86Assembler::orq_rm):
(JSC::X86Assembler::orq_im):
(JSC::X86Assembler::subl_mr):
(JSC::X86Assembler::subl_rm):
(JSC::X86Assembler::subw_rm):
(JSC::X86Assembler::subb_rm):
(JSC::X86Assembler::subl_im):
(JSC::X86Assembler::subw_im):
(JSC::X86Assembler::subb_im):
(JSC::X86Assembler::subq_mr):
(JSC::X86Assembler::subq_rm):
(JSC::X86Assembler::subq_im):
(JSC::X86Assembler::xorl_mr):
(JSC::X86Assembler::xorl_rm):
(JSC::X86Assembler::xorl_im):
(JSC::X86Assembler::xorw_rm):
(JSC::X86Assembler::xorw_im):
(JSC::X86Assembler::xorb_rm):
(JSC::X86Assembler::xorb_im):
(JSC::X86Assembler::xorq_im):
(JSC::X86Assembler::xorq_rm):
(JSC::X86Assembler::xorq_mr):
(JSC::X86Assembler::xchgb_rm):
(JSC::X86Assembler::xchgw_rm):
(JSC::X86Assembler::xchgl_rm):
(JSC::X86Assembler::xchgq_rm):
(JSC::X86Assembler::movw_im):
(JSC::X86Assembler::movq_i32m):
(JSC::X86Assembler::cmpxchgb_rm):
(JSC::X86Assembler::cmpxchgw_rm):
(JSC::X86Assembler::cmpxchgl_rm):
(JSC::X86Assembler::cmpxchgq_rm):
(JSC::X86Assembler::xaddb_rm):
(JSC::X86Assembler::xaddw_rm):
(JSC::X86Assembler::xaddl_rm):
(JSC::X86Assembler::xaddq_rm):
(JSC::X86Assembler::X86InstructionFormatter::SingleInstructionBufferWriter::memoryModRM):
* b3/B3AtomicValue.cpp: Added.
(JSC::B3::AtomicValue::~AtomicValue):
(JSC::B3::AtomicValue::dumpMeta):
(JSC::B3::AtomicValue::cloneImpl):
(JSC::B3::AtomicValue::AtomicValue):
* b3/B3AtomicValue.h: Added.
* b3/B3BasicBlock.h:
* b3/B3BlockInsertionSet.cpp:
(JSC::B3::BlockInsertionSet::BlockInsertionSet):
(JSC::B3::BlockInsertionSet::insert): Deleted.
(JSC::B3::BlockInsertionSet::insertBefore): Deleted.
(JSC::B3::BlockInsertionSet::insertAfter): Deleted.
(JSC::B3::BlockInsertionSet::execute): Deleted.
* b3/B3BlockInsertionSet.h:
* b3/B3Effects.cpp:
(JSC::B3::Effects::interferes):
(JSC::B3::Effects::operator==):
(JSC::B3::Effects::dump):
* b3/B3Effects.h:
(JSC::B3::Effects::forCall):
(JSC::B3::Effects::mustExecute):
* b3/B3EliminateCommonSubexpressions.cpp:
* b3/B3Generate.cpp:
(JSC::B3::generateToAir):
* b3/B3GenericBlockInsertionSet.h: Added.
(JSC::B3::GenericBlockInsertionSet::GenericBlockInsertionSet):
(JSC::B3::GenericBlockInsertionSet::insert):
(JSC::B3::GenericBlockInsertionSet::insertBefore):
(JSC::B3::GenericBlockInsertionSet::insertAfter):
(JSC::B3::GenericBlockInsertionSet::execute):
* b3/B3HeapRange.h:
(JSC::B3::HeapRange::operator|):
* b3/B3InsertionSet.cpp:
(JSC::B3::InsertionSet::insertClone):
* b3/B3InsertionSet.h:
* b3/B3LegalizeMemoryOffsets.cpp:
* b3/B3LowerMacros.cpp:
(JSC::B3::lowerMacros):
* b3/B3LowerMacrosAfterOptimizations.cpp:
* b3/B3LowerToAir.cpp:
(JSC::B3::Air::LowerToAir::LowerToAir):
(JSC::B3::Air::LowerToAir::run):
(JSC::B3::Air::LowerToAir::effectiveAddr):
(JSC::B3::Air::LowerToAir::addr):
(JSC::B3::Air::LowerToAir::loadPromiseAnyOpcode):
(JSC::B3::Air::LowerToAir::appendShift):
(JSC::B3::Air::LowerToAir::tryAppendStoreBinOp):
(JSC::B3::Air::LowerToAir::storeOpcode):
(JSC::B3::Air::LowerToAir::createStore):
(JSC::B3::Air::LowerToAir::finishAppendingInstructions):
(JSC::B3::Air::LowerToAir::newBlock):
(JSC::B3::Air::LowerToAir::splitBlock):
(JSC::B3::Air::LowerToAir::fillStackmap):
(JSC::B3::Air::LowerToAir::appendX86Div):
(JSC::B3::Air::LowerToAir::appendX86UDiv):
(JSC::B3::Air::LowerToAir::loadLinkOpcode):
(JSC::B3::Air::LowerToAir::storeCondOpcode):
(JSC::B3::Air::LowerToAir::appendCAS):
(JSC::B3::Air::LowerToAir::appendVoidAtomic):
(JSC::B3::Air::LowerToAir::appendGeneralAtomic):
(JSC::B3::Air::LowerToAir::lower):
(JSC::B3::Air::LowerToAir::lowerX86Div): Deleted.
(JSC::B3::Air::LowerToAir::lowerX86UDiv): Deleted.
* b3/B3LowerToAir.h:
* b3/B3MemoryValue.cpp:
(JSC::B3::MemoryValue::isLegalOffset):
(JSC::B3::MemoryValue::accessType):
(JSC::B3::MemoryValue::accessBank):
(JSC::B3::MemoryValue::accessByteSize):
(JSC::B3::MemoryValue::dumpMeta):
(JSC::B3::MemoryValue::MemoryValue):
(JSC::B3::MemoryValue::accessWidth): Deleted.
* b3/B3MemoryValue.h:
* b3/B3MemoryValueInlines.h: Added.
(JSC::B3::MemoryValue::isLegalOffset):
(JSC::B3::MemoryValue::requiresSimpleAddr):
(JSC::B3::MemoryValue::accessWidth):
* b3/B3MoveConstants.cpp:
* b3/B3NativeTraits.h: Added.
* b3/B3Opcode.cpp:
(JSC::B3::storeOpcode):
(WTF::printInternal):
* b3/B3Opcode.h:
(JSC::B3::isLoad):
(JSC::B3::isStore):
(JSC::B3::isLoadStore):
(JSC::B3::isAtomic):
(JSC::B3::isAtomicCAS):
(JSC::B3::isAtomicXchg):
(JSC::B3::isMemoryAccess):
(JSC::B3::signExtendOpcode):
* b3/B3Procedure.cpp:
(JSC::B3::Procedure::dump):
* b3/B3Procedure.h:
(JSC::B3::Procedure::hasQuirks):
(JSC::B3::Procedure::setHasQuirks):
* b3/B3PureCSE.cpp:
(JSC::B3::pureCSE):
* b3/B3PureCSE.h:
* b3/B3ReduceStrength.cpp:
* b3/B3Validate.cpp:
* b3/B3Value.cpp:
(JSC::B3::Value::returnsBool):
(JSC::B3::Value::effects):
(JSC::B3::Value::key):
(JSC::B3::Value::performSubstitution):
(JSC::B3::Value::typeFor):
* b3/B3Value.h:
* b3/B3Width.cpp:
(JSC::B3::bestType):
* b3/B3Width.h:
(JSC::B3::canonicalWidth):
(JSC::B3::isCanonicalWidth):
(JSC::B3::mask):
* b3/air/AirArg.cpp:
(JSC::B3::Air::Arg::jsHash):
(JSC::B3::Air::Arg::dump):
(WTF::printInternal):
* b3/air/AirArg.h:
(JSC::B3::Air::Arg::isAnyUse):
(JSC::B3::Air::Arg::isColdUse):
(JSC::B3::Air::Arg::cooled):
(JSC::B3::Air::Arg::isEarlyUse):
(JSC::B3::Air::Arg::isLateUse):
(JSC::B3::Air::Arg::isAnyDef):
(JSC::B3::Air::Arg::isEarlyDef):
(JSC::B3::Air::Arg::isLateDef):
(JSC::B3::Air::Arg::isZDef):
(JSC::B3::Air::Arg::simpleAddr):
(JSC::B3::Air::Arg::statusCond):
(JSC::B3::Air::Arg::isSimpleAddr):
(JSC::B3::Air::Arg::isMemory):
(JSC::B3::Air::Arg::isStatusCond):
(JSC::B3::Air::Arg::isCondition):
(JSC::B3::Air::Arg::ptr):
(JSC::B3::Air::Arg::base):
(JSC::B3::Air::Arg::isGP):
(JSC::B3::Air::Arg::isFP):
(JSC::B3::Air::Arg::isValidForm):
(JSC::B3::Air::Arg::forEachTmpFast):
(JSC::B3::Air::Arg::forEachTmp):
(JSC::B3::Air::Arg::asAddress):
(JSC::B3::Air::Arg::asStatusCondition):
(JSC::B3::Air::Arg::isInvertible):
(JSC::B3::Air::Arg::inverted):
* b3/air/AirBasicBlock.cpp:
(JSC::B3::Air::BasicBlock::setSuccessors):
* b3/air/AirBasicBlock.h:
* b3/air/AirBlockInsertionSet.cpp: Added.
(JSC::B3::Air::BlockInsertionSet::BlockInsertionSet):
(JSC::B3::Air::BlockInsertionSet::~BlockInsertionSet):
* b3/air/AirBlockInsertionSet.h: Added.
* b3/air/AirDumpAsJS.cpp: Removed.
* b3/air/AirDumpAsJS.h: Removed.
* b3/air/AirEliminateDeadCode.cpp:
(JSC::B3::Air::eliminateDeadCode):
* b3/air/AirGenerate.cpp:
(JSC::B3::Air::prepareForGeneration):
* b3/air/AirInstInlines.h:
(JSC::B3::Air::isAtomicStrongCASValid):
(JSC::B3::Air::isBranchAtomicStrongCASValid):
(JSC::B3::Air::isAtomicStrongCAS8Valid):
(JSC::B3::Air::isAtomicStrongCAS16Valid):
(JSC::B3::Air::isAtomicStrongCAS32Valid):
(JSC::B3::Air::isAtomicStrongCAS64Valid):
(JSC::B3::Air::isBranchAtomicStrongCAS8Valid):
(JSC::B3::Air::isBranchAtomicStrongCAS16Valid):
(JSC::B3::Air::isBranchAtomicStrongCAS32Valid):
(JSC::B3::Air::isBranchAtomicStrongCAS64Valid):
* b3/air/AirOpcode.opcodes:
* b3/air/AirOptimizeBlockOrder.cpp:
(JSC::B3::Air::optimizeBlockOrder):
* b3/air/AirPadInterference.cpp:
(JSC::B3::Air::padInterference):
* b3/air/AirSpillEverything.cpp:
(JSC::B3::Air::spillEverything):
* b3/air/opcode_generator.rb:
* b3/testb3.cpp:
(JSC::B3::testLoadAcq42):
(JSC::B3::testStoreRelAddLoadAcq32):
(JSC::B3::testStoreRelAddLoadAcq8):
(JSC::B3::testStoreRelAddFenceLoadAcq8):
(JSC::B3::testStoreRelAddLoadAcq16):
(JSC::B3::testStoreRelAddLoadAcq64):
(JSC::B3::testTrappingStoreElimination):
(JSC::B3::testX86LeaAddAdd):
(JSC::B3::testX86LeaAddShlLeftScale1):
(JSC::B3::testAtomicWeakCAS):
(JSC::B3::testAtomicStrongCAS):
(JSC::B3::testAtomicXchg):
(JSC::B3::testDepend32):
(JSC::B3::testDepend64):
(JSC::B3::run):
* runtime/Options.h:

Websites/webkit.org:

Document the new opcodes!

* docs/b3/intermediate-representation.html:

git-svn-id: http://svn.webkit.org/repository/webkit/trunk@213714 268f45cc-cd09-0410-ab3c-d52691b4dbfc

66 files changed:
Source/JavaScriptCore/CMakeLists.txt
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj
Source/JavaScriptCore/assembler/ARM64Assembler.h
Source/JavaScriptCore/assembler/AbstractMacroAssembler.cpp [new file with mode: 0644]
Source/JavaScriptCore/assembler/AbstractMacroAssembler.h
Source/JavaScriptCore/assembler/MacroAssembler.h
Source/JavaScriptCore/assembler/MacroAssemblerARM64.h
Source/JavaScriptCore/assembler/MacroAssemblerX86Common.h
Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h
Source/JavaScriptCore/assembler/X86Assembler.h
Source/JavaScriptCore/b3/B3AtomicValue.cpp [new file with mode: 0644]
Source/JavaScriptCore/b3/B3AtomicValue.h [new file with mode: 0644]
Source/JavaScriptCore/b3/B3BasicBlock.h
Source/JavaScriptCore/b3/B3BlockInsertionSet.cpp
Source/JavaScriptCore/b3/B3BlockInsertionSet.h
Source/JavaScriptCore/b3/B3Effects.cpp
Source/JavaScriptCore/b3/B3Effects.h
Source/JavaScriptCore/b3/B3EliminateCommonSubexpressions.cpp
Source/JavaScriptCore/b3/B3Generate.cpp
Source/JavaScriptCore/b3/B3GenericBlockInsertionSet.h [new file with mode: 0644]
Source/JavaScriptCore/b3/B3HeapRange.h
Source/JavaScriptCore/b3/B3InsertionSet.cpp
Source/JavaScriptCore/b3/B3InsertionSet.h
Source/JavaScriptCore/b3/B3LegalizeMemoryOffsets.cpp
Source/JavaScriptCore/b3/B3LowerMacros.cpp
Source/JavaScriptCore/b3/B3LowerMacrosAfterOptimizations.cpp
Source/JavaScriptCore/b3/B3LowerToAir.cpp
Source/JavaScriptCore/b3/B3LowerToAir.h
Source/JavaScriptCore/b3/B3MemoryValue.cpp
Source/JavaScriptCore/b3/B3MemoryValue.h
Source/JavaScriptCore/b3/B3MemoryValueInlines.h [new file with mode: 0644]
Source/JavaScriptCore/b3/B3MoveConstants.cpp
Source/JavaScriptCore/b3/B3NativeTraits.h [new file with mode: 0644]
Source/JavaScriptCore/b3/B3Opcode.cpp
Source/JavaScriptCore/b3/B3Opcode.h
Source/JavaScriptCore/b3/B3Procedure.cpp
Source/JavaScriptCore/b3/B3Procedure.h
Source/JavaScriptCore/b3/B3PureCSE.cpp
Source/JavaScriptCore/b3/B3PureCSE.h
Source/JavaScriptCore/b3/B3ReduceStrength.cpp
Source/JavaScriptCore/b3/B3Validate.cpp
Source/JavaScriptCore/b3/B3Value.cpp
Source/JavaScriptCore/b3/B3Value.h
Source/JavaScriptCore/b3/B3ValueRep.h
Source/JavaScriptCore/b3/B3Width.cpp
Source/JavaScriptCore/b3/B3Width.h
Source/JavaScriptCore/b3/air/AirArg.cpp
Source/JavaScriptCore/b3/air/AirArg.h
Source/JavaScriptCore/b3/air/AirBasicBlock.cpp
Source/JavaScriptCore/b3/air/AirBasicBlock.h
Source/JavaScriptCore/b3/air/AirBlockInsertionSet.cpp [moved from Source/JavaScriptCore/b3/air/AirDumpAsJS.h with 78% similarity]
Source/JavaScriptCore/b3/air/AirBlockInsertionSet.h [new file with mode: 0644]
Source/JavaScriptCore/b3/air/AirDumpAsJS.cpp [deleted file]
Source/JavaScriptCore/b3/air/AirEliminateDeadCode.cpp
Source/JavaScriptCore/b3/air/AirGenerate.cpp
Source/JavaScriptCore/b3/air/AirInstInlines.h
Source/JavaScriptCore/b3/air/AirOpcode.opcodes
Source/JavaScriptCore/b3/air/AirOptimizeBlockOrder.cpp
Source/JavaScriptCore/b3/air/AirPadInterference.cpp
Source/JavaScriptCore/b3/air/AirSpillEverything.cpp
Source/JavaScriptCore/b3/air/opcode_generator.rb
Source/JavaScriptCore/b3/testb3.cpp
Source/JavaScriptCore/runtime/Options.h
Websites/webkit.org/ChangeLog
Websites/webkit.org/docs/b3/intermediate-representation.html

index 603dee9..8cb3b76 100644 (file)
@@ -64,6 +64,7 @@ set(JavaScriptCore_SOURCES
     API/OpaqueJSString.cpp
 
     assembler/ARMAssembler.cpp
+    assembler/AbstractMacroAssembler.cpp
     assembler/LinkBuffer.cpp
     assembler/MacroAssembler.cpp
     assembler/MacroAssemblerARM.cpp
@@ -76,12 +77,12 @@ set(JavaScriptCore_SOURCES
     b3/air/AirAllocateStack.cpp
     b3/air/AirArg.cpp
     b3/air/AirBasicBlock.cpp
+    b3/air/AirBlockInsertionSet.cpp
     b3/air/AirCCallSpecial.cpp
     b3/air/AirCCallingConvention.cpp
     b3/air/AirCode.cpp
     b3/air/AirCustom.cpp
     b3/air/AirDisassembler.cpp
-    b3/air/AirDumpAsJS.cpp
     b3/air/AirEliminateDeadCode.cpp
     b3/air/AirEmitShuffle.cpp
     b3/air/AirFixObviousSpills.cpp
@@ -110,6 +111,7 @@ set(JavaScriptCore_SOURCES
     b3/air/AirValidate.cpp
 
     b3/B3ArgumentRegValue.cpp
+    b3/B3AtomicValue.cpp
     b3/B3Bank.cpp
     b3/B3BasicBlock.cpp
     b3/B3BlockInsertionSet.cpp
index b49690b..22676d5 100644 (file)
@@ -1,3 +1,510 @@
+2017-03-04  Filip Pizlo  <fpizlo@apple.com>
+
+        B3 should have comprehensive support for atomic operations
+        https://bugs.webkit.org/show_bug.cgi?id=162349
+
+        Reviewed by Keith Miller.
+        
+        This adds the following capabilities to B3:
+        
+        - Atomic weak/strong unfenced/fenced compare-and-swap
+        - Atomic add/sub/or/and/xor/xchg
+        - Acquire/release fencing on loads/stores
+        - Fenceless load-load dependencies
+        
+        This adds lowering to the following instructions on x86:
+        
+        - lock cmpxchg
+        - lock xadd
+        - lock add/sub/or/and/xor/xchg
+        
+        This adds lowering to the following instructions on ARM64:
+        
+        - ldar and friends
+        - stlr and friends
+        - ldxr and friends (unfenced LL)
+        - stxr and friends (unfended SC)
+        - ldaxr and friends (fenced LL)
+        - stlxr and friends (fenced SC)
+        - eor as a fenceless load-load dependency
+        
+        This does instruction selection pattern matching to ensure that weak/strong CAS and all of the
+        variants of fences and atomic math ops get lowered to the best possible instruction sequence.
+        For example, we support the Equal(AtomicStrongCAS(expected, ...), expected) pattern and a bunch
+        of its friends. You can say Branch(Equal(AtomicStrongCAS(expected, ...), expected)) and it will
+        generate the best possible branch sequence on x86 and ARM64.
+        
+        B3 now knows how to model all of the kinds of fencing. It knows that acq loads are ordered with
+        respect to each other and with respect to rel stores, creating sequential consistency that
+        transcends just the acq/rel fences themselves (see Effects::fence). It knows that the phantom
+        fence effects may only target some abstract heaps but not others, so that load elimination and
+        store sinking can still operate across fences if you just tell B3 that the fence does not alias
+        those accesses. This makes it super easy to teach B3 that some of your heap is thread-local.
+        Even better, it lets you express fine-grained dependencies where the atomics that affect one
+        property in shared memory do not clobber non-atomics that ffect some other property in shared
+        memory.
+        
+        One of my favorite features is Depend, which allows you to express load-load dependencies. On
+        x86 it lowers to nothing, while on ARM64 it lowers to eor.
+        
+        This also exposes a common atomicWeakCAS API to the x86_64/ARM64 MacroAssemblers. Same for
+        acq/rel. JSC's 64-bit JITs are now a happy concurrency playground.
+        
+        This doesn't yet expose the functionality to JS or wasm. SAB still uses the non-intrinsic
+        implementations of the Atomics object, for now.
+        
+        * CMakeLists.txt:
+        * JavaScriptCore.xcodeproj/project.pbxproj:
+        * assembler/ARM64Assembler.h:
+        (JSC::ARM64Assembler::ldar):
+        (JSC::ARM64Assembler::ldxr):
+        (JSC::ARM64Assembler::ldaxr):
+        (JSC::ARM64Assembler::stxr):
+        (JSC::ARM64Assembler::stlr):
+        (JSC::ARM64Assembler::stlxr):
+        (JSC::ARM64Assembler::excepnGenerationImmMask):
+        (JSC::ARM64Assembler::exoticLoad):
+        (JSC::ARM64Assembler::storeRelease):
+        (JSC::ARM64Assembler::exoticStore):
+        * assembler/AbstractMacroAssembler.cpp: Added.
+        (WTF::printInternal):
+        * assembler/AbstractMacroAssembler.h:
+        (JSC::AbstractMacroAssemblerBase::invert):
+        * assembler/MacroAssembler.h:
+        * assembler/MacroAssemblerARM64.h:
+        (JSC::MacroAssemblerARM64::loadAcq8SignedExtendTo32):
+        (JSC::MacroAssemblerARM64::loadAcq8):
+        (JSC::MacroAssemblerARM64::storeRel8):
+        (JSC::MacroAssemblerARM64::loadAcq16SignedExtendTo32):
+        (JSC::MacroAssemblerARM64::loadAcq16):
+        (JSC::MacroAssemblerARM64::storeRel16):
+        (JSC::MacroAssemblerARM64::loadAcq32):
+        (JSC::MacroAssemblerARM64::loadAcq64):
+        (JSC::MacroAssemblerARM64::storeRel32):
+        (JSC::MacroAssemblerARM64::storeRel64):
+        (JSC::MacroAssemblerARM64::loadLink8):
+        (JSC::MacroAssemblerARM64::loadLinkAcq8):
+        (JSC::MacroAssemblerARM64::storeCond8):
+        (JSC::MacroAssemblerARM64::storeCondRel8):
+        (JSC::MacroAssemblerARM64::loadLink16):
+        (JSC::MacroAssemblerARM64::loadLinkAcq16):
+        (JSC::MacroAssemblerARM64::storeCond16):
+        (JSC::MacroAssemblerARM64::storeCondRel16):
+        (JSC::MacroAssemblerARM64::loadLink32):
+        (JSC::MacroAssemblerARM64::loadLinkAcq32):
+        (JSC::MacroAssemblerARM64::storeCond32):
+        (JSC::MacroAssemblerARM64::storeCondRel32):
+        (JSC::MacroAssemblerARM64::loadLink64):
+        (JSC::MacroAssemblerARM64::loadLinkAcq64):
+        (JSC::MacroAssemblerARM64::storeCond64):
+        (JSC::MacroAssemblerARM64::storeCondRel64):
+        (JSC::MacroAssemblerARM64::atomicStrongCAS8):
+        (JSC::MacroAssemblerARM64::atomicStrongCAS16):
+        (JSC::MacroAssemblerARM64::atomicStrongCAS32):
+        (JSC::MacroAssemblerARM64::atomicStrongCAS64):
+        (JSC::MacroAssemblerARM64::atomicRelaxedStrongCAS8):
+        (JSC::MacroAssemblerARM64::atomicRelaxedStrongCAS16):
+        (JSC::MacroAssemblerARM64::atomicRelaxedStrongCAS32):
+        (JSC::MacroAssemblerARM64::atomicRelaxedStrongCAS64):
+        (JSC::MacroAssemblerARM64::branchAtomicWeakCAS8):
+        (JSC::MacroAssemblerARM64::branchAtomicWeakCAS16):
+        (JSC::MacroAssemblerARM64::branchAtomicWeakCAS32):
+        (JSC::MacroAssemblerARM64::branchAtomicWeakCAS64):
+        (JSC::MacroAssemblerARM64::branchAtomicRelaxedWeakCAS8):
+        (JSC::MacroAssemblerARM64::branchAtomicRelaxedWeakCAS16):
+        (JSC::MacroAssemblerARM64::branchAtomicRelaxedWeakCAS32):
+        (JSC::MacroAssemblerARM64::branchAtomicRelaxedWeakCAS64):
+        (JSC::MacroAssemblerARM64::depend32):
+        (JSC::MacroAssemblerARM64::depend64):
+        (JSC::MacroAssemblerARM64::loadLink):
+        (JSC::MacroAssemblerARM64::loadLinkAcq):
+        (JSC::MacroAssemblerARM64::storeCond):
+        (JSC::MacroAssemblerARM64::storeCondRel):
+        (JSC::MacroAssemblerARM64::signExtend):
+        (JSC::MacroAssemblerARM64::branch):
+        (JSC::MacroAssemblerARM64::atomicStrongCAS):
+        (JSC::MacroAssemblerARM64::atomicRelaxedStrongCAS):
+        (JSC::MacroAssemblerARM64::branchAtomicWeakCAS):
+        (JSC::MacroAssemblerARM64::branchAtomicRelaxedWeakCAS):
+        (JSC::MacroAssemblerARM64::extractSimpleAddress):
+        (JSC::MacroAssemblerARM64::signExtend<8>):
+        (JSC::MacroAssemblerARM64::signExtend<16>):
+        (JSC::MacroAssemblerARM64::branch<64>):
+        * assembler/MacroAssemblerX86Common.h:
+        (JSC::MacroAssemblerX86Common::add32):
+        (JSC::MacroAssemblerX86Common::and32):
+        (JSC::MacroAssemblerX86Common::and16):
+        (JSC::MacroAssemblerX86Common::and8):
+        (JSC::MacroAssemblerX86Common::neg32):
+        (JSC::MacroAssemblerX86Common::neg16):
+        (JSC::MacroAssemblerX86Common::neg8):
+        (JSC::MacroAssemblerX86Common::or32):
+        (JSC::MacroAssemblerX86Common::or16):
+        (JSC::MacroAssemblerX86Common::or8):
+        (JSC::MacroAssemblerX86Common::sub16):
+        (JSC::MacroAssemblerX86Common::sub8):
+        (JSC::MacroAssemblerX86Common::sub32):
+        (JSC::MacroAssemblerX86Common::xor32):
+        (JSC::MacroAssemblerX86Common::xor16):
+        (JSC::MacroAssemblerX86Common::xor8):
+        (JSC::MacroAssemblerX86Common::not32):
+        (JSC::MacroAssemblerX86Common::not16):
+        (JSC::MacroAssemblerX86Common::not8):
+        (JSC::MacroAssemblerX86Common::store16):
+        (JSC::MacroAssemblerX86Common::atomicStrongCAS8):
+        (JSC::MacroAssemblerX86Common::atomicStrongCAS16):
+        (JSC::MacroAssemblerX86Common::atomicStrongCAS32):
+        (JSC::MacroAssemblerX86Common::branchAtomicStrongCAS8):
+        (JSC::MacroAssemblerX86Common::branchAtomicStrongCAS16):
+        (JSC::MacroAssemblerX86Common::branchAtomicStrongCAS32):
+        (JSC::MacroAssemblerX86Common::atomicWeakCAS8):
+        (JSC::MacroAssemblerX86Common::atomicWeakCAS16):
+        (JSC::MacroAssemblerX86Common::atomicWeakCAS32):
+        (JSC::MacroAssemblerX86Common::branchAtomicWeakCAS8):
+        (JSC::MacroAssemblerX86Common::branchAtomicWeakCAS16):
+        (JSC::MacroAssemblerX86Common::branchAtomicWeakCAS32):
+        (JSC::MacroAssemblerX86Common::atomicRelaxedWeakCAS8):
+        (JSC::MacroAssemblerX86Common::atomicRelaxedWeakCAS16):
+        (JSC::MacroAssemblerX86Common::atomicRelaxedWeakCAS32):
+        (JSC::MacroAssemblerX86Common::branchAtomicRelaxedWeakCAS8):
+        (JSC::MacroAssemblerX86Common::branchAtomicRelaxedWeakCAS16):
+        (JSC::MacroAssemblerX86Common::branchAtomicRelaxedWeakCAS32):
+        (JSC::MacroAssemblerX86Common::atomicAdd8):
+        (JSC::MacroAssemblerX86Common::atomicAdd16):
+        (JSC::MacroAssemblerX86Common::atomicAdd32):
+        (JSC::MacroAssemblerX86Common::atomicSub8):
+        (JSC::MacroAssemblerX86Common::atomicSub16):
+        (JSC::MacroAssemblerX86Common::atomicSub32):
+        (JSC::MacroAssemblerX86Common::atomicAnd8):
+        (JSC::MacroAssemblerX86Common::atomicAnd16):
+        (JSC::MacroAssemblerX86Common::atomicAnd32):
+        (JSC::MacroAssemblerX86Common::atomicOr8):
+        (JSC::MacroAssemblerX86Common::atomicOr16):
+        (JSC::MacroAssemblerX86Common::atomicOr32):
+        (JSC::MacroAssemblerX86Common::atomicXor8):
+        (JSC::MacroAssemblerX86Common::atomicXor16):
+        (JSC::MacroAssemblerX86Common::atomicXor32):
+        (JSC::MacroAssemblerX86Common::atomicNeg8):
+        (JSC::MacroAssemblerX86Common::atomicNeg16):
+        (JSC::MacroAssemblerX86Common::atomicNeg32):
+        (JSC::MacroAssemblerX86Common::atomicNot8):
+        (JSC::MacroAssemblerX86Common::atomicNot16):
+        (JSC::MacroAssemblerX86Common::atomicNot32):
+        (JSC::MacroAssemblerX86Common::atomicXchgAdd8):
+        (JSC::MacroAssemblerX86Common::atomicXchgAdd16):
+        (JSC::MacroAssemblerX86Common::atomicXchgAdd32):
+        (JSC::MacroAssemblerX86Common::atomicXchg8):
+        (JSC::MacroAssemblerX86Common::atomicXchg16):
+        (JSC::MacroAssemblerX86Common::atomicXchg32):
+        (JSC::MacroAssemblerX86Common::loadAcq8):
+        (JSC::MacroAssemblerX86Common::loadAcq8SignedExtendTo32):
+        (JSC::MacroAssemblerX86Common::loadAcq16):
+        (JSC::MacroAssemblerX86Common::loadAcq16SignedExtendTo32):
+        (JSC::MacroAssemblerX86Common::loadAcq32):
+        (JSC::MacroAssemblerX86Common::storeRel8):
+        (JSC::MacroAssemblerX86Common::storeRel16):
+        (JSC::MacroAssemblerX86Common::storeRel32):
+        (JSC::MacroAssemblerX86Common::storeFence):
+        (JSC::MacroAssemblerX86Common::loadFence):
+        (JSC::MacroAssemblerX86Common::replaceWithJump):
+        (JSC::MacroAssemblerX86Common::maxJumpReplacementSize):
+        (JSC::MacroAssemblerX86Common::patchableJumpSize):
+        (JSC::MacroAssemblerX86Common::supportsFloatingPointRounding):
+        (JSC::MacroAssemblerX86Common::supportsAVX):
+        (JSC::MacroAssemblerX86Common::updateEax1EcxFlags):
+        (JSC::MacroAssemblerX86Common::x86Condition):
+        (JSC::MacroAssemblerX86Common::atomicStrongCAS):
+        (JSC::MacroAssemblerX86Common::branchAtomicStrongCAS):
+        * assembler/MacroAssemblerX86_64.h:
+        (JSC::MacroAssemblerX86_64::add64):
+        (JSC::MacroAssemblerX86_64::and64):
+        (JSC::MacroAssemblerX86_64::neg64):
+        (JSC::MacroAssemblerX86_64::or64):
+        (JSC::MacroAssemblerX86_64::sub64):
+        (JSC::MacroAssemblerX86_64::xor64):
+        (JSC::MacroAssemblerX86_64::not64):
+        (JSC::MacroAssemblerX86_64::store64):
+        (JSC::MacroAssemblerX86_64::atomicStrongCAS64):
+        (JSC::MacroAssemblerX86_64::branchAtomicStrongCAS64):
+        (JSC::MacroAssemblerX86_64::atomicWeakCAS64):
+        (JSC::MacroAssemblerX86_64::branchAtomicWeakCAS64):
+        (JSC::MacroAssemblerX86_64::atomicRelaxedWeakCAS64):
+        (JSC::MacroAssemblerX86_64::branchAtomicRelaxedWeakCAS64):
+        (JSC::MacroAssemblerX86_64::atomicAdd64):
+        (JSC::MacroAssemblerX86_64::atomicSub64):
+        (JSC::MacroAssemblerX86_64::atomicAnd64):
+        (JSC::MacroAssemblerX86_64::atomicOr64):
+        (JSC::MacroAssemblerX86_64::atomicXor64):
+        (JSC::MacroAssemblerX86_64::atomicNeg64):
+        (JSC::MacroAssemblerX86_64::atomicNot64):
+        (JSC::MacroAssemblerX86_64::atomicXchgAdd64):
+        (JSC::MacroAssemblerX86_64::atomicXchg64):
+        (JSC::MacroAssemblerX86_64::loadAcq64):
+        (JSC::MacroAssemblerX86_64::storeRel64):
+        * assembler/X86Assembler.h:
+        (JSC::X86Assembler::addl_mr):
+        (JSC::X86Assembler::addq_mr):
+        (JSC::X86Assembler::addq_rm):
+        (JSC::X86Assembler::addq_im):
+        (JSC::X86Assembler::andl_mr):
+        (JSC::X86Assembler::andl_rm):
+        (JSC::X86Assembler::andw_rm):
+        (JSC::X86Assembler::andb_rm):
+        (JSC::X86Assembler::andl_im):
+        (JSC::X86Assembler::andw_im):
+        (JSC::X86Assembler::andb_im):
+        (JSC::X86Assembler::andq_mr):
+        (JSC::X86Assembler::andq_rm):
+        (JSC::X86Assembler::andq_im):
+        (JSC::X86Assembler::incq_m):
+        (JSC::X86Assembler::negq_m):
+        (JSC::X86Assembler::negl_m):
+        (JSC::X86Assembler::negw_m):
+        (JSC::X86Assembler::negb_m):
+        (JSC::X86Assembler::notl_m):
+        (JSC::X86Assembler::notw_m):
+        (JSC::X86Assembler::notb_m):
+        (JSC::X86Assembler::notq_m):
+        (JSC::X86Assembler::orl_mr):
+        (JSC::X86Assembler::orl_rm):
+        (JSC::X86Assembler::orw_rm):
+        (JSC::X86Assembler::orb_rm):
+        (JSC::X86Assembler::orl_im):
+        (JSC::X86Assembler::orw_im):
+        (JSC::X86Assembler::orb_im):
+        (JSC::X86Assembler::orq_mr):
+        (JSC::X86Assembler::orq_rm):
+        (JSC::X86Assembler::orq_im):
+        (JSC::X86Assembler::subl_mr):
+        (JSC::X86Assembler::subl_rm):
+        (JSC::X86Assembler::subw_rm):
+        (JSC::X86Assembler::subb_rm):
+        (JSC::X86Assembler::subl_im):
+        (JSC::X86Assembler::subw_im):
+        (JSC::X86Assembler::subb_im):
+        (JSC::X86Assembler::subq_mr):
+        (JSC::X86Assembler::subq_rm):
+        (JSC::X86Assembler::subq_im):
+        (JSC::X86Assembler::xorl_mr):
+        (JSC::X86Assembler::xorl_rm):
+        (JSC::X86Assembler::xorl_im):
+        (JSC::X86Assembler::xorw_rm):
+        (JSC::X86Assembler::xorw_im):
+        (JSC::X86Assembler::xorb_rm):
+        (JSC::X86Assembler::xorb_im):
+        (JSC::X86Assembler::xorq_im):
+        (JSC::X86Assembler::xorq_rm):
+        (JSC::X86Assembler::xorq_mr):
+        (JSC::X86Assembler::xchgb_rm):
+        (JSC::X86Assembler::xchgw_rm):
+        (JSC::X86Assembler::xchgl_rm):
+        (JSC::X86Assembler::xchgq_rm):
+        (JSC::X86Assembler::movw_im):
+        (JSC::X86Assembler::movq_i32m):
+        (JSC::X86Assembler::cmpxchgb_rm):
+        (JSC::X86Assembler::cmpxchgw_rm):
+        (JSC::X86Assembler::cmpxchgl_rm):
+        (JSC::X86Assembler::cmpxchgq_rm):
+        (JSC::X86Assembler::xaddb_rm):
+        (JSC::X86Assembler::xaddw_rm):
+        (JSC::X86Assembler::xaddl_rm):
+        (JSC::X86Assembler::xaddq_rm):
+        (JSC::X86Assembler::X86InstructionFormatter::SingleInstructionBufferWriter::memoryModRM):
+        * b3/B3AtomicValue.cpp: Added.
+        (JSC::B3::AtomicValue::~AtomicValue):
+        (JSC::B3::AtomicValue::dumpMeta):
+        (JSC::B3::AtomicValue::cloneImpl):
+        (JSC::B3::AtomicValue::AtomicValue):
+        * b3/B3AtomicValue.h: Added.
+        * b3/B3BasicBlock.h:
+        * b3/B3BlockInsertionSet.cpp:
+        (JSC::B3::BlockInsertionSet::BlockInsertionSet):
+        (JSC::B3::BlockInsertionSet::insert): Deleted.
+        (JSC::B3::BlockInsertionSet::insertBefore): Deleted.
+        (JSC::B3::BlockInsertionSet::insertAfter): Deleted.
+        (JSC::B3::BlockInsertionSet::execute): Deleted.
+        * b3/B3BlockInsertionSet.h:
+        * b3/B3Effects.cpp:
+        (JSC::B3::Effects::interferes):
+        (JSC::B3::Effects::operator==):
+        (JSC::B3::Effects::dump):
+        * b3/B3Effects.h:
+        (JSC::B3::Effects::forCall):
+        (JSC::B3::Effects::mustExecute):
+        * b3/B3EliminateCommonSubexpressions.cpp:
+        * b3/B3Generate.cpp:
+        (JSC::B3::generateToAir):
+        * b3/B3GenericBlockInsertionSet.h: Added.
+        (JSC::B3::GenericBlockInsertionSet::GenericBlockInsertionSet):
+        (JSC::B3::GenericBlockInsertionSet::insert):
+        (JSC::B3::GenericBlockInsertionSet::insertBefore):
+        (JSC::B3::GenericBlockInsertionSet::insertAfter):
+        (JSC::B3::GenericBlockInsertionSet::execute):
+        * b3/B3HeapRange.h:
+        (JSC::B3::HeapRange::operator|):
+        * b3/B3InsertionSet.cpp:
+        (JSC::B3::InsertionSet::insertClone):
+        * b3/B3InsertionSet.h:
+        * b3/B3LegalizeMemoryOffsets.cpp:
+        * b3/B3LowerMacros.cpp:
+        (JSC::B3::lowerMacros):
+        * b3/B3LowerMacrosAfterOptimizations.cpp:
+        * b3/B3LowerToAir.cpp:
+        (JSC::B3::Air::LowerToAir::LowerToAir):
+        (JSC::B3::Air::LowerToAir::run):
+        (JSC::B3::Air::LowerToAir::effectiveAddr):
+        (JSC::B3::Air::LowerToAir::addr):
+        (JSC::B3::Air::LowerToAir::loadPromiseAnyOpcode):
+        (JSC::B3::Air::LowerToAir::appendShift):
+        (JSC::B3::Air::LowerToAir::tryAppendStoreBinOp):
+        (JSC::B3::Air::LowerToAir::storeOpcode):
+        (JSC::B3::Air::LowerToAir::createStore):
+        (JSC::B3::Air::LowerToAir::finishAppendingInstructions):
+        (JSC::B3::Air::LowerToAir::newBlock):
+        (JSC::B3::Air::LowerToAir::splitBlock):
+        (JSC::B3::Air::LowerToAir::fillStackmap):
+        (JSC::B3::Air::LowerToAir::appendX86Div):
+        (JSC::B3::Air::LowerToAir::appendX86UDiv):
+        (JSC::B3::Air::LowerToAir::loadLinkOpcode):
+        (JSC::B3::Air::LowerToAir::storeCondOpcode):
+        (JSC::B3::Air::LowerToAir::appendCAS):
+        (JSC::B3::Air::LowerToAir::appendVoidAtomic):
+        (JSC::B3::Air::LowerToAir::appendGeneralAtomic):
+        (JSC::B3::Air::LowerToAir::lower):
+        (JSC::B3::Air::LowerToAir::lowerX86Div): Deleted.
+        (JSC::B3::Air::LowerToAir::lowerX86UDiv): Deleted.
+        * b3/B3LowerToAir.h:
+        * b3/B3MemoryValue.cpp:
+        (JSC::B3::MemoryValue::isLegalOffset):
+        (JSC::B3::MemoryValue::accessType):
+        (JSC::B3::MemoryValue::accessBank):
+        (JSC::B3::MemoryValue::accessByteSize):
+        (JSC::B3::MemoryValue::dumpMeta):
+        (JSC::B3::MemoryValue::MemoryValue):
+        (JSC::B3::MemoryValue::accessWidth): Deleted.
+        * b3/B3MemoryValue.h:
+        * b3/B3MemoryValueInlines.h: Added.
+        (JSC::B3::MemoryValue::isLegalOffset):
+        (JSC::B3::MemoryValue::requiresSimpleAddr):
+        (JSC::B3::MemoryValue::accessWidth):
+        * b3/B3MoveConstants.cpp:
+        * b3/B3NativeTraits.h: Added.
+        * b3/B3Opcode.cpp:
+        (JSC::B3::storeOpcode):
+        (WTF::printInternal):
+        * b3/B3Opcode.h:
+        (JSC::B3::isLoad):
+        (JSC::B3::isStore):
+        (JSC::B3::isLoadStore):
+        (JSC::B3::isAtomic):
+        (JSC::B3::isAtomicCAS):
+        (JSC::B3::isAtomicXchg):
+        (JSC::B3::isMemoryAccess):
+        (JSC::B3::signExtendOpcode):
+        * b3/B3Procedure.cpp:
+        (JSC::B3::Procedure::dump):
+        * b3/B3Procedure.h:
+        (JSC::B3::Procedure::hasQuirks):
+        (JSC::B3::Procedure::setHasQuirks):
+        * b3/B3PureCSE.cpp:
+        (JSC::B3::pureCSE):
+        * b3/B3PureCSE.h:
+        * b3/B3ReduceStrength.cpp:
+        * b3/B3Validate.cpp:
+        * b3/B3Value.cpp:
+        (JSC::B3::Value::returnsBool):
+        (JSC::B3::Value::effects):
+        (JSC::B3::Value::key):
+        (JSC::B3::Value::performSubstitution):
+        (JSC::B3::Value::typeFor):
+        * b3/B3Value.h:
+        * b3/B3Width.cpp:
+        (JSC::B3::bestType):
+        * b3/B3Width.h:
+        (JSC::B3::canonicalWidth):
+        (JSC::B3::isCanonicalWidth):
+        (JSC::B3::mask):
+        * b3/air/AirArg.cpp:
+        (JSC::B3::Air::Arg::jsHash):
+        (JSC::B3::Air::Arg::dump):
+        (WTF::printInternal):
+        * b3/air/AirArg.h:
+        (JSC::B3::Air::Arg::isAnyUse):
+        (JSC::B3::Air::Arg::isColdUse):
+        (JSC::B3::Air::Arg::cooled):
+        (JSC::B3::Air::Arg::isEarlyUse):
+        (JSC::B3::Air::Arg::isLateUse):
+        (JSC::B3::Air::Arg::isAnyDef):
+        (JSC::B3::Air::Arg::isEarlyDef):
+        (JSC::B3::Air::Arg::isLateDef):
+        (JSC::B3::Air::Arg::isZDef):
+        (JSC::B3::Air::Arg::simpleAddr):
+        (JSC::B3::Air::Arg::statusCond):
+        (JSC::B3::Air::Arg::isSimpleAddr):
+        (JSC::B3::Air::Arg::isMemory):
+        (JSC::B3::Air::Arg::isStatusCond):
+        (JSC::B3::Air::Arg::isCondition):
+        (JSC::B3::Air::Arg::ptr):
+        (JSC::B3::Air::Arg::base):
+        (JSC::B3::Air::Arg::isGP):
+        (JSC::B3::Air::Arg::isFP):
+        (JSC::B3::Air::Arg::isValidForm):
+        (JSC::B3::Air::Arg::forEachTmpFast):
+        (JSC::B3::Air::Arg::forEachTmp):
+        (JSC::B3::Air::Arg::asAddress):
+        (JSC::B3::Air::Arg::asStatusCondition):
+        (JSC::B3::Air::Arg::isInvertible):
+        (JSC::B3::Air::Arg::inverted):
+        * b3/air/AirBasicBlock.cpp:
+        (JSC::B3::Air::BasicBlock::setSuccessors):
+        * b3/air/AirBasicBlock.h:
+        * b3/air/AirBlockInsertionSet.cpp: Added.
+        (JSC::B3::Air::BlockInsertionSet::BlockInsertionSet):
+        (JSC::B3::Air::BlockInsertionSet::~BlockInsertionSet):
+        * b3/air/AirBlockInsertionSet.h: Added.
+        * b3/air/AirDumpAsJS.cpp: Removed.
+        * b3/air/AirDumpAsJS.h: Removed.
+        * b3/air/AirEliminateDeadCode.cpp:
+        (JSC::B3::Air::eliminateDeadCode):
+        * b3/air/AirGenerate.cpp:
+        (JSC::B3::Air::prepareForGeneration):
+        * b3/air/AirInstInlines.h:
+        (JSC::B3::Air::isAtomicStrongCASValid):
+        (JSC::B3::Air::isBranchAtomicStrongCASValid):
+        (JSC::B3::Air::isAtomicStrongCAS8Valid):
+        (JSC::B3::Air::isAtomicStrongCAS16Valid):
+        (JSC::B3::Air::isAtomicStrongCAS32Valid):
+        (JSC::B3::Air::isAtomicStrongCAS64Valid):
+        (JSC::B3::Air::isBranchAtomicStrongCAS8Valid):
+        (JSC::B3::Air::isBranchAtomicStrongCAS16Valid):
+        (JSC::B3::Air::isBranchAtomicStrongCAS32Valid):
+        (JSC::B3::Air::isBranchAtomicStrongCAS64Valid):
+        * b3/air/AirOpcode.opcodes:
+        * b3/air/AirOptimizeBlockOrder.cpp:
+        (JSC::B3::Air::optimizeBlockOrder):
+        * b3/air/AirPadInterference.cpp:
+        (JSC::B3::Air::padInterference):
+        * b3/air/AirSpillEverything.cpp:
+        (JSC::B3::Air::spillEverything):
+        * b3/air/opcode_generator.rb:
+        * b3/testb3.cpp:
+        (JSC::B3::testLoadAcq42):
+        (JSC::B3::testStoreRelAddLoadAcq32):
+        (JSC::B3::testStoreRelAddLoadAcq8):
+        (JSC::B3::testStoreRelAddFenceLoadAcq8):
+        (JSC::B3::testStoreRelAddLoadAcq16):
+        (JSC::B3::testStoreRelAddLoadAcq64):
+        (JSC::B3::testTrappingStoreElimination):
+        (JSC::B3::testX86LeaAddAdd):
+        (JSC::B3::testX86LeaAddShlLeftScale1):
+        (JSC::B3::testAtomicWeakCAS):
+        (JSC::B3::testAtomicStrongCAS):
+        (JSC::B3::testAtomicXchg):
+        (JSC::B3::testDepend32):
+        (JSC::B3::testDepend64):
+        (JSC::B3::run):
+        * runtime/Options.h:
+
 2017-03-10  Csaba Osztrogon√°c  <ossy@webkit.org>
 
         Unreviewed typo fixes after r213652.
index 3e8c274..1cc3cf1 100644 (file)
                0F2C63B01E60AE4300C13839 /* B3Bank.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2C63AC1E60AE3C00C13839 /* B3Bank.h */; };
                0F2C63B11E60AE4500C13839 /* B3Width.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F2C63AD1E60AE3C00C13839 /* B3Width.cpp */; };
                0F2C63B21E60AE4700C13839 /* B3Width.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2C63AE1E60AE3D00C13839 /* B3Width.h */; };
+               0F2C63B61E6343EA00C13839 /* B3AtomicValue.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F2C63B31E6343E800C13839 /* B3AtomicValue.cpp */; };
+               0F2C63B71E6343ED00C13839 /* B3AtomicValue.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2C63B41E6343E800C13839 /* B3AtomicValue.h */; };
+               0F2C63B81E6343F700C13839 /* B3GenericBlockInsertionSet.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2C63B51E6343E800C13839 /* B3GenericBlockInsertionSet.h */; };
+               0F2C63BB1E63440A00C13839 /* AirBlockInsertionSet.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F2C63B91E63440800C13839 /* AirBlockInsertionSet.cpp */; };
+               0F2C63BC1E63440C00C13839 /* AirBlockInsertionSet.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2C63BA1E63440800C13839 /* AirBlockInsertionSet.h */; };
+               0F2C63C01E660EA700C13839 /* AbstractMacroAssembler.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F2C63BF1E660EA500C13839 /* AbstractMacroAssembler.cpp */; };
+               0F2C63C21E664A5C00C13839 /* B3NativeTraits.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2C63C11E664A5A00C13839 /* B3NativeTraits.h */; };
+               0F2C63C41E69EF9400C13839 /* B3MemoryValueInlines.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2C63C31E69EF9200C13839 /* B3MemoryValueInlines.h */; };
                0F2D4DDD19832D34007D4B19 /* DebuggerScope.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F2D4DDB19832D34007D4B19 /* DebuggerScope.cpp */; };
                0F2D4DDE19832D34007D4B19 /* DebuggerScope.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2D4DDC19832D34007D4B19 /* DebuggerScope.h */; settings = {ATTRIBUTES = (Private, ); }; };
                0F2D4DE819832DAC007D4B19 /* ToThisStatus.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F2D4DE519832DAC007D4B19 /* ToThisStatus.cpp */; };
                DC2143081CA32E58000A8869 /* ICStats.cpp in Sources */ = {isa = PBXBuildFile; fileRef = DC2143051CA32E52000A8869 /* ICStats.cpp */; };
                DC3D2B0A1D34316200BA918C /* HeapCell.h in Headers */ = {isa = PBXBuildFile; fileRef = DC3D2B091D34316100BA918C /* HeapCell.h */; settings = {ATTRIBUTES = (Private, ); }; };
                DC3D2B0C1D34377000BA918C /* HeapCell.cpp in Sources */ = {isa = PBXBuildFile; fileRef = DC3D2B0B1D34376E00BA918C /* HeapCell.cpp */; };
-               DC454B8C1D00E822004C18AF /* AirDumpAsJS.cpp in Sources */ = {isa = PBXBuildFile; fileRef = DC454B8A1D00E81F004C18AF /* AirDumpAsJS.cpp */; };
-               DC454B8D1D00E824004C18AF /* AirDumpAsJS.h in Headers */ = {isa = PBXBuildFile; fileRef = DC454B8B1D00E81F004C18AF /* AirDumpAsJS.h */; };
                DC605B5D1CE26EA000593718 /* ProfilerEvent.cpp in Sources */ = {isa = PBXBuildFile; fileRef = DC605B591CE26E9800593718 /* ProfilerEvent.cpp */; };
                DC605B5E1CE26EA200593718 /* ProfilerEvent.h in Headers */ = {isa = PBXBuildFile; fileRef = DC605B5A1CE26E9800593718 /* ProfilerEvent.h */; settings = {ATTRIBUTES = (Private, ); }; };
                DC605B5F1CE26EA500593718 /* ProfilerUID.cpp in Sources */ = {isa = PBXBuildFile; fileRef = DC605B5B1CE26E9800593718 /* ProfilerUID.cpp */; };
                0F2C63AC1E60AE3C00C13839 /* B3Bank.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3Bank.h; path = b3/B3Bank.h; sourceTree = "<group>"; };
                0F2C63AD1E60AE3C00C13839 /* B3Width.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = B3Width.cpp; path = b3/B3Width.cpp; sourceTree = "<group>"; };
                0F2C63AE1E60AE3D00C13839 /* B3Width.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3Width.h; path = b3/B3Width.h; sourceTree = "<group>"; };
+               0F2C63B31E6343E800C13839 /* B3AtomicValue.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = B3AtomicValue.cpp; path = b3/B3AtomicValue.cpp; sourceTree = "<group>"; };
+               0F2C63B41E6343E800C13839 /* B3AtomicValue.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3AtomicValue.h; path = b3/B3AtomicValue.h; sourceTree = "<group>"; };
+               0F2C63B51E6343E800C13839 /* B3GenericBlockInsertionSet.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3GenericBlockInsertionSet.h; path = b3/B3GenericBlockInsertionSet.h; sourceTree = "<group>"; };
+               0F2C63B91E63440800C13839 /* AirBlockInsertionSet.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirBlockInsertionSet.cpp; path = b3/air/AirBlockInsertionSet.cpp; sourceTree = "<group>"; };
+               0F2C63BA1E63440800C13839 /* AirBlockInsertionSet.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirBlockInsertionSet.h; path = b3/air/AirBlockInsertionSet.h; sourceTree = "<group>"; };
+               0F2C63BF1E660EA500C13839 /* AbstractMacroAssembler.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = AbstractMacroAssembler.cpp; sourceTree = "<group>"; };
+               0F2C63C11E664A5A00C13839 /* B3NativeTraits.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3NativeTraits.h; path = b3/B3NativeTraits.h; sourceTree = "<group>"; };
+               0F2C63C31E69EF9200C13839 /* B3MemoryValueInlines.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = B3MemoryValueInlines.h; path = b3/B3MemoryValueInlines.h; sourceTree = "<group>"; };
                0F2D4DDB19832D34007D4B19 /* DebuggerScope.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = DebuggerScope.cpp; sourceTree = "<group>"; };
                0F2D4DDC19832D34007D4B19 /* DebuggerScope.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = DebuggerScope.h; sourceTree = "<group>"; };
                0F2D4DDF19832D91007D4B19 /* TypeProfilerLog.cpp */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.cpp.cpp; path = TypeProfilerLog.cpp; sourceTree = "<group>"; };
                DC2143061CA32E52000A8869 /* ICStats.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = ICStats.h; sourceTree = "<group>"; };
                DC3D2B091D34316100BA918C /* HeapCell.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = HeapCell.h; sourceTree = "<group>"; };
                DC3D2B0B1D34376E00BA918C /* HeapCell.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = HeapCell.cpp; sourceTree = "<group>"; };
-               DC454B8A1D00E81F004C18AF /* AirDumpAsJS.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirDumpAsJS.cpp; path = b3/air/AirDumpAsJS.cpp; sourceTree = "<group>"; };
-               DC454B8B1D00E81F004C18AF /* AirDumpAsJS.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirDumpAsJS.h; path = b3/air/AirDumpAsJS.h; sourceTree = "<group>"; };
                DC605B591CE26E9800593718 /* ProfilerEvent.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = ProfilerEvent.cpp; path = profiler/ProfilerEvent.cpp; sourceTree = "<group>"; };
                DC605B5A1CE26E9800593718 /* ProfilerEvent.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = ProfilerEvent.h; path = profiler/ProfilerEvent.h; sourceTree = "<group>"; };
                DC605B5B1CE26E9800593718 /* ProfilerUID.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = ProfilerUID.cpp; path = profiler/ProfilerUID.cpp; sourceTree = "<group>"; };
                                0FEC84B31BDACD880080FF74 /* air */,
                                0FEC84B41BDACDAC0080FF74 /* B3ArgumentRegValue.cpp */,
                                0FEC84B51BDACDAC0080FF74 /* B3ArgumentRegValue.h */,
+                               0F2C63B31E6343E800C13839 /* B3AtomicValue.cpp */,
+                               0F2C63B41E6343E800C13839 /* B3AtomicValue.h */,
                                0F2C63AB1E60AE3C00C13839 /* B3Bank.cpp */,
                                0F2C63AC1E60AE3C00C13839 /* B3Bank.h */,
                                0FEC84B61BDACDAC0080FF74 /* B3BasicBlock.cpp */,
                                0FEC84CD1BDACDAC0080FF74 /* B3FrequentedBlock.h */,
                                0FEC84CE1BDACDAC0080FF74 /* B3Generate.cpp */,
                                0FEC84CF1BDACDAC0080FF74 /* B3Generate.h */,
+                               0F2C63B51E6343E800C13839 /* B3GenericBlockInsertionSet.h */,
                                0FEC84D01BDACDAC0080FF74 /* B3GenericFrequentedBlock.h */,
                                0FEC85BF1BE167A00080FF74 /* B3HeapRange.cpp */,
                                0FEC85C01BE167A00080FF74 /* B3HeapRange.h */,
                                43AB26C51C1A52F700D82AE6 /* B3MathExtras.h */,
                                0FEC84D51BDACDAC0080FF74 /* B3MemoryValue.cpp */,
                                0FEC84D61BDACDAC0080FF74 /* B3MemoryValue.h */,
+                               0F2C63C31E69EF9200C13839 /* B3MemoryValueInlines.h */,
                                0F338E031BF0276C0013C88F /* B3MoveConstants.cpp */,
                                0F338E041BF0276C0013C88F /* B3MoveConstants.h */,
+                               0F2C63C11E664A5A00C13839 /* B3NativeTraits.h */,
                                0F338E051BF0276C0013C88F /* B3OpaqueByproduct.h */,
                                0F338E061BF0276C0013C88F /* B3OpaqueByproducts.cpp */,
                                0F338E071BF0276C0013C88F /* B3OpaqueByproducts.h */,
                                0F64EAF21C4ECD0600621E9B /* AirArgInlines.h */,
                                0FEC854C1BDACDC70080FF74 /* AirBasicBlock.cpp */,
                                0FEC854D1BDACDC70080FF74 /* AirBasicBlock.h */,
+                               0F2C63B91E63440800C13839 /* AirBlockInsertionSet.cpp */,
+                               0F2C63BA1E63440800C13839 /* AirBlockInsertionSet.h */,
                                0FB3878B1BFBC44D00E3AB1E /* AirBlockWorklist.h */,
                                0F6183201C45BF070072450B /* AirCCallingConvention.cpp */,
                                0F6183211C45BF070072450B /* AirCCallingConvention.h */,
                                0F10F1A21C420BF0001C07D2 /* AirCustom.h */,
                                79ABB17B1E5CCB570045B9A6 /* AirDisassembler.cpp */,
                                79ABB17C1E5CCB570045B9A6 /* AirDisassembler.h */,
-                               DC454B8A1D00E81F004C18AF /* AirDumpAsJS.cpp */,
-                               DC454B8B1D00E81F004C18AF /* AirDumpAsJS.h */,
                                0F4570361BE44C910062A629 /* AirEliminateDeadCode.cpp */,
                                0F4570371BE44C910062A629 /* AirEliminateDeadCode.h */,
                                0F6183231C45BF070072450B /* AirEmitShuffle.cpp */,
                        isa = PBXGroup;
                        children = (
                                0F1FE51B1922A3BC006987C5 /* AbortReason.h */,
+                               0F2C63BF1E660EA500C13839 /* AbstractMacroAssembler.cpp */,
                                860161DF0F3A83C100F84710 /* AbstractMacroAssembler.h */,
                                0F3730901C0CD70C00052BFA /* AllowMacroScratchRegisterUsage.h */,
                                8640923B156EED3B00566CB2 /* ARM64Assembler.h */,
                                86C568DE11A213EE0007F7F0 /* MacroAssemblerMIPS.h */,
                                FE68C6351B90DDD90042BCB3 /* MacroAssemblerPrinter.cpp */,
                                FE68C6361B90DDD90042BCB3 /* MacroAssemblerPrinter.h */,
-                               860161E00F3A83C100F84710 /* MacroAssemblerX86.h */,
                                860161E10F3A83C100F84710 /* MacroAssemblerX86_64.h */,
+                               860161E00F3A83C100F84710 /* MacroAssemblerX86.h */,
                                A7A4AE0717973B26005612B1 /* MacroAssemblerX86Common.cpp */,
                                860161E20F3A83C100F84710 /* MacroAssemblerX86Common.h */,
                                65860177185A8F5E00030EEE /* MaxFrameExtentForSlowPathCall.h */,
                                658824AF1E5CFDB000FB7359 /* ConfigFile.h in Headers */,
                                0FEC85761BDACDC70080FF74 /* AirCode.h in Headers */,
                                0F10F1A31C420BF0001C07D2 /* AirCustom.h in Headers */,
-                               DC454B8D1D00E824004C18AF /* AirDumpAsJS.h in Headers */,
                                0F4570391BE44C910062A629 /* AirEliminateDeadCode.h in Headers */,
                                0F61832D1C45BF070072450B /* AirEmitShuffle.h in Headers */,
                                0F4DE1CF1C4C1B54004D6C11 /* AirFixObviousSpills.h in Headers */,
                                0FEC85801BDACDC70080FF74 /* AirInst.h in Headers */,
                                0FEC85811BDACDC70080FF74 /* AirInstInlines.h in Headers */,
                                0FDF67D71D9DC442001B9825 /* AirKind.h in Headers */,
+                               0F2C63BC1E63440C00C13839 /* AirBlockInsertionSet.h in Headers */,
                                2684D4381C00161C0081D663 /* AirLiveness.h in Headers */,
                                0FE34C1A1C4B39AE0003A512 /* AirLogRegisterPressure.h in Headers */,
                                0F61832F1C45BF070072450B /* AirLowerAfterRegAlloc.h in Headers */,
                                43422A631C158E6D00E2EB98 /* B3ConstFloatValue.h in Headers */,
                                0FEC85B31BDED9570080FF74 /* B3ConstPtrValue.h in Headers */,
                                0F338DF61BE93D550013C88F /* B3ConstrainedValue.h in Headers */,
+                               0F2C63B81E6343F700C13839 /* B3GenericBlockInsertionSet.h in Headers */,
                                0F338E0E1BF0276C0013C88F /* B3DataSection.h in Headers */,
                                0F33FCFC1C1625BE00323F67 /* B3Dominators.h in Headers */,
                                0F6B8AD91C4EDDA200969052 /* B3DuplicateTails.h in Headers */,
                                A77A423E17A0BBFD00A8DB81 /* DFGAbstractHeap.h in Headers */,
                                A704D90317A0BAA8006BA554 /* DFGAbstractInterpreter.h in Headers */,
                                A704D90417A0BAA8006BA554 /* DFGAbstractInterpreterInlines.h in Headers */,
+                               0F2C63C21E664A5C00C13839 /* B3NativeTraits.h in Headers */,
                                0F620177143FCD3F0068B77C /* DFGAbstractValue.h in Headers */,
                                0FD3E4021B618AAF00C80E1E /* DFGAdaptiveInferredPropertyValueWatchpoint.h in Headers */,
                                0F18D3D01B55A6E0002C5C9F /* DFGAdaptiveStructureWatchpoint.h in Headers */,
                                0F392C8A1B46188400844728 /* DFGOSRExitFuzz.h in Headers */,
                                0FEFC9AB1681A3B600567F53 /* DFGOSRExitJumpPlaceholder.h in Headers */,
                                0F235BEE17178E7300690C7F /* DFGOSRExitPreparation.h in Headers */,
+                               0F2C63B71E6343ED00C13839 /* B3AtomicValue.h in Headers */,
                                0F6237981AE45CA700D402EA /* DFGPhantomInsertionPhase.h in Headers */,
                                0FFFC95C14EF90AF00C72532 /* DFGPhase.h in Headers */,
                                0F2B9CEB19D0BA7D00B1D1B5 /* DFGPhiChildren.h in Headers */,
                                79B819931DD25CF500DDC714 /* JSGlobalObjectInlines.h in Headers */,
                                A51007C1187CC3C600B38879 /* JSGlobalObjectInspectorController.h in Headers */,
                                A50E4B6418809DD50068A46D /* JSGlobalObjectRuntimeAgent.h in Headers */,
+                               0F2C63C41E69EF9400C13839 /* B3MemoryValueInlines.h in Headers */,
                                A503FA2A188F105900110F14 /* JSGlobalObjectScriptDebugServer.h in Headers */,
                                A513E5C0185BFACC007E95AD /* JSInjectedScriptHost.h in Headers */,
                                A513E5C2185BFACC007E95AD /* JSInjectedScriptHostPrototype.h in Headers */,
                        buildActionMask = 2147483647;
                        files = (
                                0FFA549716B8835000B3A982 /* A64DOpcode.cpp in Sources */,
+                               0F2C63C01E660EA700C13839 /* AbstractMacroAssembler.cpp in Sources */,
                                AD4937C31DDBE6140077C807 /* AbstractModuleRecord.cpp in Sources */,
                                0F55F0F414D1063900AC7649 /* AbstractPC.cpp in Sources */,
                                5370B4F51BF26202005C40FC /* AdaptiveInferredPropertyValueWatchpointBase.cpp in Sources */,
                                0FEC85731BDACDC70080FF74 /* AirCCallSpecial.cpp in Sources */,
                                0FEC85751BDACDC70080FF74 /* AirCode.cpp in Sources */,
                                0F61832B1C45BF070072450B /* AirCustom.cpp in Sources */,
-                               DC454B8C1D00E822004C18AF /* AirDumpAsJS.cpp in Sources */,
                                0F4570381BE44C910062A629 /* AirEliminateDeadCode.cpp in Sources */,
                                0F61832C1C45BF070072450B /* AirEmitShuffle.cpp in Sources */,
                                0F4DE1CE1C4C1B54004D6C11 /* AirFixObviousSpills.cpp in Sources */,
                                9E729407190F01A5001A91B5 /* InitializeThreading.cpp in Sources */,
                                A513E5B7185B8BD3007E95AD /* InjectedScript.cpp in Sources */,
                                A514B2C2185A684400F3C7CB /* InjectedScriptBase.cpp in Sources */,
+                               0F2C63B61E6343EA00C13839 /* B3AtomicValue.cpp in Sources */,
                                A58E35911860DECF001F24FE /* InjectedScriptHost.cpp in Sources */,
                                A513E5CA185F9624007E95AD /* InjectedScriptManager.cpp in Sources */,
                                A5840E20187B7B8600843B10 /* InjectedScriptModule.cpp in Sources */,
                                E35E035F1B7AB43E0073AD2A /* InspectorInstrumentationObject.cpp in Sources */,
                                A532438B18568335002ED692 /* InspectorProtocolObjects.cpp in Sources */,
                                A50E4B6118809DD50068A46D /* InspectorRuntimeAgent.cpp in Sources */,
+                               0F2C63BB1E63440A00C13839 /* AirBlockInsertionSet.cpp in Sources */,
                                A55165D21BDF0B98003B75C1 /* InspectorScriptProfilerAgent.cpp in Sources */,
                                A593CF821840377100BFCE27 /* InspectorValues.cpp in Sources */,
                                147F39CF107EC37600427A48 /* InternalFunction.cpp in Sources */,
index 7c22a55..c8e035e 100644 (file)
@@ -694,6 +694,21 @@ private:
         LdrLiteralOp_LDRSW = 2,
         LdrLiteralOp_128BIT = 2
     };
+    
+    enum ExoticLoadFence {
+        ExoticLoadFence_None,
+        ExoticLoadFence_Acquire
+    };
+    
+    enum ExoticLoadAtomic {
+        ExoticLoadAtomic_Link,
+        ExoticLoadAtomic_None
+    };
+
+    enum ExoticStoreFence {
+        ExoticStoreFence_None,
+        ExoticStoreFence_Release,
+    };
 
     static unsigned memPairOffsetShift(bool V, MemPairOpSize size)
     {
@@ -1531,6 +1546,48 @@ public:
     {
         insn(0xd5033abf);
     }
+    
+    template<int datasize>
+    void ldar(RegisterID dst, RegisterID src)
+    {
+        CHECK_DATASIZE();
+        insn(exoticLoad(MEMOPSIZE, ExoticLoadFence_Acquire, ExoticLoadAtomic_None, dst, src));
+    }
+
+    template<int datasize>
+    void ldxr(RegisterID dst, RegisterID src)
+    {
+        CHECK_DATASIZE();
+        insn(exoticLoad(MEMOPSIZE, ExoticLoadFence_None, ExoticLoadAtomic_Link, dst, src));
+    }
+
+    template<int datasize>
+    void ldaxr(RegisterID dst, RegisterID src)
+    {
+        CHECK_DATASIZE();
+        insn(exoticLoad(MEMOPSIZE, ExoticLoadFence_Acquire, ExoticLoadAtomic_Link, dst, src));
+    }
+    
+    template<int datasize>
+    void stxr(RegisterID result, RegisterID src, RegisterID dst)
+    {
+        CHECK_DATASIZE();
+        insn(exoticStore(MEMOPSIZE, ExoticStoreFence_None, result, src, dst));
+    }
+
+    template<int datasize>
+    void stlr(RegisterID src, RegisterID dst)
+    {
+        CHECK_DATASIZE();
+        insn(storeRelease(MEMOPSIZE, src, dst));
+    }
+
+    template<int datasize>
+    void stlxr(RegisterID result, RegisterID src, RegisterID dst)
+    {
+        CHECK_DATASIZE();
+        insn(exoticStore(MEMOPSIZE, ExoticStoreFence_Release, result, src, dst));
+    }
 
     template<int datasize>
     ALWAYS_INLINE void orn(RegisterID rd, RegisterID rn, RegisterID rm)
@@ -3608,7 +3665,22 @@ private:
         const int op4 = 0;
         return (0xd6000000 | opc << 21 | op2 << 16 | op3 << 10 | xOrZr(rn) << 5 | op4);
     }
-
+    
+    static int exoticLoad(MemOpSize size, ExoticLoadFence fence, ExoticLoadAtomic atomic, RegisterID dst, RegisterID src)
+    {
+        return 0x085f7c00 | size << 30 | fence << 15 | atomic << 23 | src << 5 | dst;
+    }
+    
+    static int storeRelease(MemOpSize size, RegisterID src, RegisterID dst)
+    {
+        return 0x089ffc00 | size << 30 | dst << 5 | src;
+    }
+    
+    static int exoticStore(MemOpSize size, ExoticStoreFence fence, RegisterID result, RegisterID src, RegisterID dst)
+    {
+        return 0x08007c00 | size << 30 | result << 16 | fence << 15 | dst << 5 | src;
+    }
+    
     // Workaround for Cortex-A53 erratum (835769). Emit an extra nop if the
     // last instruction in the buffer is a load, store or prefetch. Needed
     // before 64-bit multiply-accumulate instructions.
diff --git a/Source/JavaScriptCore/assembler/AbstractMacroAssembler.cpp b/Source/JavaScriptCore/assembler/AbstractMacroAssembler.cpp
new file mode 100644 (file)
index 0000000..810b3ce
--- /dev/null
@@ -0,0 +1,52 @@
+/*
+ * Copyright (C) 2017 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+#include "config.h"
+#include "MacroAssembler.h" // Have to break with style because AbstractMacroAssembler.h is a shady header.
+
+#if ENABLE(ASSEMBLER)
+
+#include <wtf/PrintStream.h>
+
+using namespace JSC;
+
+namespace WTF {
+
+void printInternal(PrintStream& out, AbstractMacroAssemblerBase::StatusCondition condition)
+{
+    switch (condition) {
+    case AbstractMacroAssemblerBase::Success:
+        out.print("Success");
+        return;
+    case AbstractMacroAssemblerBase::Failure:
+        out.print("Failure");
+        return;
+    }
+    RELEASE_ASSERT_NOT_REACHED();
+}
+
+} // namespace WTF
+
+#endif // ENABLE(ASSEMBLER)
index b791e5c..87ae698 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2008, 2012, 2014-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2008-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -36,6 +36,7 @@
 #include <wtf/CryptographicallyRandomNumber.h>
 #include <wtf/Noncopyable.h>
 #include <wtf/SharedTask.h>
+#include <wtf/Vector.h>
 #include <wtf/WeakRandom.h>
 
 namespace JSC {
@@ -50,8 +51,28 @@ namespace DFG {
 struct OSRExit;
 }
 
+class AbstractMacroAssemblerBase {
+public:
+    enum StatusCondition {
+        Success,
+        Failure
+    };
+    
+    static StatusCondition invert(StatusCondition condition)
+    {
+        switch (condition) {
+        case Success:
+            return Failure;
+        case Failure:
+            return Success;
+        }
+        RELEASE_ASSERT_NOT_REACHED();
+        return Success;
+    }
+};
+
 template <class AssemblerType, class MacroAssemblerType>
-class AbstractMacroAssembler {
+class AbstractMacroAssembler : public AbstractMacroAssemblerBase {
 public:
     typedef AbstractMacroAssembler<AssemblerType, MacroAssemblerType> AbstractMacroAssemblerType;
     typedef AssemblerType AssemblerType_T;
@@ -1108,3 +1129,16 @@ AbstractMacroAssembler<AssemblerType, MacroAssemblerType>::Address::indexedBy(
 #endif // ENABLE(ASSEMBLER)
 
 } // namespace JSC
+
+#if ENABLE(ASSEMBLER)
+
+namespace WTF {
+
+class PrintStream;
+
+void printInternal(PrintStream& out, JSC::AbstractMacroAssemblerBase::StatusCondition);
+
+} // namespace WTF
+
+#endif // ENABLE(ASSEMBLER)
+
index b6aba87..0fb2552 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2008, 2012-2015 Apple Inc. All rights reserved.
+ * Copyright (C) 2008-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -136,6 +136,7 @@ public:
     static const double twoToThe32; // This is super useful for some double code.
 
     // Utilities used by the DFG JIT.
+    using AbstractMacroAssemblerBase::invert;
     using MacroAssemblerBase::invert;
     
     static DoubleCondition invert(DoubleCondition cond)
index 0dc5045..a8dc771 100644 (file)
@@ -3375,7 +3375,229 @@ public:
     {
         m_assembler.dmbISH();
     }
-
+    
+    void loadAcq8SignedExtendTo32(ImplicitAddress address, RegisterID dest)
+    {
+        m_assembler.ldar<8>(dest, extractSimpleAddress(address));
+    }
+    
+    void loadAcq8(ImplicitAddress address, RegisterID dest)
+    {
+        loadAcq8SignedExtendTo32(address, dest);
+        and32(TrustedImm32(0xff), dest);
+    }
+    
+    void storeRel8(RegisterID src, ImplicitAddress address)
+    {
+        m_assembler.stlr<8>(src, extractSimpleAddress(address));
+    }
+    
+    void loadAcq16SignedExtendTo32(ImplicitAddress address, RegisterID dest)
+    {
+        m_assembler.ldar<16>(dest, extractSimpleAddress(address));
+    }
+    
+    void loadAcq16(ImplicitAddress address, RegisterID dest)
+    {
+        loadAcq16SignedExtendTo32(address, dest);
+        and32(TrustedImm32(0xffff), dest);
+    }
+    
+    void storeRel16(RegisterID src, ImplicitAddress address)
+    {
+        m_assembler.stlr<16>(src, extractSimpleAddress(address));
+    }
+    
+    void loadAcq32(ImplicitAddress address, RegisterID dest)
+    {
+        m_assembler.ldar<32>(dest, extractSimpleAddress(address));
+    }
+    
+    void loadAcq64(ImplicitAddress address, RegisterID dest)
+    {
+        m_assembler.ldar<64>(dest, extractSimpleAddress(address));
+    }
+    
+    void storeRel32(RegisterID dest, ImplicitAddress address)
+    {
+        m_assembler.stlr<32>(dest, extractSimpleAddress(address));
+    }
+    
+    void storeRel64(RegisterID dest, ImplicitAddress address)
+    {
+        m_assembler.stlr<64>(dest, extractSimpleAddress(address));
+    }
+    
+    void loadLink8(ImplicitAddress address, RegisterID dest)
+    {
+        m_assembler.ldxr<8>(dest, extractSimpleAddress(address));
+    }
+    
+    void loadLinkAcq8(ImplicitAddress address, RegisterID dest)
+    {
+        m_assembler.ldaxr<8>(dest, extractSimpleAddress(address));
+    }
+    
+    void storeCond8(RegisterID src, ImplicitAddress address, RegisterID result)
+    {
+        m_assembler.stxr<8>(result, src, extractSimpleAddress(address));
+    }
+    
+    void storeCondRel8(RegisterID src, ImplicitAddress address, RegisterID result)
+    {
+        m_assembler.stlxr<8>(result, src, extractSimpleAddress(address));
+    }
+    
+    void loadLink16(ImplicitAddress address, RegisterID dest)
+    {
+        m_assembler.ldxr<16>(dest, extractSimpleAddress(address));
+    }
+    
+    void loadLinkAcq16(ImplicitAddress address, RegisterID dest)
+    {
+        m_assembler.ldaxr<16>(dest, extractSimpleAddress(address));
+    }
+    
+    void storeCond16(RegisterID src, ImplicitAddress address, RegisterID result)
+    {
+        m_assembler.stxr<16>(result, src, extractSimpleAddress(address));
+    }
+    
+    void storeCondRel16(RegisterID src, ImplicitAddress address, RegisterID result)
+    {
+        m_assembler.stlxr<16>(result, src, extractSimpleAddress(address));
+    }
+    
+    void loadLink32(ImplicitAddress address, RegisterID dest)
+    {
+        m_assembler.ldxr<32>(dest, extractSimpleAddress(address));
+    }
+    
+    void loadLinkAcq32(ImplicitAddress address, RegisterID dest)
+    {
+        m_assembler.ldaxr<32>(dest, extractSimpleAddress(address));
+    }
+    
+    void storeCond32(RegisterID src, ImplicitAddress address, RegisterID result)
+    {
+        m_assembler.stxr<32>(result, src, extractSimpleAddress(address));
+    }
+    
+    void storeCondRel32(RegisterID src, ImplicitAddress address, RegisterID result)
+    {
+        m_assembler.stlxr<32>(result, src, extractSimpleAddress(address));
+    }
+    
+    void loadLink64(ImplicitAddress address, RegisterID dest)
+    {
+        m_assembler.ldxr<64>(dest, extractSimpleAddress(address));
+    }
+    
+    void loadLinkAcq64(ImplicitAddress address, RegisterID dest)
+    {
+        m_assembler.ldaxr<64>(dest, extractSimpleAddress(address));
+    }
+    
+    void storeCond64(RegisterID src, ImplicitAddress address, RegisterID result)
+    {
+        m_assembler.stxr<64>(result, src, extractSimpleAddress(address));
+    }
+    
+    void storeCondRel64(RegisterID src, ImplicitAddress address, RegisterID result)
+    {
+        m_assembler.stlxr<64>(result, src, extractSimpleAddress(address));
+    }
+    
+    void atomicStrongCAS8(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicStrongCAS<8>(cond, expectedAndResult, newValue, address, result);
+    }
+    
+    void atomicStrongCAS16(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicStrongCAS<16>(cond, expectedAndResult, newValue, address, result);
+    }
+    
+    void atomicStrongCAS32(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicStrongCAS<32>(cond, expectedAndResult, newValue, address, result);
+    }
+    
+    void atomicStrongCAS64(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicStrongCAS<64>(cond, expectedAndResult, newValue, address, result);
+    }
+    
+    void atomicRelaxedStrongCAS8(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicRelaxedStrongCAS<8>(cond, expectedAndResult, newValue, address, result);
+    }
+    
+    void atomicRelaxedStrongCAS16(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicRelaxedStrongCAS<16>(cond, expectedAndResult, newValue, address, result);
+    }
+    
+    void atomicRelaxedStrongCAS32(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicRelaxedStrongCAS<32>(cond, expectedAndResult, newValue, address, result);
+    }
+    
+    void atomicRelaxedStrongCAS64(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicRelaxedStrongCAS<64>(cond, expectedAndResult, newValue, address, result);
+    }
+    
+    JumpList branchAtomicWeakCAS8(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicWeakCAS<8>(cond, expectedAndClobbered, newValue, address);
+    }
+    
+    JumpList branchAtomicWeakCAS16(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicWeakCAS<16>(cond, expectedAndClobbered, newValue, address);
+    }
+    
+    JumpList branchAtomicWeakCAS32(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicWeakCAS<32>(cond, expectedAndClobbered, newValue, address);
+    }
+    
+    JumpList branchAtomicWeakCAS64(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicWeakCAS<64>(cond, expectedAndClobbered, newValue, address);
+    }
+    
+    JumpList branchAtomicRelaxedWeakCAS8(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicRelaxedWeakCAS<8>(cond, expectedAndClobbered, newValue, address);
+    }
+    
+    JumpList branchAtomicRelaxedWeakCAS16(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicRelaxedWeakCAS<16>(cond, expectedAndClobbered, newValue, address);
+    }
+    
+    JumpList branchAtomicRelaxedWeakCAS32(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicRelaxedWeakCAS<32>(cond, expectedAndClobbered, newValue, address);
+    }
+    
+    JumpList branchAtomicRelaxedWeakCAS64(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicRelaxedWeakCAS<64>(cond, expectedAndClobbered, newValue, address);
+    }
+    
+    void depend32(RegisterID src, RegisterID dest)
+    {
+        m_assembler.eor<32>(dest, src, src);
+    }
+    
+    void depend64(RegisterID src, RegisterID dest)
+    {
+        m_assembler.eor<64>(dest, src, src);
+    }
+    
     // Misc helper functions.
 
     // Invert a relational condition, e.g. == becomes !=, < becomes >=, etc.
@@ -3882,6 +4104,158 @@ private:
         }
         return false;
     }
+    
+    template<int datasize>
+    void loadLink(RegisterID src, RegisterID dest)
+    {
+        m_assembler.ldxr<datasize>(dest, src);
+    }
+    
+    template<int datasize>
+    void loadLinkAcq(RegisterID src, RegisterID dest)
+    {
+        m_assembler.ldaxr<datasize>(dest, src);
+    }
+    
+    template<int datasize>
+    void storeCond(RegisterID src, RegisterID dest, RegisterID result)
+    {
+        m_assembler.stxr<datasize>(src, dest, result);
+    }
+    
+    template<int datasize>
+    void storeCondRel(RegisterID src, RegisterID dest, RegisterID result)
+    {
+        m_assembler.stlxr<datasize>(src, dest, result);
+    }
+    
+    template<int datasize>
+    void signExtend(RegisterID src, RegisterID dest)
+    {
+        move(src, dest);
+    }
+    
+    template<int datasize>
+    Jump branch(RelationalCondition cond, RegisterID left, RegisterID right)
+    {
+        return branch32(cond, left, right);
+    }
+    
+    template<int datasize>
+    void atomicStrongCAS(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
+    {
+        signExtend<datasize>(expectedAndResult, expectedAndResult);
+        
+        RegisterID simpleAddress = extractSimpleAddress(address);
+        RegisterID tmp = getCachedDataTempRegisterIDAndInvalidate();
+        
+        Label reloop = label();
+        loadLinkAcq<datasize>(simpleAddress, tmp);
+        Jump failure = branch<datasize>(NotEqual, expectedAndResult, tmp);
+        
+        storeCondRel<datasize>(newValue, simpleAddress, result);
+        branchTest32(NonZero, result).linkTo(reloop, this);
+        move(TrustedImm32(cond == Success), result);
+        Jump done = jump();
+        
+        failure.link(this);
+        move(tmp, expectedAndResult);
+        storeCondRel<datasize>(tmp, simpleAddress, result);
+        branchTest32(NonZero, result).linkTo(reloop, this);
+        move(TrustedImm32(cond == Failure), result);
+        
+        done.link(this);
+    }
+    
+    template<int datasize>
+    void atomicRelaxedStrongCAS(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
+    {
+        signExtend<datasize>(expectedAndResult, expectedAndResult);
+        
+        RegisterID simpleAddress = extractSimpleAddress(address);
+        RegisterID tmp = getCachedDataTempRegisterIDAndInvalidate();
+        
+        Label reloop = label();
+        loadLink<datasize>(simpleAddress, tmp);
+        Jump failure = branch<datasize>(NotEqual, expectedAndResult, tmp);
+        
+        storeCond<datasize>(newValue, simpleAddress, result);
+        branchTest32(NonZero, result).linkTo(reloop, this);
+        move(TrustedImm32(cond == Success), result);
+        Jump done = jump();
+        
+        failure.link(this);
+        move(tmp, expectedAndResult);
+        move(TrustedImm32(cond == Failure), result);
+        
+        done.link(this);
+    }
+    
+    template<int datasize>
+    JumpList branchAtomicWeakCAS(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        signExtend<datasize>(expectedAndClobbered, expectedAndClobbered);
+        
+        RegisterID simpleAddress = extractSimpleAddress(address);
+        RegisterID tmp = getCachedDataTempRegisterIDAndInvalidate();
+        
+        JumpList success;
+        JumpList failure;
+
+        loadLinkAcq<datasize>(simpleAddress, tmp);
+        failure.append(branch<datasize>(NotEqual, expectedAndClobbered, tmp));
+        storeCondRel<datasize>(newValue, simpleAddress, expectedAndClobbered);
+        
+        switch (cond) {
+        case Success:
+            success.append(branchTest32(Zero, expectedAndClobbered));
+            failure.link(this);
+            return success;
+        case Failure:
+            failure.append(branchTest32(NonZero, expectedAndClobbered));
+            return failure;
+        }
+        
+        RELEASE_ASSERT_NOT_REACHED();
+    }
+    
+    template<int datasize>
+    JumpList branchAtomicRelaxedWeakCAS(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        signExtend<datasize>(expectedAndClobbered, expectedAndClobbered);
+        
+        RegisterID simpleAddress = extractSimpleAddress(address);
+        RegisterID tmp = getCachedDataTempRegisterIDAndInvalidate();
+        
+        JumpList success;
+        JumpList failure;
+
+        loadLink<datasize>(simpleAddress, tmp);
+        failure.append(branch<datasize>(NotEqual, expectedAndClobbered, tmp));
+        storeCond<datasize>(newValue, simpleAddress, expectedAndClobbered);
+        
+        switch (cond) {
+        case Success:
+            success.append(branchTest32(Zero, expectedAndClobbered));
+            failure.link(this);
+            return success;
+        case Failure:
+            failure.append(branchTest32(NonZero, expectedAndClobbered));
+            return failure;
+        }
+        
+        RELEASE_ASSERT_NOT_REACHED();
+    }
+    
+    RegisterID extractSimpleAddress(ImplicitAddress address)
+    {
+        if (!address.offset)
+            return address.base;
+        
+        signExtend32ToPtr(TrustedImm32(address.offset), getCachedMemoryTempRegisterIDAndInvalidate());
+        add64(address.base, memoryTempRegister);
+        return memoryTempRegister;
+    }
 
     Jump jumpAfterFloatingPointCompare(DoubleCondition cond)
     {
@@ -3994,6 +4368,24 @@ ALWAYS_INLINE void MacroAssemblerARM64::storeUnscaledImmediate<16>(RegisterID rt
     m_assembler.sturh(rt, rn, simm);
 }
 
+template<>
+inline void MacroAssemblerARM64::signExtend<8>(RegisterID src, RegisterID dest)
+{
+    signExtend8To32(src, dest);
+}
+
+template<>
+inline void MacroAssemblerARM64::signExtend<16>(RegisterID src, RegisterID dest)
+{
+    signExtend16To32(src, dest);
+}
+
+template<>
+inline MacroAssemblerARM64::Jump MacroAssemblerARM64::branch<64>(RelationalCondition cond, RegisterID left, RegisterID right)
+{
+    return branch64(cond, left, right);
+}
+
 } // namespace JSC
 
 #endif // ENABLE(ASSEMBLER)
index 8d05e2e..8ee10ca 100644 (file)
@@ -171,6 +171,11 @@ public:
         m_assembler.addl_mr(src.offset, src.base, dest);
     }
 
+    void add32(BaseIndex src, RegisterID dest)
+    {
+        m_assembler.addl_mr(src.offset, src.base, src.index, src.scale, dest);
+    }
+
     void add32(RegisterID src, Address dest)
     {
         m_assembler.addl_rm(src, dest.offset, dest.base);
@@ -251,16 +256,71 @@ public:
         m_assembler.andl_rm(src, dest.offset, dest.base);
     }
 
+    void and32(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.andl_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
+    void and16(RegisterID src, Address dest)
+    {
+        m_assembler.andw_rm(src, dest.offset, dest.base);
+    }
+
+    void and16(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.andw_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
+    void and8(RegisterID src, Address dest)
+    {
+        m_assembler.andb_rm(src, dest.offset, dest.base);
+    }
+
+    void and8(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.andb_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
     void and32(Address src, RegisterID dest)
     {
         m_assembler.andl_mr(src.offset, src.base, dest);
     }
 
+    void and32(BaseIndex src, RegisterID dest)
+    {
+        m_assembler.andl_mr(src.offset, src.base, src.index, src.scale, dest);
+    }
+
     void and32(TrustedImm32 imm, Address address)
     {
         m_assembler.andl_im(imm.m_value, address.offset, address.base);
     }
 
+    void and32(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.andl_im(imm.m_value, address.offset, address.base, address.index, address.scale);
+    }
+
+    void and16(TrustedImm32 imm, Address address)
+    {
+        m_assembler.andw_im(static_cast<int16_t>(imm.m_value), address.offset, address.base);
+    }
+
+    void and16(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.andw_im(static_cast<int16_t>(imm.m_value), address.offset, address.base, address.index, address.scale);
+    }
+
+    void and8(TrustedImm32 imm, Address address)
+    {
+        m_assembler.andb_im(static_cast<int8_t>(imm.m_value), address.offset, address.base);
+    }
+
+    void and8(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.andb_im(static_cast<int8_t>(imm.m_value), address.offset, address.base, address.index, address.scale);
+    }
+
     void and32(RegisterID op1, RegisterID op2, RegisterID dest)
     {
         if (op1 == op2)
@@ -456,6 +516,31 @@ public:
         m_assembler.negl_m(srcDest.offset, srcDest.base);
     }
 
+    void neg32(BaseIndex srcDest)
+    {
+        m_assembler.negl_m(srcDest.offset, srcDest.base, srcDest.index, srcDest.scale);
+    }
+
+    void neg16(Address srcDest)
+    {
+        m_assembler.negw_m(srcDest.offset, srcDest.base);
+    }
+
+    void neg16(BaseIndex srcDest)
+    {
+        m_assembler.negw_m(srcDest.offset, srcDest.base, srcDest.index, srcDest.scale);
+    }
+
+    void neg8(Address srcDest)
+    {
+        m_assembler.negb_m(srcDest.offset, srcDest.base);
+    }
+
+    void neg8(BaseIndex srcDest)
+    {
+        m_assembler.negb_m(srcDest.offset, srcDest.base, srcDest.index, srcDest.scale);
+    }
+
     void or32(RegisterID src, RegisterID dest)
     {
         m_assembler.orl_rr(src, dest);
@@ -471,16 +556,71 @@ public:
         m_assembler.orl_rm(src, dest.offset, dest.base);
     }
 
+    void or32(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.orl_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
+    void or16(RegisterID src, Address dest)
+    {
+        m_assembler.orw_rm(src, dest.offset, dest.base);
+    }
+
+    void or16(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.orw_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
+    void or8(RegisterID src, Address dest)
+    {
+        m_assembler.orb_rm(src, dest.offset, dest.base);
+    }
+
+    void or8(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.orb_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
     void or32(Address src, RegisterID dest)
     {
         m_assembler.orl_mr(src.offset, src.base, dest);
     }
 
+    void or32(BaseIndex src, RegisterID dest)
+    {
+        m_assembler.orl_mr(src.offset, src.base, src.index, src.scale, dest);
+    }
+
     void or32(TrustedImm32 imm, Address address)
     {
         m_assembler.orl_im(imm.m_value, address.offset, address.base);
     }
 
+    void or32(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.orl_im(imm.m_value, address.offset, address.base, address.index, address.scale);
+    }
+
+    void or16(TrustedImm32 imm, Address address)
+    {
+        m_assembler.orw_im(static_cast<int16_t>(imm.m_value), address.offset, address.base);
+    }
+
+    void or16(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.orw_im(static_cast<int16_t>(imm.m_value), address.offset, address.base, address.index, address.scale);
+    }
+
+    void or8(TrustedImm32 imm, Address address)
+    {
+        m_assembler.orb_im(static_cast<int8_t>(imm.m_value), address.offset, address.base);
+    }
+
+    void or8(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.orb_im(static_cast<int8_t>(imm.m_value), address.offset, address.base, address.index, address.scale);
+    }
+
     void or32(RegisterID op1, RegisterID op2, RegisterID dest)
     {
         if (op1 == op2)
@@ -654,16 +794,71 @@ public:
         m_assembler.subl_im(imm.m_value, address.offset, address.base);
     }
 
+    void sub16(TrustedImm32 imm, Address address)
+    {
+        m_assembler.subw_im(static_cast<int16_t>(imm.m_value), address.offset, address.base);
+    }
+
+    void sub8(TrustedImm32 imm, Address address)
+    {
+        m_assembler.subb_im(static_cast<int8_t>(imm.m_value), address.offset, address.base);
+    }
+
+    void sub32(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.subl_im(imm.m_value, address.offset, address.base, address.index, address.scale);
+    }
+
+    void sub16(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.subw_im(static_cast<int16_t>(imm.m_value), address.offset, address.base, address.index, address.scale);
+    }
+
+    void sub8(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.subb_im(static_cast<int8_t>(imm.m_value), address.offset, address.base, address.index, address.scale);
+    }
+
     void sub32(Address src, RegisterID dest)
     {
         m_assembler.subl_mr(src.offset, src.base, dest);
     }
 
+    void sub32(BaseIndex src, RegisterID dest)
+    {
+        m_assembler.subl_mr(src.offset, src.base, src.index, src.scale, dest);
+    }
+
     void sub32(RegisterID src, Address dest)
     {
         m_assembler.subl_rm(src, dest.offset, dest.base);
     }
 
+    void sub16(RegisterID src, Address dest)
+    {
+        m_assembler.subw_rm(src, dest.offset, dest.base);
+    }
+
+    void sub8(RegisterID src, Address dest)
+    {
+        m_assembler.subb_rm(src, dest.offset, dest.base);
+    }
+
+    void sub32(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.subl_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
+    void sub16(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.subw_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
+    void sub8(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.subb_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
     void xor32(RegisterID src, RegisterID dest)
     {
         m_assembler.xorl_rr(src, dest);
@@ -677,6 +872,50 @@ public:
             m_assembler.xorl_im(imm.m_value, dest.offset, dest.base);
     }
 
+    void xor32(TrustedImm32 imm, BaseIndex dest)
+    {
+        if (imm.m_value == -1)
+            m_assembler.notl_m(dest.offset, dest.base, dest.index, dest.scale);
+        else
+            m_assembler.xorl_im(imm.m_value, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
+    void xor16(TrustedImm32 imm, Address dest)
+    {
+        imm.m_value = static_cast<int16_t>(imm.m_value);
+        if (imm.m_value == -1)
+            m_assembler.notw_m(dest.offset, dest.base);
+        else
+            m_assembler.xorw_im(imm.m_value, dest.offset, dest.base);
+    }
+
+    void xor16(TrustedImm32 imm, BaseIndex dest)
+    {
+        imm.m_value = static_cast<int16_t>(imm.m_value);
+        if (imm.m_value == -1)
+            m_assembler.notw_m(dest.offset, dest.base, dest.index, dest.scale);
+        else
+            m_assembler.xorw_im(imm.m_value, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
+    void xor8(TrustedImm32 imm, Address dest)
+    {
+        imm.m_value = static_cast<int8_t>(imm.m_value);
+        if (imm.m_value == -1)
+            m_assembler.notb_m(dest.offset, dest.base);
+        else
+            m_assembler.xorb_im(imm.m_value, dest.offset, dest.base);
+    }
+
+    void xor8(TrustedImm32 imm, BaseIndex dest)
+    {
+        imm.m_value = static_cast<int8_t>(imm.m_value);
+        if (imm.m_value == -1)
+            m_assembler.notb_m(dest.offset, dest.base, dest.index, dest.scale);
+        else
+            m_assembler.xorb_im(imm.m_value, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
     void xor32(TrustedImm32 imm, RegisterID dest)
     {
         if (imm.m_value == -1)
@@ -690,11 +929,41 @@ public:
         m_assembler.xorl_rm(src, dest.offset, dest.base);
     }
 
+    void xor32(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.xorl_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
+    void xor16(RegisterID src, Address dest)
+    {
+        m_assembler.xorw_rm(src, dest.offset, dest.base);
+    }
+
+    void xor16(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.xorw_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
+    void xor8(RegisterID src, Address dest)
+    {
+        m_assembler.xorb_rm(src, dest.offset, dest.base);
+    }
+
+    void xor8(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.xorb_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
     void xor32(Address src, RegisterID dest)
     {
         m_assembler.xorl_mr(src.offset, src.base, dest);
     }
     
+    void xor32(BaseIndex src, RegisterID dest)
+    {
+        m_assembler.xorl_mr(src.offset, src.base, src.index, src.scale, dest);
+    }
+    
     void xor32(RegisterID op1, RegisterID op2, RegisterID dest)
     {
         if (op1 == op2)
@@ -741,6 +1010,31 @@ public:
         m_assembler.notl_m(dest.offset, dest.base);
     }
 
+    void not32(BaseIndex dest)
+    {
+        m_assembler.notl_m(dest.offset, dest.base, dest.index, dest.scale);
+    }
+
+    void not16(Address dest)
+    {
+        m_assembler.notw_m(dest.offset, dest.base);
+    }
+
+    void not16(BaseIndex dest)
+    {
+        m_assembler.notw_m(dest.offset, dest.base, dest.index, dest.scale);
+    }
+
+    void not8(Address dest)
+    {
+        m_assembler.notb_m(dest.offset, dest.base);
+    }
+
+    void not8(BaseIndex dest)
+    {
+        m_assembler.notb_m(dest.offset, dest.base, dest.index, dest.scale);
+    }
+
     void sqrtDouble(FPRegisterID src, FPRegisterID dst)
     {
         m_assembler.sqrtsd_rr(src, dst);
@@ -1071,42 +1365,23 @@ public:
 
     void store16(RegisterID src, BaseIndex address)
     {
-#if CPU(X86)
-        // On 32-bit x86 we can only store from the first 4 registers;
-        // esp..edi are mapped to the 'h' registers!
-        if (src >= 4) {
-            // Pick a temporary register.
-            RegisterID temp = getUnusedRegister(address);
-
-            // Swap to the temporary register to perform the store.
-            swap(src, temp);
-            m_assembler.movw_rm(temp, address.offset, address.base, address.index, address.scale);
-            swap(src, temp);
-            return;
-        }
-#endif
         m_assembler.movw_rm(src, address.offset, address.base, address.index, address.scale);
     }
 
     void store16(RegisterID src, Address address)
     {
-#if CPU(X86)
-        // On 32-bit x86 we can only store from the first 4 registers;
-        // esp..edi are mapped to the 'h' registers!
-        if (src >= 4) {
-            // Pick a temporary register.
-            RegisterID temp = getUnusedRegister(address);
-
-            // Swap to the temporary register to perform the store.
-            swap(src, temp);
-            m_assembler.movw_rm(temp, address.offset, address.base);
-            swap(src, temp);
-            return;
-        }
-#endif
         m_assembler.movw_rm(src, address.offset, address.base);
     }
 
+    void store16(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.movw_im(static_cast<int16_t>(imm.m_value), address.offset, address.base, address.index, address.scale);
+    }
+
+    void store16(TrustedImm32 imm, Address address)
+    {
+        m_assembler.movw_im(static_cast<int16_t>(imm.m_value), address.offset, address.base);
+    }
 
     // Floating-point operation:
     //
@@ -2745,77 +3020,904 @@ public:
         m_assembler.lock();
         m_assembler.orl_im(0, 0, X86Registers::esp);
     }
+    
+    void atomicStrongCAS8(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicStrongCAS(cond, expectedAndResult, result, [&] { m_assembler.cmpxchgb_rm(newValue, address.offset, address.base); });
+    }
 
-    // We take this to mean that it prevents motion of normal stores. So, it's a no-op on x86.
-    void storeFence()
+    void atomicStrongCAS8(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, BaseIndex address, RegisterID result)
     {
+        atomicStrongCAS(cond, expectedAndResult, result, [&] { m_assembler.cmpxchgb_rm(newValue, address.offset, address.base, address.index, address.scale); });
     }
 
-    // We take this to mean that it prevents motion of normal loads. So, it's a no-op on x86.
-    void loadFence()
+    void atomicStrongCAS16(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
     {
+        atomicStrongCAS(cond, expectedAndResult, result, [&] { m_assembler.cmpxchgw_rm(newValue, address.offset, address.base); });
     }
 
-    static void replaceWithBreakpoint(CodeLocationLabel instructionStart)
+    void atomicStrongCAS16(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, BaseIndex address, RegisterID result)
     {
-        X86Assembler::replaceWithInt3(instructionStart.executableAddress());
+        atomicStrongCAS(cond, expectedAndResult, result, [&] { m_assembler.cmpxchgw_rm(newValue, address.offset, address.base, address.index, address.scale); });
     }
 
-    static void replaceWithJump(CodeLocationLabel instructionStart, CodeLocationLabel destination)
+    void atomicStrongCAS32(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
     {
-        X86Assembler::replaceWithJump(instructionStart.executableAddress(), destination.executableAddress());
+        atomicStrongCAS(cond, expectedAndResult, result, [&] { m_assembler.cmpxchgl_rm(newValue, address.offset, address.base); });
     }
-    
-    static ptrdiff_t maxJumpReplacementSize()
+
+    void atomicStrongCAS32(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, BaseIndex address, RegisterID result)
     {
-        return X86Assembler::maxJumpReplacementSize();
+        atomicStrongCAS(cond, expectedAndResult, result, [&] { m_assembler.cmpxchgl_rm(newValue, address.offset, address.base, address.index, address.scale); });
     }
 
-    static ptrdiff_t patchableJumpSize()
+    void atomicStrongCAS8(RegisterID expectedAndResult, RegisterID newValue, Address address)
     {
-        return X86Assembler::patchableJumpSize();
+        atomicStrongCAS(expectedAndResult, [&] { m_assembler.cmpxchgb_rm(newValue, address.offset, address.base); });
     }
 
-    static bool supportsFloatingPointRounding()
+    void atomicStrongCAS8(RegisterID expectedAndResult, RegisterID newValue, BaseIndex address)
     {
-        if (s_sse4_1CheckState == CPUIDCheckState::NotChecked)
-            updateEax1EcxFlags();
-        return s_sse4_1CheckState == CPUIDCheckState::Set;
+        atomicStrongCAS(expectedAndResult, [&] { m_assembler.cmpxchgb_rm(newValue, address.offset, address.base, address.index, address.scale); });
     }
 
-    static bool supportsAVX()
+    void atomicStrongCAS16(RegisterID expectedAndResult, RegisterID newValue, Address address)
     {
-        // AVX still causes mysterious regressions and those regressions can be massive.
-        return false;
+        atomicStrongCAS(expectedAndResult, [&] { m_assembler.cmpxchgw_rm(newValue, address.offset, address.base); });
     }
 
-    static void updateEax1EcxFlags()
+    void atomicStrongCAS16(RegisterID expectedAndResult, RegisterID newValue, BaseIndex address)
     {
-        int flags = 0;
-#if COMPILER(MSVC)
-        int cpuInfo[4];
-        __cpuid(cpuInfo, 0x1);
-        flags = cpuInfo[2];
-#elif COMPILER(GCC_OR_CLANG)
-#if CPU(X86_64)
-        asm (
-            "movl $0x1, %%eax;"
-            "cpuid;"
-            "movl %%ecx, %0;"
-            : "=g" (flags)
-            :
-            : "%eax", "%ebx", "%ecx", "%edx"
-            );
-#else
-        asm (
-            "movl $0x1, %%eax;"
-            "pushl %%ebx;"
-            "cpuid;"
-            "popl %%ebx;"
-            "movl %%ecx, %0;"
-            : "=g" (flags)
-            :
-            : "%eax", "%ecx", "%edx"
+        atomicStrongCAS(expectedAndResult, [&] { m_assembler.cmpxchgw_rm(newValue, address.offset, address.base, address.index, address.scale); });
+    }
+
+    void atomicStrongCAS32(RegisterID expectedAndResult, RegisterID newValue, Address address)
+    {
+        atomicStrongCAS(expectedAndResult, [&] { m_assembler.cmpxchgl_rm(newValue, address.offset, address.base); });
+    }
+
+    void atomicStrongCAS32(RegisterID expectedAndResult, RegisterID newValue, BaseIndex address)
+    {
+        atomicStrongCAS(expectedAndResult, [&] { m_assembler.cmpxchgl_rm(newValue, address.offset, address.base, address.index, address.scale); });
+    }
+
+    Jump branchAtomicStrongCAS8(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address)
+    {
+        return branchAtomicStrongCAS(cond, expectedAndResult, [&] { m_assembler.cmpxchgb_rm(newValue, address.offset, address.base); });
+    }
+
+    Jump branchAtomicStrongCAS8(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, BaseIndex address)
+    {
+        return branchAtomicStrongCAS(cond, expectedAndResult, [&] { m_assembler.cmpxchgb_rm(newValue, address.offset, address.base, address.index, address.scale); });
+    }
+
+    Jump branchAtomicStrongCAS16(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address)
+    {
+        return branchAtomicStrongCAS(cond, expectedAndResult, [&] { m_assembler.cmpxchgw_rm(newValue, address.offset, address.base); });
+    }
+
+    Jump branchAtomicStrongCAS16(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, BaseIndex address)
+    {
+        return branchAtomicStrongCAS(cond, expectedAndResult, [&] { m_assembler.cmpxchgw_rm(newValue, address.offset, address.base, address.index, address.scale); });
+    }
+
+    Jump branchAtomicStrongCAS32(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address)
+    {
+        return branchAtomicStrongCAS(cond, expectedAndResult, [&] { m_assembler.cmpxchgl_rm(newValue, address.offset, address.base); });
+    }
+
+    Jump branchAtomicStrongCAS32(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, BaseIndex address)
+    {
+        return branchAtomicStrongCAS(cond, expectedAndResult, [&] { m_assembler.cmpxchgl_rm(newValue, address.offset, address.base, address.index, address.scale); });
+    }
+
+    // If you use weak CAS, you cannot rely on expectedAndClobbered to have any particular value after
+    // this completes. On x86, it will contain the result of the strong CAS. On ARM, it will still have
+    // the expected value.
+    void atomicWeakCAS8(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicStrongCAS8(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    void atomicWeakCAS8(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address, RegisterID result)
+    {
+        atomicStrongCAS8(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    void atomicWeakCAS16(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicStrongCAS16(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    void atomicWeakCAS16(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address, RegisterID result)
+    {
+        atomicStrongCAS16(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    void atomicWeakCAS32(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicStrongCAS32(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    void atomicWeakCAS32(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address, RegisterID result)
+    {
+        atomicStrongCAS32(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    Jump branchAtomicWeakCAS8(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicStrongCAS8(cond, expectedAndClobbered, newValue, address);
+    }
+
+    Jump branchAtomicWeakCAS8(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address)
+    {
+        return branchAtomicStrongCAS8(cond, expectedAndClobbered, newValue, address);
+    }
+
+    Jump branchAtomicWeakCAS16(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicStrongCAS16(cond, expectedAndClobbered, newValue, address);
+    }
+
+    Jump branchAtomicWeakCAS16(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address)
+    {
+        return branchAtomicStrongCAS16(cond, expectedAndClobbered, newValue, address);
+    }
+
+    Jump branchAtomicWeakCAS32(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicStrongCAS32(cond, expectedAndClobbered, newValue, address);
+    }
+
+    Jump branchAtomicWeakCAS32(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address)
+    {
+        return branchAtomicStrongCAS32(cond, expectedAndClobbered, newValue, address);
+    }
+    
+    void atomicRelaxedWeakCAS8(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicStrongCAS8(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    void atomicRelaxedWeakCAS8(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address, RegisterID result)
+    {
+        atomicStrongCAS8(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    void atomicRelaxedWeakCAS16(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicStrongCAS16(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    void atomicRelaxedWeakCAS16(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address, RegisterID result)
+    {
+        atomicStrongCAS16(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    void atomicRelaxedWeakCAS32(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicStrongCAS32(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    void atomicRelaxedWeakCAS32(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address, RegisterID result)
+    {
+        atomicStrongCAS32(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    Jump branchAtomicRelaxedWeakCAS8(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicStrongCAS8(cond, expectedAndClobbered, newValue, address);
+    }
+
+    Jump branchAtomicRelaxedWeakCAS8(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address)
+    {
+        return branchAtomicStrongCAS8(cond, expectedAndClobbered, newValue, address);
+    }
+
+    Jump branchAtomicRelaxedWeakCAS16(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicStrongCAS16(cond, expectedAndClobbered, newValue, address);
+    }
+
+    Jump branchAtomicRelaxedWeakCAS16(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address)
+    {
+        return branchAtomicStrongCAS16(cond, expectedAndClobbered, newValue, address);
+    }
+
+    Jump branchAtomicRelaxedWeakCAS32(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicStrongCAS32(cond, expectedAndClobbered, newValue, address);
+    }
+
+    Jump branchAtomicRelaxedWeakCAS32(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address)
+    {
+        return branchAtomicStrongCAS32(cond, expectedAndClobbered, newValue, address);
+    }
+    
+    void atomicAdd8(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        add8(imm, address);
+    }
+    
+    void atomicAdd8(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        add8(imm, address);
+    }
+    
+    void atomicAdd8(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        add8(reg, address);
+    }
+    
+    void atomicAdd8(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        add8(reg, address);
+    }
+    
+    void atomicAdd16(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        add16(imm, address);
+    }
+    
+    void atomicAdd16(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        add16(imm, address);
+    }
+    
+    void atomicAdd16(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        add16(reg, address);
+    }
+    
+    void atomicAdd16(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        add16(reg, address);
+    }
+    
+    void atomicAdd32(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        add32(imm, address);
+    }
+    
+    void atomicAdd32(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        add32(imm, address);
+    }
+    
+    void atomicAdd32(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        add32(reg, address);
+    }
+    
+    void atomicAdd32(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        add32(reg, address);
+    }
+    
+    void atomicSub8(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        sub8(imm, address);
+    }
+    
+    void atomicSub8(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        sub8(imm, address);
+    }
+    
+    void atomicSub8(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        sub8(reg, address);
+    }
+    
+    void atomicSub8(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        sub8(reg, address);
+    }
+    
+    void atomicSub16(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        sub16(imm, address);
+    }
+    
+    void atomicSub16(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        sub16(imm, address);
+    }
+    
+    void atomicSub16(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        sub16(reg, address);
+    }
+    
+    void atomicSub16(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        sub16(reg, address);
+    }
+    
+    void atomicSub32(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        sub32(imm, address);
+    }
+    
+    void atomicSub32(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        sub32(imm, address);
+    }
+    
+    void atomicSub32(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        sub32(reg, address);
+    }
+    
+    void atomicSub32(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        sub32(reg, address);
+    }
+    
+    void atomicAnd8(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        and8(imm, address);
+    }
+    
+    void atomicAnd8(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        and8(imm, address);
+    }
+    
+    void atomicAnd8(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        and8(reg, address);
+    }
+    
+    void atomicAnd8(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        and8(reg, address);
+    }
+    
+    void atomicAnd16(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        and16(imm, address);
+    }
+    
+    void atomicAnd16(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        and16(imm, address);
+    }
+    
+    void atomicAnd16(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        and16(reg, address);
+    }
+    
+    void atomicAnd16(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        and16(reg, address);
+    }
+    
+    void atomicAnd32(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        and32(imm, address);
+    }
+    
+    void atomicAnd32(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        and32(imm, address);
+    }
+    
+    void atomicAnd32(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        and32(reg, address);
+    }
+    
+    void atomicAnd32(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        and32(reg, address);
+    }
+    
+    void atomicOr8(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        or8(imm, address);
+    }
+    
+    void atomicOr8(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        or8(imm, address);
+    }
+    
+    void atomicOr8(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        or8(reg, address);
+    }
+    
+    void atomicOr8(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        or8(reg, address);
+    }
+    
+    void atomicOr16(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        or16(imm, address);
+    }
+    
+    void atomicOr16(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        or16(imm, address);
+    }
+    
+    void atomicOr16(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        or16(reg, address);
+    }
+    
+    void atomicOr16(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        or16(reg, address);
+    }
+    
+    void atomicOr32(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        or32(imm, address);
+    }
+    
+    void atomicOr32(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        or32(imm, address);
+    }
+    
+    void atomicOr32(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        or32(reg, address);
+    }
+    
+    void atomicOr32(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        or32(reg, address);
+    }
+    
+    void atomicXor8(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        xor8(imm, address);
+    }
+    
+    void atomicXor8(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        xor8(imm, address);
+    }
+    
+    void atomicXor8(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        xor8(reg, address);
+    }
+    
+    void atomicXor8(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        xor8(reg, address);
+    }
+    
+    void atomicXor16(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        xor16(imm, address);
+    }
+    
+    void atomicXor16(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        xor16(imm, address);
+    }
+    
+    void atomicXor16(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        xor16(reg, address);
+    }
+    
+    void atomicXor16(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        xor16(reg, address);
+    }
+    
+    void atomicXor32(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        xor32(imm, address);
+    }
+    
+    void atomicXor32(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        xor32(imm, address);
+    }
+    
+    void atomicXor32(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        xor32(reg, address);
+    }
+    
+    void atomicXor32(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        xor32(reg, address);
+    }
+    
+    void atomicNeg8(Address address)
+    {
+        m_assembler.lock();
+        neg8(address);
+    }
+    
+    void atomicNeg8(BaseIndex address)
+    {
+        m_assembler.lock();
+        neg8(address);
+    }
+    
+    void atomicNeg16(Address address)
+    {
+        m_assembler.lock();
+        neg16(address);
+    }
+    
+    void atomicNeg16(BaseIndex address)
+    {
+        m_assembler.lock();
+        neg16(address);
+    }
+    
+    void atomicNeg32(Address address)
+    {
+        m_assembler.lock();
+        neg32(address);
+    }
+    
+    void atomicNeg32(BaseIndex address)
+    {
+        m_assembler.lock();
+        neg32(address);
+    }
+    
+    void atomicNot8(Address address)
+    {
+        m_assembler.lock();
+        not8(address);
+    }
+    
+    void atomicNot8(BaseIndex address)
+    {
+        m_assembler.lock();
+        not8(address);
+    }
+    
+    void atomicNot16(Address address)
+    {
+        m_assembler.lock();
+        not16(address);
+    }
+    
+    void atomicNot16(BaseIndex address)
+    {
+        m_assembler.lock();
+        not16(address);
+    }
+    
+    void atomicNot32(Address address)
+    {
+        m_assembler.lock();
+        not32(address);
+    }
+    
+    void atomicNot32(BaseIndex address)
+    {
+        m_assembler.lock();
+        not32(address);
+    }
+    
+    void atomicXchgAdd8(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        m_assembler.xaddb_rm(reg, address.offset, address.base);
+    }
+    
+    void atomicXchgAdd8(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        m_assembler.xaddb_rm(reg, address.offset, address.base, address.index, address.scale);
+    }
+    
+    void atomicXchgAdd16(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        m_assembler.xaddw_rm(reg, address.offset, address.base);
+    }
+    
+    void atomicXchgAdd16(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        m_assembler.xaddw_rm(reg, address.offset, address.base, address.index, address.scale);
+    }
+    
+    void atomicXchgAdd32(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        m_assembler.xaddl_rm(reg, address.offset, address.base);
+    }
+    
+    void atomicXchgAdd32(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        m_assembler.xaddl_rm(reg, address.offset, address.base, address.index, address.scale);
+    }
+    
+    void atomicXchg8(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        m_assembler.xchgb_rm(reg, address.offset, address.base);
+    }
+    
+    void atomicXchg8(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        m_assembler.xchgb_rm(reg, address.offset, address.base, address.index, address.scale);
+    }
+    
+    void atomicXchg16(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        m_assembler.xchgw_rm(reg, address.offset, address.base);
+    }
+    
+    void atomicXchg16(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        m_assembler.xchgw_rm(reg, address.offset, address.base, address.index, address.scale);
+    }
+    
+    void atomicXchg32(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        m_assembler.xchgl_rm(reg, address.offset, address.base);
+    }
+    
+    void atomicXchg32(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        m_assembler.xchgl_rm(reg, address.offset, address.base, address.index, address.scale);
+    }
+    
+    void loadAcq8(Address src, RegisterID dest)
+    {
+        load8(src, dest);
+    }
+    
+    void loadAcq8(BaseIndex src, RegisterID dest)
+    {
+        load8(src, dest);
+    }
+    
+    void loadAcq8SignedExtendTo32(Address src, RegisterID dest)
+    {
+        load8SignedExtendTo32(src, dest);
+    }
+    
+    void loadAcq8SignedExtendTo32(BaseIndex src, RegisterID dest)
+    {
+        load8SignedExtendTo32(src, dest);
+    }
+    
+    void loadAcq16(Address src, RegisterID dest)
+    {
+        load16(src, dest);
+    }
+    
+    void loadAcq16(BaseIndex src, RegisterID dest)
+    {
+        load16(src, dest);
+    }
+    
+    void loadAcq16SignedExtendTo32(Address src, RegisterID dest)
+    {
+        load16SignedExtendTo32(src, dest);
+    }
+    
+    void loadAcq16SignedExtendTo32(BaseIndex src, RegisterID dest)
+    {
+        load16SignedExtendTo32(src, dest);
+    }
+    
+    void loadAcq32(Address src, RegisterID dest)
+    {
+        load32(src, dest);
+    }
+    
+    void loadAcq32(BaseIndex src, RegisterID dest)
+    {
+        load32(src, dest);
+    }
+    
+    void storeRel8(RegisterID src, Address dest)
+    {
+        store8(src, dest);
+    }
+    
+    void storeRel8(RegisterID src, BaseIndex dest)
+    {
+        store8(src, dest);
+    }
+    
+    void storeRel8(TrustedImm32 imm, Address dest)
+    {
+        store8(imm, dest);
+    }
+    
+    void storeRel8(TrustedImm32 imm, BaseIndex dest)
+    {
+        store8(imm, dest);
+    }
+    
+    void storeRel16(RegisterID src, Address dest)
+    {
+        store16(src, dest);
+    }
+    
+    void storeRel16(RegisterID src, BaseIndex dest)
+    {
+        store16(src, dest);
+    }
+    
+    void storeRel16(TrustedImm32 imm, Address dest)
+    {
+        store16(imm, dest);
+    }
+    
+    void storeRel16(TrustedImm32 imm, BaseIndex dest)
+    {
+        store16(imm, dest);
+    }
+    
+    void storeRel32(RegisterID src, Address dest)
+    {
+        store32(src, dest);
+    }
+    
+    void storeRel32(RegisterID src, BaseIndex dest)
+    {
+        store32(src, dest);
+    }
+    
+    void storeRel32(TrustedImm32 imm, Address dest)
+    {
+        store32(imm, dest);
+    }
+    
+    void storeRel32(TrustedImm32 imm, BaseIndex dest)
+    {
+        store32(imm, dest);
+    }
+    
+    // We take this to mean that it prevents motion of normal stores. So, it's a no-op on x86.
+    void storeFence()
+    {
+    }
+
+    // We take this to mean that it prevents motion of normal loads. So, it's a no-op on x86.
+    void loadFence()
+    {
+    }
+
+    static void replaceWithBreakpoint(CodeLocationLabel instructionStart)
+    {
+        X86Assembler::replaceWithInt3(instructionStart.executableAddress());
+    }
+
+    static void replaceWithJump(CodeLocationLabel instructionStart, CodeLocationLabel destination)
+    {
+        X86Assembler::replaceWithJump(instructionStart.executableAddress(), destination.executableAddress());
+    }
+    
+    static ptrdiff_t maxJumpReplacementSize()
+    {
+        return X86Assembler::maxJumpReplacementSize();
+    }
+
+    static ptrdiff_t patchableJumpSize()
+    {
+        return X86Assembler::patchableJumpSize();
+    }
+
+    static bool supportsFloatingPointRounding()
+    {
+        if (s_sse4_1CheckState == CPUIDCheckState::NotChecked)
+            updateEax1EcxFlags();
+        return s_sse4_1CheckState == CPUIDCheckState::Set;
+    }
+
+    static bool supportsAVX()
+    {
+        // AVX still causes mysterious regressions and those regressions can be massive.
+        return false;
+    }
+
+    static void updateEax1EcxFlags()
+    {
+        int flags = 0;
+#if COMPILER(MSVC)
+        int cpuInfo[4];
+        __cpuid(cpuInfo, 0x1);
+        flags = cpuInfo[2];
+#elif COMPILER(GCC_OR_CLANG)
+#if CPU(X86_64)
+        asm (
+            "movl $0x1, %%eax;"
+            "cpuid;"
+            "movl %%ecx, %0;"
+            : "=g" (flags)
+            :
+            : "%eax", "%ebx", "%ecx", "%edx"
+            );
+#else
+        asm (
+            "movl $0x1, %%eax;"
+            "pushl %%ebx;"
+            "cpuid;"
+            "popl %%ebx;"
+            "movl %%ecx, %0;"
+            : "=g" (flags)
+            :
+            : "%eax", "%ecx", "%edx"
             );
 #endif
 #endif // COMPILER(GCC_OR_CLANG)
@@ -2838,6 +3940,18 @@ protected:
         return static_cast<X86Assembler::Condition>(cond);
     }
 
+    X86Assembler::Condition x86Condition(StatusCondition cond)
+    {
+        switch (cond) {
+        case Success:
+            return X86Assembler::ConditionE;
+        case Failure:
+            return X86Assembler::ConditionNE;
+        }
+        RELEASE_ASSERT_NOT_REACHED();
+        return X86Assembler::ConditionE;
+    }
+
     void set32(X86Assembler::Condition cond, RegisterID dest)
     {
 #if CPU(X86)
@@ -2932,6 +4046,35 @@ protected:
         move(TrustedImm32(sizeOfRegister), dst);
         srcIsNonZero.link(this);
     }
+    
+    template<typename Func>
+    void atomicStrongCAS(StatusCondition cond, RegisterID expectedAndResult, RegisterID result, const Func& func)
+    {
+        swap(expectedAndResult, X86Registers::eax);
+        m_assembler.lock();
+        func();
+        swap(expectedAndResult, X86Registers::eax);
+        set32(x86Condition(cond), result);
+    }
+
+    template<typename Func>
+    void atomicStrongCAS(RegisterID expectedAndResult, const Func& func)
+    {
+        swap(expectedAndResult, X86Registers::eax);
+        m_assembler.lock();
+        func();
+        swap(expectedAndResult, X86Registers::eax);
+    }
+
+    template<typename Func>
+    Jump branchAtomicStrongCAS(StatusCondition cond, RegisterID expectedAndResult, const Func& func)
+    {
+        swap(expectedAndResult, X86Registers::eax);
+        m_assembler.lock();
+        func();
+        swap(expectedAndResult, X86Registers::eax);
+        return Jump(m_assembler.jCC(x86Condition(cond)));
+    }
 
 private:
     // Only MacroAssemblerX86 should be using the following method; SSE2 is always available on
index 7e18412..41b2ed4 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2008, 2012, 2014-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2008-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -264,11 +264,21 @@ public:
         m_assembler.addq_mr(src.offset, src.base, dest);
     }
 
+    void add64(BaseIndex src, RegisterID dest)
+    {
+        m_assembler.addq_mr(src.offset, src.base, src.index, src.scale, dest);
+    }
+
     void add64(RegisterID src, Address dest)
     {
         m_assembler.addq_rm(src, dest.offset, dest.base);
     }
 
+    void add64(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.addq_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
     void add64(AbsoluteAddress src, RegisterID dest)
     {
         move(TrustedImmPtr(src.m_ptr), scratchRegister());
@@ -306,6 +316,14 @@ public:
             m_assembler.addq_im(imm.m_value, address.offset, address.base);
     }
 
+    void add64(TrustedImm32 imm, BaseIndex address)
+    {
+        if (imm.m_value == 1)
+            m_assembler.incq_m(address.offset, address.base, address.index, address.scale);
+        else
+            m_assembler.addq_im(imm.m_value, address.offset, address.base, address.index, address.scale);
+    }
+
     void add64(TrustedImm32 imm, AbsoluteAddress address)
     {
         move(TrustedImmPtr(address.m_ptr), scratchRegister());
@@ -342,11 +360,41 @@ public:
         m_assembler.andq_rr(src, dest);
     }
 
+    void and64(RegisterID src, Address dest)
+    {
+        m_assembler.andq_rm(src, dest.offset, dest.base);
+    }
+
+    void and64(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.andq_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
+    void and64(Address src, RegisterID dest)
+    {
+        m_assembler.andq_mr(src.offset, src.base, dest);
+    }
+
+    void and64(BaseIndex src, RegisterID dest)
+    {
+        m_assembler.andq_mr(src.offset, src.base, src.index, src.scale, dest);
+    }
+
     void and64(TrustedImm32 imm, RegisterID srcDest)
     {
         m_assembler.andq_ir(imm.m_value, srcDest);
     }
 
+    void and64(TrustedImm32 imm, Address dest)
+    {
+        m_assembler.andq_im(imm.m_value, dest.offset, dest.base);
+    }
+
+    void and64(TrustedImm32 imm, BaseIndex dest)
+    {
+        m_assembler.andq_im(imm.m_value, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
     void and64(TrustedImmPtr imm, RegisterID srcDest)
     {
         intptr_t intValue = imm.asIntptr();
@@ -552,11 +600,51 @@ public:
         m_assembler.negq_r(dest);
     }
 
+    void neg64(Address dest)
+    {
+        m_assembler.negq_m(dest.offset, dest.base);
+    }
+
+    void neg64(BaseIndex dest)
+    {
+        m_assembler.negq_m(dest.offset, dest.base, dest.index, dest.scale);
+    }
+
     void or64(RegisterID src, RegisterID dest)
     {
         m_assembler.orq_rr(src, dest);
     }
 
+    void or64(RegisterID src, Address dest)
+    {
+        m_assembler.orq_rm(src, dest.offset, dest.base);
+    }
+
+    void or64(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.orq_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
+    void or64(Address src, RegisterID dest)
+    {
+        m_assembler.orq_mr(src.offset, src.base, dest);
+    }
+
+    void or64(BaseIndex src, RegisterID dest)
+    {
+        m_assembler.orq_mr(src.offset, src.base, src.index, src.scale, dest);
+    }
+
+    void or64(TrustedImm32 imm, Address dest)
+    {
+        m_assembler.orq_im(imm.m_value, dest.offset, dest.base);
+    }
+
+    void or64(TrustedImm32 imm, BaseIndex dest)
+    {
+        m_assembler.orq_im(imm.m_value, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
     void or64(TrustedImm64 imm, RegisterID srcDest)
     {
         if (imm.m_value <= std::numeric_limits<int32_t>::max()
@@ -619,16 +707,31 @@ public:
         m_assembler.subq_im(imm.m_value, address.offset, address.base);
     }
 
+    void sub64(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.subq_im(imm.m_value, address.offset, address.base, address.index, address.scale);
+    }
+
     void sub64(Address src, RegisterID dest)
     {
         m_assembler.subq_mr(src.offset, src.base, dest);
     }
 
+    void sub64(BaseIndex src, RegisterID dest)
+    {
+        m_assembler.subq_mr(src.offset, src.base, src.index, src.scale, dest);
+    }
+
     void sub64(RegisterID src, Address dest)
     {
         m_assembler.subq_rm(src, dest.offset, dest.base);
     }
 
+    void sub64(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.subq_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
     void xor64(RegisterID src, RegisterID dest)
     {
         m_assembler.xorq_rr(src, dest);
@@ -651,6 +754,31 @@ public:
         m_assembler.xorq_rm(src, dest.offset, dest.base);
     }
 
+    void xor64(RegisterID src, BaseIndex dest)
+    {
+        m_assembler.xorq_rm(src, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
+    void xor64(Address src, RegisterID dest)
+    {
+        m_assembler.xorq_mr(src.offset, src.base, dest);
+    }
+
+    void xor64(BaseIndex src, RegisterID dest)
+    {
+        m_assembler.xorq_mr(src.offset, src.base, src.index, src.scale, dest);
+    }
+
+    void xor64(TrustedImm32 imm, Address dest)
+    {
+        m_assembler.xorq_im(imm.m_value, dest.offset, dest.base);
+    }
+
+    void xor64(TrustedImm32 imm, BaseIndex dest)
+    {
+        m_assembler.xorq_im(imm.m_value, dest.offset, dest.base, dest.index, dest.scale);
+    }
+
     void xor64(TrustedImm32 imm, RegisterID srcDest)
     {
         m_assembler.xorq_ir(imm.m_value, srcDest);
@@ -666,6 +794,11 @@ public:
         m_assembler.notq_m(dest.offset, dest.base);
     }
 
+    void not64(BaseIndex dest)
+    {
+        m_assembler.notq_m(dest.offset, dest.base, dest.index, dest.scale);
+    }
+
     void load64(ImplicitAddress address, RegisterID dest)
     {
         m_assembler.movq_mr(address.offset, address.base, dest);
@@ -725,6 +858,11 @@ public:
         m_assembler.movq_i32m(imm.m_value, address.offset, address.base);
     }
 
+    void store64(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.movq_i32m(imm.m_value, address.offset, address.base, address.index, address.scale);
+    }
+
     void store64(TrustedImm64 imm, ImplicitAddress address)
     {
         if (CAN_SIGN_EXTEND_32_64(imm.m_value)) {
@@ -1294,7 +1432,275 @@ public:
         MacroAssemblerX86Common::move(TrustedImmPtr(address.m_ptr), scratchRegister());
         return MacroAssemblerX86Common::branchTest8(cond, Address(scratchRegister()), mask8);
     }
+    
+    void atomicStrongCAS64(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicStrongCAS(cond, expectedAndResult, result, [&] { m_assembler.cmpxchgq_rm(newValue, address.offset, address.base); });
+    }
 
+    void atomicStrongCAS64(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, BaseIndex address, RegisterID result)
+    {
+        atomicStrongCAS(cond, expectedAndResult, result, [&] { m_assembler.cmpxchgq_rm(newValue, address.offset, address.base, address.index, address.scale); });
+    }
+
+    void atomicStrongCAS64(RegisterID expectedAndResult, RegisterID newValue, Address address)
+    {
+        atomicStrongCAS(expectedAndResult, [&] { m_assembler.cmpxchgq_rm(newValue, address.offset, address.base); });
+    }
+
+    void atomicStrongCAS64(RegisterID expectedAndResult, RegisterID newValue, BaseIndex address)
+    {
+        atomicStrongCAS(expectedAndResult, [&] { m_assembler.cmpxchgq_rm(newValue, address.offset, address.base, address.index, address.scale); });
+    }
+
+    Jump branchAtomicStrongCAS64(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address)
+    {
+        return branchAtomicStrongCAS(cond, expectedAndResult, [&] { m_assembler.cmpxchgq_rm(newValue, address.offset, address.base); });
+    }
+
+    Jump branchAtomicStrongCAS64(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, BaseIndex address)
+    {
+        return branchAtomicStrongCAS(cond, expectedAndResult, [&] { m_assembler.cmpxchgq_rm(newValue, address.offset, address.base, address.index, address.scale); });
+    }
+
+    void atomicWeakCAS64(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicStrongCAS64(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    void atomicWeakCAS64(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address, RegisterID result)
+    {
+        atomicStrongCAS64(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    Jump branchAtomicWeakCAS64(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicStrongCAS64(cond, expectedAndClobbered, newValue, address);
+    }
+
+    Jump branchAtomicWeakCAS64(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address)
+    {
+        return branchAtomicStrongCAS64(cond, expectedAndClobbered, newValue, address);
+    }
+
+    void atomicRelaxedWeakCAS64(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address, RegisterID result)
+    {
+        atomicStrongCAS64(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    void atomicRelaxedWeakCAS64(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address, RegisterID result)
+    {
+        atomicStrongCAS64(cond, expectedAndClobbered, newValue, address, result);
+    }
+
+    Jump branchAtomicRelaxedWeakCAS64(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
+    {
+        return branchAtomicStrongCAS64(cond, expectedAndClobbered, newValue, address);
+    }
+
+    Jump branchAtomicRelaxedWeakCAS64(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, BaseIndex address)
+    {
+        return branchAtomicStrongCAS64(cond, expectedAndClobbered, newValue, address);
+    }
+
+    void atomicAdd64(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        add64(imm, address);
+    }
+    
+    void atomicAdd64(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        add64(imm, address);
+    }
+    
+    void atomicAdd64(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        add64(reg, address);
+    }
+    
+    void atomicAdd64(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        add64(reg, address);
+    }
+    
+    void atomicSub64(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        sub64(imm, address);
+    }
+    
+    void atomicSub64(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        sub64(imm, address);
+    }
+    
+    void atomicSub64(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        sub64(reg, address);
+    }
+    
+    void atomicSub64(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        sub64(reg, address);
+    }
+    
+    void atomicAnd64(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        and64(imm, address);
+    }
+    
+    void atomicAnd64(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        and64(imm, address);
+    }
+    
+    void atomicAnd64(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        and64(reg, address);
+    }
+    
+    void atomicAnd64(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        and64(reg, address);
+    }
+    
+    void atomicOr64(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        or64(imm, address);
+    }
+    
+    void atomicOr64(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        or64(imm, address);
+    }
+    
+    void atomicOr64(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        or64(reg, address);
+    }
+    
+    void atomicOr64(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        or64(reg, address);
+    }
+    
+    void atomicXor64(TrustedImm32 imm, Address address)
+    {
+        m_assembler.lock();
+        xor64(imm, address);
+    }
+    
+    void atomicXor64(TrustedImm32 imm, BaseIndex address)
+    {
+        m_assembler.lock();
+        xor64(imm, address);
+    }
+    
+    void atomicXor64(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        xor64(reg, address);
+    }
+    
+    void atomicXor64(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        xor64(reg, address);
+    }
+    
+    void atomicNeg64(Address address)
+    {
+        m_assembler.lock();
+        neg64(address);
+    }
+    
+    void atomicNeg64(BaseIndex address)
+    {
+        m_assembler.lock();
+        neg64(address);
+    }
+    
+    void atomicNot64(Address address)
+    {
+        m_assembler.lock();
+        not64(address);
+    }
+    
+    void atomicNot64(BaseIndex address)
+    {
+        m_assembler.lock();
+        not64(address);
+    }
+    
+    void atomicXchgAdd64(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        m_assembler.xaddq_rm(reg, address.offset, address.base);
+    }
+    
+    void atomicXchgAdd64(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        m_assembler.xaddq_rm(reg, address.offset, address.base, address.index, address.scale);
+    }
+    
+    void atomicXchg64(RegisterID reg, Address address)
+    {
+        m_assembler.lock();
+        m_assembler.xchgq_rm(reg, address.offset, address.base);
+    }
+    
+    void atomicXchg64(RegisterID reg, BaseIndex address)
+    {
+        m_assembler.lock();
+        m_assembler.xchgq_rm(reg, address.offset, address.base, address.index, address.scale);
+    }
+    
+    void loadAcq64(Address src, RegisterID dest)
+    {
+        load64(src, dest);
+    }
+    
+    void loadAcq64(BaseIndex src, RegisterID dest)
+    {
+        load64(src, dest);
+    }
+    
+    void storeRel64(RegisterID src, Address dest)
+    {
+        store64(src, dest);
+    }
+    
+    void storeRel64(RegisterID src, BaseIndex dest)
+    {
+        store64(src, dest);
+    }
+    
+    void storeRel64(TrustedImm32 imm, Address dest)
+    {
+        store64(imm, dest);
+    }
+    
+    void storeRel64(TrustedImm32 imm, BaseIndex dest)
+    {
+        store64(imm, dest);
+    }
+    
     void truncateDoubleToUint32(FPRegisterID src, RegisterID dest)
     {
         m_assembler.cvttsd2siq_rr(src, dest);
index 6671f2c..e25f18e 100644 (file)
@@ -190,16 +190,20 @@ private:
         OP_ADD_EvGv                     = 0x01,
         OP_ADD_GvEv                     = 0x03,
         OP_ADD_EAXIv                    = 0x05,
+        OP_OR_EvGb                      = 0x08,
         OP_OR_EvGv                      = 0x09,
         OP_OR_GvEv                      = 0x0B,
         OP_OR_EAXIv                     = 0x0D,
         OP_2BYTE_ESCAPE                 = 0x0F,
+        OP_AND_EvGb                     = 0x20,
         OP_AND_EvGv                     = 0x21,
         OP_AND_GvEv                     = 0x23,
+        OP_SUB_EvGb                     = 0x28,
         OP_SUB_EvGv                     = 0x29,
         OP_SUB_GvEv                     = 0x2B,
         OP_SUB_EAXIv                    = 0x2D,
         PRE_PREDICT_BRANCH_NOT_TAKEN    = 0x2E,
+        OP_XOR_EvGb                     = 0x30,
         OP_XOR_EvGv                     = 0x31,
         OP_XOR_GvEv                     = 0x33,
         OP_XOR_EAXIv                    = 0x35,
@@ -223,6 +227,7 @@ private:
         OP_GROUP1_EvIb                  = 0x83,
         OP_TEST_EbGb                    = 0x84,
         OP_TEST_EvGv                    = 0x85,
+        OP_XCHG_EvGb                    = 0x86,
         OP_XCHG_EvGv                    = 0x87,
         OP_MOV_EbGb                     = 0x88,
         OP_MOV_EvGv                     = 0x89,
@@ -252,6 +257,7 @@ private:
         PRE_SSE_F2                      = 0xF2,
         PRE_SSE_F3                      = 0xF3,
         OP_HLT                          = 0xF4,
+        OP_GROUP3_Eb                    = 0xF6,
         OP_GROUP3_EbIb                  = 0xF6,
         OP_GROUP3_Ev                    = 0xF7,
         OP_GROUP3_EvIz                  = 0xF7, // OP_GROUP3_Ev has an immediate, when instruction is a test. 
@@ -290,6 +296,8 @@ private:
         OP_SETCC            = 0x90,
         OP2_3BYTE_ESCAPE_AE = 0xAE,
         OP2_IMUL_GvEv       = 0xAF,
+        OP2_CMPXCHGb        = 0xB0,
+        OP2_CMPXCHG         = 0xB1,
         OP2_MOVZX_GvEb      = 0xB6,
         OP2_BSF             = 0xBC,
         OP2_TZCNT           = 0xBC,
@@ -298,6 +306,8 @@ private:
         OP2_MOVSX_GvEb      = 0xBE,
         OP2_MOVZX_GvEw      = 0xB7,
         OP2_MOVSX_GvEw      = 0xBF,
+        OP2_XADDb           = 0xC0,
+        OP2_XADD            = 0xC1,
         OP2_PEXTRW_GdUdIb   = 0xC5,
         OP2_PSLLQ_UdqIb     = 0x73,
         OP2_PSRLQ_UdqIb     = 0x73,
@@ -440,6 +450,11 @@ public:
         m_formatter.oneByteOp(OP_ADD_GvEv, dst, base, offset);
     }
     
+    void addl_mr(int offset, RegisterID base, RegisterID index, int scale, RegisterID dst)
+    {
+        m_formatter.oneByteOp(OP_ADD_GvEv, dst, base, index, scale, offset);
+    }
+    
 #if !CPU(X86_64)
     void addl_mr(const void* addr, RegisterID dst)
     {
@@ -562,11 +577,21 @@ public:
         m_formatter.oneByteOp64(OP_ADD_GvEv, dst, base, offset);
     }
 
+    void addq_mr(int offset, RegisterID base, RegisterID index, int scale, RegisterID dst)
+    {
+        m_formatter.oneByteOp64(OP_ADD_GvEv, dst, base, index, scale, offset);
+    }
+
     void addq_rm(RegisterID src, int offset, RegisterID base)
     {
         m_formatter.oneByteOp64(OP_ADD_EvGv, src, base, offset);
     }
 
+    void addq_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp64(OP_ADD_EvGv, src, base, index, scale, offset);
+    }
+
     void addq_ir(int imm, RegisterID dst)
     {
         if (CAN_SIGN_EXTEND_8_32(imm)) {
@@ -591,6 +616,17 @@ public:
             m_formatter.immediate32(imm);
         }
     }
+
+    void addq_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIb, GROUP1_OP_ADD, base, index, scale, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIz, GROUP1_OP_ADD, base, index, scale, offset);
+            m_formatter.immediate32(imm);
+        }
+    }
 #else
     void addl_im(int imm, const void* addr)
     {
@@ -614,11 +650,43 @@ public:
         m_formatter.oneByteOp(OP_AND_GvEv, dst, base, offset);
     }
 
+    void andl_mr(int offset, RegisterID base, RegisterID index, int scale, RegisterID dst)
+    {
+        m_formatter.oneByteOp(OP_AND_GvEv, dst, base, index, scale, offset);
+    }
+
     void andl_rm(RegisterID src, int offset, RegisterID base)
     {
         m_formatter.oneByteOp(OP_AND_EvGv, src, base, offset);
     }
 
+    void andl_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_AND_EvGv, src, base, index, scale, offset);
+    }
+
+    void andw_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        andl_rm(src, offset, base);
+    }
+
+    void andw_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        andl_rm(src, offset, base, index, scale);
+    }
+
+    void andb_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.oneByteOp(OP_AND_EvGb, src, base, offset);
+    }
+
+    void andb_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_AND_EvGb, src, base, index, scale, offset);
+    }
+
     void andl_ir(int imm, RegisterID dst)
     {
         if (CAN_SIGN_EXTEND_8_32(imm)) {
@@ -641,6 +709,53 @@ public:
         }
     }
 
+    void andl_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp(OP_GROUP1_EvIb, GROUP1_OP_AND, base, index, scale, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp(OP_GROUP1_EvIz, GROUP1_OP_AND, base, index, scale, offset);
+            m_formatter.immediate32(imm);
+        }
+    }
+
+    void andw_im(int imm, int offset, RegisterID base)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp(OP_GROUP1_EvIb, GROUP1_OP_AND, base, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp(OP_GROUP1_EvIz, GROUP1_OP_AND, base, offset);
+            m_formatter.immediate16(imm);
+        }
+    }
+
+    void andw_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp(OP_GROUP1_EvIb, GROUP1_OP_AND, base, index, scale, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp(OP_GROUP1_EvIz, GROUP1_OP_AND, base, index, scale, offset);
+            m_formatter.immediate16(imm);
+        }
+    }
+
+    void andb_im(int imm, int offset, RegisterID base)
+    {
+        m_formatter.oneByteOp(OP_GROUP1_EbIb, GROUP1_OP_AND, base, offset);
+        m_formatter.immediate8(imm);
+    }
+
+    void andb_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_GROUP1_EbIb, GROUP1_OP_AND, base, index, scale, offset);
+        m_formatter.immediate8(imm);
+    }
+
 #if CPU(X86_64)
     void andq_rr(RegisterID src, RegisterID dst)
     {
@@ -657,6 +772,48 @@ public:
             m_formatter.immediate32(imm);
         }
     }
+
+    void andq_mr(int offset, RegisterID base, RegisterID dst)
+    {
+        m_formatter.oneByteOp64(OP_AND_GvEv, dst, base, offset);
+    }
+
+    void andq_mr(int offset, RegisterID base, RegisterID index, int scale, RegisterID dst)
+    {
+        m_formatter.oneByteOp64(OP_AND_GvEv, dst, base, index, scale, offset);
+    }
+
+    void andq_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.oneByteOp64(OP_AND_EvGv, src, base, offset);
+    }
+
+    void andq_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp64(OP_AND_EvGv, src, base, index, scale, offset);
+    }
+
+    void andq_im(int imm, int offset, RegisterID base)
+    {
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIb, GROUP1_OP_AND, base, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIz, GROUP1_OP_AND, base, offset);
+            m_formatter.immediate32(imm);
+        }
+    }
+
+    void andq_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIb, GROUP1_OP_AND, base, index, scale, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIz, GROUP1_OP_AND, base, index, scale, offset);
+            m_formatter.immediate32(imm);
+        }
+    }
 #else
     void andl_im(int imm, const void* addr)
     {
@@ -703,6 +860,11 @@ public:
     {
         m_formatter.oneByteOp64(OP_GROUP5_Ev, GROUP1_OP_ADD, base, offset);
     }
+
+    void incq_m(int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp64(OP_GROUP5_Ev, GROUP1_OP_ADD, base, index, scale, offset);
+    }
 #endif // CPU(X86_64)
 
     void negl_r(RegisterID dst)
@@ -715,6 +877,16 @@ public:
     {
         m_formatter.oneByteOp64(OP_GROUP3_Ev, GROUP3_OP_NEG, dst);
     }
+
+    void negq_m(int offset, RegisterID base)
+    {
+        m_formatter.oneByteOp64(OP_GROUP3_Ev, GROUP3_OP_NEG, base, offset);
+    }
+
+    void negq_m(int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp64(OP_GROUP3_Ev, GROUP3_OP_NEG, base, index, scale, offset);
+    }
 #endif
 
     void negl_m(int offset, RegisterID base)
@@ -722,6 +894,33 @@ public:
         m_formatter.oneByteOp(OP_GROUP3_Ev, GROUP3_OP_NEG, base, offset);
     }
 
+    void negl_m(int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_GROUP3_Ev, GROUP3_OP_NEG, base, index, scale, offset);
+    }
+
+    void negw_m(int offset, RegisterID base)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        negl_m(offset, base);
+    }
+
+    void negw_m(int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        negl_m(offset, base, index, scale);
+    }
+
+    void negb_m(int offset, RegisterID base)
+    {
+        m_formatter.oneByteOp(OP_GROUP3_Eb, GROUP3_OP_NEG, base, offset);
+    }
+
+    void negb_m(int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_GROUP3_Eb, GROUP3_OP_NEG, base, index, scale, offset);
+    }
+
     void notl_r(RegisterID dst)
     {
         m_formatter.oneByteOp(OP_GROUP3_Ev, GROUP3_OP_NOT, dst);
@@ -732,6 +931,33 @@ public:
         m_formatter.oneByteOp(OP_GROUP3_Ev, GROUP3_OP_NOT, base, offset);
     }
 
+    void notl_m(int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_GROUP3_Ev, GROUP3_OP_NOT, base, index, scale, offset);
+    }
+
+    void notw_m(int offset, RegisterID base)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        notl_m(offset, base);
+    }
+
+    void notw_m(int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        notl_m(offset, base, index, scale);
+    }
+
+    void notb_m(int offset, RegisterID base)
+    {
+        m_formatter.oneByteOp(OP_GROUP3_Eb, GROUP3_OP_NOT, base, offset);
+    }
+
+    void notb_m(int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_GROUP3_Eb, GROUP3_OP_NOT, base, index, scale, offset);
+    }
+
 #if CPU(X86_64)
     void notq_r(RegisterID dst)
     {
@@ -742,6 +968,11 @@ public:
     {
         m_formatter.oneByteOp64(OP_GROUP3_Ev, GROUP3_OP_NOT, base, offset);
     }
+
+    void notq_m(int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp64(OP_GROUP3_Ev, GROUP3_OP_NOT, base, index, scale, offset);
+    }
 #endif
 
     void orl_rr(RegisterID src, RegisterID dst)
@@ -754,11 +985,43 @@ public:
         m_formatter.oneByteOp(OP_OR_GvEv, dst, base, offset);
     }
 
+    void orl_mr(int offset, RegisterID base, RegisterID index, int scale, RegisterID dst)
+    {
+        m_formatter.oneByteOp(OP_OR_GvEv, dst, base, index, scale, offset);
+    }
+
     void orl_rm(RegisterID src, int offset, RegisterID base)
     {
         m_formatter.oneByteOp(OP_OR_EvGv, src, base, offset);
     }
 
+    void orl_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_OR_EvGv, src, base, index, scale, offset);
+    }
+
+    void orw_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        orl_rm(src, offset, base);
+    }
+
+    void orw_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        orl_rm(src, offset, base, index, scale);
+    }
+
+    void orb_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.oneByteOp(OP_OR_EvGb, src, base, offset);
+    }
+
+    void orb_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_OR_EvGb, src, base, index, scale, offset);
+    }
+
     void orl_ir(int imm, RegisterID dst)
     {
         if (CAN_SIGN_EXTEND_8_32(imm)) {
@@ -784,12 +1047,101 @@ public:
         }
     }
 
+    void orl_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp(OP_GROUP1_EvIb, GROUP1_OP_OR, base, index, scale, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp(OP_GROUP1_EvIz, GROUP1_OP_OR, base, index, scale, offset);
+            m_formatter.immediate32(imm);
+        }
+    }
+
+    void orw_im(int imm, int offset, RegisterID base)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp(OP_GROUP1_EvIb, GROUP1_OP_OR, base, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp(OP_GROUP1_EvIz, GROUP1_OP_OR, base, offset);
+            m_formatter.immediate16(imm);
+        }
+    }
+
+    void orw_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp(OP_GROUP1_EvIb, GROUP1_OP_OR, base, index, scale, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp(OP_GROUP1_EvIz, GROUP1_OP_OR, base, index, scale, offset);
+            m_formatter.immediate16(imm);
+        }
+    }
+
+    void orb_im(int imm, int offset, RegisterID base)
+    {
+        m_formatter.oneByteOp(OP_GROUP1_EbIb, GROUP1_OP_OR, base, offset);
+        m_formatter.immediate8(imm);
+    }
+
+    void orb_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_GROUP1_EbIb, GROUP1_OP_OR, base, index, scale, offset);
+        m_formatter.immediate8(imm);
+    }
+
 #if CPU(X86_64)
     void orq_rr(RegisterID src, RegisterID dst)
     {
         m_formatter.oneByteOp64(OP_OR_EvGv, src, dst);
     }
 
+    void orq_mr(int offset, RegisterID base, RegisterID dst)
+    {
+        m_formatter.oneByteOp64(OP_OR_GvEv, dst, base, offset);
+    }
+
+    void orq_mr(int offset, RegisterID base, RegisterID index, int scale, RegisterID dst)
+    {
+        m_formatter.oneByteOp64(OP_OR_GvEv, dst, base, index, scale, offset);
+    }
+
+    void orq_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.oneByteOp64(OP_OR_EvGv, src, base, offset);
+    }
+
+    void orq_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp64(OP_OR_EvGv, src, base, index, scale, offset);
+    }
+
+    void orq_im(int imm, int offset, RegisterID base)
+    {
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIb, GROUP1_OP_OR, base, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIz, GROUP1_OP_OR, base, offset);
+            m_formatter.immediate32(imm);
+        }
+    }
+
+    void orq_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIb, GROUP1_OP_OR, base, index, scale, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIz, GROUP1_OP_OR, base, index, scale, offset);
+            m_formatter.immediate32(imm);
+        }
+    }
+
     void orq_ir(int imm, RegisterID dst)
     {
         if (CAN_SIGN_EXTEND_8_32(imm)) {
@@ -831,11 +1183,43 @@ public:
         m_formatter.oneByteOp(OP_SUB_GvEv, dst, base, offset);
     }
 
+    void subl_mr(int offset, RegisterID base, RegisterID index, int scale, RegisterID dst)
+    {
+        m_formatter.oneByteOp(OP_SUB_GvEv, dst, base, index, scale, offset);
+    }
+
     void subl_rm(RegisterID src, int offset, RegisterID base)
     {
         m_formatter.oneByteOp(OP_SUB_EvGv, src, base, offset);
     }
 
+    void subl_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_SUB_EvGv, src, base, index, scale, offset);
+    }
+
+    void subw_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        m_formatter.oneByteOp(OP_SUB_EvGv, src, base, offset);
+    }
+
+    void subw_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        m_formatter.oneByteOp(OP_SUB_EvGv, src, base, index, scale, offset);
+    }
+
+    void subb_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.oneByteOp(OP_SUB_EvGb, src, base, offset);
+    }
+
+    void subb_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_SUB_EvGb, src, base, index, scale, offset);
+    }
+
     void subl_ir(int imm, RegisterID dst)
     {
         if (CAN_SIGN_EXTEND_8_32(imm)) {
@@ -861,6 +1245,53 @@ public:
         }
     }
 
+    void subl_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp(OP_GROUP1_EvIb, GROUP1_OP_SUB, base, index, scale, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp(OP_GROUP1_EvIz, GROUP1_OP_SUB, base, index, scale, offset);
+            m_formatter.immediate32(imm);
+        }
+    }
+
+    void subw_im(int imm, int offset, RegisterID base)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp(OP_GROUP1_EvIb, GROUP1_OP_SUB, base, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp(OP_GROUP1_EvIz, GROUP1_OP_SUB, base, offset);
+            m_formatter.immediate16(imm);
+        }
+    }
+
+    void subw_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp(OP_GROUP1_EvIb, GROUP1_OP_SUB, base, index, scale, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp(OP_GROUP1_EvIz, GROUP1_OP_SUB, base, index, scale, offset);
+            m_formatter.immediate16(imm);
+        }
+    }
+
+    void subb_im(int imm, int offset, RegisterID base)
+    {
+        m_formatter.oneByteOp(OP_GROUP1_EbIb, GROUP1_OP_SUB, base, offset);
+        m_formatter.immediate8(imm);
+    }
+
+    void subb_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_GROUP1_EbIb, GROUP1_OP_SUB, base, index, scale, offset);
+        m_formatter.immediate8(imm);
+    }
+
 #if CPU(X86_64)
     void subq_rr(RegisterID src, RegisterID dst)
     {
@@ -872,11 +1303,21 @@ public:
         m_formatter.oneByteOp64(OP_SUB_GvEv, dst, base, offset);
     }
 
+    void subq_mr(int offset, RegisterID base, RegisterID index, int scale, RegisterID dst)
+    {
+        m_formatter.oneByteOp64(OP_SUB_GvEv, dst, base, index, scale, offset);
+    }
+
     void subq_rm(RegisterID src, int offset, RegisterID base)
     {
         m_formatter.oneByteOp64(OP_SUB_EvGv, src, base, offset);
     }
 
+    void subq_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp64(OP_SUB_EvGv, src, base, index, scale, offset);
+    }
+
     void subq_ir(int imm, RegisterID dst)
     {
         if (CAN_SIGN_EXTEND_8_32(imm)) {
@@ -901,6 +1342,17 @@ public:
             m_formatter.immediate32(imm);
         }
     }
+
+    void subq_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIb, GROUP1_OP_SUB, base, index, scale, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIz, GROUP1_OP_SUB, base, index, scale, offset);
+            m_formatter.immediate32(imm);
+        }
+    }
 #else
     void subl_im(int imm, const void* addr)
     {
@@ -924,11 +1376,21 @@ public:
         m_formatter.oneByteOp(OP_XOR_GvEv, dst, base, offset);
     }
 
+    void xorl_mr(int offset, RegisterID base, RegisterID index, int scale, RegisterID dst)
+    {
+        m_formatter.oneByteOp(OP_XOR_GvEv, dst, base, index, scale, offset);
+    }
+
     void xorl_rm(RegisterID src, int offset, RegisterID base)
     {
         m_formatter.oneByteOp(OP_XOR_EvGv, src, base, offset);
     }
 
+    void xorl_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_XOR_EvGv, src, base, index, scale, offset);
+    }
+
     void xorl_im(int imm, int offset, RegisterID base)
     {
         if (CAN_SIGN_EXTEND_8_32(imm)) {
@@ -940,6 +1402,75 @@ public:
         }
     }
 
+    void xorl_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp(OP_GROUP1_EvIb, GROUP1_OP_XOR, base, index, scale, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp(OP_GROUP1_EvIz, GROUP1_OP_XOR, base, index, scale, offset);
+            m_formatter.immediate32(imm);
+        }
+    }
+
+    void xorw_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        xorl_rm(src, offset, base);
+    }
+
+    void xorw_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        xorl_rm(src, offset, base, index, scale);
+    }
+
+    void xorw_im(int imm, int offset, RegisterID base)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp(OP_GROUP1_EvIb, GROUP1_OP_XOR, base, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp(OP_GROUP1_EvIz, GROUP1_OP_XOR, base, offset);
+            m_formatter.immediate16(imm);
+        }
+    }
+
+    void xorw_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp(OP_GROUP1_EvIb, GROUP1_OP_XOR, base, index, scale, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp(OP_GROUP1_EvIz, GROUP1_OP_XOR, base, index, scale, offset);
+            m_formatter.immediate16(imm);
+        }
+    }
+
+    void xorb_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.oneByteOp(OP_XOR_EvGb, src, base, offset);
+    }
+
+    void xorb_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_XOR_EvGb, src, base, index, scale, offset);
+    }
+
+    void xorb_im(int imm, int offset, RegisterID base)
+    {
+        m_formatter.oneByteOp(OP_GROUP1_EbIb, GROUP1_OP_XOR, base, offset);
+        m_formatter.immediate8(imm);
+    }
+
+    void xorb_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_GROUP1_EbIb, GROUP1_OP_XOR, base, index, scale, offset);
+        m_formatter.immediate8(imm);
+    }
+
     void xorl_ir(int imm, RegisterID dst)
     {
         if (CAN_SIGN_EXTEND_8_32(imm)) {
@@ -974,11 +1505,47 @@ public:
         }
     }
     
+    void xorq_im(int imm, int offset, RegisterID base)
+    {
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIb, GROUP1_OP_XOR, base, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIz, GROUP1_OP_XOR, base, offset);
+            m_formatter.immediate32(imm);
+        }
+    }
+    
+    void xorq_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        if (CAN_SIGN_EXTEND_8_32(imm)) {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIb, GROUP1_OP_XOR, base, index, scale, offset);
+            m_formatter.immediate8(imm);
+        } else {
+            m_formatter.oneByteOp64(OP_GROUP1_EvIz, GROUP1_OP_XOR, base, index, scale, offset);
+            m_formatter.immediate32(imm);
+        }
+    }
+    
     void xorq_rm(RegisterID src, int offset, RegisterID base)
     {
         m_formatter.oneByteOp64(OP_XOR_EvGv, src, base, offset);
     }
 
+    void xorq_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp64(OP_XOR_EvGv, src, base, index, scale, offset);
+    }
+
+    void xorq_mr(int offset, RegisterID base, RegisterID dest)
+    {
+        m_formatter.oneByteOp64(OP_XOR_GvEv, dest, base, offset);
+    }
+
+    void xorq_mr(int offset, RegisterID base, RegisterID index, int scale, RegisterID dest)
+    {
+        m_formatter.oneByteOp64(OP_XOR_GvEv, dest, base, index, scale, offset);
+    }
 #endif
 
     void lzcnt_rr(RegisterID src, RegisterID dst)
@@ -1586,11 +2153,38 @@ public:
             m_formatter.oneByteOp(OP_XCHG_EvGv, src, dst);
     }
 
+    void xchgb_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.oneByteOp(OP_XCHG_EvGb, src, base, offset);
+    }
+
+    void xchgb_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_XCHG_EvGb, src, base, index, scale, offset);
+    }
+
+    void xchgw_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        m_formatter.oneByteOp(OP_XCHG_EvGv, src, base, offset);
+    }
+
+    void xchgw_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        m_formatter.oneByteOp(OP_XCHG_EvGv, src, base, index, scale, offset);
+    }
+
     void xchgl_rm(RegisterID src, int offset, RegisterID base)
     {
         m_formatter.oneByteOp(OP_XCHG_EvGv, src, base, offset);
     }
 
+    void xchgl_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp(OP_XCHG_EvGv, src, base, index, scale, offset);
+    }
+
 #if CPU(X86_64)
     void xchgq_rr(RegisterID src, RegisterID dst)
     {
@@ -1606,6 +2200,11 @@ public:
     {
         m_formatter.oneByteOp64(OP_XCHG_EvGv, src, base, offset);
     }
+
+    void xchgq_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp64(OP_XCHG_EvGv, src, base, index, scale, offset);
+    }
 #endif
 
     void movl_rr(RegisterID src, RegisterID dst)
@@ -1731,6 +2330,20 @@ public:
         m_formatter.oneByteOp8(OP_MOV_EvGv, src, base, index, scale, offset);
     }
 
+    void movw_im(int imm, int offset, RegisterID base)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        m_formatter.oneByteOp(OP_GROUP11_EvIz, GROUP11_MOV, base, offset);
+        m_formatter.immediate16(imm);
+    }
+
+    void movw_im(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        m_formatter.oneByteOp(OP_GROUP11_EvIz, GROUP11_MOV, base, index, scale, offset);
+        m_formatter.immediate16(imm);
+    }
+
     void movl_EAXm(const void* addr)
     {
         m_formatter.oneByteOp(OP_MOV_OvEAX);
@@ -1800,6 +2413,12 @@ public:
         m_formatter.immediate32(imm);
     }
 
+    void movq_i32m(int imm, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.oneByteOp64(OP_GROUP11_EvIz, GROUP11_MOV, base, index, scale, offset);
+        m_formatter.immediate32(imm);
+    }
+
     void movq_i64r(int64_t imm, RegisterID dst)
     {
         m_formatter.oneByteOp64(OP_MOV_EAXIv, dst);
@@ -2781,6 +3400,94 @@ public:
         m_formatter.prefix(PRE_LOCK);
     }
     
+    void cmpxchgb_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.twoByteOp8(OP2_CMPXCHGb, src, base, offset);
+    }
+    
+    void cmpxchgb_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.twoByteOp8(OP2_CMPXCHGb, src, base, index, scale, offset);
+    }
+    
+    void cmpxchgw_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        m_formatter.twoByteOp(OP2_CMPXCHG, src, base, offset);
+    }
+    
+    void cmpxchgw_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        m_formatter.twoByteOp(OP2_CMPXCHG, src, base, index, scale, offset);
+    }
+    
+    void cmpxchgl_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.twoByteOp(OP2_CMPXCHG, src, base, offset);
+    }
+    
+    void cmpxchgl_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.twoByteOp(OP2_CMPXCHG, src, base, index, scale, offset);
+    }
+
+#if CPU(X86_64)    
+    void cmpxchgq_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.twoByteOp64(OP2_CMPXCHG, src, base, offset);
+    }
+    
+    void cmpxchgq_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.twoByteOp64(OP2_CMPXCHG, src, base, index, scale, offset);
+    }
+#endif // CPU(X86_64)
+    
+    void xaddb_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.twoByteOp8(OP2_XADDb, src, base, offset);
+    }
+    
+    void xaddb_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.twoByteOp8(OP2_XADDb, src, base, index, scale, offset);
+    }
+    
+    void xaddw_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        m_formatter.twoByteOp(OP2_XADD, src, base, offset);
+    }
+    
+    void xaddw_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.prefix(PRE_OPERAND_SIZE);
+        m_formatter.twoByteOp(OP2_XADD, src, base, index, scale, offset);
+    }
+    
+    void xaddl_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.twoByteOp(OP2_XADD, src, base, offset);
+    }
+    
+    void xaddl_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.twoByteOp(OP2_XADD, src, base, index, scale, offset);
+    }
+
+#if CPU(X86_64)    
+    void xaddq_rm(RegisterID src, int offset, RegisterID base)
+    {
+        m_formatter.twoByteOp64(OP2_XADD, src, base, offset);
+    }
+    
+    void xaddq_rm(RegisterID src, int offset, RegisterID base, RegisterID index, int scale)
+    {
+        m_formatter.twoByteOp64(OP2_XADD, src, base, index, scale, offset);
+    }
+#endif // CPU(X86_64)
+    
     void mfence()
     {
         m_formatter.threeByteOp(OP2_3BYTE_ESCAPE_AE, OP3_MFENCE);
@@ -3763,6 +4470,24 @@ private:
             writer.registerModRM(groupOp, rm);
         }
 
+        void twoByteOp8(TwoByteOpcodeID opcode, int reg, RegisterID base, int offset)
+        {
+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIf(byteRegRequiresRex(reg, base), reg, 0, base);
+            writer.putByteUnchecked(OP_2BYTE_ESCAPE);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM(reg, base, offset);
+        }
+
+        void twoByteOp8(TwoByteOpcodeID opcode, int reg, RegisterID base, RegisterID index, int scale, int offset)
+        {
+            SingleInstructionBufferWriter writer(m_buffer);
+            writer.emitRexIf(byteRegRequiresRex(reg) || regRequiresRex(index, base), reg, index, base);
+            writer.putByteUnchecked(OP_2BYTE_ESCAPE);
+            writer.putByteUnchecked(opcode);
+            writer.memoryModRM(reg, base, index, scale, offset);
+        }
+
         // Immediates:
         //
         // An immedaite should be appended where appropriate after an op has been emitted.
diff --git a/Source/JavaScriptCore/b3/B3AtomicValue.cpp b/Source/JavaScriptCore/b3/B3AtomicValue.cpp
new file mode 100644 (file)
index 0000000..70c35dc
--- /dev/null
@@ -0,0 +1,86 @@
+/*
+ * Copyright (C) 2017 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "config.h"
+#include "B3AtomicValue.h"
+
+#if ENABLE(B3_JIT)
+
+namespace JSC { namespace B3 {
+
+AtomicValue::~AtomicValue()
+{
+}
+
+void AtomicValue::dumpMeta(CommaPrinter& comma, PrintStream& out) const
+{
+    out.print(comma, "width = ", m_width);
+    
+    MemoryValue::dumpMeta(comma, out);
+}
+
+Value* AtomicValue::cloneImpl() const
+{
+    return new AtomicValue(*this);
+}
+
+AtomicValue::AtomicValue(Kind kind, Origin origin, Width width, Value* operand, Value* pointer, int32_t offset, HeapRange range, HeapRange fenceRange)
+    : MemoryValue(CheckedOpcode, kind, operand->type(), origin, offset, range, fenceRange, operand, pointer)
+    , m_width(width)
+{
+    ASSERT(bestType(GP, accessWidth()) == accessType());
+    
+    switch (kind.opcode()) {
+    case AtomicXchgAdd:
+    case AtomicXchgAnd:
+    case AtomicXchgOr:
+    case AtomicXchgSub:
+    case AtomicXchgXor:
+    case AtomicXchg:
+        break;
+    default:
+        ASSERT_NOT_REACHED();
+    }
+}
+
+AtomicValue::AtomicValue(Kind kind, Origin origin, Width width, Value* expectedValue, Value* newValue, Value* pointer, int32_t offset, HeapRange range, HeapRange fenceRange)
+    : MemoryValue(CheckedOpcode, kind, kind.opcode() == AtomicWeakCAS ? Int32 : expectedValue->type(), origin, offset, range, fenceRange, expectedValue, newValue, pointer)
+    , m_width(width)
+{
+    ASSERT(bestType(GP, accessWidth()) == accessType());
+
+    switch (kind.opcode()) {
+    case AtomicWeakCAS:
+    case AtomicStrongCAS:
+        break;
+    default:
+        ASSERT_NOT_REACHED();
+    }
+}
+
+} } // namespace JSC::B3
+
+#endif // ENABLE(B3_JIT)
+
diff --git a/Source/JavaScriptCore/b3/B3AtomicValue.h b/Source/JavaScriptCore/b3/B3AtomicValue.h
new file mode 100644 (file)
index 0000000..d6bcef0
--- /dev/null
@@ -0,0 +1,66 @@
+/*
+ * Copyright (C) 2017 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#pragma once
+
+#if ENABLE(B3_JIT)
+
+#include "B3MemoryValue.h"
+#include "B3Width.h"
+
+namespace JSC { namespace B3 {
+
+class JS_EXPORT_PRIVATE AtomicValue : public MemoryValue {
+public:
+    static bool accepts(Kind kind)
+    {
+        return isAtomic(kind.opcode());
+    }
+    
+    ~AtomicValue();
+    
+    Type accessType() const { return child(0)->type(); }
+    
+    Width accessWidth() const { return m_width; }
+    
+protected:
+    void dumpMeta(CommaPrinter&, PrintStream&) const override;
+    
+    Value* cloneImpl() const override;
+    
+private:
+    friend class Procedure;
+    
+    AtomicValue(Kind, Origin, Width, Value* operand, Value* pointer, int32_t offset = 0, HeapRange range = HeapRange::top(), HeapRange fenceRange = HeapRange::top());
+    
+    AtomicValue(Kind, Origin, Width, Value* expectedValue, Value* newValue, Value* pointer, int32_t offset = 0, HeapRange range = HeapRange::top(), HeapRange fenceRange = HeapRange::top());
+    
+    Width m_width;
+};
+
+} } // namespace JSC::B3
+
+#endif // ENABLE(B3_JIT)
+
index 11f4668..08cd4e9 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -40,6 +40,7 @@ class BlockInsertionSet;
 class InsertionSet;
 class Procedure;
 class Value;
+template<typename> class GenericBlockInsertionSet;
 
 class BasicBlock {
     WTF_MAKE_NONCOPYABLE(BasicBlock);
@@ -158,6 +159,7 @@ private:
     friend class BlockInsertionSet;
     friend class InsertionSet;
     friend class Procedure;
+    template<typename> friend class GenericBlockInsertionSet;
     
     // Instantiate via Procedure.
     BasicBlock(unsigned index, double frequency);
index 76a1668..b00b454 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
 namespace JSC { namespace B3 {
 
 BlockInsertionSet::BlockInsertionSet(Procedure &proc)
-    : m_proc(proc)
+    : GenericBlockInsertionSet(proc.m_blocks)
+    , m_proc(proc)
 {
 }
 
 BlockInsertionSet::~BlockInsertionSet() { }
 
-void BlockInsertionSet::insert(BlockInsertion&& insertion)
-{
-    m_insertions.append(WTFMove(insertion));
-}
-
-BasicBlock* BlockInsertionSet::insert(unsigned index, double frequency)
-{
-    std::unique_ptr<BasicBlock> block(new BasicBlock(UINT_MAX, frequency));
-    BasicBlock* result = block.get();
-    insert(BlockInsertion(index, WTFMove(block)));
-    return result;
-}
-
-BasicBlock* BlockInsertionSet::insertBefore(BasicBlock* before, double frequency)
-{
-    return insert(before->index(), frequency == frequency ? frequency : before->frequency());
-}
-
-BasicBlock* BlockInsertionSet::insertAfter(BasicBlock* after, double frequency)
-{
-    return insert(after->index() + 1, frequency == frequency ? frequency : after->frequency());
-}
-
 BasicBlock* BlockInsertionSet::splitForward(
     BasicBlock* block, unsigned& valueIndex, InsertionSet* insertionSet, double frequency)
 {
@@ -102,32 +80,6 @@ BasicBlock* BlockInsertionSet::splitForward(
     return result;
 }
 
-bool BlockInsertionSet::execute()
-{
-    if (m_insertions.isEmpty())
-        return false;
-    
-    // We allow insertions to be given to us in any order. So, we need to sort them before
-    // running WTF::executeInsertions. We strongly prefer a stable sort and we want it to be
-    // fast, so we use bubble sort.
-    bubbleSort(m_insertions.begin(), m_insertions.end());
-
-    executeInsertions(m_proc.m_blocks, m_insertions);
-    
-    // Prune out empty entries. This isn't strictly necessary but it's
-    // healthy to keep the block list from growing.
-    m_proc.m_blocks.removeAllMatching(
-        [&] (std::unique_ptr<BasicBlock>& blockPtr) -> bool {
-            return !blockPtr;
-        });
-    
-    // Make sure that the blocks know their new indices.
-    for (unsigned i = 0; i < m_proc.m_blocks.size(); ++i)
-        m_proc.m_blocks[i]->m_index = i;
-    
-    return true;
-}
-
 } } // namespace JSC::B3
 
 #endif // ENABLE(B3_JIT)
index b316f64..eb92868 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -27,6 +27,7 @@
 
 #if ENABLE(B3_JIT)
 
+#include "B3GenericBlockInsertionSet.h"
 #include "B3Procedure.h"
 #include <wtf/Insertion.h>
 #include <wtf/Vector.h>
@@ -35,26 +36,13 @@ namespace JSC { namespace B3 {
 
 class InsertionSet;
 
-typedef WTF::Insertion<std::unique_ptr<BasicBlock>> BlockInsertion;
+typedef GenericBlockInsertionSet<BasicBlock>::BlockInsertion BlockInsertion;
 
-class BlockInsertionSet {
+class BlockInsertionSet : public GenericBlockInsertionSet<BasicBlock> {
 public:
     BlockInsertionSet(Procedure&);
     ~BlockInsertionSet();
     
-    void insert(BlockInsertion&&);
-
-    // Insert a new block at a given index.
-    BasicBlock* insert(unsigned index, double frequency = PNaN);
-
-    // Inserts a new block before the given block. Usually you will not pass the frequency
-    // argument. Passing PNaN causes us to just use the frequency of the 'before' block. That's
-    // usually what you want.
-    BasicBlock* insertBefore(BasicBlock* before, double frequency = PNaN);
-
-    // Inserts a new block after the given block.
-    BasicBlock* insertAfter(BasicBlock* after, double frequency = PNaN);
-
     // A helper to split a block when forward iterating over it. It creates a new block to hold
     // everything before the instruction at valueIndex. The current block is left with
     // everything at and after valueIndex. If the optional InsertionSet is provided, it will get
@@ -80,12 +68,9 @@ public:
     BasicBlock* splitForward(
         BasicBlock*, unsigned& valueIndex, InsertionSet* = nullptr,
         double frequency = PNaN);
-    
-    bool execute();
 
 private:
     Procedure& m_proc;
-    Vector<BlockInsertion, 8> m_insertions;
 };
 
 } } // namespace JSC::B3
index aeda46f..6c8684a 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -82,7 +82,8 @@ bool Effects::interferes(const Effects& other) const
         || interferesWithWritesPinned(other, *this)
         || writes.overlaps(other.writes)
         || writes.overlaps(other.reads)
-        || reads.overlaps(other.writes);
+        || reads.overlaps(other.writes)
+        || (fence && other.fence);
 }
 
 bool Effects::operator==(const Effects& other) const
@@ -95,7 +96,8 @@ bool Effects::operator==(const Effects& other) const
         && writesPinned == other.writesPinned
         && readsPinned == other.readsPinned
         && writes == other.writes
-        && reads == other.reads;
+        && reads == other.reads
+        && fence == other.fence;
 }
 
 bool Effects::operator!=(const Effects& other) const
@@ -120,6 +122,8 @@ void Effects::dump(PrintStream& out) const
         out.print(comma, "WritesPinned");
     if (readsPinned)
         out.print(comma, "ReadsPinned");
+    if (fence)
+        out.print(comma, "Fence");
     if (writes)
         out.print(comma, "Writes:", writes);
     if (reads)
index 7a08853..b2e3fd2 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -71,6 +71,10 @@ struct Effects {
     // https://bugs.webkit.org/show_bug.cgi?id=163173
     bool readsPinned { false };
     bool writesPinned { false };
+    
+    // Memory fences cannot be reordered around each other regardless of their effects. This is flagged
+    // if the operation is a memory fence.
+    bool fence { false };
 
     HeapRange writes;
     HeapRange reads;
@@ -89,6 +93,7 @@ struct Effects {
         result.reads = HeapRange::top();
         result.readsPinned = true;
         result.writesPinned = true;
+        result.fence = true;
         return result;
     }
 
@@ -103,7 +108,7 @@ struct Effects {
 
     bool mustExecute() const
     {
-        return terminal || exitsSideways || writesLocalState || writes || writesPinned;
+        return terminal || exitsSideways || writesLocalState || writes || writesPinned || fence;
     }
 
     // Returns true if reordering instructions with these respective effects would change program
index feaacda..8c519a8 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2016-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -142,6 +142,7 @@ struct ImpureBlockData {
 
     RangeSet<HeapRange> reads; // This only gets used for forward store elimination.
     RangeSet<HeapRange> writes; // This gets used for both load and store elimination.
+    bool fence;
 
     MemoryValueMap storesAtHead;
     MemoryValueMap memoryValuesAtTail;
@@ -174,12 +175,14 @@ public:
                 
                 if (memory && memory->isStore()
                     && !data.reads.overlaps(memory->range())
-                    && !data.writes.overlaps(memory->range()))
+                    && !data.writes.overlaps(memory->range())
+                    && (!data.fence || !memory->hasFence()))
                     data.storesAtHead.add(memory);
                 data.reads.add(effects.reads);
 
                 if (HeapRange writes = effects.writes)
                     clobber(data, writes);
+                data.fence |= effects.fence;
 
                 if (memory)
                     data.memoryValuesAtTail.add(memory);
@@ -445,8 +448,6 @@ private:
         }
 
         default:
-            dataLog("Bad memory value: ", deepDump(m_proc, m_value), "\n");
-            RELEASE_ASSERT_NOT_REACHED();
             break;
         }
     }
@@ -478,6 +479,9 @@ private:
     template<typename Filter>
     bool findStoreAfterClobber(Value* ptr, HeapRange range, const Filter& filter)
     {
+        if (m_value->as<MemoryValue>()->hasFence())
+            return false;
+        
         // We can eliminate a store if every forward path hits a store to the same location before
         // hitting any operation that observes the store. This search seems like it should be
         // expensive, but in the overwhelming majority of cases it will almost immediately hit an 
@@ -614,6 +618,12 @@ private:
         if (verbose)
             dataLog(*m_value, ": looking backward for ", *ptr, "...\n");
         
+        if (m_value->as<MemoryValue>()->hasFence()) {
+            if (verbose)
+                dataLog("    Giving up because fences.\n");
+            return { };
+        }
+        
         if (MemoryValue* match = m_data.memoryValuesAtTail.find(ptr, filter)) {
             if (verbose)
                 dataLog("    Found ", *match, " locally.\n");
index e328c6a..d141be7 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -43,6 +43,7 @@
 #include "B3LowerToAir.h"
 #include "B3MoveConstants.h"
 #include "B3Procedure.h"
+#include "B3PureCSE.h"
 #include "B3ReduceDoubleToFloat.h"
 #include "B3ReduceStrength.h"
 #include "B3TimingScope.h"
@@ -92,6 +93,7 @@ void generateToAir(Procedure& procedure, unsigned optLevel)
         // https://bugs.webkit.org/show_bug.cgi?id=150507
     }
 
+    // This puts the IR in quirks mode.
     lowerMacros(procedure);
 
     if (optLevel >= 1) {
diff --git a/Source/JavaScriptCore/b3/B3GenericBlockInsertionSet.h b/Source/JavaScriptCore/b3/B3GenericBlockInsertionSet.h
new file mode 100644 (file)
index 0000000..f20fafd
--- /dev/null
@@ -0,0 +1,111 @@
+/*
+ * Copyright (C) 2017 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+#pragma once
+
+#if ENABLE(B3_JIT)
+
+#include "PureNaN.h"
+#include <climits>
+#include <wtf/BubbleSort.h>
+#include <wtf/Insertion.h>
+#include <wtf/Vector.h>
+
+namespace JSC { namespace B3 {
+
+class InsertionSet;
+
+template<typename BasicBlock>
+class GenericBlockInsertionSet {
+public:
+    typedef WTF::Insertion<std::unique_ptr<BasicBlock>> BlockInsertion;
+    
+    GenericBlockInsertionSet(Vector<std::unique_ptr<BasicBlock>>& blocks)
+        : m_blocks(blocks)
+    {
+    }
+    
+    void insert(BlockInsertion&& insertion)
+    {
+        m_insertions.append(WTFMove(insertion));
+    }
+
+    // Insert a new block at a given index.
+    BasicBlock* insert(unsigned index, double frequency = PNaN)
+    {
+        std::unique_ptr<BasicBlock> block(new BasicBlock(UINT_MAX, frequency));
+        BasicBlock* result = block.get();
+        insert(BlockInsertion(index, WTFMove(block)));
+        return result;
+    }
+
+    // Inserts a new block before the given block. Usually you will not pass the frequency
+    // argument. Passing PNaN causes us to just use the frequency of the 'before' block. That's
+    // usually what you want.
+    BasicBlock* insertBefore(BasicBlock* before, double frequency = PNaN)
+    {
+        return insert(before->index(), frequency == frequency ? frequency : before->frequency());
+    }
+
+    // Inserts a new block after the given block.
+    BasicBlock* insertAfter(BasicBlock* after, double frequency = PNaN)
+    {
+        return insert(after->index() + 1, frequency == frequency ? frequency : after->frequency());
+    }
+
+    bool execute()
+    {
+        if (m_insertions.isEmpty())
+            return false;
+        
+        // We allow insertions to be given to us in any order. So, we need to sort them before
+        // running WTF::executeInsertions. We strongly prefer a stable sort and we want it to be
+        // fast, so we use bubble sort.
+        bubbleSort(m_insertions.begin(), m_insertions.end());
+        
+        executeInsertions(m_blocks, m_insertions);
+        
+        // Prune out empty entries. This isn't strictly necessary but it's
+        // healthy to keep the block list from growing.
+        m_blocks.removeAllMatching(
+            [&] (std::unique_ptr<BasicBlock>& blockPtr) -> bool {
+                return !blockPtr;
+            });
+        
+        // Make sure that the blocks know their new indices.
+        for (unsigned i = 0; i < m_blocks.size(); ++i)
+            m_blocks[i]->m_index = i;
+        
+        return true;
+    }
+
+private:
+    Vector<std::unique_ptr<BasicBlock>>& m_blocks;
+    Vector<BlockInsertion, 8> m_insertions;
+};
+
+} } // namespace JSC::B3
+
+#endif // ENABLE(B3_JIT)
index 03866bd..3c15a6e 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -88,6 +88,13 @@ public:
         return !(*this == other);
     }
     
+    HeapRange operator|(const HeapRange& other) const
+    {
+        return HeapRange(
+            std::min(m_begin, other.m_begin),
+            std::max(m_end, other.m_end));
+    }
+    
     explicit operator bool() const { return m_begin != m_end; }
 
     unsigned begin() const { return m_begin; }
index a6e119f..f583c20 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -58,6 +58,11 @@ Value* InsertionSet::insertBottom(size_t index, Value* likeValue)
     return insertBottom(index, likeValue->origin(), likeValue->type());
 }
 
+Value* InsertionSet::insertClone(size_t index, Value* value)
+{
+    return insertValue(index, m_procedure.clone(value));
+}
+
 void InsertionSet::execute(BasicBlock* block)
 {
     bubbleSort(m_insertions.begin(), m_insertions.end());
index 1eb5272..08bf481 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -71,6 +71,8 @@ public:
 
     Value* insertBottom(size_t index, Origin, Type);
     Value* insertBottom(size_t index, Value*);
+    
+    Value* insertClone(size_t index, Value*);
 
     void execute(BasicBlock*);
 
index d5c2936..e85cd22 100644 (file)
@@ -28,9 +28,8 @@
 
 #if ENABLE(B3_JIT)
 
-#include "AirArg.h"
 #include "B3InsertionSetInlines.h"
-#include "B3MemoryValue.h"
+#include "B3MemoryValueInlines.h"
 #include "B3PhaseScope.h"
 #include "B3ProcedureInlines.h"
 #include "B3ValueInlines.h"
@@ -49,20 +48,21 @@ public:
 
     void run()
     {
-        if (!isARM64())
-            return;
-
+        // FIXME: Perhaps this should be moved to lowerMacros, and quirks mode can impose the requirement
+        // that the offset is legal. But for now this is sort of OK because we run pureCSE after. Also,
+        // we should probably have something better than just pureCSE to clean up the code that this
+        // introduces.
+        // https://bugs.webkit.org/show_bug.cgi?id=169246
+        
         for (BasicBlock* block : m_proc) {
             for (unsigned index = 0; index < block->size(); ++index) {
                 MemoryValue* memoryValue = block->at(index)->as<MemoryValue>();
                 if (!memoryValue)
                     continue;
-
-                int32_t offset = memoryValue->offset();
-                Width width = memoryValue->accessWidth();
-                if (!Air::Arg::isValidAddrForm(offset, width)) {
+                
+                if (!memoryValue->isLegalOffset(memoryValue->offset())) {
                     Value* base = memoryValue->lastChild();
-                    Value* offsetValue = m_insertionSet.insertIntConstant(index, memoryValue->origin(), pointerType(), offset);
+                    Value* offsetValue = m_insertionSet.insertIntConstant(index, memoryValue->origin(), pointerType(), memoryValue->offset());
                     Value* resolvedAddress = m_proc.add<Value>(Add, memoryValue->origin(), base, offsetValue);
                     m_insertionSet.insertValue(index, resolvedAddress);
 
index 6841510..b1a49ef 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
 #if ENABLE(B3_JIT)
 
 #include "AllowMacroScratchRegisterUsage.h"
+#include "B3AtomicValue.h"
 #include "B3BasicBlockInlines.h"
 #include "B3BlockInsertionSet.h"
 #include "B3CCallValue.h"
 #include "B3CaseCollectionInlines.h"
 #include "B3ConstPtrValue.h"
+#include "B3FenceValue.h"
 #include "B3InsertionSetInlines.h"
-#include "B3MemoryValue.h"
+#include "B3MemoryValueInlines.h"
 #include "B3PatchpointValue.h"
 #include "B3PhaseScope.h"
 #include "B3ProcedureInlines.h"
 #include "B3StackmapGenerationParams.h"
 #include "B3SwitchValue.h"
 #include "B3UpsilonValue.h"
+#include "B3UseCounts.h"
 #include "B3ValueInlines.h"
 #include "CCallHelpers.h"
 #include "LinkBuffer.h"
@@ -58,11 +61,14 @@ public:
         : m_proc(proc)
         , m_blockInsertionSet(proc)
         , m_insertionSet(proc)
+        , m_useCounts(proc)
     {
     }
 
     bool run()
     {
+        RELEASE_ASSERT(!m_proc.hasQuirks());
+        
         for (BasicBlock* block : m_proc) {
             m_block = block;
             processCurrentBlock();
@@ -72,6 +78,10 @@ public:
             m_proc.resetReachability();
             m_proc.invalidateCFG();
         }
+        
+        // This indicates that we've 
+        m_proc.setHasQuirks(true);
+        
         return m_changed;
     }
     
@@ -185,7 +195,127 @@ private:
                 m_changed = true;
                 break;
             }
+                
+            case Depend: {
+                if (isX86()) {
+                    // Create a load-load fence. This codegens to nothing on X86. We use it to tell the
+                    // compiler not to block load motion.
+                    FenceValue* fence = m_insertionSet.insert<FenceValue>(m_index, m_origin);
+                    fence->read = HeapRange();
+                    fence->write = HeapRange::top();
+                    
+                    // Kill the Depend, which should unlock a bunch of code simplification.
+                    m_value->replaceWithBottom(m_insertionSet, m_index);
+                    
+                    m_changed = true;
+                }
+                break;
+            }
 
+            case AtomicWeakCAS:
+            case AtomicStrongCAS: {
+                AtomicValue* atomic = m_value->as<AtomicValue>();
+                Width width = atomic->accessWidth();
+                
+                if (isCanonicalWidth(width))
+                    break;
+                
+                Value* expectedValue = atomic->child(0);
+                
+                if (!isX86()) {
+                    // On ARM, the load part of the CAS does a load with zero extension. Therefore, we need
+                    // to zero-extend the input.
+                    Value* maskedExpectedValue = m_insertionSet.insert<Value>(
+                        m_index, BitAnd, m_origin, expectedValue,
+                        m_insertionSet.insertIntConstant(m_index, expectedValue, mask(width)));
+                    
+                    atomic->child(0) = maskedExpectedValue;
+                }
+                
+                if (atomic->opcode() == AtomicStrongCAS) {
+                    Value* newValue = m_insertionSet.insert<Value>(
+                        m_index, signExtendOpcode(width), m_origin,
+                        m_insertionSet.insertClone(m_index, atomic));
+                    
+                    atomic->replaceWithIdentity(newValue);
+                }
+                
+                m_changed = true;
+                break;
+            }
+                
+            case AtomicXchgAdd:
+            case AtomicXchgAnd:
+            case AtomicXchgOr:
+            case AtomicXchgSub:
+            case AtomicXchgXor:
+            case AtomicXchg: {
+                // On X86, these may actually return garbage in the high bits. On ARM64, these sorta
+                // zero-extend their high bits, except that the high bits might get polluted by high
+                // bits in the operand. So, either way, we need to throw a sign-extend on these
+                // things.
+                
+                if (isX86()) {
+                    if (m_value->opcode() == AtomicXchgSub && m_useCounts.numUses(m_value)) {
+                        // On x86, xchgadd is better than xchgsub if it has any users.
+                        m_value->setOpcodeUnsafely(AtomicXchgAdd);
+                        m_value->child(0) = m_insertionSet.insert<Value>(
+                            m_index, Neg, m_origin, m_value->child(0));
+                    }
+                    
+                    bool exempt = false;
+                    switch (m_value->opcode()) {
+                    case AtomicXchgAnd:
+                    case AtomicXchgOr:
+                    case AtomicXchgSub:
+                    case AtomicXchgXor:
+                        exempt = true;
+                        break;
+                    default:
+                        break;
+                    }
+                    if (exempt)
+                        break;
+                }
+                
+                AtomicValue* atomic = m_value->as<AtomicValue>();
+                Width width = atomic->accessWidth();
+                
+                if (isCanonicalWidth(width))
+                    break;
+                
+                Value* newValue = m_insertionSet.insert<Value>(
+                    m_index, signExtendOpcode(width), m_origin,
+                    m_insertionSet.insertClone(m_index, atomic));
+                
+                atomic->replaceWithIdentity(newValue);
+                m_changed = true;
+                break;
+            }
+                
+            case Load8Z:
+            case Load16Z: {
+                if (isX86())
+                    break;
+                
+                MemoryValue* memory = m_value->as<MemoryValue>();
+                if (!memory->hasFence())
+                    break;
+                
+                // Sub-width load-acq on ARM64 always sign extends.
+                Value* newLoad = m_insertionSet.insertClone(m_index, memory);
+                newLoad->setOpcodeUnsafely(memory->opcode() == Load8Z ? Load8S : Load16S);
+                
+                Value* newValue = m_insertionSet.insert<Value>(
+                    m_index, BitAnd, m_origin, newLoad,
+                    m_insertionSet.insertIntConstant(
+                        m_index, m_origin, Int32, mask(memory->accessWidth())));
+
+                m_value->replaceWithIdentity(newValue);
+                m_changed = true;
+                break;
+            }
+                
             default:
                 break;
             }
@@ -470,6 +600,7 @@ private:
     Procedure& m_proc;
     BlockInsertionSet m_blockInsertionSet;
     InsertionSet m_insertionSet;
+    UseCounts m_useCounts;
     BasicBlock* m_block;
     unsigned m_index;
     Value* m_value;
@@ -477,21 +608,12 @@ private:
     bool m_changed { false };
 };
 
-bool lowerMacrosImpl(Procedure& proc)
-{
-    LowerMacros lowerMacros(proc);
-    return lowerMacros.run();
-}
-
 } // anonymous namespace
 
 bool lowerMacros(Procedure& proc)
 {
-    PhaseScope phaseScope(proc, "lowerMacros");
-    bool result = lowerMacrosImpl(proc);
-    if (shouldValidateIR())
-        RELEASE_ASSERT(!lowerMacrosImpl(proc));
-    return result;
+    LowerMacros lowerMacros(proc);
+    return lowerMacros.run();
 }
 
 } } // namespace JSC::B3
index dbe158b..963c10e 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -156,15 +156,14 @@ private:
             case RotL: {
                 // ARM64 doesn't have a rotate left.
                 if (isARM64()) {
-                    if (isARM64()) {
-                        Value* newShift = m_insertionSet.insert<Value>(m_index, Neg, m_value->origin(), m_value->child(1));
-                        Value* rotate = m_insertionSet.insert<Value>(m_index, RotR, m_value->origin(), m_value->child(0), newShift);
-                        m_value->replaceWithIdentity(rotate);
-                        break;
-                    }
+                    Value* newShift = m_insertionSet.insert<Value>(m_index, Neg, m_value->origin(), m_value->child(1));
+                    Value* rotate = m_insertionSet.insert<Value>(m_index, RotR, m_value->origin(), m_value->child(0), newShift);
+                    m_value->replaceWithIdentity(rotate);
+                    break;
                 }
                 break;
             }
+                
             default:
                 break;
             }
index 5cbcc97..969680b 100644 (file)
 
 #if ENABLE(B3_JIT)
 
+#include "AirBlockInsertionSet.h"
 #include "AirCCallSpecial.h"
 #include "AirCode.h"
 #include "AirInsertionSet.h"
 #include "AirInstInlines.h"
 #include "AirStackSlot.h"
 #include "B3ArgumentRegValue.h"
+#include "B3AtomicValue.h"
 #include "B3BasicBlockInlines.h"
 #include "B3BlockWorklist.h"
 #include "B3CCallValue.h"
@@ -41,7 +43,7 @@
 #include "B3Commutativity.h"
 #include "B3Dominators.h"
 #include "B3FenceValue.h"
-#include "B3MemoryValue.h"
+#include "B3MemoryValueInlines.h"
 #include "B3PatchpointSpecial.h"
 #include "B3PatchpointValue.h"
 #include "B3PhaseScope.h"
@@ -72,6 +74,16 @@ namespace {
 
 const bool verbose = false;
 
+// FIXME: We wouldn't need this if Air supported Width modifiers in Air::Kind.
+// https://bugs.webkit.org/show_bug.cgi?id=169247
+#define OPCODE_FOR_WIDTH(opcode, width) ( \
+    (width) == Width8 ? opcode ## 8 : \
+    (width) == Width16 ? opcode ## 16 : \
+    (width) == Width32 ? opcode ## 32 : \
+    opcode ## 64)
+#define OPCODE_FOR_CANONICAL_WIDTH(opcode, width) ( \
+    (width) == Width64 ? opcode ## 64 : opcode ## 32)
+
 class LowerToAir {
 public:
     LowerToAir(Procedure& procedure)
@@ -83,6 +95,12 @@ public:
         , m_dominators(procedure.dominators())
         , m_procedure(procedure)
         , m_code(procedure.code())
+        , m_blockInsertionSet(m_code)
+#if CPU(X86) || CPU(X86_64)
+        , m_eax(X86Registers::eax)
+        , m_ecx(X86Registers::ecx)
+        , m_edx(X86Registers::edx)
+#endif
     {
     }
 
@@ -124,14 +142,18 @@ public:
         // hoisted address expression before we duplicate it back into the loop.
         for (B3::BasicBlock* block : m_procedure.blocksInPreOrder()) {
             m_block = block;
-            // Reset some state.
-            m_insts.resize(0);
 
             m_isRare = !m_fastWorklist.saw(block);
 
             if (verbose)
                 dataLog("Lowering Block ", *block, ":\n");
             
+            // Make sure that the successors are set up correctly.
+            for (B3::FrequentedBlock successor : block->successors()) {
+                m_blockToBlock[block]->successors().append(
+                    Air::FrequentedBlock(m_blockToBlock[successor.block()], successor.frequency()));
+            }
+
             // Process blocks in reverse order so we see uses before defs. That's what allows us
             // to match patterns effectively.
             for (unsigned i = block->size(); i--;) {
@@ -149,19 +171,10 @@ public:
                 }
             }
 
-            // Now append the instructions. m_insts contains them in reverse order, so we process
-            // it in reverse.
-            for (unsigned i = m_insts.size(); i--;) {
-                for (Inst& inst : m_insts[i])
-                    m_blockToBlock[block]->appendInst(WTFMove(inst));
-            }
-
-            // Make sure that the successors are set up correctly.
-            for (B3::FrequentedBlock successor : block->successors()) {
-                m_blockToBlock[block]->successors().append(
-                    Air::FrequentedBlock(m_blockToBlock[successor.block()], successor.frequency()));
-            }
+            finishAppendingInstructions(m_blockToBlock[block]);
         }
+        
+        m_blockInsertionSet.execute();
 
         Air::InsertionSet insertionSet(m_code);
         for (Inst& inst : m_prologue)
@@ -448,12 +461,12 @@ private:
             return std::nullopt;
         return scale;
     }
-
+    
     // This turns the given operand into an address.
     Arg effectiveAddr(Value* address, int32_t offset, Width width)
     {
         ASSERT(Arg::isValidAddrForm(offset, width));
-
+        
         auto fallback = [&] () -> Arg {
             return Arg::addr(tmp(address), offset);
         };
@@ -535,12 +548,15 @@ private:
         MemoryValue* value = memoryValue->as<MemoryValue>();
         if (!value)
             return Arg();
+        
+        if (value->requiresSimpleAddr())
+            return Arg::simpleAddr(tmp(value->lastChild()));
 
         int32_t offset = value->offset();
         Width width = value->accessWidth();
 
         Arg result = effectiveAddr(value->lastChild(), offset, width);
-        ASSERT(result.isValidForm(width));
+        RELEASE_ASSERT(result.isValidForm(width));
 
         return result;
     }
@@ -561,11 +577,19 @@ private:
     
     ArgPromise loadPromiseAnyOpcode(Value* loadValue)
     {
+        RELEASE_ASSERT(loadValue->as<MemoryValue>());
         if (!canBeInternal(loadValue))
             return Arg();
         if (crossesInterference(loadValue))
             return Arg();
-        ArgPromise result(addr(loadValue), loadValue);
+        // On x86, all loads have fences. Doing this kind of instruction selection will move the load,
+        // but that's fine because our interference analysis stops the motion of fences around other
+        // fences. So, any load motion we introduce here would not be observable.
+        if (!isX86() && loadValue->as<MemoryValue>()->hasFence())
+            return Arg();
+        Arg loadAddr = addr(loadValue);
+        RELEASE_ASSERT(loadAddr);
+        ArgPromise result(loadAddr, loadValue);
         if (loadValue->traps())
             result.setTraps(true);
         return result;
@@ -738,7 +762,7 @@ private:
 
         return true;
     }
-
+    
     template<Air::Opcode opcode32, Air::Opcode opcode64, Air::Opcode opcodeDouble, Air::Opcode opcodeFloat, Commutativity commutativity = NotCommutative>
     void appendBinOp(Value* left, Value* right)
     {
@@ -897,11 +921,9 @@ private:
             return;
         }
 
-#if CPU(X86) || CPU(X86_64)
         append(Move, tmp(value), tmp(m_value));
-        append(Move, tmp(amount), Tmp(X86Registers::ecx));
-        append(opcode, Tmp(X86Registers::ecx), tmp(m_value));
-#endif
+        append(Move, tmp(amount), m_ecx);
+        append(opcode, m_ecx, tmp(m_value));
     }
 
     template<Air::Opcode opcode32, Air::Opcode opcode64>
@@ -930,10 +952,16 @@ private:
         Air::Opcode opcode32, Air::Opcode opcode64, Commutativity commutativity = NotCommutative>
     bool tryAppendStoreBinOp(Value* left, Value* right)
     {
+        RELEASE_ASSERT(m_value->as<MemoryValue>());
+        
         Air::Opcode opcode = tryOpcodeForType(opcode32, opcode64, left->type());
         if (opcode == Air::Oops)
             return false;
         
+        // On x86, all stores have fences, and this isn't reordering the store itself.
+        if (!isX86() && m_value->as<MemoryValue>()->hasFence())
+            return false;
+        
         Arg storeAddr = addr(m_value);
         ASSERT(storeAddr);
 
@@ -993,11 +1021,50 @@ private:
 
         return Inst(move, m_value, tmp(value), dest);
     }
+    
+    Air::Opcode storeOpcode(Width width, Bank bank, bool release)
+    {
+        switch (width) {
+        case Width8:
+            RELEASE_ASSERT(bank == GP);
+            return release ? StoreRel8 : Air::Store8;
+        case Width16:
+            RELEASE_ASSERT(bank == GP);
+            return release ? StoreRel16 : Air::Store16;
+        case Width32:
+            switch (bank) {
+            case GP:
+                return release ? StoreRel32 : Move32;
+            case FP:
+                RELEASE_ASSERT(!release);
+                return MoveFloat;
+            }
+            break;
+        case Width64:
+            RELEASE_ASSERT(is64Bit());
+            switch (bank) {
+            case GP:
+                return release ? StoreRel64 : Move;
+            case FP:
+                RELEASE_ASSERT(!release);
+                return MoveDouble;
+            }
+            break;
+        }
+        RELEASE_ASSERT_NOT_REACHED();
+    }
+    
+    Air::Opcode storeOpcode(Value* value)
+    {
+        MemoryValue* memory = value->as<MemoryValue>();
+        RELEASE_ASSERT(memory->isStore());
+        return storeOpcode(memory->accessWidth(), memory->accessBank(), memory->hasFence());
+    }
 
     Inst createStore(Value* value, const Arg& dest)
     {
-        Air::Opcode moveOpcode = moveForType(value->type());
-        return createStore(moveOpcode, value, dest);
+        Air::Opcode moveOpcode = storeOpcode(value);
+        return createStore(moveOpcode, value->child(0), dest);
     }
 
     template<typename... Args>
@@ -1005,7 +1072,7 @@ private:
     {
         append(trappingInst(m_value, createStore(std::forward<Args>(args)...)));
     }
-
+    
     Air::Opcode moveForType(Type type)
     {
         switch (type) {
@@ -1072,7 +1139,41 @@ private:
     {
         m_insts.last().append(inst);
     }
+    
+    void finishAppendingInstructions(Air::BasicBlock* target)
+    {
+        // Now append the instructions. m_insts contains them in reverse order, so we process
+        // it in reverse.
+        for (unsigned i = m_insts.size(); i--;) {
+            for (Inst& inst : m_insts[i])
+                target->appendInst(WTFMove(inst));
+        }
+        m_insts.resize(0);
+    }
+    
+    Air::BasicBlock* newBlock()
+    {
+        return m_blockInsertionSet.insertAfter(m_blockToBlock[m_block]);
+    }
 
+    // NOTE: This will create a continuation block (`nextBlock`) *after* any blocks you've created using
+    // newBlock(). So, it's preferable to create all of your blocks upfront using newBlock(). Also note
+    // that any code you emit before this will be prepended to the continuation, and any code you emit
+    // after this will be appended to the previous block.
+    void splitBlock(Air::BasicBlock*& previousBlock, Air::BasicBlock*& nextBlock)
+    {
+        Air::BasicBlock* block = m_blockToBlock[m_block];
+        
+        previousBlock = block;
+        nextBlock = m_blockInsertionSet.insertAfter(block);
+        
+        finishAppendingInstructions(nextBlock);
+        nextBlock->successors() = block->successors();
+        block->successors().clear();
+        
+        m_insts.append(Vector<Inst>());
+    }
+    
     template<typename T, typename... Arguments>
     T* ensureSpecial(T*& field, Arguments&&... arguments)
     {
@@ -1122,7 +1223,7 @@ private:
                 break;
             case ValueRep::StackArgument:
                 arg = Arg::callArg(value.rep().offsetFromSP());
-                appendStore(value.value(), arg);
+                appendStore(moveForType(value.value()->type()), value.value(), arg);
                 break;
             default:
                 RELEASE_ASSERT_NOT_REACHED();
@@ -1253,7 +1354,9 @@ private:
 
             // It's safe to share value, but since we're sharing, it means that we aren't locking it.
             // If we don't lock it, then fusing loads is off limits and all of value's children will
-            // have to go through the sharing path as well.
+            // have to go through the sharing path as well. Fusing loads is off limits because the load
+            // could already have been emitted elsehwere - so fusing it here would duplicate the load.
+            // We don't consider that to be a legal optimization.
             canCommitInternal = false;
             
             return Fuse;
@@ -1471,6 +1574,7 @@ private:
                     // FIXME: If this is unsigned then we can chop things off of the immediate.
                     // This might make the immediate more legal. Perhaps that's a job for
                     // strength reduction?
+                    // https://bugs.webkit.org/show_bug.cgi?id=169248
                     
                     if (rightImm) {
                         if (Inst result = tryTest(width, loadPromise(left, loadOpcode), rightImm)) {
@@ -1960,6 +2064,314 @@ private:
         return true;
     }
 
+    void appendX86Div(B3::Opcode op)
+    {
+        Air::Opcode convertToDoubleWord;
+        Air::Opcode div;
+        switch (m_value->type()) {
+        case Int32:
+            convertToDoubleWord = X86ConvertToDoubleWord32;
+            div = X86Div32;
+            break;
+        case Int64:
+            convertToDoubleWord = X86ConvertToQuadWord64;
+            div = X86Div64;
+            break;
+        default:
+            RELEASE_ASSERT_NOT_REACHED();
+            return;
+        }
+
+        ASSERT(op == Div || op == Mod);
+        Tmp result = op == Div ? m_eax : m_edx;
+
+        append(Move, tmp(m_value->child(0)), m_eax);
+        append(convertToDoubleWord, m_eax, m_edx);
+        append(div, m_eax, m_edx, tmp(m_value->child(1)));
+        append(Move, result, tmp(m_value));
+    }
+
+    void appendX86UDiv(B3::Opcode op)
+    {
+        Air::Opcode div = m_value->type() == Int32 ? X86UDiv32 : X86UDiv64;
+
+        ASSERT(op == UDiv || op == UMod);
+        Tmp result = op == UDiv ? m_eax : m_edx;
+
+        append(Move, tmp(m_value->child(0)), m_eax);
+        append(Xor64, m_edx, m_edx);
+        append(div, m_eax, m_edx, tmp(m_value->child(1)));
+        append(Move, result, tmp(m_value));
+    }
+    
+    Air::Opcode loadLinkOpcode(Width width, bool fence)
+    {
+        return fence ? OPCODE_FOR_WIDTH(LoadLinkAcq, width) : OPCODE_FOR_WIDTH(LoadLink, width);
+    }
+    
+    Air::Opcode storeCondOpcode(Width width, bool fence)
+    {
+        return fence ? OPCODE_FOR_WIDTH(StoreCondRel, width) : OPCODE_FOR_WIDTH(StoreCond, width);
+    }
+    
+    // This can emit code for the following patterns:
+    // AtomicWeakCAS
+    // BitXor(AtomicWeakCAS, 1)
+    // AtomicStrongCAS
+    // Equal(AtomicStrongCAS, expected)
+    // NotEqual(AtomicStrongCAS, expected)
+    // Branch(AtomicWeakCAS)
+    // Branch(Equal(AtomicStrongCAS, expected))
+    // Branch(NotEqual(AtomicStrongCAS, expected))
+    //
+    // It assumes that atomicValue points to the CAS, and m_value points to the instruction being
+    // generated. It assumes that you've consumed everything that needs to be consumed.
+    void appendCAS(Value* atomicValue, bool invert)
+    {
+        AtomicValue* atomic = atomicValue->as<AtomicValue>();
+        RELEASE_ASSERT(atomic);
+        
+        bool isBranch = m_value->opcode() == Branch;
+        bool isStrong = atomic->opcode() == AtomicStrongCAS;
+        bool returnsOldValue = m_value->opcode() == AtomicStrongCAS;
+        bool hasFence = atomic->hasFence();
+        
+        Width width = atomic->accessWidth();
+        Arg address = addr(atomic);
+
+        Tmp valueResultTmp;
+        Tmp boolResultTmp;
+        if (returnsOldValue) {
+            RELEASE_ASSERT(!invert);
+            valueResultTmp = tmp(m_value);
+            boolResultTmp = m_code.newTmp(GP);
+        } else if (isBranch) {
+            valueResultTmp = m_code.newTmp(GP);
+            boolResultTmp = m_code.newTmp(GP);
+        } else {
+            valueResultTmp = m_code.newTmp(GP);
+            boolResultTmp = tmp(m_value);
+        }
+        
+        Tmp successBoolResultTmp;
+        if (isStrong && !isBranch)
+            successBoolResultTmp = m_code.newTmp(GP);
+        else
+            successBoolResultTmp = boolResultTmp;
+
+        Tmp expectedValueTmp = tmp(atomic->child(0));
+        Tmp newValueTmp = tmp(atomic->child(1));
+        
+        Air::FrequentedBlock success;
+        Air::FrequentedBlock failure;
+        if (isBranch) {
+            success = m_blockToBlock[m_block]->successor(invert);
+            failure = m_blockToBlock[m_block]->successor(!invert);
+        }
+        
+        if (isX86()) {
+            append(relaxedMoveForType(atomic->accessType()), immOrTmp(atomic->child(0)), m_eax);
+            if (returnsOldValue) {
+                append(OPCODE_FOR_WIDTH(AtomicStrongCAS, width), m_eax, newValueTmp, address);
+                append(relaxedMoveForType(atomic->accessType()), m_eax, valueResultTmp);
+            } else if (isBranch) {
+                append(OPCODE_FOR_WIDTH(BranchAtomicStrongCAS, width), Arg::statusCond(MacroAssembler::Success), m_eax, newValueTmp, address);
+                m_blockToBlock[m_block]->setSuccessors(success, failure);
+            } else
+                append(OPCODE_FOR_WIDTH(AtomicStrongCAS, width), Arg::statusCond(invert ? MacroAssembler::Failure : MacroAssembler::Success), m_eax, tmp(atomic->child(1)), address, boolResultTmp);
+            return;
+        }
+        
+        RELEASE_ASSERT(isARM64());
+        // We wish to emit:
+        //
+        // Block #reloop:
+        //     LoadLink
+        //     Branch NotEqual
+        //   Successors: Then:#fail, Else: #store
+        // Block #store:
+        //     StoreCond
+        //     Xor $1, %result    <--- only if !invert
+        //     Jump
+        //   Successors: #done
+        // Block #fail:
+        //     Move $invert, %result
+        //     Jump
+        //   Successors: #done
+        // Block #done:
+        
+        Air::BasicBlock* reloopBlock = newBlock();
+        Air::BasicBlock* storeBlock = newBlock();
+        Air::BasicBlock* successBlock = nullptr;
+        if (!isBranch && isStrong)
+            successBlock = newBlock();
+        Air::BasicBlock* failBlock = nullptr;
+        if (!isBranch) {
+            failBlock = newBlock();
+            failure = failBlock;
+        }
+        Air::BasicBlock* strongFailBlock;
+        if (isStrong && hasFence)
+            strongFailBlock = newBlock();
+        Air::FrequentedBlock comparisonFail = failure;
+        Air::FrequentedBlock weakFail;
+        if (isStrong) {
+            if (hasFence)
+                comparisonFail = strongFailBlock;
+            weakFail = reloopBlock;
+        } else 
+            weakFail = failure;
+        Air::BasicBlock* beginBlock;
+        Air::BasicBlock* doneBlock;
+        splitBlock(beginBlock, doneBlock);
+        
+        append(Air::Jump);
+        beginBlock->setSuccessors(reloopBlock);
+        
+        reloopBlock->append(loadLinkOpcode(width, atomic->hasFence()), m_value, address, valueResultTmp);
+        reloopBlock->append(OPCODE_FOR_CANONICAL_WIDTH(Branch, width), m_value, Arg::relCond(MacroAssembler::NotEqual), valueResultTmp, expectedValueTmp);
+        reloopBlock->setSuccessors(comparisonFail, storeBlock);
+        
+        storeBlock->append(storeCondOpcode(width, atomic->hasFence()), m_value, newValueTmp, address, successBoolResultTmp);
+        if (isBranch) {
+            storeBlock->append(BranchTest32, m_value, Arg::resCond(MacroAssembler::Zero), boolResultTmp, boolResultTmp);
+            storeBlock->setSuccessors(success, weakFail);
+            doneBlock->successors().clear();
+            RELEASE_ASSERT(!doneBlock->size());
+            doneBlock->append(Air::Oops, m_value);
+        } else {
+            if (isStrong) {
+                storeBlock->append(BranchTest32, m_value, Arg::resCond(MacroAssembler::Zero), successBoolResultTmp, successBoolResultTmp);
+                storeBlock->setSuccessors(successBlock, reloopBlock);
+                
+                successBlock->append(Move, m_value, Arg::imm(!invert), boolResultTmp);
+                successBlock->append(Air::Jump, m_value);
+                successBlock->setSuccessors(doneBlock);
+            } else {
+                if (!invert)
+                    storeBlock->append(Xor32, m_value, Arg::bitImm(1), boolResultTmp, boolResultTmp);
+                
+                storeBlock->append(Air::Jump, m_value);
+                storeBlock->setSuccessors(doneBlock);
+            }
+            
+            failBlock->append(Move, m_value, Arg::imm(invert), boolResultTmp);
+            failBlock->append(Air::Jump, m_value);
+            failBlock->setSuccessors(doneBlock);
+        }
+        
+        if (isStrong && hasFence) {
+            Tmp tmp = m_code.newTmp(GP);
+            strongFailBlock->append(storeCondOpcode(width, atomic->hasFence()), m_value, valueResultTmp, address, tmp);
+            strongFailBlock->append(BranchTest32, m_value, Arg::resCond(MacroAssembler::Zero), tmp, tmp);
+            strongFailBlock->setSuccessors(failure, reloopBlock);
+        }
+    }
+    
+    bool appendVoidAtomic(Air::Opcode atomicOpcode)
+    {
+        if (m_useCounts.numUses(m_value))
+            return false;
+        
+        Arg address = addr(m_value);
+        
+        if (isValidForm(atomicOpcode, Arg::Imm, address.kind()) && imm(m_value->child(0))) {
+            append(atomicOpcode, imm(m_value->child(0)), address);
+            return true;
+        }
+        
+        if (isValidForm(atomicOpcode, Arg::Tmp, address.kind())) {
+            append(atomicOpcode, tmp(m_value->child(0)), address);
+            return true;
+        }
+        
+        return false;
+    }
+    
+    void appendGeneralAtomic(Air::Opcode opcode, Commutativity commutativity = NotCommutative)
+    {
+        AtomicValue* atomic = m_value->as<AtomicValue>();
+        
+        Arg address = addr(m_value);
+        Tmp oldValue = m_code.newTmp(GP);
+        Tmp newValue = opcode == Air::Nop ? tmp(atomic->child(0)) : m_code.newTmp(GP);
+        
+        // We need a CAS loop or a LL/SC loop. Using prepare/attempt jargon, we want:
+        //
+        // Block #reloop:
+        //     Prepare
+        //     opcode
+        //     Attempt
+        //   Successors: Then:#done, Else:#reloop
+        // Block #done:
+        //     Move oldValue, result
+        
+        append(relaxedMoveForType(atomic->type()), oldValue, tmp(atomic));
+        
+        Air::BasicBlock* reloopBlock = newBlock();
+        Air::BasicBlock* beginBlock;
+        Air::BasicBlock* doneBlock;
+        splitBlock(beginBlock, doneBlock);
+        
+        append(Air::Jump);
+        beginBlock->setSuccessors(reloopBlock);
+        
+        Air::Opcode prepareOpcode;
+        if (isX86()) {
+            switch (atomic->accessWidth()) {
+            case Width8:
+                prepareOpcode = Load8SignedExtendTo32;
+                break;
+            case Width16:
+                prepareOpcode = Load16SignedExtendTo32;
+                break;
+            case Width32:
+                prepareOpcode = Move32;
+                break;
+            case Width64:
+                prepareOpcode = Move;
+                break;
+            }
+        } else {
+            RELEASE_ASSERT(isARM64());
+            prepareOpcode = loadLinkOpcode(atomic->accessWidth(), atomic->hasFence());
+        }
+        reloopBlock->append(prepareOpcode, m_value, address, oldValue);
+        
+        if (opcode != Air::Nop) {
+            // FIXME: If we ever have to write this again, we need to find a way to share the code with
+            // appendBinOp.
+            // https://bugs.webkit.org/show_bug.cgi?id=169249
+            if (commutativity == Commutative && imm(atomic->child(0)) && isValidForm(opcode, Arg::Imm, Arg::Tmp, Arg::Tmp))
+                reloopBlock->append(opcode, m_value, imm(atomic->child(0)), oldValue, newValue);
+            else if (imm(atomic->child(0)) && isValidForm(opcode, Arg::Tmp, Arg::Imm, Arg::Tmp))
+                reloopBlock->append(opcode, m_value, oldValue, imm(atomic->child(0)), newValue);
+            else if (commutativity == Commutative && bitImm(atomic->child(0)) && isValidForm(opcode, Arg::BitImm, Arg::Tmp, Arg::Tmp))
+                reloopBlock->append(opcode, m_value, bitImm(atomic->child(0)), oldValue, newValue);
+            else if (isValidForm(opcode, Arg::Tmp, Arg::Tmp, Arg::Tmp))
+                reloopBlock->append(opcode, m_value, oldValue, tmp(atomic->child(0)), newValue);
+            else {
+                reloopBlock->append(relaxedMoveForType(atomic->type()), m_value, oldValue, newValue);
+                if (imm(atomic->child(0)) && isValidForm(opcode, Arg::Imm, Arg::Tmp))
+                    reloopBlock->append(opcode, m_value, imm(atomic->child(0)), newValue);
+                else
+                    reloopBlock->append(opcode, m_value, tmp(atomic->child(0)), newValue);
+            }
+        }
+
+        if (isX86()) {
+            Air::Opcode casOpcode = OPCODE_FOR_WIDTH(BranchAtomicStrongCAS, atomic->accessWidth());
+            reloopBlock->append(relaxedMoveForType(atomic->type()), m_value, oldValue, m_eax);
+            reloopBlock->append(casOpcode, m_value, Arg::statusCond(MacroAssembler::Success), m_eax, newValue, address);
+        } else {
+            RELEASE_ASSERT(isARM64());
+            Tmp boolResult = m_code.newTmp(GP);
+            reloopBlock->append(storeCondOpcode(atomic->accessWidth(), atomic->hasFence()), m_value, newValue, address, boolResult);
+            reloopBlock->append(BranchTest32, m_value, Arg::resCond(MacroAssembler::Zero), boolResult, boolResult);
+        }
+        reloopBlock->setSuccessors(doneBlock, reloopBlock);
+    }
+    
     void lower()
     {
         switch (m_value->opcode()) {
@@ -1970,30 +2382,50 @@ private:
         }
             
         case Load: {
-            append(trappingInst(m_value, moveForType(m_value->type()), m_value, addr(m_value), tmp(m_value)));
+            MemoryValue* memory = m_value->as<MemoryValue>();
+            Air::Opcode opcode = Air::Oops;
+            if (memory->hasFence()) {
+                switch (memory->type()) {
+                case Int32:
+                    opcode = LoadAcq32;
+                    break;
+                case Int64:
+                    opcode = LoadAcq64;
+                    break;
+                default:
+                    RELEASE_ASSERT_NOT_REACHED();
+                    break;
+                }
+            } else
+                opcode = moveForType(memory->type());
+            append(trappingInst(m_value, opcode, m_value, addr(m_value), tmp(m_value)));
             return;
         }
             
         case Load8S: {
-            append(trappingInst(m_value, Load8SignedExtendTo32, m_value, addr(m_value), tmp(m_value)));
+            Air::Opcode opcode = m_value->as<MemoryValue>()->hasFence() ? LoadAcq8SignedExtendTo32 : Load8SignedExtendTo32;
+            append(trappingInst(m_value, opcode, m_value, addr(m_value), tmp(m_value)));
             return;
         }
 
         case Load8Z: {
-            append(trappingInst(m_value, Load8, m_value, addr(m_value), tmp(m_value)));
+            Air::Opcode opcode = m_value->as<MemoryValue>()->hasFence() ? LoadAcq8 : Load8;
+            append(trappingInst(m_value, opcode, m_value, addr(m_value), tmp(m_value)));
             return;
         }
 
         case Load16S: {
-            append(trappingInst(m_value, Load16SignedExtendTo32, m_value, addr(m_value), tmp(m_value)));
+            Air::Opcode opcode = m_value->as<MemoryValue>()->hasFence() ? LoadAcq16SignedExtendTo32 : Load16SignedExtendTo32;
+            append(trappingInst(m_value, opcode, m_value, addr(m_value), tmp(m_value)));
             return;
         }
 
         case Load16Z: {
-            append(trappingInst(m_value, Load16, m_value, addr(m_value), tmp(m_value)));
+            Air::Opcode opcode = m_value->as<MemoryValue>()->hasFence() ? LoadAcq16 : Load16;
+            append(trappingInst(m_value, opcode, m_value, addr(m_value), tmp(m_value)));
             return;
         }
-
+            
         case Add: {
             if (tryAppendLea())
                 return;
@@ -2091,7 +2523,7 @@ private:
             if (m_value->isChill())
                 RELEASE_ASSERT(isARM64());
             if (isInt(m_value->type()) && isX86()) {
-                lowerX86Div(Div);
+                appendX86Div(Div);
                 return;
             }
             ASSERT(!isX86() || isFloat(m_value->type()));
@@ -2102,7 +2534,7 @@ private:
 
         case UDiv: {
             if (isInt(m_value->type()) && isX86()) {
-                lowerX86UDiv(UDiv);
+                appendX86UDiv(UDiv);
                 return;
             }
 
@@ -2116,13 +2548,13 @@ private:
         case Mod: {
             RELEASE_ASSERT(isX86());
             RELEASE_ASSERT(!m_value->isChill());
-            lowerX86Div(Mod);
+            appendX86Div(Mod);
             return;
         }
 
         case UMod: {
             RELEASE_ASSERT(isX86());
-            lowerX86UDiv(UMod);
+            appendX86UDiv(UMod);
             return;
         }
 
@@ -2161,10 +2593,28 @@ private:
                 appendUnOp<Not32, Not64>(m_value->child(0));
                 return;
             }
+            
+            // This pattern is super useful on both x86 and ARM64, since the inversion of the CAS result
+            // can be done with zero cost on x86 (just flip the set from E to NE) and it's a progression
+            // on ARM64 (since STX returns 0 on success, so ordinarily we have to flip it).
+            if (m_value->child(1)->isInt(1)
+                && isAtomicCAS(m_value->child(0)->opcode())
+                && canBeInternal(m_value->child(0))) {
+                commitInternal(m_value->child(0));
+                appendCAS(m_value->child(0), true);
+                return;
+            }
+            
             appendBinOp<Xor32, Xor64, XorDouble, XorFloat, Commutative>(
                 m_value->child(0), m_value->child(1));
             return;
         }
+            
+        case Depend: {
+            RELEASE_ASSERT(isARM64());
+            appendUnOp<Depend32, Depend64>(m_value->child(0));
+            return;
+        }
 
         case Shl: {
             if (m_value->child(1)->isInt32(1)) {
@@ -2265,7 +2715,7 @@ private:
                 }
             }
 
-            appendStore(valueToStore, addr(m_value));
+            appendStore(m_value, addr(m_value));
             return;
         }
 
@@ -2286,7 +2736,7 @@ private:
                     return;
                 }
             }
-            appendStore(Air::Store8, valueToStore, addr(m_value));
+            appendStore(m_value, addr(m_value));
             return;
         }
 
@@ -2307,7 +2757,7 @@ private:
                     return;
                 }
             }
-            appendStore(Air::Store16, valueToStore, addr(m_value));
+            appendStore(m_value, addr(m_value));
             return;
         }
 
@@ -2415,7 +2865,27 @@ private:
         }
 
         case Equal:
-        case NotEqual:
+        case NotEqual: {
+            // FIXME: Teach this to match patterns that arise from subwidth CAS. The CAS's result has to
+            // be either zero- or sign-extended, and the value it's compared to should also be zero- or
+            // sign-extended in a matching way. It's not super clear that this is very profitable.
+            // https://bugs.webkit.org/show_bug.cgi?id=169250
+            if (m_value->child(0)->opcode() == AtomicStrongCAS
+                && m_value->child(0)->as<AtomicValue>()->isCanonicalWidth()
+                && m_value->child(0)->child(0) == m_value->child(1)
+                && canBeInternal(m_value->child(0))) {
+                ASSERT(!m_locked.contains(m_value->child(0)->child(1)));
+                ASSERT(!m_locked.contains(m_value->child(1)));
+                
+                commitInternal(m_value->child(0));
+                appendCAS(m_value->child(0), m_value->opcode() == NotEqual);
+                return;
+            }
+                
+            m_insts.last().append(createCompare(m_value));
+            return;
+        }
+            
         case LessThan:
         case GreaterThan:
         case LessEqual:
@@ -2440,6 +2910,7 @@ private:
                 config.moveConditionallyFloat = MoveConditionallyFloat;
             } else {
                 // FIXME: it's not obvious that these are particularly efficient.
+                // https://bugs.webkit.org/show_bug.cgi?id=169251
                 config.moveConditionally32 = MoveDoubleConditionally32;
                 config.moveConditionally64 = MoveDoubleConditionally64;
                 config.moveConditionallyTest32 = MoveDoubleConditionallyTest32;
@@ -2726,6 +3197,46 @@ private:
         }
 
         case Branch: {
+            if (canBeInternal(m_value->child(0))) {
+                Value* branchChild = m_value->child(0);
+                switch (branchChild->opcode()) {
+                case AtomicWeakCAS:
+                    commitInternal(branchChild);
+                    appendCAS(branchChild, false);
+                    return;
+                    
+                case AtomicStrongCAS:
+                    // A branch is a comparison to zero.
+                    // FIXME: Teach this to match patterns that arise from subwidth CAS.
+                    // https://bugs.webkit.org/show_bug.cgi?id=169250
+                    if (branchChild->child(0)->isInt(0)
+                        && branchChild->as<AtomicValue>()->isCanonicalWidth()) {
+                        commitInternal(branchChild);
+                        appendCAS(branchChild, true);
+                        return;
+                    }
+                    break;
+                    
+                case Equal:
+                case NotEqual:
+                    // FIXME: Teach this to match patterns that arise from subwidth CAS.
+                    // https://bugs.webkit.org/show_bug.cgi?id=169250
+                    if (branchChild->child(0)->opcode() == AtomicStrongCAS
+                        && branchChild->child(0)->as<AtomicValue>()->isCanonicalWidth()
+                        && canBeInternal(branchChild->child(0))
+                        && branchChild->child(0)->child(0) == branchChild->child(1)) {
+                        commitInternal(branchChild);
+                        commitInternal(branchChild->child(0));
+                        appendCAS(branchChild->child(0), branchChild->opcode() == NotEqual);
+                        return;
+                    }
+                    break;
+                    
+                default:
+                    break;
+                }
+            }
+            
             m_insts.last().append(createBranch(m_value->child(0)));
             return;
         }
@@ -2783,7 +3294,81 @@ private:
             append(Air::EntrySwitch);
             return;
         }
+            
+        case AtomicWeakCAS:
+        case AtomicStrongCAS: {
+            appendCAS(m_value, false);
+            return;
+        }
+            
+        case AtomicXchgAdd: {
+            AtomicValue* atomic = m_value->as<AtomicValue>();
+            if (appendVoidAtomic(OPCODE_FOR_WIDTH(AtomicAdd, atomic->accessWidth())))
+                return;
+            
+            Arg address = addr(atomic);
+            Air::Opcode opcode = OPCODE_FOR_WIDTH(AtomicXchgAdd, atomic->accessWidth());
+            if (isValidForm(opcode, Arg::Tmp, address.kind())) {
+                append(relaxedMoveForType(atomic->type()), tmp(atomic->child(0)), tmp(atomic));
+                append(opcode, tmp(atomic), address);
+                return;
+            }
+
+            appendGeneralAtomic(OPCODE_FOR_CANONICAL_WIDTH(Add, atomic->accessWidth()), Commutative);
+            return;
+        }
 
+        case AtomicXchgSub: {
+            AtomicValue* atomic = m_value->as<AtomicValue>();
+            if (appendVoidAtomic(OPCODE_FOR_WIDTH(AtomicSub, atomic->accessWidth())))
+                return;
+            
+            appendGeneralAtomic(OPCODE_FOR_CANONICAL_WIDTH(Sub, atomic->accessWidth()));
+            return;
+        }
+            
+        case AtomicXchgAnd: {
+            AtomicValue* atomic = m_value->as<AtomicValue>();
+            if (appendVoidAtomic(OPCODE_FOR_WIDTH(AtomicAnd, atomic->accessWidth())))
+                return;
+            
+            appendGeneralAtomic(OPCODE_FOR_CANONICAL_WIDTH(And, atomic->accessWidth()), Commutative);
+            return;
+        }
+            
+        case AtomicXchgOr: {
+            AtomicValue* atomic = m_value->as<AtomicValue>();
+            if (appendVoidAtomic(OPCODE_FOR_WIDTH(AtomicOr, atomic->accessWidth())))
+                return;
+            
+            appendGeneralAtomic(OPCODE_FOR_CANONICAL_WIDTH(Or, atomic->accessWidth()), Commutative);
+            return;
+        }
+            
+        case AtomicXchgXor: {
+            AtomicValue* atomic = m_value->as<AtomicValue>();
+            if (appendVoidAtomic(OPCODE_FOR_WIDTH(AtomicXor, atomic->accessWidth())))
+                return;
+            
+            appendGeneralAtomic(OPCODE_FOR_CANONICAL_WIDTH(Xor, atomic->accessWidth()), Commutative);
+            return;
+        }
+            
+        case AtomicXchg: {
+            AtomicValue* atomic = m_value->as<AtomicValue>();
+            
+            Arg address = addr(atomic);
+            Air::Opcode opcode = OPCODE_FOR_WIDTH(AtomicXchg, atomic->accessWidth());
+            if (isValidForm(opcode, Arg::Tmp, address.kind())) {
+                append(relaxedMoveForType(atomic->type()), tmp(atomic->child(0)), tmp(atomic));
+                append(opcode, tmp(atomic), address);
+                return;
+            }
+            
+            appendGeneralAtomic(Air::Nop);
+            return;
+        }
+            
         default:
             break;
         }
@@ -2791,64 +3376,7 @@ private:
         dataLog("FATAL: could not lower ", deepDump(m_procedure, m_value), "\n");
         RELEASE_ASSERT_NOT_REACHED();
     }
-
-    void lowerX86Div(B3::Opcode op)
-    {
-#if CPU(X86) || CPU(X86_64)
-        Tmp eax = Tmp(X86Registers::eax);
-        Tmp edx = Tmp(X86Registers::edx);
-
-        Air::Opcode convertToDoubleWord;
-        Air::Opcode div;
-        switch (m_value->type()) {
-        case Int32:
-            convertToDoubleWord = X86ConvertToDoubleWord32;
-            div = X86Div32;
-            break;
-        case Int64:
-            convertToDoubleWord = X86ConvertToQuadWord64;
-            div = X86Div64;
-            break;
-        default:
-            RELEASE_ASSERT_NOT_REACHED();
-            return;
-        }
-
-        ASSERT(op == Div || op == Mod);
-        X86Registers::RegisterID result = op == Div ? X86Registers::eax : X86Registers::edx;
-
-        append(Move, tmp(m_value->child(0)), eax);
-        append(convertToDoubleWord, eax, edx);
-        append(div, eax, edx, tmp(m_value->child(1)));
-        append(Move, Tmp(result), tmp(m_value));
-
-#else
-        UNUSED_PARAM(op);
-        UNREACHABLE_FOR_PLATFORM();
-#endif
-    }
-
-    void lowerX86UDiv(B3::Opcode op)
-    {
-#if CPU(X86) || CPU(X86_64)
-        Tmp eax = Tmp(X86Registers::eax);
-        Tmp edx = Tmp(X86Registers::edx);
-
-        Air::Opcode div = m_value->type() == Int32 ? X86UDiv32 : X86UDiv64;
-
-        ASSERT(op == UDiv || op == UMod);
-        X86Registers::RegisterID result = op == UDiv ? X86Registers::eax : X86Registers::edx;
-
-        append(Move, tmp(m_value->child(0)), eax);
-        append(Xor64, edx, edx);
-        append(div, eax, edx, tmp(m_value->child(1)));
-        append(Move, Tmp(result), tmp(m_value));
-#else
-        UNUSED_PARAM(op);
-        UNREACHABLE_FOR_PLATFORM();
-#endif
-    }
-
+    
     IndexSet<Value> m_locked; // These are values that will have no Tmp in Air.
     IndexMap<Value, Tmp> m_valueToTmp; // These are values that must have a Tmp in Air. We say that a Value* with a non-null Tmp is "pinned".
     IndexMap<Value, Tmp> m_phiToTmp; // Each Phi gets its own Tmp.
@@ -2863,7 +3391,7 @@ private:
 
     Vector<Vector<Inst, 4>> m_insts;
     Vector<Inst> m_prologue;
-
+    
     B3::BasicBlock* m_block;
     bool m_isRare;
     unsigned m_index;
@@ -2874,6 +3402,12 @@ private:
 
     Procedure& m_procedure;
     Code& m_code;
+
+    Air::BlockInsertionSet m_blockInsertionSet;
+
+    Tmp m_eax;
+    Tmp m_ecx;
+    Tmp m_edx;
 };
 
 } // anonymous namespace
index a668376..9365749 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -32,7 +32,8 @@ namespace JSC { namespace B3 {
 class Procedure;
 namespace Air { class Code; }
 
-// This lowers the current B3 procedure to an Air code.
+// This lowers the current B3 procedure to an Air code. Note that this assumes that the procedure is in
+// quirks mode, but it does not assert this, to simplify how we write tests.
 
 JS_EXPORT_PRIVATE void lowerToAir(Procedure&);
 
index e73a0fd..5301d41 100644 (file)
 
 #if ENABLE(B3_JIT)
 
+#include "B3AtomicValue.h"
+#include "B3MemoryValueInlines.h"
+#include "B3ValueInlines.h"
+
 namespace JSC { namespace B3 {
 
 MemoryValue::~MemoryValue()
 {
 }
 
-size_t MemoryValue::accessByteSize() const
+bool MemoryValue::isLegalOffset(int64_t offset) const
 {
-    switch (opcode()) {
-    case Load8Z:
-    case Load8S:
-    case Store8:
-        return 1;
-    case Load16Z:
-    case Load16S:
-    case Store16:
-        return 2;
-    case Load:
-        return sizeofType(type());
-    case Store:
-        return sizeofType(child(0)->type());
-    default:
-        RELEASE_ASSERT_NOT_REACHED();
-        return 0;
-    }
+    return B3::isRepresentableAs<int32_t>(offset) && isLegalOffset(static_cast<int32_t>(offset));
+}
+
+Type MemoryValue::accessType() const
+{
+    if (isLoad())
+        return type();
+    // This happens to work for atomics, too. That's why AtomicValue does not need to override this.
+    return child(0)->type();
 }
 
-Width MemoryValue::accessWidth() const
+Bank MemoryValue::accessBank() const
 {
-    return widthForBytes(accessByteSize());
+    return bankForType(accessType());
+}
+
+size_t MemoryValue::accessByteSize() const
+{
+    return bytes(accessWidth());
 }
 
 void MemoryValue::dumpMeta(CommaPrinter& comma, PrintStream& out) const
@@ -65,8 +66,11 @@ void MemoryValue::dumpMeta(CommaPrinter& comma, PrintStream& out) const
     if (m_offset)
         out.print(comma, "offset = ", m_offset);
     if ((isLoad() && effects().reads != range())
-        || (isStore() && effects().writes != range()))
+        || (isStore() && effects().writes != range())
+        || isExotic())
         out.print(comma, "range = ", range());
+    if (isExotic())
+        out.print(comma, "fenceRange = ", fenceRange());
 }
 
 Value* MemoryValue::cloneImpl() const
@@ -74,6 +78,62 @@ Value* MemoryValue::cloneImpl() const
     return new MemoryValue(*this);
 }
 
+// Use this form for Load (but not Load8Z, Load8S, or any of the Loads that have a suffix that
+// describes the returned type).
+MemoryValue::MemoryValue(Kind kind, Type type, Origin origin, Value* pointer, int32_t offset, HeapRange range, HeapRange fenceRange)
+    : Value(CheckedOpcode, kind, type, origin, pointer)
+    , m_offset(offset)
+    , m_range(range)
+    , m_fenceRange(fenceRange)
+{
+    if (!ASSERT_DISABLED) {
+        switch (kind.opcode()) {
+        case Load:
+            break;
+        case Load8Z:
+        case Load8S:
+        case Load16Z:
+        case Load16S:
+            ASSERT(type == Int32);
+            break;
+        case Store8:
+        case Store16:
+        case Store:
+            ASSERT(type == Void);
+            break;
+        default:
+            ASSERT_NOT_REACHED();
+        }
+    }
+}
+
+// Use this form for loads where the return type is implied.
+MemoryValue::MemoryValue(Kind kind, Origin origin, Value* pointer, int32_t offset, HeapRange range, HeapRange fenceRange)
+    : MemoryValue(kind, Int32, origin, pointer, offset, range, fenceRange)
+{
+    if (!ASSERT_DISABLED) {
+        switch (kind.opcode()) {
+        case Load8Z:
+        case Load8S:
+        case Load16Z:
+        case Load16S:
+            break;
+        default:
+            ASSERT_NOT_REACHED();
+        }
+    }
+}
+
+// Use this form for stores.
+MemoryValue::MemoryValue(Kind kind, Origin origin, Value* value, Value* pointer, int32_t offset, HeapRange range, HeapRange fenceRange)
+    : Value(CheckedOpcode, kind, Void, origin, value, pointer)
+    , m_offset(offset)
+    , m_range(range)
+    , m_fenceRange(fenceRange)
+{
+    ASSERT(B3::isStore(kind.opcode()));
+}
+
 } } // namespace JSC::B3
 
 #endif // ENABLE(B3_JIT)
index 5b53939..cec0380 100644 (file)
 
 #if ENABLE(B3_JIT)
 
+#include "AirArg.h"
+#include "B3Bank.h"
 #include "B3HeapRange.h"
 #include "B3Value.h"
 
 namespace JSC { namespace B3 {
 
-// FIXME: We want to allow fenced memory accesses on ARM.
-// https://bugs.webkit.org/show_bug.cgi?id=162349
-
 class JS_EXPORT_PRIVATE MemoryValue : public Value {
 public:
     static bool accepts(Kind kind)
     {
-        switch (kind.opcode()) {
-        case Load8Z:
-        case Load8S:
-        case Load16Z:
-        case Load16S:
-        case Load:
-        case Store8:
-        case Store16:
-        case Store:
-            return true;
-        default:
-            return false;
-        }
-    }
-
-    static bool isStore(Kind kind)
-    {
-        switch (kind.opcode()) {
-        case Store8:
-        case Store16:
-        case Store:
-            return true;
-        default:
-            return false;
-        }
-    }
-
-    static bool isLoad(Kind kind)
-    {
-        return accepts(kind) && !isStore(kind);
+        return isMemoryAccess(kind.opcode());
     }
 
     ~MemoryValue();
 
     int32_t offset() const { return m_offset; }
     void setOffset(int32_t offset) { m_offset = offset; }
+    
+    // You don't have to worry about using legal offsets unless you've entered quirks mode.
+    bool isLegalOffset(int32_t offset) const;
+    bool isLegalOffset(int64_t offset) const;
+    
+    // A necessary consequence of MemoryValue having an offset is that it participates in instruction
+    // selection. This tells you if this will get lowered to something that requires an offsetless
+    // address.
+    bool requiresSimpleAddr() const;
 
     const HeapRange& range() const { return m_range; }
     void setRange(const HeapRange& range) { m_range = range; }
-
-    bool isStore() const { return type() == Void; }
-    bool isLoad() const { return type() != Void; }
-
+    
+    // This is an alias for range.
+    const HeapRange& accessRange() const { return range(); }
+    void setAccessRange(const HeapRange& range) { setRange(range); }
+    
+    const HeapRange& fenceRange() const { return m_fenceRange; }
+    void setFenceRange(const HeapRange& range) { m_fenceRange = range; }
+
+    bool isStore() const { return B3::isStore(opcode()); }
+    bool isLoad() const { return B3::isLoad(opcode()); }
+
+    bool hasFence() const { return !!fenceRange(); }
+    bool isExotic() const { return hasFence() || isAtomic(opcode()); }
+
+    Type accessType() const;
+    Bank accessBank() const;
     size_t accessByteSize() const;
+    
     Width accessWidth() const;
 
+    bool isCanonicalWidth() const { return B3::isCanonicalWidth(accessWidth()); }
+
 protected:
-    void dumpMeta(CommaPrinter& comma, PrintStream&) const override;
+    void dumpMeta(CommaPrinter&, PrintStream&) const override;
 
     Value* cloneImpl() const override;
 
+    template<typename... Arguments>
+    MemoryValue(CheckedOpcodeTag, Kind kind, Type type, Origin origin, int32_t offset, HeapRange range, HeapRange fenceRange, Arguments... arguments)
+        : Value(CheckedOpcode, kind, type, origin, arguments...)
+        , m_offset(offset)
+        , m_range(range)
+        , m_fenceRange(fenceRange)
+    {
+    }
+    
 private:
     friend class Procedure;
-
+    
     // Use this form for Load (but not Load8Z, Load8S, or any of the Loads that have a suffix that
     // describes the returned type).
-    MemoryValue(Kind kind, Type type, Origin origin, Value* pointer, int32_t offset = 0)
-        : Value(CheckedOpcode, kind, type, origin, pointer)
-        , m_offset(offset)
-        , m_range(HeapRange::top())
-    {
-        if (!ASSERT_DISABLED) {
-            switch (kind.opcode()) {
-            case Load:
-                break;
-            case Load8Z:
-            case Load8S:
-            case Load16Z:
-            case Load16S:
-                ASSERT(type == Int32);
-                break;
-            case Store8:
-            case Store16:
-            case Store:
-                ASSERT(type == Void);
-                break;
-            default:
-                ASSERT_NOT_REACHED();
-            }
-        }
-    }
+    MemoryValue(Kind, Type, Origin, Value* pointer, int32_t offset = 0, HeapRange range = HeapRange::top(), HeapRange fenceRange = HeapRange());
 
     // Use this form for loads where the return type is implied.
-    MemoryValue(Kind kind, Origin origin, Value* pointer, int32_t offset = 0)
-        : MemoryValue(kind, Int32, origin, pointer, offset)
-    {
-    }
+    MemoryValue(Kind, Origin, Value* pointer, int32_t offset = 0, HeapRange range = HeapRange::top(), HeapRange fenceRange = HeapRange());
 
     // Use this form for stores.
-    MemoryValue(Kind kind, Origin origin, Value* value, Value* pointer, int32_t offset = 0)
-        : Value(CheckedOpcode, kind, Void, origin, value, pointer)
-        , m_offset(offset)
-        , m_range(HeapRange::top())
-    {
-        if (!ASSERT_DISABLED) {
-            switch (kind.opcode()) {
-            case Store8:
-            case Store16:
-            case Store:
-                break;
-            default:
-                ASSERT_NOT_REACHED();
-                break;
-            }
-        }
-    }
-
+    MemoryValue(Kind, Origin, Value* value, Value* pointer, int32_t offset = 0, HeapRange range = HeapRange::top(), HeapRange fenceRange = HeapRange());
+    
     int32_t m_offset { 0 };
-    HeapRange m_range;
+    HeapRange m_range { HeapRange::top() };
+    HeapRange m_fenceRange { HeapRange() };
 };
 
 } } // namespace JSC::B3
diff --git a/Source/JavaScriptCore/b3/B3MemoryValueInlines.h b/Source/JavaScriptCore/b3/B3MemoryValueInlines.h
new file mode 100644 (file)
index 0000000..a47232c
--- /dev/null
@@ -0,0 +1,84 @@
+/*
+ * Copyright (C) 2017 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#pragma once
+
+#if ENABLE(B3_JIT)
+
+#include "AirArg.h"
+#include "B3AtomicValue.h"
+
+namespace JSC { namespace B3 {
+
+inline bool MemoryValue::isLegalOffset(int32_t offset) const
+{
+    // NOTE: This is inline because it constant-folds to true on x86!
+    
+    // So far only X86 allows exotic loads to have an offset.
+    if (requiresSimpleAddr())
+        return !offset;
+    
+    return Air::Arg::isValidAddrForm(offset, accessWidth());
+}
+
+inline bool MemoryValue::requiresSimpleAddr() const
+{
+    return !isX86() && isExotic();
+}
+
+inline Width MemoryValue::accessWidth() const
+{
+    switch (opcode()) {
+    case Load8Z:
+    case Load8S:
+    case Store8:
+        return Width8;
+    case Load16Z:
+    case Load16S:
+    case Store16:
+        return Width16;
+    case Load:
+        return widthForType(type());
+    case Store:
+        return widthForType(child(0)->type());
+    case AtomicWeakCAS:
+    case AtomicStrongCAS:
+    case AtomicXchgAdd:
+    case AtomicXchgAnd:
+    case AtomicXchgOr:
+    case AtomicXchgSub:
+    case AtomicXchgXor:
+    case AtomicXchg:
+        return as<AtomicValue>()->accessWidth();
+    default:
+        RELEASE_ASSERT_NOT_REACHED();
+        return Width8;
+    }
+}
+
+} } // namespace JSC::B3
+
+#endif // ENABLE(B3_JIT)
+
index 5b29376..1f9d8a0 100644 (file)
@@ -32,7 +32,7 @@
 #include "B3BasicBlockInlines.h"
 #include "B3Dominators.h"
 #include "B3InsertionSetInlines.h"
-#include "B3MemoryValue.h"
+#include "B3MemoryValueInlines.h"
 #include "B3PhaseScope.h"
 #include "B3ProcedureInlines.h"
 #include "B3ValueInlines.h"
@@ -203,12 +203,8 @@ private:
                                 if (!candidatePointer->hasIntPtr())
                                     return false;
                                 
-                                intptr_t offset = desiredOffset(candidatePointer);
-                                if (!B3::isRepresentableAs<int32_t>(static_cast<int64_t>(offset)))
-                                    return false;
-                                return Air::Arg::isValidAddrForm(
-                                    static_cast<int32_t>(offset),
-                                    widthForBytes(memoryValue->accessByteSize()));
+                                int64_t offset = desiredOffset(candidatePointer);
+                                return memoryValue->isLegalOffset(offset);
                             });
                         
                         if (bestPointer) {
diff --git a/Source/JavaScriptCore/b3/B3NativeTraits.h b/Source/JavaScriptCore/b3/B3NativeTraits.h
new file mode 100644 (file)
index 0000000..5b1787d
--- /dev/null
@@ -0,0 +1,111 @@
+/*
+ * Copyright (C) 2017 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+#pragma once
+
+#if ENABLE(B3_JIT)
+
+#include "B3Bank.h"
+#include "B3Type.h"
+#include "B3Width.h"
+
+namespace JSC { namespace B3 {
+
+template<typename> struct NativeTraits;
+
+template<> struct NativeTraits<int8_t> {
+    typedef int32_t CanonicalType;
+    static const Bank bank = GP;
+    static const Width width = Width8;
+    static const Type type = Int32;
+};
+
+template<> struct NativeTraits<uint8_t> {
+    typedef int32_t CanonicalType;
+    static const Bank bank = GP;
+    static const Width width = Width8;
+    static const Type type = Int32;
+};
+
+template<> struct NativeTraits<int16_t> {
+    typedef int32_t CanonicalType;
+    static const Bank bank = GP;
+    static const Width width = Width16;
+    static const Type type = Int32;
+};
+
+template<> struct NativeTraits<uint16_t> {
+    typedef int32_t CanonicalType;
+    static const Bank bank = GP;
+    static const Width width = Width16;
+    static const Type type = Int32;
+};
+
+template<> struct NativeTraits<int32_t> {
+    typedef int32_t CanonicalType;
+    static const Bank bank = GP;
+    static const Width width = Width32;
+    static const Type type = Int32;
+};
+
+template<> struct NativeTraits<uint32_t> {
+    typedef int32_t CanonicalType;
+    static const Bank bank = GP;
+    static const Width width = Width32;
+    static const Type type = Int32;
+};
+
+template<> struct NativeTraits<int64_t> {
+    typedef int64_t CanonicalType;
+    static const Bank bank = GP;
+    static const Width width = Width64;
+    static const Type type = Int64;
+};
+
+template<> struct NativeTraits<uint64_t> {
+    typedef int64_t CanonicalType;
+    static const Bank bank = GP;
+    static const Width width = Width64;
+    static const Type type = Int64;
+};
+
+template<> struct NativeTraits<float> {
+    typedef float CanonicalType;
+    static const Bank bank = FP;
+    static const Width width = Width32;
+    static const Type type = Float;
+};
+
+template<> struct NativeTraits<double> {
+    typedef double CanonicalType;
+    static const Bank bank = FP;
+    static const Width width = Width64;
+    static const Type type = Double;
+};
+
+} } // namespace JSC::B3
+
+#endif // ENABLE(B3_JIT)
+
index a0aa5a9..be076fd 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -68,6 +68,23 @@ std::optional<Opcode> invertedCompare(Opcode opcode, Type type)
     }
 }
 
+Opcode storeOpcode(Bank bank, Width width)
+{
+    switch (bank) {
+    case GP:
+        switch (width) {
+        case Width8:
+            return Store8;
+        case Width16:
+            return Store16;
+        default:
+            return Store;
+        }
+    case FP:
+        return Store;
+    }
+}
+
 } } // namespace JSC::B3
 
 namespace WTF {
@@ -263,6 +280,33 @@ void printInternal(PrintStream& out, Opcode opcode)
     case Store:
         out.print("Store");
         return;
+    case AtomicWeakCAS:
+        out.print("AtomicWeakCAS");
+        return;
+    case AtomicStrongCAS:
+        out.print("AtomicStrongCAS");
+        return;
+    case AtomicXchgAdd:
+        out.print("AtomicXchgAdd");
+        return;
+    case AtomicXchgAnd:
+        out.print("AtomicXchgAnd");
+        return;
+    case AtomicXchgOr:
+        out.print("AtomicXchgOr");
+        return;
+    case AtomicXchgSub:
+        out.print("AtomicXchgSub");
+        return;
+    case AtomicXchgXor:
+        out.print("AtomicXchgXor");
+        return;
+    case AtomicXchg:
+        out.print("AtomicXchg");
+        return;
+    case Depend:
+        out.print("Depend");
+        return;
     case WasmAddress:
         out.print("WasmAddress");
         return;
index 956dba9..77715fe 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -28,6 +28,7 @@
 #if ENABLE(B3_JIT)
 
 #include "B3Type.h"
+#include "B3Width.h"
 #include <wtf/Optional.h>
 #include <wtf/StdLibExtras.h>
 
@@ -82,7 +83,6 @@ enum Opcode : int16_t {
     Mod, // All bets are off as to what will happen when you execute this for -2^31%-1 and x%0.
     UMod,
 
-
     // Polymorphic negation. Note that we only need this for floating point, since integer negation
     // is exactly like Sub(0, x). But that's not true for floating point. Sub(0, 0) is 0, while
     // Neg(0) is -0. Also, we canonicalize Sub(0, x) into Neg(x) in case of integers.
@@ -162,6 +162,110 @@ enum Opcode : int16_t {
     Store16,
     // This is a polymorphic store for Int32, Int64, Float, and Double.
     Store,
+    
+    // Atomic compare and swap that returns a boolean. May choose to do nothing and return false. You can
+    // usually assume that this is faster and results in less code than AtomicStrongCAS, though that's
+    // not necessarily true on Intel, if instruction selection does its job. Imagine that this opcode is
+    // as if you did this atomically:
+    //
+    // template<typename T>
+    // bool AtomicWeakCAS(T expectedValue, T newValue, T* ptr)
+    // {
+    //     if (!rand())
+    //         return false; // Real world example of this: context switch on ARM while doing CAS.
+    //     if (*ptr != expectedValue)
+    //         return false;
+    //     *ptr = newValue;
+    //     return true;
+    // }
+    //
+    // Note that all atomics put the pointer last to be consistent with how loads and stores work. This
+    // is a goofy tradition, but it's harmless, and better than being inconsistent.
+    //
+    // Note that weak CAS has no fencing guarantees when it fails. This means that the following
+    // transformation is always valid:
+    //
+    // Before:
+    //
+    //         Branch(AtomicWeakCAS(expected, new, ptr))
+    //       Successors: Then:#success, Else:#fail
+    //
+    // After:
+    //
+    //         Branch(Equal(Load(ptr), expected))
+    //       Successors: Then:#attempt, Else:#fail
+    //     BB#attempt:
+    //         Branch(AtomicWeakCAS(expected, new, ptr))
+    //       Successors: Then:#success, Else:#fail
+    //
+    // Both kinds of CAS for non-canonical widths (Width8 and Width16) ignore the irrelevant bits of the
+    // input.
+    AtomicWeakCAS,
+    
+    // Atomic compare and swap that returns the old value. Does not have the nondeterminism of WeakCAS.
+    // This is a bit more code and a bit slower in some cases, though not by a lot. Imagine that this
+    // opcode is as if you did this atomically:
+    //
+    // template<typename T>
+    // T AtomicStrongCAS(T expectedValue, T newValue, T* ptr)
+    // {
+    //     T oldValue = *ptr;
+    //     if (oldValue == expectedValue)
+    //         *ptr = newValue;
+    //     return oldValue
+    // }
+    //
+    // AtomicStrongCAS sign-extends its result for subwidth operations.
+    //
+    // Note that AtomicWeakCAS and AtomicStrongCAS sort of have this kind of equivalence:
+    //
+    // AtomicWeakCAS(@exp, @new, @ptr) == Equal(AtomicStrongCAS(@exp, @new, @ptr), @exp)
+    //
+    // Assuming that the WeakCAS does not spuriously fail, of course.
+    AtomicStrongCAS,
+    
+    // Atomically ___ a memory location and return the old value. Syntax:
+    //
+    // @oldValue = AtomicXchg___(@operand, @ptr)
+    //
+    // For non-canonical widths (Width8 and Width16), these return sign-extended results and ignore the
+    // irrelevant bits of their inputs.
+    AtomicXchgAdd,
+    AtomicXchgAnd,
+    AtomicXchgOr,
+    AtomicXchgSub,
+    AtomicXchgXor,
+    
+    // FIXME: Maybe we should have AtomicXchgNeg.
+    // https://bugs.webkit.org/show_bug.cgi?id=169252
+    
+    // Atomically exchange a value with a memory location. Syntax:
+    //
+    // @oldValue = AtomicXchg(@newValue, @ptr)
+    AtomicXchg,
+    
+    // Introduce an invisible dependency for blocking motion of loads with respect to each other. Syntax:
+    //
+    // @result = Depend(@phantom)
+    //
+    // This is eventually codegenerated to have local semantics as if we did:
+    //
+    // @result = $0
+    //
+    // But it ensures that the users of @result cannot execute until @phantom is computed.
+    //
+    // The compiler is not allowed to reason about the fact that Depend codegenerates this way. Any kind
+    // of transformation or analysis that relies on the insight that Depend is really zero is unsound,
+    // because it unlocks reordering of users of @result and @phantom.
+    //
+    // On X86, this is lowered to a load-load fence and @result uses @phantom directly.
+    //
+    // On ARM, this is lowered as if like:
+    //
+    // @result = BitXor(@phantom, @phantom)
+    //
+    // Except that the compiler never gets an opportunity to simplify out the BitXor.
+    Depend,
 
     // This is used to compute the actual address of a Wasm memory operation. It takes an IntPtr
     // and a pinned register then computes the appropriate IntPtr address. For the use-case of
@@ -301,6 +405,112 @@ inline bool isDefinitelyTerminal(Opcode opcode)
     }
 }
 
+inline bool isLoad(Opcode opcode)
+{
+    switch (opcode) {
+    case Load8Z:
+    case Load8S:
+    case Load16Z:
+    case Load16S:
+    case Load:
+        return true;
+    default:
+        return false;
+    }
+}
+
+inline bool isStore(Opcode opcode)
+{
+    switch (opcode) {
+    case Store8:
+    case Store16:
+    case Store:
+        return true;
+    default:
+        return false;
+    }
+}
+
+inline bool isLoadStore(Opcode opcode)
+{
+    switch (opcode) {
+    case Load8Z:
+    case Load8S:
+    case Load16Z:
+    case Load16S:
+    case Load:
+    case Store8:
+    case Store16:
+    case Store:
+        return true;
+    default:
+        return false;
+    }
+}
+
+inline bool isAtomic(Opcode opcode)
+{
+    switch (opcode) {
+    case AtomicWeakCAS:
+    case AtomicStrongCAS:
+    case AtomicXchgAdd:
+    case AtomicXchgAnd:
+    case AtomicXchgOr:
+    case AtomicXchgSub:
+    case AtomicXchgXor:
+    case AtomicXchg:
+        return true;
+    default:
+        return false;
+    }
+}
+
+inline bool isAtomicCAS(Opcode opcode)
+{
+    switch (opcode) {
+    case AtomicWeakCAS:
+    case AtomicStrongCAS:
+        return true;
+    default:
+        return false;
+    }
+}
+
+inline bool isAtomicXchg(Opcode opcode)
+{
+    switch (opcode) {
+    case AtomicXchgAdd:
+    case AtomicXchgAnd:
+    case AtomicXchgOr:
+    case AtomicXchgSub:
+    case AtomicXchgXor:
+    case AtomicXchg:
+        return true;
+    default:
+        return false;
+    }
+}
+
+inline bool isMemoryAccess(Opcode opcode)
+{
+    return isAtomic(opcode) || isLoadStore(opcode);
+}
+
+inline Opcode signExtendOpcode(Width width)
+{
+    switch (width) {
+    case Width8:
+        return SExt8;
+    case Width16:
+        return SExt16;
+    default:
+        RELEASE_ASSERT_NOT_REACHED();
+        return Oops;
+    }
+}
+
+JS_EXPORT_PRIVATE Opcode storeOpcode(Bank bank, Width width);
+
 } } // namespace JSC::B3
 
 namespace WTF {
index 0cb48c4..857952c 100644 (file)
@@ -202,6 +202,8 @@ void Procedure::dump(PrintStream& out) const
         }
         dataLog("    ", deepDump(*this, value), "\n");
     }
+    if (hasQuirks())
+        out.print("Has Quirks: True\n");
     if (variables().size()) {
         out.print("Variables:\n");
         for (Variable* variable : variables())
index 2236145..dcb0cac 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -117,6 +117,7 @@ public:
     Value* addIntConstant(Origin, Type, int64_t value);
     Value* addIntConstant(Value*, int64_t value);
 
+    // You're guaranteed that bottom is zero.
     Value* addBottom(Origin, Type);
     Value* addBottom(Value*);
 
@@ -195,6 +196,17 @@ public:
     // alive. Great for compiler-generated data sections, like switch jump tables and constant pools.
     // This returns memory that has been zero-initialized.
     JS_EXPORT_PRIVATE void* addDataSection(size_t);
+    
+    // Some operations are specified in B3 IR to behave one way but on this given CPU they behave a
+    // different way. When true, those B3 IR ops switch to behaving the CPU way, and the optimizer may
+    // start taking advantage of it.
+    //
+    // One way to think of it is like this. Imagine that you find that the cleanest way of lowering
+    // something in lowerMacros is to unconditionally replace one opcode with another. This is a shortcut
+    // where you instead keep the same opcode, but rely on the opcode's meaning changes once lowerMacros
+    // sets hasQuirks.
+    bool hasQuirks() const { return m_hasQuirks; }
+    void setHasQuirks(bool value) { m_hasQuirks = value; }
 
     OpaqueByproducts& byproducts() { return *m_byproducts; }
 
@@ -252,6 +264,7 @@ private:
     RefPtr<SharedTask<void(PrintStream&, Origin)>> m_originPrinter;
     const void* m_frontendData;
     PCToOriginMap m_pcToOriginMap;
+    bool m_hasQuirks { false };
 };
 
 } } // namespace JSC::B3
index 0ea3447..eb9c46b 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2016-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -29,6 +29,7 @@
 #if ENABLE(B3_JIT)
 
 #include "B3Dominators.h"
+#include "B3PhaseScope.h"
 #include "B3Value.h"
 
 namespace JSC { namespace B3 {
@@ -89,6 +90,23 @@ bool PureCSE::process(Value* value, Dominators& dominators)
     return false;
 }
 
+bool pureCSE(Procedure& proc)
+{
+    PhaseScope phaseScope(proc, "pureCSE");
+    
+    Dominators& dominators = proc.dominators();
+    PureCSE pureCSE;
+    bool result = false;
+    for (BasicBlock* block : proc.blocksInPreOrder()) {
+        for (Value* value : *block) {
+            result |= value->performSubstitution();
+            result |= pureCSE.process(value, dominators);
+        }
+    }
+    
+    return result;
+}
+
 } } // namespace JSC::B3
 
 #endif // ENABLE(B3_JIT)
index 942966c..d26e411 100644 (file)
@@ -56,6 +56,8 @@ private:
     HashMap<ValueKey, Matches> m_map;
 };
 
+bool pureCSE(Procedure&);
+
 } } // namespace JSC::B3
 
 #endif // ENABLE(B3_JIT)
index 43c7302..3c6bf7c 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
 
 #if ENABLE(B3_JIT)
 
+#include "B3AtomicValue.h"
 #include "B3BasicBlockInlines.h"
 #include "B3BlockInsertionSet.h"
 #include "B3ComputeDivisionMagic.h"
 #include "B3Dominators.h"
 #include "B3InsertionSetInlines.h"
-#include "B3MemoryValue.h"
+#include "B3MemoryValueInlines.h"
 #include "B3PhaseScope.h"
 #include "B3PhiChildren.h"
 #include "B3ProcedureInlines.h"
@@ -602,7 +603,7 @@ private:
             }
 
             break;
-
+            
         case Neg:
             // Turn this: Neg(constant)
             // Into this: -constant
@@ -1295,6 +1296,16 @@ private:
                     break;
                 }
             }
+            
+            if (!m_proc.hasQuirks()) {
+                // Turn this: SExt8(AtomicXchg___)
+                // Into this: AtomicXchg___
+                if (isAtomicXchg(m_value->child(0)->opcode())
+                    && m_value->child(0)->as<AtomicValue>()->accessWidth() == Width8) {
+                    replaceWithIdentity(m_value->child(0));
+                    break;
+                }
+            }
             break;
 
         case SExt16:
@@ -1343,6 +1354,16 @@ private:
                     break;
                 }
             }
+
+            if (!m_proc.hasQuirks()) {
+                // Turn this: SExt16(AtomicXchg___)
+                // Into this: AtomicXchg___
+                if (isAtomicXchg(m_value->child(0)->opcode())
+                    && m_value->child(0)->as<AtomicValue>()->accessWidth() == Width16) {
+                    replaceWithIdentity(m_value->child(0));
+                    break;
+                }
+            }
             break;
 
         case SExt32:
@@ -2101,22 +2122,25 @@ private:
         ASSERT(m_value->numChildren() >= 2);
         
         // Leave it alone if the right child is a constant.
-        if (m_value->child(1)->isConstant())
+        if (m_value->child(1)->isConstant()
+            || m_value->child(0)->opcode() == AtomicStrongCAS)
             return;
         
-        if (m_value->child(0)->isConstant()) {
+        auto swap = [&] () {
             std::swap(m_value->child(0), m_value->child(1));
             m_changed = true;
-            return;
-        }
-
+        };
+        
+        if (m_value->child(0)->isConstant())
+            return swap();
+        
+        if (m_value->child(1)->opcode() == AtomicStrongCAS)
+            return swap();
+        
         // Sort the operands. This is an important canonicalization. We use the index instead of
         // the address to make this at least slightly deterministic.
-        if (m_value->child(0)->index() > m_value->child(1)->index()) {
-            std::swap(m_value->child(0), m_value->child(1));
-            m_changed = true;
-            return;
-        }
+        if (m_value->child(0)->index() > m_value->child(1)->index())
+            return swap();
     }
 
     // FIXME: This should really be a forward analysis. Instead, we uses a bounded-search backwards
index 8df8ace..6e6a93d 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -30,6 +30,7 @@
 
 #include "AirCode.h"
 #include "B3ArgumentRegValue.h"
+#include "B3AtomicValue.h"
 #include "B3BasicBlockInlines.h"
 #include "B3Dominators.h"
 #include "B3MemoryValue.h"
@@ -346,6 +347,7 @@ public:
                 VALIDATE(value->numChildren() == 1, ("At ", *value));
                 VALIDATE(value->child(0)->type() == pointerType(), ("At ", *value));
                 VALIDATE(value->type() == Int32, ("At ", *value));
+                validateFence(value);
                 validateStackAccess(value);
                 break;
             case Load:
@@ -353,6 +355,7 @@ public:
                 VALIDATE(value->numChildren() == 1, ("At ", *value));
                 VALIDATE(value->child(0)->type() == pointerType(), ("At ", *value));
                 VALIDATE(value->type() != Void, ("At ", *value));
+                validateFence(value);
                 validateStackAccess(value);
                 break;
             case Store8:
@@ -362,6 +365,7 @@ public:
                 VALIDATE(value->child(0)->type() == Int32, ("At ", *value));
                 VALIDATE(value->child(1)->type() == pointerType(), ("At ", *value));
                 VALIDATE(value->type() == Void, ("At ", *value));
+                validateFence(value);
                 validateStackAccess(value);
                 break;
             case Store:
@@ -369,8 +373,49 @@ public:
                 VALIDATE(value->numChildren() == 2, ("At ", *value));
                 VALIDATE(value->child(1)->type() == pointerType(), ("At ", *value));
                 VALIDATE(value->type() == Void, ("At ", *value));
+                validateFence(value);
                 validateStackAccess(value);
                 break;
+            case AtomicWeakCAS:
+                VALIDATE(!value->kind().isChill(), ("At ", *value));
+                VALIDATE(value->numChildren() == 3, ("At ", *value));
+                VALIDATE(value->type() == Int32, ("At ", *value));
+                VALIDATE(value->child(0)->type() == value->child(1)->type(), ("At ", *value));
+                VALIDATE(isInt(value->child(0)->type()), ("At ", *value));
+                VALIDATE(value->child(2)->type() == pointerType(), ("At ", *value));
+                validateAtomic(value);
+                validateStackAccess(value);
+                break;
+            case AtomicStrongCAS:
+                VALIDATE(!value->kind().isChill(), ("At ", *value));
+                VALIDATE(value->numChildren() == 3, ("At ", *value));
+                VALIDATE(value->type() == value->child(0)->type(), ("At ", *value));
+                VALIDATE(value->type() == value->child(1)->type(), ("At ", *value));
+                VALIDATE(isInt(value->type()), ("At ", *value));
+                VALIDATE(value->child(2)->type() == pointerType(), ("At ", *value));
+                validateAtomic(value);
+                validateStackAccess(value);
+                break;
+            case AtomicXchgAdd:
+            case AtomicXchgAnd:
+            case AtomicXchgOr:
+            case AtomicXchgSub:
+            case AtomicXchgXor:
+            case AtomicXchg:
+                VALIDATE(!value->kind().isChill(), ("At ", *value));
+                VALIDATE(value->numChildren() == 2, ("At ", *value));
+                VALIDATE(value->type() == value->child(0)->type(), ("At ", *value));
+                VALIDATE(isInt(value->type()), ("At ", *value));
+                VALIDATE(value->child(1)->type() == pointerType(), ("At ", *value));
+                validateAtomic(value);
+                validateStackAccess(value);
+                break;
+            case Depend:
+                VALIDATE(!value->kind().hasExtraBits(), ("At ", *value));
+                VALIDATE(value->numChildren() == 1, ("At ", *value));
+                VALIDATE(value->type() == value->child(0)->type(), ("At ", *value));
+                VALIDATE(isInt(value->type()), ("At ", *value));
+                break;
             case WasmAddress:
                 VALIDATE(!value->kind().hasExtraBits(), ("At ", *value));
                 VALIDATE(value->numChildren() == 1, ("At ", *value));
@@ -537,6 +582,20 @@ private:
             break;
         }
     }
+    
+    void validateFence(Value* value)
+    {
+        MemoryValue* memory = value->as<MemoryValue>();
+        if (memory->hasFence())
+            VALIDATE(memory->accessBank() == GP, ("Fence at ", *memory));
+    }
+    
+    void validateAtomic(Value* value)
+    {
+        AtomicValue* atomic = value->as<AtomicValue>();
+        
+        VALIDATE(bestType(GP, atomic->accessWidth()) == atomic->accessType(), ("At ", *value));
+    }
 
     void validateStackAccess(Value* value)
     {
index b4fc433..93483ec 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
@@ -29,6 +29,7 @@
 #if ENABLE(B3_JIT)
 
 #include "B3ArgumentRegValue.h"
+#include "B3AtomicValue.h"
 #include "B3BasicBlockInlines.h"
 #include "B3BottomProvider.h"
 #include "B3CCallValue.h"
@@ -510,6 +511,7 @@ bool Value::returnsBool() const
     case AboveEqual:
     case BelowEqual:
     case EqualOrUnordered:
+    case AtomicWeakCAS:
         return true;
     case Phi:
         // FIXME: We should have a story here.
@@ -589,6 +591,7 @@ Effects Value::effects() const
     case BelowEqual:
     case EqualOrUnordered:
     case Select:
+    case Depend:
         break;
     case Div:
     case UDiv:
@@ -600,16 +603,44 @@ Effects Value::effects() const
     case Load8S:
     case Load16Z:
     case Load16S:
-    case Load:
-        result.reads = as<MemoryValue>()->range();
+    case Load: {
+        const MemoryValue* memory = as<MemoryValue>();
+        result.reads = memory->range();
+        if (memory->hasFence()) {
+            result.writes = memory->fenceRange();
+            result.fence = true;
+        }
         result.controlDependent = true;
         break;
+    }
     case Store8:
     case Store16:
-    case Store:
-        result.writes = as<MemoryValue>()->range();
+    case Store: {
+        const MemoryValue* memory = as<MemoryValue>();
+        result.writes = memory->range();
+        if (memory->hasFence()) {
+            result.reads = memory->fenceRange();
+            result.fence = true;
+        }
+        result.controlDependent = true;
+        break;
+    }
+    case AtomicWeakCAS:
+    case AtomicStrongCAS:
+    case AtomicXchgAdd:
+    case AtomicXchgAnd:
+    case AtomicXchgOr:
+    case AtomicXchgSub:
+    case AtomicXchgXor:
+    case AtomicXchg: {
+        const AtomicValue* atomic = as<AtomicValue>();
+        result.reads = atomic->range() | atomic->fenceRange();
+        result.writes = atomic->range() | atomic->fenceRange();
+        if (atomic->hasFence())
+            result.fence = true;
         result.controlDependent = true;
         break;
+    }
     case WasmAddress:
         result.readsPinned = true;
         break;
@@ -617,18 +648,7 @@ Effects Value::effects() const
         const FenceValue* fence = as<FenceValue>();
         result.reads = fence->read;
         result.writes = fence->write;
-        
-        // Prevent killing of fences that claim not to write anything. It's a bit weird that we use
-        // local state as the way to do this, but it happens to work: we must assume that we cannot
-        // kill writesLocalState unless we understands exactly what the instruction is doing (like
-        // the way that fixSSA understands Set/Get and the way that reduceStrength and others
-        // understand Upsilon). This would only become a problem if we had some analysis that was
-        // looking to use the writesLocalState bit to invalidate a CSE over local state operations.
-        // Then a Fence would look block, say, the elimination of a redundant Get. But it like
-        // that's not at all how our optimizations for Set/Get/Upsilon/Phi work - they grok their
-        // operations deeply enough that they have no need to check this bit - so this cheat is
-        // fine.
-        result.writesLocalState = true;
+        result.fence = true;
         break;
     }
     case CCall:
@@ -694,6 +714,7 @@ ValueKey Value::key() const
     case Check:
     case BitwiseCast:
     case Neg:
+    case Depend:
         return ValueKey(kind(), type(), child(0));
     case Add:
     case Sub:
@@ -746,12 +767,16 @@ ValueKey Value::key() const
     }
 }
 
-void Value::performSubstitution()
+bool Value::performSubstitution()
 {
+    bool result = false;
     for (Value*& child : children()) {
-        while (child->opcode() == Identity)
+        while (child->opcode() == Identity) {
+            result = true;
             child = child->child(0);
+        }
     }
+    return result;
 }
 
 bool Value::isFree() const
@@ -801,6 +826,7 @@ Type Value::typeFor(Kind kind, Value* firstChild, Value* secondChild)
     case CheckAdd:
     case CheckSub:
     case CheckMul:
+    case Depend:
         return firstChild->type();
     case FramePointer:
         return pointerType();
index d57c2c4..5dfc284 100644 (file)
@@ -68,6 +68,11 @@ public:
     
     Opcode opcode() const { return kind().opcode(); }
     
+    // Note that the kind is meant to be immutable. Do this when you know that this is safe. It's not
+    // usually safe.
+    void setKindUnsafely(Kind kind) { m_kind = kind; }
+    void setOpcodeUnsafely(Opcode opcode) { m_kind.setOpcode(opcode); }
+    
     // It's good practice to mirror Kind methods here, so you can say value->isBlah()
     // instead of value->kind().isBlah().
     bool isChill() const { return kind().isChill(); }
@@