Switch FTL GetById/PutById IC's over to using AnyRegCC
authorfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Mon, 11 Nov 2013 07:30:50 +0000 (07:30 +0000)
committerfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Mon, 11 Nov 2013 07:30:50 +0000 (07:30 +0000)
https://bugs.webkit.org/show_bug.cgi?id=124094

Source/JavaScriptCore:

Reviewed by Sam Weinig.

This closes the loop on inline caches (IC's) in the FTL. The goal is to have IC's
in LLVM-generated code that are just as efficient (if not more so) than what a
custom JIT could do. As in zero sources of overhead. Not a single extra instruction
or even register allocation pathology. We accomplish this by having two thingies in
LLVM. First is the llvm.experimental.patchpoint intrinsic, which is sort of an
inline machine code snippet that we can fill in with whatever we want and then
modify subsequently. But you have only two choices of how to pass values to a
patchpoint: (1) via the calling convention or (2) via the stackmap. Neither are good
for operands to an IC (like the base pointer for a GetById, for example). (1) is bad
because it results in things being pinned to certain registers a priori; a custom
JIT (like the DFG) will not pin IC operands to any registers a priori but will allow
the register allocator to do whatever it wants. (2) is bad because the operands may
be spilled or may be represented in other crazy ways. You generally want an IC to
have its operands in registers. Also, patchpoints only return values using the
calling convention, which is unfortunate since it pins the return value to a
register a priori. This is where the second thingy comes in: the AnyRegCC. This is
a special calling convention only for use with patchpoints. It means that arguments
passed "by CC" in the patchpoint can be placed in any register, and the register
that gets used is reported as part of the stackmap. It also means that the return
value (if there is one) can be placed in any register, and the stackmap will tell
you which one it was. Thus, patchpoints combined with AnyRegCC mean that you not
only get the kind of self-modifying code that you want for IC's, but you also get
all of the register allocation goodness that a custom JIT would have given you.
Except that you're getting it from LLVM and not a custom JIT. Awesome.

Even though all of the fun stuff is on the LLVM side, this patch was harder than
you'd expect.

First the obvious bits:

- IC patchpoints now use AnyRegCC instead of the C CC. (CC = calling convention.)

- FTL::fixFunctionBasedOnStackMaps() now correctly figures out which registers the
  IC is supposed to use instead of assuming C CC argument registers.

And then all of the stuff that broke and that this patch fixes:

- IC sizing based on generating a dummy IC (what FTLInlineCacheSize did) is totally
  bad on x86-64, where various register permutations lead to bizarre header bytes
  and eclectic SIB encodings. I changed that to have magic constants, for now.

- Slow path calls didn't preserve the CC return register.

- Repatch's scratch register allocation would get totally confused if the operand
  registers weren't one of the DFG-style "temp" registers. And by "totally confused"
  I mean that it would crash.

- We assumed that r10 is callee-saved. It's not. That one dude's PPT about x86-64
  cdecl that I found on the intertubes was not a trustworthy source of information,
  apparently.

- Call repatching didn't know that the FTL does its IC slow calls via specially
  generated thunks. This was particularly fun to fix: basically, now when we relink
  an IC call in the FTL, we use the old call target to find the SlowPathCallKey,
  which tells us everything we need to know to generate (or look up) a new thunk for
  the new function we want to call.

* assembler/MacroAssemblerCodeRef.h:
(JSC::MacroAssemblerCodePtr::MacroAssemblerCodePtr):
(JSC::MacroAssemblerCodePtr::isEmptyValue):
(JSC::MacroAssemblerCodePtr::isDeletedValue):
(JSC::MacroAssemblerCodePtr::hash):
(JSC::MacroAssemblerCodePtr::emptyValue):
(JSC::MacroAssemblerCodePtr::deletedValue):
(JSC::MacroAssemblerCodePtrHash::hash):
(JSC::MacroAssemblerCodePtrHash::equal):
* assembler/MacroAssemblerX86Common.h:
* assembler/RepatchBuffer.h:
(JSC::RepatchBuffer::RepatchBuffer):
(JSC::RepatchBuffer::codeBlock):
* ftl/FTLAbbreviations.h:
(JSC::FTL::setInstructionCallingConvention):
* ftl/FTLCompile.cpp:
(JSC::FTL::fixFunctionBasedOnStackMaps):
* ftl/FTLInlineCacheSize.cpp:
(JSC::FTL::sizeOfGetById):
(JSC::FTL::sizeOfPutById):
* ftl/FTLJITFinalizer.cpp:
(JSC::FTL::JITFinalizer::finalizeFunction):
* ftl/FTLLocation.cpp:
(JSC::FTL::Location::forStackmaps):
* ftl/FTLLocation.h:
* ftl/FTLLowerDFGToLLVM.cpp:
(JSC::FTL::LowerDFGToLLVM::compileGetById):
(JSC::FTL::LowerDFGToLLVM::compilePutById):
* ftl/FTLOSRExitCompiler.cpp:
(JSC::FTL::compileStub):
* ftl/FTLSlowPathCall.cpp:
* ftl/FTLSlowPathCallKey.h:
(JSC::FTL::SlowPathCallKey::withCallTarget):
* ftl/FTLStackMaps.cpp:
(JSC::FTL::StackMaps::Location::directGPR):
(JSC::FTL::StackMaps::Location::restoreInto):
* ftl/FTLStackMaps.h:
* ftl/FTLThunks.h:
(JSC::FTL::generateIfNecessary):
(JSC::FTL::keyForThunk):
(JSC::FTL::Thunks::keyForSlowPathCallThunk):
* jit/FPRInfo.h:
(JSC::FPRInfo::toIndex):
* jit/GPRInfo.h:
(JSC::GPRInfo::toIndex):
(JSC::GPRInfo::debugName):
* jit/RegisterSet.cpp:
(JSC::RegisterSet::calleeSaveRegisters):
* jit/RegisterSet.h:
(JSC::RegisterSet::filter):
* jit/Repatch.cpp:
(JSC::readCallTarget):
(JSC::repatchCall):
(JSC::repatchByIdSelfAccess):
(JSC::tryCacheGetByID):
(JSC::tryCachePutByID):
(JSC::tryBuildPutByIdList):
(JSC::resetGetByID):
(JSC::resetPutByID):
* jit/ScratchRegisterAllocator.h:
(JSC::ScratchRegisterAllocator::lock):

Source/WTF:

Reviewed by Sam Weinig.

I needed to add another set operation, namely filter(), which is an in-place set
intersection.

* wtf/BitVector.cpp:
(WTF::BitVector::filterSlow):
* wtf/BitVector.h:
(WTF::BitVector::filter):

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@159039 268f45cc-cd09-0410-ab3c-d52691b4dbfc

26 files changed:
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/assembler/MacroAssemblerCodeRef.h
Source/JavaScriptCore/assembler/MacroAssemblerX86Common.h
Source/JavaScriptCore/assembler/RepatchBuffer.h
Source/JavaScriptCore/ftl/FTLAbbreviations.h
Source/JavaScriptCore/ftl/FTLCompile.cpp
Source/JavaScriptCore/ftl/FTLInlineCacheSize.cpp
Source/JavaScriptCore/ftl/FTLJITFinalizer.cpp
Source/JavaScriptCore/ftl/FTLLocation.cpp
Source/JavaScriptCore/ftl/FTLLocation.h
Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp
Source/JavaScriptCore/ftl/FTLOSRExitCompiler.cpp
Source/JavaScriptCore/ftl/FTLSlowPathCall.cpp
Source/JavaScriptCore/ftl/FTLSlowPathCallKey.h
Source/JavaScriptCore/ftl/FTLStackMaps.cpp
Source/JavaScriptCore/ftl/FTLStackMaps.h
Source/JavaScriptCore/ftl/FTLThunks.h
Source/JavaScriptCore/jit/FPRInfo.h
Source/JavaScriptCore/jit/GPRInfo.h
Source/JavaScriptCore/jit/RegisterSet.cpp
Source/JavaScriptCore/jit/RegisterSet.h
Source/JavaScriptCore/jit/Repatch.cpp
Source/JavaScriptCore/jit/ScratchRegisterAllocator.h
Source/WTF/ChangeLog
Source/WTF/wtf/BitVector.cpp
Source/WTF/wtf/BitVector.h

index aa37255..4bf7632 100644 (file)
@@ -1,3 +1,129 @@
+2013-11-09  Filip Pizlo  <fpizlo@apple.com>
+
+        Switch FTL GetById/PutById IC's over to using AnyRegCC
+        https://bugs.webkit.org/show_bug.cgi?id=124094
+
+        Reviewed by Sam Weinig.
+        
+        This closes the loop on inline caches (IC's) in the FTL. The goal is to have IC's
+        in LLVM-generated code that are just as efficient (if not more so) than what a
+        custom JIT could do. As in zero sources of overhead. Not a single extra instruction
+        or even register allocation pathology. We accomplish this by having two thingies in
+        LLVM. First is the llvm.experimental.patchpoint intrinsic, which is sort of an
+        inline machine code snippet that we can fill in with whatever we want and then
+        modify subsequently. But you have only two choices of how to pass values to a
+        patchpoint: (1) via the calling convention or (2) via the stackmap. Neither are good
+        for operands to an IC (like the base pointer for a GetById, for example). (1) is bad
+        because it results in things being pinned to certain registers a priori; a custom
+        JIT (like the DFG) will not pin IC operands to any registers a priori but will allow
+        the register allocator to do whatever it wants. (2) is bad because the operands may
+        be spilled or may be represented in other crazy ways. You generally want an IC to
+        have its operands in registers. Also, patchpoints only return values using the
+        calling convention, which is unfortunate since it pins the return value to a
+        register a priori. This is where the second thingy comes in: the AnyRegCC. This is
+        a special calling convention only for use with patchpoints. It means that arguments
+        passed "by CC" in the patchpoint can be placed in any register, and the register
+        that gets used is reported as part of the stackmap. It also means that the return
+        value (if there is one) can be placed in any register, and the stackmap will tell
+        you which one it was. Thus, patchpoints combined with AnyRegCC mean that you not
+        only get the kind of self-modifying code that you want for IC's, but you also get
+        all of the register allocation goodness that a custom JIT would have given you.
+        Except that you're getting it from LLVM and not a custom JIT. Awesome.
+        
+        Even though all of the fun stuff is on the LLVM side, this patch was harder than
+        you'd expect.
+        
+        First the obvious bits:
+        
+        - IC patchpoints now use AnyRegCC instead of the C CC. (CC = calling convention.)
+        
+        - FTL::fixFunctionBasedOnStackMaps() now correctly figures out which registers the
+          IC is supposed to use instead of assuming C CC argument registers.
+        
+        And then all of the stuff that broke and that this patch fixes:
+        
+        - IC sizing based on generating a dummy IC (what FTLInlineCacheSize did) is totally
+          bad on x86-64, where various register permutations lead to bizarre header bytes
+          and eclectic SIB encodings. I changed that to have magic constants, for now.
+        
+        - Slow path calls didn't preserve the CC return register.
+        
+        - Repatch's scratch register allocation would get totally confused if the operand
+          registers weren't one of the DFG-style "temp" registers. And by "totally confused"
+          I mean that it would crash.
+        
+        - We assumed that r10 is callee-saved. It's not. That one dude's PPT about x86-64
+          cdecl that I found on the intertubes was not a trustworthy source of information,
+          apparently.
+        
+        - Call repatching didn't know that the FTL does its IC slow calls via specially
+          generated thunks. This was particularly fun to fix: basically, now when we relink
+          an IC call in the FTL, we use the old call target to find the SlowPathCallKey,
+          which tells us everything we need to know to generate (or look up) a new thunk for
+          the new function we want to call.
+        
+        * assembler/MacroAssemblerCodeRef.h:
+        (JSC::MacroAssemblerCodePtr::MacroAssemblerCodePtr):
+        (JSC::MacroAssemblerCodePtr::isEmptyValue):
+        (JSC::MacroAssemblerCodePtr::isDeletedValue):
+        (JSC::MacroAssemblerCodePtr::hash):
+        (JSC::MacroAssemblerCodePtr::emptyValue):
+        (JSC::MacroAssemblerCodePtr::deletedValue):
+        (JSC::MacroAssemblerCodePtrHash::hash):
+        (JSC::MacroAssemblerCodePtrHash::equal):
+        * assembler/MacroAssemblerX86Common.h:
+        * assembler/RepatchBuffer.h:
+        (JSC::RepatchBuffer::RepatchBuffer):
+        (JSC::RepatchBuffer::codeBlock):
+        * ftl/FTLAbbreviations.h:
+        (JSC::FTL::setInstructionCallingConvention):
+        * ftl/FTLCompile.cpp:
+        (JSC::FTL::fixFunctionBasedOnStackMaps):
+        * ftl/FTLInlineCacheSize.cpp:
+        (JSC::FTL::sizeOfGetById):
+        (JSC::FTL::sizeOfPutById):
+        * ftl/FTLJITFinalizer.cpp:
+        (JSC::FTL::JITFinalizer::finalizeFunction):
+        * ftl/FTLLocation.cpp:
+        (JSC::FTL::Location::forStackmaps):
+        * ftl/FTLLocation.h:
+        * ftl/FTLLowerDFGToLLVM.cpp:
+        (JSC::FTL::LowerDFGToLLVM::compileGetById):
+        (JSC::FTL::LowerDFGToLLVM::compilePutById):
+        * ftl/FTLOSRExitCompiler.cpp:
+        (JSC::FTL::compileStub):
+        * ftl/FTLSlowPathCall.cpp:
+        * ftl/FTLSlowPathCallKey.h:
+        (JSC::FTL::SlowPathCallKey::withCallTarget):
+        * ftl/FTLStackMaps.cpp:
+        (JSC::FTL::StackMaps::Location::directGPR):
+        (JSC::FTL::StackMaps::Location::restoreInto):
+        * ftl/FTLStackMaps.h:
+        * ftl/FTLThunks.h:
+        (JSC::FTL::generateIfNecessary):
+        (JSC::FTL::keyForThunk):
+        (JSC::FTL::Thunks::keyForSlowPathCallThunk):
+        * jit/FPRInfo.h:
+        (JSC::FPRInfo::toIndex):
+        * jit/GPRInfo.h:
+        (JSC::GPRInfo::toIndex):
+        (JSC::GPRInfo::debugName):
+        * jit/RegisterSet.cpp:
+        (JSC::RegisterSet::calleeSaveRegisters):
+        * jit/RegisterSet.h:
+        (JSC::RegisterSet::filter):
+        * jit/Repatch.cpp:
+        (JSC::readCallTarget):
+        (JSC::repatchCall):
+        (JSC::repatchByIdSelfAccess):
+        (JSC::tryCacheGetByID):
+        (JSC::tryCachePutByID):
+        (JSC::tryBuildPutByIdList):
+        (JSC::resetGetByID):
+        (JSC::resetPutByID):
+        * jit/ScratchRegisterAllocator.h:
+        (JSC::ScratchRegisterAllocator::lock):
+
 2013-11-10  Oliver Hunt  <oliver@apple.com>
 
         Implement Set iterators
index 5e0c580..758742d 100644 (file)
@@ -334,11 +334,41 @@ public:
     {
         dumpWithName("CodePtr", out);
     }
+    
+    enum EmptyValueTag { EmptyValue };
+    enum DeletedValueTag { DeletedValue };
+    
+    MacroAssemblerCodePtr(EmptyValueTag)
+        : m_value(emptyValue())
+    {
+    }
+    
+    MacroAssemblerCodePtr(DeletedValueTag)
+        : m_value(deletedValue())
+    {
+    }
+    
+    bool isEmptyValue() const { return m_value == emptyValue(); }
+    bool isDeletedValue() const { return m_value == deletedValue(); }
+    
+    unsigned hash() const { return PtrHash<void*>::hash(m_value); }
 
 private:
+    static void* emptyValue() { return bitwise_cast<void*>(static_cast<intptr_t>(1)); }
+    static void* deletedValue() { return bitwise_cast<void*>(static_cast<intptr_t>(2)); }
+    
     void* m_value;
 };
 
+struct MacroAssemblerCodePtrHash {
+    static unsigned hash(const MacroAssemblerCodePtr& ptr) { return ptr.hash(); }
+    static bool equal(const MacroAssemblerCodePtr& a, const MacroAssemblerCodePtr& b)
+    {
+        return a == b;
+    }
+    static const bool safeToCompareToEmptyOrDeleted = true;
+};
+
 // MacroAssemblerCodeRef:
 //
 // A reference to a section of JIT generated code.  A CodeRef consists of a
@@ -420,4 +450,16 @@ private:
 
 } // namespace JSC
 
+namespace WTF {
+
+template<typename T> struct DefaultHash;
+template<> struct DefaultHash<JSC::MacroAssemblerCodePtr> {
+    typedef JSC::MacroAssemblerCodePtrHash Hash;
+};
+
+template<typename T> struct HashTraits;
+template<> struct HashTraits<JSC::MacroAssemblerCodePtr> : public CustomHashTraits<JSC::MacroAssemblerCodePtr> { };
+
+} // namespace WTF
+
 #endif // MacroAssemblerCodeRef_h
index 37e28ff..23d0ffb 100644 (file)
 namespace JSC {
 
 class MacroAssemblerX86Common : public AbstractMacroAssembler<X86Assembler> {
-protected:
+public:
 #if CPU(X86_64)
     static const X86Registers::RegisterID scratchRegister = X86Registers::r11;
 #endif
 
+protected:
     static const int DoubleConditionBitInvert = 0x10;
     static const int DoubleConditionBitSpecial = 0x20;
     static const int DoubleConditionBits = DoubleConditionBitInvert | DoubleConditionBitSpecial;
index 5fa84c2..41e950a 100644 (file)
@@ -45,6 +45,7 @@ class RepatchBuffer {
 
 public:
     RepatchBuffer(CodeBlock* codeBlock)
+        : m_codeBlock(codeBlock)
     {
 #if ENABLE(ASSEMBLER_WX_EXCLUSIVE)
         RefPtr<JITCode> code = codeBlock->jitCode();
@@ -52,8 +53,6 @@ public:
         m_size = code->size();
 
         ExecutableAllocator::makeWritable(m_start, m_size);
-#else
-        UNUSED_PARAM(codeBlock);
 #endif
     }
 
@@ -63,6 +62,8 @@ public:
         ExecutableAllocator::makeExecutable(m_start, m_size);
 #endif
     }
+    
+    CodeBlock* codeBlock() const { return m_codeBlock; }
 
     void relink(CodeLocationJump jump, CodeLocationLabel destination)
     {
@@ -176,6 +177,7 @@ public:
     }
 
 private:
+    CodeBlock* m_codeBlock;
 #if ENABLE(ASSEMBLER_WX_EXCLUSIVE)
     void* m_start;
     size_t m_size;
index 2af8aca..2dac00a 100644 (file)
@@ -269,6 +269,7 @@ static inline LValue buildCall(LBuilder builder, LValue function, LValue arg1, L
     LValue args[] = { arg1, arg2, arg3, arg4, arg5, arg6, arg7, arg8 };
     return buildCall(builder, function, args, 8);
 }
+static inline void setInstructionCallingConvention(LValue instruction, LCallConv callingConvention) { llvm->SetInstructionCallConv(instruction, callingConvention); }
 static inline LValue buildExtractValue(LBuilder builder, LValue aggVal, unsigned index) { return llvm->BuildExtractValue(builder, aggVal, index, ""); }
 static inline LValue buildSelect(LBuilder builder, LValue condition, LValue taken, LValue notTaken) { return llvm->BuildSelect(builder, condition, taken, notTaken, ""); }
 static inline LValue buildBr(LBuilder builder, LBasicBlock destination) { return llvm->BuildBr(builder, destination); }
index 2de7057..1fb9696 100644 (file)
@@ -188,14 +188,12 @@ static void fixFunctionBasedOnStackMaps(
             
             StackMaps::Record& record = iter->value;
             
-            UNUSED_PARAM(record); // FIXME: use AnyRegs.
-
             // FIXME: LLVM should tell us which registers are live.
             RegisterSet usedRegisters = RegisterSet::allRegisters();
             
-            GPRReg callFrameRegister = GPRInfo::argumentGPR0;
-            GPRReg base = GPRInfo::argumentGPR1;
-            GPRReg result = GPRInfo::returnValueGPR;
+            GPRReg result = record.locations[0].directGPR();
+            GPRReg callFrameRegister = record.locations[1].directGPR();
+            GPRReg base = record.locations[2].directGPR();
             
             JITGetByIdGenerator gen(
                 codeBlock, getById.codeOrigin(), usedRegisters, callFrameRegister,
@@ -224,19 +222,17 @@ static void fixFunctionBasedOnStackMaps(
             
             StackMaps::Record& record = iter->value;
             
-            UNUSED_PARAM(record); // FIXME: use AnyRegs.
-
             // FIXME: LLVM should tell us which registers are live.
             RegisterSet usedRegisters = RegisterSet::allRegisters();
             
-            GPRReg callFrameRegister = GPRInfo::argumentGPR0;
-            GPRReg base = GPRInfo::argumentGPR1;
-            GPRReg value = GPRInfo::argumentGPR2;
+            GPRReg callFrameRegister = record.locations[0].directGPR();
+            GPRReg base = record.locations[1].directGPR();
+            GPRReg value = record.locations[2].directGPR();
             
             JITPutByIdGenerator gen(
                 codeBlock, putById.codeOrigin(), usedRegisters, callFrameRegister,
-                JSValueRegs(base), JSValueRegs(value), GPRInfo::argumentGPR3, false,
-                putById.ecmaMode(), putById.putKind());
+                JSValueRegs(base), JSValueRegs(value), MacroAssembler::scratchRegister,
+                false, putById.ecmaMode(), putById.putKind());
             
             MacroAssembler::Label begin = slowPathJIT.label();
             
index 8f2984a..f023da7 100644 (file)
 
 namespace JSC { namespace FTL {
 
-static size_t s_sizeOfGetById;
-static size_t s_sizeOfPutById;
+// These sizes are x86-64-specific, and were found empirically. They have to cover the worst
+// possible combination of registers leading to the largest possible encoding of each
+// instruction in the IC.
 
 size_t sizeOfGetById()
 {
-    if (s_sizeOfGetById)
-        return s_sizeOfGetById;
-    
-    MacroAssembler jit;
-    
-    JITGetByIdGenerator generator(
-        0, CodeOrigin(), RegisterSet(), GPRInfo::callFrameRegister,
-        JSValueRegs(GPRInfo::regT6), JSValueRegs(GPRInfo::regT7), false);
-    generator.generateFastPath(jit);
-    
-    return s_sizeOfGetById = jit.m_assembler.codeSize();
+    return 29;
 }
 
 size_t sizeOfPutById()
 {
-    if (s_sizeOfPutById)
-        return s_sizeOfPutById;
-    
-    MacroAssembler jit;
-    
-    JITPutByIdGenerator generator(
-        0, CodeOrigin(), RegisterSet(), GPRInfo::callFrameRegister,
-        JSValueRegs(GPRInfo::regT6), JSValueRegs(GPRInfo::regT7), GPRInfo::regT8, false,
-        NotStrictMode, NotDirect);
-    generator.generateFastPath(jit);
-    
-    return s_sizeOfPutById = jit.m_assembler.codeSize();
+    return 32;
 }
 
 } } // namespace JSC::FTL
index 178fd27..a8a99b6 100644 (file)
@@ -75,7 +75,7 @@ bool JITFinalizer::finalizeFunction()
                 CodeLocationLabel(
                     m_plan.vm.ftlThunks->getOSRExitGenerationThunk(
                         m_plan.vm, Location::forStackmaps(
-                            jitCode->stackmaps, iter->value.locations[0])).code()));
+                            &jitCode->stackmaps, iter->value.locations[0])).code()));
         }
         
         jitCode->initializeExitThunks(
index 0ecdf7b..3f7ab7d 100644 (file)
@@ -35,7 +35,7 @@
 
 namespace JSC { namespace FTL {
 
-Location Location::forStackmaps(const StackMaps& stackmaps, const StackMaps::Location& location)
+Location Location::forStackmaps(const StackMaps* stackmaps, const StackMaps::Location& location)
 {
     switch (location.kind) {
     case StackMaps::Location::Unprocessed:
@@ -53,7 +53,8 @@ Location Location::forStackmaps(const StackMaps& stackmaps, const StackMaps::Loc
         return forConstant(location.offset);
         
     case StackMaps::Location::ConstantIndex:
-        return forConstant(stackmaps.constants[location.offset].integer);
+        ASSERT(stackmaps);
+        return forConstant(stackmaps->constants[location.offset].integer);
     }
     
     RELEASE_ASSERT_NOT_REACHED();
index 05ddf5b..90f5dde 100644 (file)
@@ -84,7 +84,9 @@ public:
         return result;
     }
 
-    static Location forStackmaps(const StackMaps&, const StackMaps::Location&);
+    // You can pass a null StackMaps if you are confident that the location doesn't
+    // involve a wide constant.
+    static Location forStackmaps(const StackMaps*, const StackMaps::Location&);
     
     Kind kind() const { return m_kind; }
     
index d058cff..6222964 100644 (file)
@@ -1275,10 +1275,12 @@ private:
 
         // Arguments: id, bytes, target, numArgs, args...
         unsigned stackmapID = m_stackmapIDs++;
-        setJSValue(m_out.call(
+        LValue call = m_out.call(
             m_out.patchpointInt64Intrinsic(),
             m_out.constInt32(stackmapID), m_out.constInt32(sizeOfGetById()),
-            constNull(m_out.ref8), m_out.constInt32(2), m_callFrame, base));
+            constNull(m_out.ref8), m_out.constInt32(2), m_callFrame, base);
+        setInstructionCallingConvention(call, LLVMAnyRegCallConv);
+        setJSValue(call);
         
         m_ftlState.getByIds.append(GetByIdDescriptor(stackmapID, m_node->codeOrigin, uid));
     }
@@ -1294,10 +1296,11 @@ private:
 
         // Arguments: id, bytes, target, numArgs, args...
         unsigned stackmapID = m_stackmapIDs++;
-        m_out.call(
+        LValue call = m_out.call(
             m_out.patchpointVoidIntrinsic(),
             m_out.constInt32(stackmapID), m_out.constInt32(sizeOfPutById()),
             constNull(m_out.ref8), m_out.constInt32(3), m_callFrame, base, value);
+        setInstructionCallingConvention(call, LLVMAnyRegCallConv);
         
         m_ftlState.putByIds.append(PutByIdDescriptor(
             stackmapID, m_node->codeOrigin, uid,
index f0f3a95..1813993 100644 (file)
@@ -176,8 +176,8 @@ static void compileStub(
     exit.m_code = FINALIZE_CODE_IF(
         shouldShowDisassembly(),
         patchBuffer,
-        ("FTL OSR exit #%u (bc#%u, %s) from %s, with operands = %s, and record = %s",
-            exitID, exit.m_codeOrigin.bytecodeIndex,
+        ("FTL OSR exit #%u (%s, %s) from %s, with operands = %s, and record = %s",
+            exitID, toCString(exit.m_codeOrigin).data(),
             exitKindToString(exit.m_kind), toCString(*codeBlock).data(),
             toCString(ignoringContext<DumpContext>(exit.m_values)).data(),
             toCString(*record).data()));
index 7f90443..3cf6ae0 100644 (file)
@@ -64,20 +64,23 @@ public:
         m_offsetToSavingArea =
             (std::max(m_numArgs, NUMBER_OF_ARGUMENT_REGISTERS) - NUMBER_OF_ARGUMENT_REGISTERS) * wordSize;
         
-        unsigned numArgumentRegistersThatNeedSaving = 0;
-        for (unsigned i = std::min(NUMBER_OF_ARGUMENT_REGISTERS, numArgs); i--;) {
-            if (m_usedRegisters.get(GPRInfo::toArgumentRegister(i)))
-                numArgumentRegistersThatNeedSaving++;
-        }
+        for (unsigned i = std::min(NUMBER_OF_ARGUMENT_REGISTERS, numArgs); i--;)
+            m_callingConventionRegisters.set(GPRInfo::toArgumentRegister(i));
+        if (returnRegister != InvalidGPRReg)
+            m_callingConventionRegisters.set(GPRInfo::returnValueGPR);
+        m_callingConventionRegisters.filter(m_usedRegisters);
+        
+        unsigned numberOfCallingConventionRegisters =
+            m_callingConventionRegisters.numberOfSetRegisters();
         
         size_t offsetToThunkSavingArea =
             m_offsetToSavingArea +
-            numArgumentRegistersThatNeedSaving * wordSize;
+            numberOfCallingConventionRegisters * wordSize;
         
         m_stackBytesNeeded =
             offsetToThunkSavingArea +
             stackBytesNeededForReturnAddress +
-            (m_usedRegisters.numberOfSetRegisters() - numArgumentRegistersThatNeedSaving) * wordSize;
+            (m_usedRegisters.numberOfSetRegisters() - numberOfCallingConventionRegisters) * wordSize;
         
         size_t stackAlignment = 16;
         
@@ -87,11 +90,13 @@ public:
         
         m_thunkSaveSet = m_usedRegisters;
         
-        for (unsigned i = std::min(NUMBER_OF_ARGUMENT_REGISTERS, numArgs); i--;) {
-            if (!m_usedRegisters.get(GPRInfo::toArgumentRegister(i)))
+        // This relies on all calling convention registers also being temp registers.
+        unsigned stackIndex = 0;
+        for (unsigned i = GPRInfo::numberOfRegisters; i--;) {
+            GPRReg reg = GPRInfo::toRegister(i);
+            if (!m_callingConventionRegisters.get(reg))
                 continue;
-            GPRReg reg = GPRInfo::toArgumentRegister(i);
-            m_jit.storePtr(reg, CCallHelpers::Address(CCallHelpers::stackPointerRegister, m_offsetToSavingArea + i * wordSize));
+            m_jit.storePtr(reg, CCallHelpers::Address(CCallHelpers::stackPointerRegister, m_offsetToSavingArea + (stackIndex++) * wordSize));
             m_thunkSaveSet.clear(reg);
         }
         
@@ -103,11 +108,12 @@ public:
         if (m_returnRegister != InvalidGPRReg)
             m_jit.move(GPRInfo::returnValueGPR, m_returnRegister);
         
-        for (unsigned i = std::min(NUMBER_OF_ARGUMENT_REGISTERS, m_numArgs); i--;) {
-            if (!m_usedRegisters.get(GPRInfo::toArgumentRegister(i)))
+        unsigned stackIndex = 0;
+        for (unsigned i = GPRInfo::numberOfRegisters; i--;) {
+            GPRReg reg = GPRInfo::toRegister(i);
+            if (!m_callingConventionRegisters.get(reg))
                 continue;
-            GPRReg reg = GPRInfo::toArgumentRegister(i);
-            m_jit.loadPtr(CCallHelpers::Address(CCallHelpers::stackPointerRegister, m_offsetToSavingArea + i * wordSize), reg);
+            m_jit.loadPtr(CCallHelpers::Address(CCallHelpers::stackPointerRegister, m_offsetToSavingArea + (stackIndex++) * wordSize), reg);
         }
         
         m_jit.addPtr(CCallHelpers::TrustedImm32(m_stackBytesNeeded), CCallHelpers::stackPointerRegister);
@@ -139,6 +145,7 @@ public:
 private:
     State& m_state;
     RegisterSet m_usedRegisters;
+    RegisterSet m_callingConventionRegisters;
     CCallHelpers& m_jit;
     unsigned m_numArgs;
     GPRReg m_returnRegister;
index 4a3569e..0c7c329 100644 (file)
@@ -61,6 +61,11 @@ public:
     void* callTarget() const { return m_callTarget; }
     ptrdiff_t offset() const { return m_offset; }
     
+    SlowPathCallKey withCallTarget(void* callTarget)
+    {
+        return SlowPathCallKey(usedRegisters(), callTarget, offset());
+    }
+    
     void dump(PrintStream&) const;
     
     enum EmptyValueTag { EmptyValue };
index f688cb1..5ef364c 100644 (file)
@@ -66,15 +66,15 @@ void StackMaps::Location::dump(PrintStream& out) const
     out.print("(", kind, ", reg", dwarfRegNum, ", ", offset, ")");
 }
 
-GPRReg StackMaps::Location::directGPR(StackMaps& stackmaps) const
+GPRReg StackMaps::Location::directGPR() const
 {
-    return FTL::Location::forStackmaps(stackmaps, *this).directGPR();
+    return FTL::Location::forStackmaps(nullptr, *this).directGPR();
 }
 
 void StackMaps::Location::restoreInto(
     MacroAssembler& jit, StackMaps& stackmaps, char* savedRegisters, GPRReg result) const
 {
-    FTL::Location::forStackmaps(stackmaps, *this).restoreInto(jit, savedRegisters, result);
+    FTL::Location::forStackmaps(&stackmaps, *this).restoreInto(jit, savedRegisters, result);
 }
 
 bool StackMaps::Record::parse(DataView* view, unsigned& offset)
index 13d1966..0d9e94a 100644 (file)
@@ -65,7 +65,7 @@ struct StackMaps {
         void parse(DataView*, unsigned& offset);
         void dump(PrintStream& out) const;
         
-        GPRReg directGPR(StackMaps&) const;
+        GPRReg directGPR() const;
         void restoreInto(MacroAssembler&, StackMaps&, char* savedRegisters, GPRReg result) const;
     };
     
index ffa0245..bbcdbdd 100644 (file)
@@ -44,19 +44,38 @@ namespace FTL {
 MacroAssemblerCodeRef osrExitGenerationThunkGenerator(VM&, const Location&);
 MacroAssemblerCodeRef slowPathCallThunkGenerator(VM&, const SlowPathCallKey&);
 
-template<typename MapType, typename KeyType, typename GeneratorType>
+template<typename KeyTypeArgument>
+struct ThunkMap {
+    typedef KeyTypeArgument KeyType;
+    typedef HashMap<KeyType, MacroAssemblerCodeRef> ToThunkMap;
+    typedef HashMap<MacroAssemblerCodePtr, KeyType> FromThunkMap;
+    
+    ToThunkMap m_toThunk;
+    FromThunkMap m_fromThunk;
+};
+
+template<typename MapType, typename GeneratorType>
 MacroAssemblerCodeRef generateIfNecessary(
-    VM& vm, MapType& map, const KeyType& key, GeneratorType generator)
+    VM& vm, MapType& map, const typename MapType::KeyType& key, GeneratorType generator)
 {
-    typename MapType::iterator iter = map.find(key);
-    if (iter != map.end())
+    typename MapType::ToThunkMap::iterator iter = map.m_toThunk.find(key);
+    if (iter != map.m_toThunk.end())
         return iter->value;
     
     MacroAssemblerCodeRef result = generator(vm, key);
-    map.add(key, result);
+    map.m_toThunk.add(key, result);
+    map.m_fromThunk.add(result.code(), key);
     return result;
 }
 
+template<typename MapType>
+typename MapType::KeyType keyForThunk(MapType& map, MacroAssemblerCodePtr ptr)
+{
+    typename MapType::FromThunkMap::iterator iter = map.m_fromThunk.find(ptr);
+    RELEASE_ASSERT(iter != map.m_fromThunk.end());
+    return iter->value;
+}
+
 class Thunks {
 public:
     MacroAssemblerCodeRef getOSRExitGenerationThunk(VM& vm, const Location& location)
@@ -71,9 +90,14 @@ public:
             vm, m_slowPathCallThunks, key, slowPathCallThunkGenerator);
     }
     
+    SlowPathCallKey keyForSlowPathCallThunk(MacroAssemblerCodePtr ptr)
+    {
+        return keyForThunk(m_slowPathCallThunks, ptr);
+    }
+    
 private:
-    HashMap<Location, MacroAssemblerCodeRef> m_osrExitThunks;
-    HashMap<SlowPathCallKey, MacroAssemblerCodeRef> m_slowPathCallThunks;
+    ThunkMap<Location> m_osrExitThunks;
+    ThunkMap<SlowPathCallKey> m_slowPathCallThunks;
 };
 
 } } // namespace JSC::FTL
index 4e3b97d..40f3757 100644 (file)
@@ -73,7 +73,10 @@ public:
     }
     static unsigned toIndex(FPRReg reg)
     {
-        return (unsigned)reg;
+        unsigned result = (unsigned)reg;
+        if (result >= numberOfRegisters)
+            return InvalidIndex;
+        return result;
     }
     
     static FPRReg toArgumentRegister(unsigned index)
@@ -101,6 +104,8 @@ public:
 #endif
         return nameForRegister[reg];
     }
+    
+    static const unsigned InvalidIndex = 0xffffffff;
 };
 
 #endif
index ddcb520..0869db4 100644 (file)
@@ -339,7 +339,6 @@ public:
         };
         return nameForRegister[reg];
     }
-private:
 
     static const unsigned InvalidIndex = 0xffffffff;
 };
@@ -421,9 +420,7 @@ public:
         ASSERT(reg != InvalidGPRReg);
         ASSERT(static_cast<int>(reg) < 16);
         static const unsigned indexForRegister[16] = { 0, 2, 1, 3, InvalidIndex, InvalidIndex, 5, 4, 6, 7, 8, InvalidIndex, InvalidIndex, 9, InvalidIndex, InvalidIndex };
-        unsigned result = indexForRegister[reg];
-        ASSERT(result != InvalidIndex);
-        return result;
+        return indexForRegister[reg];
     }
 
     static const char* debugName(GPRReg reg)
@@ -438,7 +435,6 @@ public:
         };
         return nameForRegister[reg];
     }
-private:
 
     static const unsigned InvalidIndex = 0xffffffff;
 };
@@ -511,7 +507,6 @@ public:
         };
         return nameForRegister[reg];
     }
-private:
 
     static const unsigned InvalidIndex = 0xffffffff;
 };
@@ -610,7 +605,6 @@ public:
         };
         return nameForRegister[reg];
     }
-private:
 
     static const unsigned InvalidIndex = 0xffffffff;
 };
index 6fc2fba..362ada0 100644 (file)
@@ -59,7 +59,6 @@ RegisterSet RegisterSet::calleeSaveRegisters()
 #if CPU(X86_64)
     result.set(X86Registers::ebx);
     result.set(X86Registers::ebp);
-    result.set(X86Registers::r10);
     result.set(X86Registers::r12);
     result.set(X86Registers::r13);
     result.set(X86Registers::r14);
index a1304a9..84ad226 100644 (file)
@@ -81,6 +81,7 @@ public:
     bool get(FPRReg reg) const { return m_vector.get(MacroAssembler::registerIndex(reg)); }
     
     void merge(const RegisterSet& other) { m_vector.merge(other.m_vector); }
+    void filter(const RegisterSet& other) { m_vector.filter(other.m_vector); }
     void exclude(const RegisterSet& other) { m_vector.exclude(other.m_vector); }
     
     size_t numberOfSetRegisters() const { return m_vector.bitCount(); }
index 5e7f95f..3ab74a8 100644 (file)
@@ -30,6 +30,7 @@
 
 #include "CCallHelpers.h"
 #include "CallFrameInlines.h"
+#include "FTLThunks.h"
 #include "GCAwareJITStubRoutine.h"
 #include "LinkBuffer.h"
 #include "Operations.h"
@@ -52,10 +53,44 @@ namespace JSC {
 // give the FTL closure call patching support until we switch to the C stack - but when we do that,
 // callFrameRegister will disappear.
 
+static FunctionPtr readCallTarget(RepatchBuffer& repatchBuffer, CodeLocationCall call)
+{
+    FunctionPtr result = MacroAssembler::readCallTarget(call);
+#if ENABLE(FTL_JIT)
+    CodeBlock* codeBlock = repatchBuffer.codeBlock();
+    if (codeBlock->jitType() == JITCode::FTLJIT) {
+        return FunctionPtr(codeBlock->vm()->ftlThunks->keyForSlowPathCallThunk(
+            MacroAssemblerCodePtr::createFromExecutableAddress(
+                result.executableAddress())).callTarget());
+    }
+#else
+    UNUSED_PARAM(repatchBuffer);
+#endif // ENABLE(FTL_JIT)
+    return result;
+}
+
+static void repatchCall(RepatchBuffer& repatchBuffer, CodeLocationCall call, FunctionPtr newCalleeFunction)
+{
+#if ENABLE(FTL_JIT)
+    CodeBlock* codeBlock = repatchBuffer.codeBlock();
+    if (codeBlock->jitType() == JITCode::FTLJIT) {
+        VM& vm = *codeBlock->vm();
+        FTL::Thunks& thunks = *vm.ftlThunks;
+        FTL::SlowPathCallKey key = thunks.keyForSlowPathCallThunk(
+            MacroAssemblerCodePtr::createFromExecutableAddress(
+                MacroAssembler::readCallTarget(call).executableAddress()));
+        key = key.withCallTarget(newCalleeFunction.executableAddress());
+        newCalleeFunction = FunctionPtr(
+            thunks.getSlowPathCallThunk(vm, key).code().executableAddress());
+    }
+#endif // ENABLE(FTL_JIT)
+    repatchBuffer.relink(call, newCalleeFunction);
+}
+
 static void repatchCall(CodeBlock* codeblock, CodeLocationCall call, FunctionPtr newCalleeFunction)
 {
     RepatchBuffer repatchBuffer(codeblock);
-    repatchBuffer.relink(call, newCalleeFunction);
+    repatchCall(repatchBuffer, call, newCalleeFunction);
 }
 
 static void repatchByIdSelfAccess(CodeBlock* codeBlock, StructureStubInfo& stubInfo, Structure* structure, PropertyOffset offset, const FunctionPtr &slowPathFunction, bool compact)
@@ -63,7 +98,7 @@ static void repatchByIdSelfAccess(CodeBlock* codeBlock, StructureStubInfo& stubI
     RepatchBuffer repatchBuffer(codeBlock);
 
     // Only optimize once!
-    repatchBuffer.relink(stubInfo.callReturnLocation, slowPathFunction);
+    repatchCall(repatchBuffer, stubInfo.callReturnLocation, slowPathFunction);
 
     // Patch the structure check & the offset of the load.
     repatchBuffer.repatch(stubInfo.callReturnLocation.dataLabelPtrAtOffset(-(intptr_t)stubInfo.patch.deltaCheckImmToCall), structure);
@@ -314,7 +349,7 @@ static bool tryCacheGetByID(ExecState* exec, JSValue baseValue, const Identifier
         
         RepatchBuffer repatchBuffer(codeBlock);
         replaceWithJump(repatchBuffer, stubInfo, stubInfo.stubRoutine->code().code());
-        repatchBuffer.relink(stubInfo.callReturnLocation, operationGetById);
+        repatchCall(repatchBuffer, stubInfo.callReturnLocation, operationGetById);
         
         return true;
     }
@@ -362,7 +397,7 @@ static bool tryCacheGetByID(ExecState* exec, JSValue baseValue, const Identifier
     
     RepatchBuffer repatchBuffer(codeBlock);
     replaceWithJump(repatchBuffer, stubInfo, stubInfo.stubRoutine->code().code());
-    repatchBuffer.relink(stubInfo.callReturnLocation, operationGetByIdBuildList);
+    repatchCall(repatchBuffer, stubInfo.callReturnLocation, operationGetByIdBuildList);
     
     stubInfo.initGetByIdChain(*vm, codeBlock->ownerExecutable(), structure, prototypeChain, count, true);
     return true;
@@ -1000,7 +1035,7 @@ static bool tryCachePutByID(ExecState* exec, JSValue baseValue, const Identifier
                 stubInfo.callReturnLocation.jumpAtOffset(
                     stubInfo.patch.deltaCallToStructCheck),
                 CodeLocationLabel(stubInfo.stubRoutine->code().code()));
-            repatchBuffer.relink(stubInfo.callReturnLocation, appropriateListBuildingPutByIdFunction(slot, putKind));
+            repatchCall(repatchBuffer, stubInfo.callReturnLocation, appropriateListBuildingPutByIdFunction(slot, putKind));
             
             stubInfo.initPutByIdTransition(*vm, codeBlock->ownerExecutable(), oldStructure, structure, prototypeChain, putKind == Direct);
             
@@ -1105,7 +1140,7 @@ static bool tryBuildPutByIdList(ExecState* exec, JSValue baseValue, const Identi
         repatchBuffer.relink(stubInfo.callReturnLocation.jumpAtOffset(stubInfo.patch.deltaCallToStructCheck), CodeLocationLabel(stubRoutine->code().code()));
         
         if (list->isFull())
-            repatchBuffer.relink(stubInfo.callReturnLocation, appropriateGenericPutByIdFunction(slot, putKind));
+            repatchCall(repatchBuffer, stubInfo.callReturnLocation, appropriateGenericPutByIdFunction(slot, putKind));
         
         return true;
     }
@@ -1380,7 +1415,7 @@ void linkClosureCall(ExecState* exec, CallLinkInfo& callLinkInfo, CodeBlock* cal
 
 void resetGetByID(RepatchBuffer& repatchBuffer, StructureStubInfo& stubInfo)
 {
-    repatchBuffer.relink(stubInfo.callReturnLocation, operationGetByIdOptimize);
+    repatchCall(repatchBuffer, stubInfo.callReturnLocation, operationGetByIdOptimize);
     CodeLocationDataLabelPtr structureLabel = stubInfo.callReturnLocation.dataLabelPtrAtOffset(-(intptr_t)stubInfo.patch.deltaCheckImmToCall);
     if (MacroAssembler::canJumpReplacePatchableBranchPtrWithPatch()) {
         repatchBuffer.revertJumpReplacementToPatchableBranchPtrWithPatch(
@@ -1402,7 +1437,7 @@ void resetGetByID(RepatchBuffer& repatchBuffer, StructureStubInfo& stubInfo)
 
 void resetPutByID(RepatchBuffer& repatchBuffer, StructureStubInfo& stubInfo)
 {
-    V_JITOperation_ESsiJJI unoptimizedFunction = bitwise_cast<V_JITOperation_ESsiJJI>(MacroAssembler::readCallTarget(stubInfo.callReturnLocation).executableAddress());
+    V_JITOperation_ESsiJJI unoptimizedFunction = bitwise_cast<V_JITOperation_ESsiJJI>(readCallTarget(repatchBuffer, stubInfo.callReturnLocation).executableAddress());
     V_JITOperation_ESsiJJI optimizedFunction;
     if (unoptimizedFunction == operationPutByIdStrict || unoptimizedFunction == operationPutByIdStrictBuildList)
         optimizedFunction = operationPutByIdStrictOptimize;
@@ -1414,7 +1449,7 @@ void resetPutByID(RepatchBuffer& repatchBuffer, StructureStubInfo& stubInfo)
         ASSERT(unoptimizedFunction == operationPutByIdDirectNonStrict || unoptimizedFunction == operationPutByIdDirectNonStrictBuildList);
         optimizedFunction = operationPutByIdDirectNonStrictOptimize;
     }
-    repatchBuffer.relink(stubInfo.callReturnLocation, optimizedFunction);
+    repatchCall(repatchBuffer, stubInfo.callReturnLocation, optimizedFunction);
     CodeLocationDataLabelPtr structureLabel = stubInfo.callReturnLocation.dataLabelPtrAtOffset(-(intptr_t)stubInfo.patch.deltaCheckImmToCall);
     if (MacroAssembler::canJumpReplacePatchableBranchPtrWithPatch()) {
         repatchBuffer.revertJumpReplacementToPatchableBranchPtrWithPatch(
index db3fbc6..de81f46 100644 (file)
@@ -44,9 +44,21 @@ public:
         , m_didReuseRegisters(false)
     {
     }
-    
-    template<typename T>
-    void lock(T reg) { m_lockedRegisters.set(reg); }
+
+    void lock(GPRReg reg)
+    {
+        unsigned index = GPRInfo::toIndex(reg);
+        if (index == GPRInfo::InvalidIndex)
+            return;
+        m_lockedRegisters.setGPRByIndex(index);
+    }
+    void lock(FPRReg reg)
+    {
+        unsigned index = FPRInfo::toIndex(reg);
+        if (index == FPRInfo::InvalidIndex)
+            return;
+        m_lockedRegisters.setFPRByIndex(index);
+    }
     
     template<typename BankInfo>
     typename BankInfo::RegisterType allocateScratch()
index ad59f2c..609d090 100644 (file)
@@ -1,3 +1,18 @@
+2013-11-09  Filip Pizlo  <fpizlo@apple.com>
+
+        Switch FTL GetById/PutById IC's over to using AnyRegCC
+        https://bugs.webkit.org/show_bug.cgi?id=124094
+
+        Reviewed by Sam Weinig.
+        
+        I needed to add another set operation, namely filter(), which is an in-place set
+        intersection.
+
+        * wtf/BitVector.cpp:
+        (WTF::BitVector::filterSlow):
+        * wtf/BitVector.h:
+        (WTF::BitVector::filter):
+
 2013-11-10  Ryuan Choi  <ryuan.choi@samsung.com>
 
         [EFL] Build break on Ubuntu 13.10
index 50dbd82..f60856c 100644 (file)
@@ -124,6 +124,31 @@ void BitVector::mergeSlow(const BitVector& other)
         a->bits()[i] |= b->bits()[i];
 }
 
+void BitVector::filterSlow(const BitVector& other)
+{
+    if (other.isInline()) {
+        ASSERT(!isInline());
+        *bits() &= cleanseInlineBits(other.m_bitsOrPointer);
+        return;
+    }
+    
+    if (isInline()) {
+        ASSERT(!other.isInline());
+        m_bitsOrPointer &= *other.outOfLineBits()->bits();
+        m_bitsOrPointer |= (static_cast<uintptr_t>(1) << maxInlineBits());
+        ASSERT(isInline());
+        return;
+    }
+    
+    OutOfLineBits* a = outOfLineBits();
+    const OutOfLineBits* b = other.outOfLineBits();
+    for (unsigned i = std::min(a->numWords(), b->numWords()); i--;)
+        a->bits()[i] &= b->bits()[i];
+    
+    for (unsigned i = b->numWords(); i < a->numWords(); ++i)
+        a->bits()[i] = 0;
+}
+
 void BitVector::excludeSlow(const BitVector& other)
 {
     if (other.isInline()) {
index d6eb9f8..77d95f6 100644 (file)
@@ -181,6 +181,16 @@ public:
         ASSERT(isInline());
     }
     
+    void filter(const BitVector& other)
+    {
+        if (!isInline() || !other.isInline()) {
+            filterSlow(other);
+            return;
+        }
+        m_bitsOrPointer &= other.m_bitsOrPointer;
+        ASSERT(isInline());
+    }
+    
     void exclude(const BitVector& other)
     {
         if (!isInline() || !other.isInline()) {
@@ -302,6 +312,7 @@ private:
     WTF_EXPORT_PRIVATE void setSlow(const BitVector& other);
     
     WTF_EXPORT_PRIVATE void mergeSlow(const BitVector& other);
+    WTF_EXPORT_PRIVATE void filterSlow(const BitVector& other);
     WTF_EXPORT_PRIVATE void excludeSlow(const BitVector& other);
     
     WTF_EXPORT_PRIVATE size_t bitCountSlow() const;