Need a store-load fence between setting cell state and visiting the object in SlotVisitor
authorfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Sat, 24 Sep 2016 00:47:30 +0000 (00:47 +0000)
committerfpizlo@apple.com <fpizlo@apple.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Sat, 24 Sep 2016 00:47:30 +0000 (00:47 +0000)
https://bugs.webkit.org/show_bug.cgi?id=162354

Reviewed by Mark Lam.

Source/JavaScriptCore:

This was meant to be a small change, but then it became bigger as I found small
opportunities for improving this code. This adds a store-load fence and is performance-
neutral. That's probably partly due to other optimizations that I did to visitChildren().

Initially, I found that adding an mfence as a store-load fence was terribly expensive. So,
I thought that I needed to buffer up a bunch of objects, set their states, do one mfence,
and then visit all of them. This seemed like a win, so I went with it. Unfortunately, this
made no sense for two reasons:

- I shouldn't use mfence. I should use ortop (lock orl $0, (%rsp)) instead. Ortop is
  basically free, and it's what WTF now uses for storeLoadFence().

- My data saying that buffering up objects was not a slow-down was wrong. That was actually
  almost as expensive as the mfence.

But in order to implement that, I made some other improvements that I think we should stick
with:

- SlotVisitor::visitChildren() now uses a switch on type. This replaces what used to be
  some nasty ClassInfo look-ups.

- We no longer save the object's old CellState. We would do that so that we would know what
  state the object had been before we blackened it. But I believe that the more logical
  solution is to have two kinds of black - one for black-for-the-first-time objects and one
  for repeat offenders. This is a lot easier to reason about, since you can now just figure
  this out by looking at the cell directly.

The latter change meant rewiring a bunch of barriers. It didn't make them any more
expensive.

Relanding after fixing a nasty build failure in cloop and elsewhere.

* JavaScriptCore.xcodeproj/project.pbxproj:
* assembler/AbstractMacroAssembler.h:
(JSC::isARMv7IDIVSupported): Deleted.
(JSC::isARM64): Deleted.
(JSC::isX86): Deleted.
(JSC::isX86_64): Deleted.
(JSC::optimizeForARMv7IDIVSupported): Deleted.
(JSC::optimizeForARM64): Deleted.
(JSC::optimizeForX86): Deleted.
(JSC::optimizeForX86_64): Deleted.
* assembler/CPU.h: Added.
(JSC::isARMv7IDIVSupported):
(JSC::isARM64):
(JSC::isX86):
(JSC::isX86_64):
(JSC::optimizeForARMv7IDIVSupported):
(JSC::optimizeForARM64):
(JSC::optimizeForX86):
(JSC::optimizeForX86_64):
* ftl/FTLLowerDFGToB3.cpp:
(JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier):
* heap/CellState.h:
(JSC::isBlack):
(JSC::blacken):
* heap/Heap.cpp:
(JSC::Heap::addToRememberedSet):
* heap/Heap.h:
* heap/HeapInlines.h:
(JSC::Heap::writeBarrier):
(JSC::Heap::reportExtraMemoryVisited):
(JSC::Heap::reportExternalMemoryVisited):
* heap/MarkStack.cpp:
* heap/MarkStack.h:
* heap/SlotVisitor.cpp:
(JSC::SlotVisitor::visitChildren):
* heap/SlotVisitor.h:
* heap/SlotVisitorInlines.h:
(JSC::SlotVisitor::reportExtraMemoryVisited):
(JSC::SlotVisitor::reportExternalMemoryVisited):
* jit/AssemblyHelpers.h:
(JSC::AssemblyHelpers::jumpIfIsRememberedOrInEden):
* llint/LLIntData.cpp:
(JSC::LLInt::Data::performAssertions):
* llint/LowLevelInterpreter.asm:
* llint/LowLevelInterpreter32_64.asm:
* llint/LowLevelInterpreter64.asm:
* runtime/JSObject.h:
(JSC::isJSFinalObject):

Source/WTF:

Fix this on x86-32.

* wtf/Atomics.h:
(WTF::x86_ortop):

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@206344 268f45cc-cd09-0410-ab3c-d52691b4dbfc

22 files changed:
Source/JavaScriptCore/ChangeLog
Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj
Source/JavaScriptCore/assembler/AbstractMacroAssembler.h
Source/JavaScriptCore/assembler/CPU.h [new file with mode: 0644]
Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp
Source/JavaScriptCore/heap/CellState.h
Source/JavaScriptCore/heap/Heap.cpp
Source/JavaScriptCore/heap/Heap.h
Source/JavaScriptCore/heap/HeapInlines.h
Source/JavaScriptCore/heap/MarkStack.cpp
Source/JavaScriptCore/heap/MarkStack.h
Source/JavaScriptCore/heap/SlotVisitor.cpp
Source/JavaScriptCore/heap/SlotVisitor.h
Source/JavaScriptCore/heap/SlotVisitorInlines.h
Source/JavaScriptCore/jit/AssemblyHelpers.h
Source/JavaScriptCore/llint/LLIntData.cpp
Source/JavaScriptCore/llint/LowLevelInterpreter.asm
Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm
Source/JavaScriptCore/llint/LowLevelInterpreter64.asm
Source/JavaScriptCore/runtime/JSObject.h
Source/WTF/ChangeLog
Source/WTF/wtf/Atomics.h

index 3caf218..f6e2ba8 100644 (file)
@@ -1,3 +1,91 @@
+2016-09-23  Filip Pizlo  <fpizlo@apple.com>
+
+        Need a store-load fence between setting cell state and visiting the object in SlotVisitor
+        https://bugs.webkit.org/show_bug.cgi?id=162354
+
+        Reviewed by Mark Lam.
+        
+        This was meant to be a small change, but then it became bigger as I found small
+        opportunities for improving this code. This adds a store-load fence and is performance-
+        neutral. That's probably partly due to other optimizations that I did to visitChildren().
+        
+        Initially, I found that adding an mfence as a store-load fence was terribly expensive. So,
+        I thought that I needed to buffer up a bunch of objects, set their states, do one mfence,
+        and then visit all of them. This seemed like a win, so I went with it. Unfortunately, this
+        made no sense for two reasons:
+        
+        - I shouldn't use mfence. I should use ortop (lock orl $0, (%rsp)) instead. Ortop is
+          basically free, and it's what WTF now uses for storeLoadFence().
+        
+        - My data saying that buffering up objects was not a slow-down was wrong. That was actually
+          almost as expensive as the mfence.
+        
+        But in order to implement that, I made some other improvements that I think we should stick
+        with:
+        
+        - SlotVisitor::visitChildren() now uses a switch on type. This replaces what used to be
+          some nasty ClassInfo look-ups.
+        
+        - We no longer save the object's old CellState. We would do that so that we would know what
+          state the object had been before we blackened it. But I believe that the more logical
+          solution is to have two kinds of black - one for black-for-the-first-time objects and one
+          for repeat offenders. This is a lot easier to reason about, since you can now just figure
+          this out by looking at the cell directly.
+        
+        The latter change meant rewiring a bunch of barriers. It didn't make them any more
+        expensive.
+        
+        Relanding after fixing a nasty build failure in cloop and elsewhere.
+
+        * JavaScriptCore.xcodeproj/project.pbxproj:
+        * assembler/AbstractMacroAssembler.h:
+        (JSC::isARMv7IDIVSupported): Deleted.
+        (JSC::isARM64): Deleted.
+        (JSC::isX86): Deleted.
+        (JSC::isX86_64): Deleted.
+        (JSC::optimizeForARMv7IDIVSupported): Deleted.
+        (JSC::optimizeForARM64): Deleted.
+        (JSC::optimizeForX86): Deleted.
+        (JSC::optimizeForX86_64): Deleted.
+        * assembler/CPU.h: Added.
+        (JSC::isARMv7IDIVSupported):
+        (JSC::isARM64):
+        (JSC::isX86):
+        (JSC::isX86_64):
+        (JSC::optimizeForARMv7IDIVSupported):
+        (JSC::optimizeForARM64):
+        (JSC::optimizeForX86):
+        (JSC::optimizeForX86_64):
+        * ftl/FTLLowerDFGToB3.cpp:
+        (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier):
+        * heap/CellState.h:
+        (JSC::isBlack):
+        (JSC::blacken):
+        * heap/Heap.cpp:
+        (JSC::Heap::addToRememberedSet):
+        * heap/Heap.h:
+        * heap/HeapInlines.h:
+        (JSC::Heap::writeBarrier):
+        (JSC::Heap::reportExtraMemoryVisited):
+        (JSC::Heap::reportExternalMemoryVisited):
+        * heap/MarkStack.cpp:
+        * heap/MarkStack.h:
+        * heap/SlotVisitor.cpp:
+        (JSC::SlotVisitor::visitChildren):
+        * heap/SlotVisitor.h:
+        * heap/SlotVisitorInlines.h:
+        (JSC::SlotVisitor::reportExtraMemoryVisited):
+        (JSC::SlotVisitor::reportExternalMemoryVisited):
+        * jit/AssemblyHelpers.h:
+        (JSC::AssemblyHelpers::jumpIfIsRememberedOrInEden):
+        * llint/LLIntData.cpp:
+        (JSC::LLInt::Data::performAssertions):
+        * llint/LowLevelInterpreter.asm:
+        * llint/LowLevelInterpreter32_64.asm:
+        * llint/LowLevelInterpreter64.asm:
+        * runtime/JSObject.h:
+        (JSC::isJSFinalObject):
+
 2016-09-23  Caitlin Potter  <caitp@igalia.com>
 
         [JSC] Implement parsing of Async Functions
index 9c6bd7a..f734f91 100644 (file)
                0F300B7818AB051100A6D72E /* DFGNodeOrigin.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F300B7718AB051100A6D72E /* DFGNodeOrigin.h */; };
                0F300B7B18AB1B1400A6D72E /* DFGIntegerCheckCombiningPhase.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F300B7918AB1B1400A6D72E /* DFGIntegerCheckCombiningPhase.cpp */; };
                0F300B7C18AB1B1400A6D72E /* DFGIntegerCheckCombiningPhase.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F300B7A18AB1B1400A6D72E /* DFGIntegerCheckCombiningPhase.h */; };
+               0F30D7C01D95D6320053089D /* CPU.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F30D7BF1D95D62F0053089D /* CPU.h */; settings = {ATTRIBUTES = (Private, ); }; };
                0F32BD101BB34F190093A57F /* HeapHelperPool.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F32BD0E1BB34F190093A57F /* HeapHelperPool.cpp */; };
                0F32BD111BB34F190093A57F /* HeapHelperPool.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F32BD0F1BB34F190093A57F /* HeapHelperPool.h */; };
                0F338DF11BE93AD10013C88F /* B3StackmapValue.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F338DEF1BE93AD10013C88F /* B3StackmapValue.cpp */; };
                0F300B7718AB051100A6D72E /* DFGNodeOrigin.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = DFGNodeOrigin.h; path = dfg/DFGNodeOrigin.h; sourceTree = "<group>"; };
                0F300B7918AB1B1400A6D72E /* DFGIntegerCheckCombiningPhase.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = DFGIntegerCheckCombiningPhase.cpp; path = dfg/DFGIntegerCheckCombiningPhase.cpp; sourceTree = "<group>"; };
                0F300B7A18AB1B1400A6D72E /* DFGIntegerCheckCombiningPhase.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = DFGIntegerCheckCombiningPhase.h; path = dfg/DFGIntegerCheckCombiningPhase.h; sourceTree = "<group>"; };
+               0F30D7BF1D95D62F0053089D /* CPU.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = CPU.h; sourceTree = "<group>"; };
                0F32BD0E1BB34F190093A57F /* HeapHelperPool.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = HeapHelperPool.cpp; sourceTree = "<group>"; };
                0F32BD0F1BB34F190093A57F /* HeapHelperPool.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = HeapHelperPool.h; sourceTree = "<group>"; };
                0F338DEF1BE93AD10013C88F /* B3StackmapValue.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = B3StackmapValue.cpp; path = b3/B3StackmapValue.cpp; sourceTree = "<group>"; };
                                86D3B2C110156BDE002865E7 /* AssemblerBufferWithConstantPool.h */,
                                43C392AA1C3BEB0000241F53 /* AssemblerCommon.h */,
                                86E116B00FE75AC800B512BC /* CodeLocation.h */,
+                               0F30D7BF1D95D62F0053089D /* CPU.h */,
                                0F37308E1C0CD68500052BFA /* DisallowMacroScratchRegisterUsage.h */,
                                0FF4275615914A20004CB9FF /* LinkBuffer.cpp */,
                                86D3B3C110159D7F002865E7 /* LinkBuffer.h */,
                                0F21C27D14BE727A00ADC64B /* CodeSpecializationKind.h in Headers */,
                                0F0B83A714BCF50700885B4F /* CodeType.h in Headers */,
                                A53243981856A489002ED692 /* CombinedDomains.json in Headers */,
+                               0F30D7C01D95D6320053089D /* CPU.h in Headers */,
                                BC18C3F30E16F5CD00B34460 /* CommonIdentifiers.h in Headers */,
                                53F6BF6D1C3F060A00F41E5D /* InternalFunctionAllocationProfile.h in Headers */,
                                0F15F15F14B7A73E005DE37D /* CommonSlowPaths.h in Headers */,
index d05c913..0948dfa 100644 (file)
@@ -27,6 +27,9 @@
 #define AbstractMacroAssembler_h
 
 #include "AbortReason.h"
+#include "AssemblerBuffer.h"
+#include "AssemblerCommon.h"
+#include "CPU.h"
 #include "CodeLocation.h"
 #include "MacroAssemblerCodeRef.h"
 #include "Options.h"
 #include <wtf/SharedTask.h>
 #include <wtf/WeakRandom.h>
 
-#if ENABLE(ASSEMBLER)
-
 namespace JSC {
 
-inline bool isARMv7IDIVSupported()
-{
-#if HAVE(ARM_IDIV_INSTRUCTIONS)
-    return true;
-#else
-    return false;
-#endif
-}
-
-inline bool isARM64()
-{
-#if CPU(ARM64)
-    return true;
-#else
-    return false;
-#endif
-}
-
-inline bool isX86()
-{
-#if CPU(X86_64) || CPU(X86)
-    return true;
-#else
-    return false;
-#endif
-}
-
-inline bool isX86_64()
-{
-#if CPU(X86_64)
-    return true;
-#else
-    return false;
-#endif
-}
-
-inline bool optimizeForARMv7IDIVSupported()
-{
-    return isARMv7IDIVSupported() && Options::useArchitectureSpecificOptimizations();
-}
-
-inline bool optimizeForARM64()
-{
-    return isARM64() && Options::useArchitectureSpecificOptimizations();
-}
-
-inline bool optimizeForX86()
-{
-    return isX86() && Options::useArchitectureSpecificOptimizations();
-}
-
-inline bool optimizeForX86_64()
-{
-    return isX86_64() && Options::useArchitectureSpecificOptimizations();
-}
+#if ENABLE(ASSEMBLER)
 
 class AllowMacroScratchRegisterUsage;
 class DisallowMacroScratchRegisterUsage;
@@ -1165,8 +1112,8 @@ AbstractMacroAssembler<AssemblerType, MacroAssemblerType>::Address::indexedBy(
     return BaseIndex(base, index, scale, offset);
 }
 
-} // namespace JSC
-
 #endif // ENABLE(ASSEMBLER)
 
+} // namespace JSC
+
 #endif // AbstractMacroAssembler_h
diff --git a/Source/JavaScriptCore/assembler/CPU.h b/Source/JavaScriptCore/assembler/CPU.h
new file mode 100644 (file)
index 0000000..2d2b486
--- /dev/null
@@ -0,0 +1,89 @@
+/*
+ * Copyright (C) 2008, 2012-2015 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+#pragma once
+
+#include "Options.h"
+
+namespace JSC {
+
+inline bool isARMv7IDIVSupported()
+{
+#if HAVE(ARM_IDIV_INSTRUCTIONS)
+    return true;
+#else
+    return false;
+#endif
+}
+
+inline bool isARM64()
+{
+#if CPU(ARM64)
+    return true;
+#else
+    return false;
+#endif
+}
+
+inline bool isX86()
+{
+#if CPU(X86_64) || CPU(X86)
+    return true;
+#else
+    return false;
+#endif
+}
+
+inline bool isX86_64()
+{
+#if CPU(X86_64)
+    return true;
+#else
+    return false;
+#endif
+}
+
+inline bool optimizeForARMv7IDIVSupported()
+{
+    return isARMv7IDIVSupported() && Options::useArchitectureSpecificOptimizations();
+}
+
+inline bool optimizeForARM64()
+{
+    return isARM64() && Options::useArchitectureSpecificOptimizations();
+}
+
+inline bool optimizeForX86()
+{
+    return isX86() && Options::useArchitectureSpecificOptimizations();
+}
+
+inline bool optimizeForX86_64()
+{
+    return isX86_64() && Options::useArchitectureSpecificOptimizations();
+}
+
+} // namespace JSC
+
index af5cc39..a085556 100644 (file)
@@ -11435,7 +11435,8 @@ private:
         LBasicBlock continuation = m_out.newBlock();
 
         m_out.branch(
-            m_out.notZero32(loadCellState(base)), usually(continuation), rarely(slowPath));
+            m_out.above(loadCellState(base), m_out.constInt32(blackThreshold)),
+            usually(continuation), rarely(slowPath));
 
         LBasicBlock lastNext = m_out.appendTo(slowPath, continuation);
 
index edc716c..2cb9d4b 100644 (file)
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2015 Apple Inc. All rights reserved.
+ * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
 #ifndef CellState_h
 #define CellState_h
 
+#include <wtf/Assertions.h>
+
 namespace JSC {
 
 enum class CellState : uint8_t {
-    // The object is black as far as this GC is concerned. When not in GC, this just means that it's an
-    // old gen object. Note that we deliberately arrange OldBlack to be zero, so that the store barrier on
-    // a target object "from" is just:
-    //
-    // if (!from->cellState())
-    //     slowPath(from);
-    //
-    // There is a bunch of code in the LLInt and JITs that rely on this being the case. You'd have to
-    // change a lot of code if you ever wanted the store barrier to be anything but a non-zero check on
-    // cellState.
-    OldBlack = 0,
+    // The object is black for the first time during this GC.
+    NewBlack = 0,
+    
+    // The object is black for the Nth time during this full GC cycle (N > 1). An object may get to
+    // this state if it transitions from black back to grey during a concurrent GC, or because it
+    // wound up in the remembered set because of a generational barrier.
+    OldBlack = 1,
     
     // The object is in eden. During GC, this means that the object has not been marked yet.
-    NewWhite = 1,
+    NewWhite = 2,
+
+    // The object is grey - i.e. it will be scanned - and this is the first time in this GC that we are
+    // going to scan it. If this is an eden GC, this also means that the object is in eden.
+    NewGrey = 3,
 
     // The object is grey - i.e. it will be scanned - but it either belongs to old gen (if this is eden
     // GC) or it is grey a second time in this current GC (because a concurrent store barrier requested
     // re-greying).
-    OldGrey = 2,
-
-    // The object is grey - i.e. it will be scanned - and this is the first time in this GC that we are
-    // going to scan it. If this is an eden GC, this also means that the object is in eden.
-    NewGrey = 3
+    OldGrey = 4
 };
 
+static const unsigned blackThreshold = 1; // x <= blackThreshold means x is black.
+
+inline bool isBlack(CellState cellState)
+{
+    return static_cast<unsigned>(cellState) <= blackThreshold;
+}
+
+inline CellState blacken(CellState cellState)
+{
+    if (cellState == CellState::NewGrey)
+        return CellState::NewBlack;
+    ASSERT(cellState == CellState::NewBlack || cellState == CellState::OldBlack || cellState == CellState::OldGrey);
+    return CellState::OldBlack;
+}
+
 } // namespace JSC
 
 #endif // CellState_h
index 667fdf8..81da6bd 100644 (file)
@@ -28,6 +28,7 @@
 #include "FullGCActivityCallback.h"
 #include "GCActivityCallback.h"
 #include "GCIncomingRefCountedSetInlines.h"
+#include "GCSegmentedArrayInlines.h"
 #include "GCTypeMap.h"
 #include "HasOwnPropertyCache.h"
 #include "HeapHelperPool.h"
@@ -914,7 +915,7 @@ void Heap::addToRememberedSet(const JSCell* cell)
 {
     ASSERT(cell);
     ASSERT(!Options::useConcurrentJIT() || !isCompilationThread());
-    ASSERT(cell->cellState() == CellState::OldBlack);
+    ASSERT(isBlack(cell->cellState()));
     // Indicate that this object is grey and that it's one of the following:
     // - A re-greyed object during a concurrent collection.
     // - An old remembered object.
index bdb23ed..65c8a36 100644 (file)
@@ -175,11 +175,11 @@ public:
     // call both of these functions: Calling only one may trigger catastropic
     // memory growth.
     void reportExtraMemoryAllocated(size_t);
-    void reportExtraMemoryVisited(CellState cellStateBeforeVisiting, size_t);
+    void reportExtraMemoryVisited(JSCell*, size_t);
 
 #if ENABLE(RESOURCE_USAGE)
     // Use this API to report the subset of extra memory that lives outside this process.
-    void reportExternalMemoryVisited(CellState cellStateBeforeVisiting, size_t);
+    void reportExternalMemoryVisited(JSCell*, size_t);
     size_t externalMemorySize() { return m_externalMemorySize; }
 #endif
 
index 0168f93..89af914 100644 (file)
@@ -125,7 +125,7 @@ inline void Heap::writeBarrier(const JSCell* from, JSCell* to)
 #if ENABLE(WRITE_BARRIER_PROFILING)
     WriteBarrierCounters::countWriteBarrier();
 #endif
-    if (!from || from->cellState() != CellState::OldBlack)
+    if (!from || !isBlack(from->cellState()))
         return;
     if (!to || to->cellState() != CellState::NewWhite)
         return;
@@ -135,7 +135,7 @@ inline void Heap::writeBarrier(const JSCell* from, JSCell* to)
 inline void Heap::writeBarrier(const JSCell* from)
 {
     ASSERT_GC_OBJECT_LOOKS_VALID(const_cast<JSCell*>(from));
-    if (!from || from->cellState() != CellState::OldBlack)
+    if (!from || !isBlack(from->cellState()))
         return;
     addToRememberedSet(from);
 }
@@ -146,10 +146,10 @@ inline void Heap::reportExtraMemoryAllocated(size_t size)
         reportExtraMemoryAllocatedSlowCase(size);
 }
 
-inline void Heap::reportExtraMemoryVisited(CellState dataBeforeVisiting, size_t size)
+inline void Heap::reportExtraMemoryVisited(JSCell* cell, size_t size)
 {
     // We don't want to double-count the extra memory that was reported in previous collections.
-    if (operationInProgress() == EdenCollection && dataBeforeVisiting == CellState::OldGrey)
+    if (operationInProgress() == EdenCollection && cell->cellState() == CellState::OldBlack)
         return;
 
     size_t* counter = &m_extraMemorySize;
@@ -162,10 +162,10 @@ inline void Heap::reportExtraMemoryVisited(CellState dataBeforeVisiting, size_t
 }
 
 #if ENABLE(RESOURCE_USAGE)
-inline void Heap::reportExternalMemoryVisited(CellState dataBeforeVisiting, size_t size)
+inline void Heap::reportExternalMemoryVisited(JSCell* cell, size_t size)
 {
     // We don't want to double-count the external memory that was reported in previous collections.
-    if (operationInProgress() == EdenCollection && dataBeforeVisiting == CellState::OldGrey)
+    if (operationInProgress() == EdenCollection && cell->cellState() == CellState::OldBlack)
         return;
 
     size_t* counter = &m_externalMemorySize;
index 318183a..c61079d 100644 (file)
@@ -26,6 +26,7 @@
 #include "config.h"
 #include "MarkStack.h"
 
+#include "GCSegmentedArrayInlines.h"
 #include "JSCInlines.h"
 
 namespace JSC {
index 04f19c6..1d04292 100644 (file)
@@ -26,7 +26,7 @@
 #ifndef MarkStack_h
 #define MarkStack_h
 
-#include "GCSegmentedArrayInlines.h"
+#include "GCSegmentedArray.h"
 
 namespace JSC {
 
index 8371721..637481a 100644 (file)
 
 #include "config.h"
 #include "SlotVisitor.h"
-#include "SlotVisitorInlines.h"
 
+#include "CPU.h"
 #include "ConservativeRoots.h"
+#include "GCSegmentedArrayInlines.h"
 #include "HeapCellInlines.h"
 #include "HeapProfiler.h"
 #include "HeapSnapshotBuilder.h"
@@ -36,6 +37,7 @@
 #include "JSObject.h"
 #include "JSString.h"
 #include "JSCInlines.h"
+#include "SlotVisitorInlines.h"
 #include "SuperSampler.h"
 #include "VM.h"
 #include <wtf/Lock.h>
@@ -296,25 +298,32 @@ ALWAYS_INLINE void SlotVisitor::visitChildren(const JSCell* cell)
     
     SetCurrentCellScope currentCellScope(*this, cell);
     
-    m_currentObjectCellStateBeforeVisiting = cell->cellState();
-    cell->setCellState(CellState::OldBlack);
+    cell->setCellState(blacken(cell->cellState()));
+    
+    // FIXME: Make this work on ARM also.
+    // https://bugs.webkit.org/show_bug.cgi?id=162461
+    if (isX86())
+        WTF::storeLoadFence();
     
-    if (isJSString(cell)) {
+    switch (cell->type()) {
+    case StringType:
         JSString::visitChildren(const_cast<JSCell*>(cell), *this);
-        return;
-    }
-
-    if (isJSFinalObject(cell)) {
+        break;
+        
+    case FinalObjectType:
         JSFinalObject::visitChildren(const_cast<JSCell*>(cell), *this);
-        return;
-    }
+        break;
 
-    if (isJSArray(cell)) {
+    case ArrayType:
         JSArray::visitChildren(const_cast<JSCell*>(cell), *this);
-        return;
+        break;
+        
+    default:
+        // FIXME: This could be so much better.
+        // https://bugs.webkit.org/show_bug.cgi?id=162462
+        cell->methodTable()->visitChildren(const_cast<JSCell*>(cell), *this);
+        break;
     }
-
-    cell->methodTable()->visitChildren(const_cast<JSCell*>(cell), *this);
 }
 
 void SlotVisitor::donateKnownParallel()
index b1e4b8b..f66251b 100644 (file)
@@ -168,8 +168,6 @@ private:
     HeapSnapshotBuilder* m_heapSnapshotBuilder { nullptr };
     JSCell* m_currentCell { nullptr };
 
-    CellState m_currentObjectCellStateBeforeVisiting { CellState::NewWhite };
-
 public:
 #if !ASSERT_DISABLED
     bool m_isCheckingForDefaultMarkViolation;
index 8ac49fa..bd5e719 100644 (file)
@@ -106,13 +106,13 @@ inline void SlotVisitor::addUnconditionalFinalizer(UnconditionalFinalizer* uncon
 
 inline void SlotVisitor::reportExtraMemoryVisited(size_t size)
 {
-    heap()->reportExtraMemoryVisited(m_currentObjectCellStateBeforeVisiting, size);
+    heap()->reportExtraMemoryVisited(m_currentCell, size);
 }
 
 #if ENABLE(RESOURCE_USAGE)
 inline void SlotVisitor::reportExternalMemoryVisited(size_t size)
 {
-    heap()->reportExternalMemoryVisited(m_currentObjectCellStateBeforeVisiting, size);
+    heap()->reportExternalMemoryVisited(m_currentCell, size);
 }
 #endif
 
index 639da49..8668a6b 100644 (file)
@@ -1308,13 +1308,13 @@ public:
 
     Jump jumpIfIsRememberedOrInEden(GPRReg cell)
     {
-        return branchTest8(MacroAssembler::NonZero, MacroAssembler::Address(cell, JSCell::cellStateOffset()));
+        return branch8(Above, Address(cell, JSCell::cellStateOffset()), TrustedImm32(blackThreshold));
     }
 
     Jump jumpIfIsRememberedOrInEden(JSCell* cell)
     {
         uint8_t* address = reinterpret_cast<uint8_t*>(cell) + JSCell::cellStateOffset();
-        return branchTest8(MacroAssembler::NonZero, MacroAssembler::AbsoluteAddress(address));
+        return branch8(Above, AbsoluteAddress(address), TrustedImm32(blackThreshold));
     }
     
     // Emits the branch structure for typeof. The code emitted by this doesn't fall through. The
index 2ffbf09..dfa9aa9 100644 (file)
@@ -214,6 +214,7 @@ void Data::performAssertions(VM& vm)
     STATIC_ASSERT(GetPutInfo::initializationBits == 0xffc00);
 
     STATIC_ASSERT(MarkedBlock::blockSize == 16 * 1024);
+    STATIC_ASSERT(blackThreshold == 1);
 
     ASSERT(bitwise_cast<uintptr_t>(ShadowChicken::Packet::tailMarker()) == static_cast<uintptr_t>(0x7a11));
 
index 997ef35..10037af 100644 (file)
@@ -409,6 +409,8 @@ const NotInitialization = 2
 const MarkedBlockSize = 16 * 1024
 const MarkedBlockMask = ~(MarkedBlockSize - 1)
 
+const BlackThreshold = 1
+
 # Allocation constants
 if JSVALUE64
     const JSFinalObjectSizeClassIndex = 1
@@ -888,9 +890,10 @@ macro arrayProfile(cellAndIndexingType, profile, scratch)
     loadb JSCell::m_indexingType[cell], indexingType
 end
 
-macro skipIfIsRememberedOrInEden(cell, scratch1, scratch2, continuation)
-    loadb JSCell::m_cellState[cell], scratch1
-    continuation(scratch1)
+macro skipIfIsRememberedOrInEden(cell, slowPath)
+    bba JSCell::m_cellState[cell], BlackThreshold, .done
+    slowPath()
+.done:
 end
 
 macro notifyWrite(set, slow)
index c3ebc62..90d2c36 100644 (file)
@@ -500,9 +500,9 @@ end
 macro writeBarrierOnOperand(cellOperand)
     loadisFromInstruction(cellOperand, t1)
     loadConstantOrVariablePayload(t1, CellTag, t2, .writeBarrierDone)
-    skipIfIsRememberedOrInEden(t2, t1, t3, 
-        macro(cellState)
-            btbnz cellState, .writeBarrierDone
+    skipIfIsRememberedOrInEden(
+        t2, 
+        macro()
             push cfr, PC
             # We make two extra slots because cCall2 will poke.
             subp 8, sp
@@ -511,8 +511,7 @@ macro writeBarrierOnOperand(cellOperand)
             cCall2Void(_llint_write_barrier_slow)
             addp 8, sp
             pop PC, cfr
-        end
-    )
+        end)
 .writeBarrierDone:
 end
 
@@ -532,9 +531,9 @@ macro writeBarrierOnGlobal(valueOperand, loadHelper)
 
     loadHelper(t3)
 
-    skipIfIsRememberedOrInEden(t3, t1, t2,
-        macro(gcData)
-            btbnz gcData, .writeBarrierDone
+    skipIfIsRememberedOrInEden(
+        t3,
+        macro()
             push cfr, PC
             # We make two extra slots because cCall2 will poke.
             subp 8, sp
@@ -543,8 +542,7 @@ macro writeBarrierOnGlobal(valueOperand, loadHelper)
             cCall2Void(_llint_write_barrier_slow)
             addp 8, sp
             pop PC, cfr
-        end
-    )
+        end)
 .writeBarrierDone:
 end
 
index d34a439..e345652 100644 (file)
@@ -404,16 +404,15 @@ end
 macro writeBarrierOnOperand(cellOperand)
     loadisFromInstruction(cellOperand, t1)
     loadConstantOrVariableCell(t1, t2, .writeBarrierDone)
-    skipIfIsRememberedOrInEden(t2, t1, t3, 
-        macro(cellState)
-            btbnz cellState, .writeBarrierDone
+    skipIfIsRememberedOrInEden(
+        t2,
+        macro()
             push PB, PC
             move t2, a1 # t2 can be a0 (not on 64 bits, but better safe than sorry)
             move cfr, a0
             cCall2Void(_llint_write_barrier_slow)
             pop PC, PB
-        end
-    )
+        end)
 .writeBarrierDone:
 end
 
@@ -432,9 +431,9 @@ macro writeBarrierOnGlobal(valueOperand, loadHelper)
     btpz t0, .writeBarrierDone
 
     loadHelper(t3)
-    skipIfIsRememberedOrInEden(t3, t1, t2,
-        macro(gcData)
-            btbnz gcData, .writeBarrierDone
+    skipIfIsRememberedOrInEden(
+        t3,
+        macro()
             push PB, PC
             move cfr, a0
             move t3, a1
index a928760..ed15ae1 100644 (file)
@@ -1093,7 +1093,7 @@ inline JSFinalObject* JSFinalObject::create(VM& vm, Structure* structure)
 
 inline bool isJSFinalObject(JSCell* cell)
 {
-    return cell->classInfo() == JSFinalObject::info();
+    return cell->type() == FinalObjectType;
 }
 
 inline bool isJSFinalObject(JSValue value)
index 2951ca7..2902a6c 100644 (file)
@@ -1,3 +1,15 @@
+2016-09-23  Filip Pizlo  <fpizlo@apple.com>
+
+        Need a store-load fence between setting cell state and visiting the object in SlotVisitor
+        https://bugs.webkit.org/show_bug.cgi?id=162354
+
+        Reviewed by Mark Lam.
+        
+        Fix this on x86-32.
+
+        * wtf/Atomics.h:
+        (WTF::x86_ortop):
+
 2016-09-23  Caitlin Potter  <caitp@igalia.com>
 
         [JSC] Implement parsing of Async Functions
index c48879f..e46a4ba 100644 (file)
@@ -175,10 +175,12 @@ inline void x86_ortop()
     // know that it is equivalent for our purposes, but it would be good to
     // investigate if that is actually better.
     MemoryBarrier();
-#else
+#elif CPU(X86_64)
     // This has acqrel semantics and is much cheaper than mfence. For exampe, in the JSC GC, using
     // mfence as a store-load fence was a 9% slow-down on Octane/splay while using this was neutral.
     asm volatile("lock; orl $0, (%%rsp)" ::: "memory");
+#else
+    asm volatile("lock; orl $0, (%%esp)" ::: "memory");
 #endif
 }