[Linux] Port MallocBench
authorutatane.tea@gmail.com <utatane.tea@gmail.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Thu, 5 Oct 2017 07:05:44 +0000 (07:05 +0000)
committerutatane.tea@gmail.com <utatane.tea@gmail.com@268f45cc-cd09-0410-ab3c-d52691b4dbfc>
Thu, 5 Oct 2017 07:05:44 +0000 (07:05 +0000)
https://bugs.webkit.org/show_bug.cgi?id=177856

Reviewed by Filip Pizlo.

.:

* CMakeLists.txt:

PerformanceTests:

We would like to optimize locking in bmalloc in Linux by using futex APIs. So we should have
the way to ensure this actually improves / does not regress the performance.

This patch ports MallocBench to Linux to measure/ensure the effect of bmalloc patch in Linux.

While we replace the dispatch serial queue in message.cpp, we still use libdispatch in Benchmark.cpp
since we do not have priority mechanism in C++11 threading implementation.

We also extend run-malloc-benchmarks to accept cmake style layout of build product directory.
And we also support building MallocBench in CMake environment including CMake Mac ports.
Currently, we do not support Windows yet.

Based on the measurement, we can say the following observation. glibc's malloc performance is not
so bad. While bmalloc shows 3.8x (in geomean) performance improvement, bmalloc in Linux shows 2.0x
improvement. Since both numbers in bmalloc are similar, we can think that bmalloc's optimization is
actually working in Linux too. And even though glibc's malloc perofmrnace is not so bad, bmalloc
still offers performance improvement.

* CMakeLists.txt: Added.
* MallocBench/CMakeLists.txt: Added.
* MallocBench/MallocBench.xcodeproj/project.pbxproj:
* MallocBench/MallocBench/Benchmark.cpp:
(Benchmark::Benchmark):
(Benchmark::runOnce):
(Benchmark::currentMemoryBytes): Deleted.
* MallocBench/MallocBench/Benchmark.h:
(Benchmark::Memory::Memory): Deleted.
(Benchmark::Memory::operator-): Deleted.
* MallocBench/MallocBench/CMakeLists.txt: Added.
* MallocBench/MallocBench/CPUCount.cpp:
(cpuCount):
* MallocBench/MallocBench/Interpreter.cpp:
(Interpreter::doMallocOp):
* MallocBench/MallocBench/Memory.cpp: Added.
(currentMemoryBytes):
* MallocBench/MallocBench/Memory.h: Copied from PerformanceTests/MallocBench/MallocBench/CPUCount.cpp.
(Memory::Memory):
(Memory::operator-):
* MallocBench/MallocBench/balloon.cpp:
(benchmark_balloon):
* MallocBench/MallocBench/mbmalloc.cpp:
* MallocBench/MallocBench/message.cpp:
(WorkQueue::WorkQueue):
(WorkQueue::~WorkQueue):
(WorkQueue::dispatchAsync):
(WorkQueue::dispatchSync):
(benchmark_message_one):
(benchmark_message_many):
* MallocBench/MallocBench/nimlang.cpp:
(benchmark_nimlang):
* MallocBench/MallocBench/stress.cpp:
(SizeStream::next):
* MallocBench/MallocBench/stress_aligned.cpp:
* MallocBench/run-malloc-benchmarks:

Source/bmalloc:

* CMakeLists.txt:

git-svn-id: https://svn.webkit.org/repository/webkit/trunk@222900 268f45cc-cd09-0410-ab3c-d52691b4dbfc

22 files changed:
CMakeLists.txt
ChangeLog
PerformanceTests/CMakeLists.txt [new file with mode: 0644]
PerformanceTests/ChangeLog
PerformanceTests/MallocBench/CMakeLists.txt [new file with mode: 0644]
PerformanceTests/MallocBench/MallocBench.xcodeproj/project.pbxproj
PerformanceTests/MallocBench/MallocBench/Benchmark.cpp
PerformanceTests/MallocBench/MallocBench/Benchmark.h
PerformanceTests/MallocBench/MallocBench/CMakeLists.txt [new file with mode: 0644]
PerformanceTests/MallocBench/MallocBench/CPUCount.cpp
PerformanceTests/MallocBench/MallocBench/Interpreter.cpp
PerformanceTests/MallocBench/MallocBench/Memory.cpp [new file with mode: 0644]
PerformanceTests/MallocBench/MallocBench/Memory.h [new file with mode: 0644]
PerformanceTests/MallocBench/MallocBench/balloon.cpp
PerformanceTests/MallocBench/MallocBench/mbmalloc.cpp
PerformanceTests/MallocBench/MallocBench/message.cpp
PerformanceTests/MallocBench/MallocBench/nimlang.cpp
PerformanceTests/MallocBench/MallocBench/stress.cpp
PerformanceTests/MallocBench/MallocBench/stress_aligned.cpp
PerformanceTests/MallocBench/run-malloc-benchmarks
Source/bmalloc/CMakeLists.txt
Source/bmalloc/ChangeLog

index 4c3b940..837ba02 100644 (file)
@@ -166,6 +166,8 @@ if (ENABLE_TOOLS)
     add_subdirectory(Tools)
 endif ()
 
+add_subdirectory(PerformanceTests)
+
 # -----------------------------------------------------------------------------
 # Print the features list last, for maximum visibility.
 # -----------------------------------------------------------------------------
index 2501cfc..e63dd3a 100644 (file)
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2017-10-05  Yusuke Suzuki  <utatane.tea@gmail.com>
+
+        [Linux] Port MallocBench
+        https://bugs.webkit.org/show_bug.cgi?id=177856
+
+        Reviewed by Filip Pizlo.
+
+        * CMakeLists.txt:
+
 2017-10-04  Ryan Haddad  <ryanhaddad@apple.com>
 
         Unreviewed, rolling out r222840.
diff --git a/PerformanceTests/CMakeLists.txt b/PerformanceTests/CMakeLists.txt
new file mode 100644 (file)
index 0000000..cf60b94
--- /dev/null
@@ -0,0 +1,5 @@
+if (NOT USE_SYSTEM_MALLOC)
+    add_subdirectory(MallocBench)
+endif ()
+
+WEBKIT_INCLUDE_CONFIG_FILES_IF_EXISTS()
index 2cab909..d00b0a8 100644 (file)
@@ -1,3 +1,65 @@
+2017-10-05  Yusuke Suzuki  <utatane.tea@gmail.com>
+
+        [Linux] Port MallocBench
+        https://bugs.webkit.org/show_bug.cgi?id=177856
+
+        Reviewed by Filip Pizlo.
+
+        We would like to optimize locking in bmalloc in Linux by using futex APIs. So we should have
+        the way to ensure this actually improves / does not regress the performance.
+
+        This patch ports MallocBench to Linux to measure/ensure the effect of bmalloc patch in Linux.
+
+        While we replace the dispatch serial queue in message.cpp, we still use libdispatch in Benchmark.cpp
+        since we do not have priority mechanism in C++11 threading implementation.
+
+        We also extend run-malloc-benchmarks to accept cmake style layout of build product directory.
+        And we also support building MallocBench in CMake environment including CMake Mac ports.
+        Currently, we do not support Windows yet.
+
+        Based on the measurement, we can say the following observation. glibc's malloc performance is not
+        so bad. While bmalloc shows 3.8x (in geomean) performance improvement, bmalloc in Linux shows 2.0x
+        improvement. Since both numbers in bmalloc are similar, we can think that bmalloc's optimization is
+        actually working in Linux too. And even though glibc's malloc perofmrnace is not so bad, bmalloc
+        still offers performance improvement.
+
+        * CMakeLists.txt: Added.
+        * MallocBench/CMakeLists.txt: Added.
+        * MallocBench/MallocBench.xcodeproj/project.pbxproj:
+        * MallocBench/MallocBench/Benchmark.cpp:
+        (Benchmark::Benchmark):
+        (Benchmark::runOnce):
+        (Benchmark::currentMemoryBytes): Deleted.
+        * MallocBench/MallocBench/Benchmark.h:
+        (Benchmark::Memory::Memory): Deleted.
+        (Benchmark::Memory::operator-): Deleted.
+        * MallocBench/MallocBench/CMakeLists.txt: Added.
+        * MallocBench/MallocBench/CPUCount.cpp:
+        (cpuCount):
+        * MallocBench/MallocBench/Interpreter.cpp:
+        (Interpreter::doMallocOp):
+        * MallocBench/MallocBench/Memory.cpp: Added.
+        (currentMemoryBytes):
+        * MallocBench/MallocBench/Memory.h: Copied from PerformanceTests/MallocBench/MallocBench/CPUCount.cpp.
+        (Memory::Memory):
+        (Memory::operator-):
+        * MallocBench/MallocBench/balloon.cpp:
+        (benchmark_balloon):
+        * MallocBench/MallocBench/mbmalloc.cpp:
+        * MallocBench/MallocBench/message.cpp:
+        (WorkQueue::WorkQueue):
+        (WorkQueue::~WorkQueue):
+        (WorkQueue::dispatchAsync):
+        (WorkQueue::dispatchSync):
+        (benchmark_message_one):
+        (benchmark_message_many):
+        * MallocBench/MallocBench/nimlang.cpp:
+        (benchmark_nimlang):
+        * MallocBench/MallocBench/stress.cpp:
+        (SizeStream::next):
+        * MallocBench/MallocBench/stress_aligned.cpp:
+        * MallocBench/run-malloc-benchmarks:
+
 2017-09-26  Mathias Bynens  <mathias@qiwi.be>
 
         Speedometer: ensure all TodoMVC tests use the complete latest CSS
diff --git a/PerformanceTests/MallocBench/CMakeLists.txt b/PerformanceTests/MallocBench/CMakeLists.txt
new file mode 100644 (file)
index 0000000..8cba7b6
--- /dev/null
@@ -0,0 +1 @@
+add_subdirectory(MallocBench)
index 2be36cb..3416f81 100644 (file)
@@ -40,6 +40,7 @@
                14FCA36119A7C917001CFDA9 /* stress.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 14FCA35F19A7C917001CFDA9 /* stress.cpp */; };
                65E401A61C657A87003C6E9C /* nimlang.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 65E401A41C657A87003C6E9C /* nimlang.cpp */; };
                65E401AC1C73B068003C6E9C /* alloc_free.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 65E401AA1C73B068003C6E9C /* alloc_free.cpp */; };
+               E37681031F8529EF00617E4C /* Memory.cpp in Sources */ = {isa = PBXBuildFile; fileRef = E37681011F8529EF00617E4C /* Memory.cpp */; };
 /* End PBXBuildFile section */
 
 /* Begin PBXCopyFilesBuildPhase section */
                        dstPath = "";
                        dstSubfolderSpec = 7;
                        files = (
+                               14C5009018401841007A531D /* facebook.ops in Copy Files */,
+                               1447AE9318FB588600B3D7FF /* flickr.ops in Copy Files */,
+                               1447AE9918FB59E000B3D7FF /* flickr_memory_warning.ops in Copy Files */,
                                1486502A1C7CF1CC008AABFE /* nimlang.ops in Copy Files */,
                                1447AE9418FB589400B3D7FF /* reddit.ops in Copy Files */,
-                               1447AE9918FB59E000B3D7FF /* flickr_memory_warning.ops in Copy Files */,
-                               1447AE9B18FB59E600B3D7FF /* theverge_memory_warning.ops in Copy Files */,
                                1447AE9A18FB59E300B3D7FF /* reddit_memory_warning.ops in Copy Files */,
                                1447AE9518FB58A300B3D7FF /* theverge.ops in Copy Files */,
-                               1447AE9318FB588600B3D7FF /* flickr.ops in Copy Files */,
-                               14C5009018401841007A531D /* facebook.ops in Copy Files */,
+                               1447AE9B18FB59E600B3D7FF /* theverge_memory_warning.ops in Copy Files */,
                        );
                        name = "Copy Files";
                        runOnlyForDeploymentPostprocessing = 0;
                65E401A51C657A87003C6E9C /* nimlang.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = nimlang.h; path = MallocBench/nimlang.h; sourceTree = "<group>"; };
                65E401AA1C73B068003C6E9C /* alloc_free.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = alloc_free.cpp; path = MallocBench/alloc_free.cpp; sourceTree = "<group>"; };
                65E401AB1C73B068003C6E9C /* alloc_free.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = alloc_free.h; path = MallocBench/alloc_free.h; sourceTree = "<group>"; };
+               E37681011F8529EF00617E4C /* Memory.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = Memory.cpp; sourceTree = "<group>"; };
+               E37681021F8529EF00617E4C /* Memory.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = Memory.h; sourceTree = "<group>"; };
 /* End PBXFileReference section */
 
 /* Begin PBXFrameworksBuildPhase section */
                        isa = PBXGroup;
                        children = (
                                14E11933177F51AC003A8D15 /* Benchmarks */,
+                               1447AE8618FB4B5100B3D7FF /* Libraries */,
                                14452CAE177D24460097E057 /* MallocBench */,
                                14CC391B18EA6722004AFE34 /* mbmalloc */,
-                               1447AE8618FB4B5100B3D7FF /* Libraries */,
                                14452CAA177D24460097E057 /* Products */,
                        );
                        sourceTree = "<group>";
                                14C5009118403DA0007A531D /* Interpreter.cpp */,
                                14C5009218403DA0007A531D /* Interpreter.h */,
                                14452CAF177D24460097E057 /* main.cpp */,
+                               E37681011F8529EF00617E4C /* Memory.cpp */,
+                               E37681021F8529EF00617E4C /* Memory.h */,
                        );
                        path = MallocBench;
                        sourceTree = "<group>";
                                14C5008B184016CF007A531D /* facebook.cpp */,
                                14C5008C184016CF007A531D /* facebook.h */,
                                14C5008E18401726007A531D /* facebook.ops */,
-                               1447AE9618FB59D900B3D7FF /* flickr_memory_warning.ops */,
                                1447AE8718FB584200B3D7FF /* flickr.cpp */,
                                1447AE8818FB584200B3D7FF /* flickr.h */,
                                1447AE8918FB584200B3D7FF /* flickr.ops */,
+                               1447AE9618FB59D900B3D7FF /* flickr_memory_warning.ops */,
                                1444AE91177E79BB00F8030A /* fragment.cpp */,
                                1444AE92177E79BB00F8030A /* fragment.h */,
                                14976EC6177E3649006B819A /* list.cpp */,
                                1451FAEC18B14B7100DB6D47 /* medium.h */,
                                1444AE94177E8DF200F8030A /* message.cpp */,
                                1444AE95177E8DF200F8030A /* message.h */,
-                               14105E8018E13EEC003A106E /* realloc.cpp */,
                                65E401A41C657A87003C6E9C /* nimlang.cpp */,
                                65E401A51C657A87003C6E9C /* nimlang.h */,
                                148650291C7CF182008AABFE /* nimlang.ops */,
+                               14105E8018E13EEC003A106E /* realloc.cpp */,
                                14105E8118E13EEC003A106E /* realloc.h */,
-                               1447AE9718FB59D900B3D7FF /* reddit_memory_warning.ops */,
                                1447AE8A18FB584200B3D7FF /* reddit.cpp */,
                                1447AE8B18FB584200B3D7FF /* reddit.h */,
                                1447AE8C18FB584200B3D7FF /* reddit.ops */,
+                               1447AE9718FB59D900B3D7FF /* reddit_memory_warning.ops */,
                                14FCA35F19A7C917001CFDA9 /* stress.cpp */,
                                14FCA36019A7C917001CFDA9 /* stress.h */,
                                14D0BFF11A6F4D3B00109F31 /* stress_aligned.cpp */,
                                14D0BFF21A6F4D3B00109F31 /* stress_aligned.h */,
-                               1447AE9818FB59D900B3D7FF /* theverge_memory_warning.ops */,
                                1447AE8D18FB584200B3D7FF /* theverge.cpp */,
                                1447AE8E18FB584200B3D7FF /* theverge.h */,
                                1447AE8F18FB584200B3D7FF /* theverge.ops */,
+                               1447AE9818FB59D900B3D7FF /* theverge_memory_warning.ops */,
                                14976ECF177E4AF7006B819A /* tree.cpp */,
                                14976ED0177E4AF7006B819A /* tree.h */,
                        );
                        isa = PBXSourcesBuildPhase;
                        buildActionMask = 2147483647;
                        files = (
-                               14CE4A6017BD355800288DAA /* big.cpp in Sources */,
-                               14976ED1177E4AF7006B819A /* tree.cpp in Sources */,
-                               1444AE96177E8DF200F8030A /* message.cpp in Sources */,
                                65E401AC1C73B068003C6E9C /* alloc_free.cpp in Sources */,
+                               14105E7F18DF7D73003A106E /* balloon.cpp in Sources */,
+                               14976ECE177E3D67006B819A /* Benchmark.cpp in Sources */,
+                               14CE4A6017BD355800288DAA /* big.cpp in Sources */,
                                14452CEF177D47110097E057 /* churn.cpp in Sources */,
-                               14452CB0177D24460097E057 /* main.cpp in Sources */,
-                               14FCA36119A7C917001CFDA9 /* stress.cpp in Sources */,
+                               14976ECC177E3C87006B819A /* CommandLine.cpp in Sources */,
+                               14E11932177ECC8B003A8D15 /* CPUCount.cpp in Sources */,
                                14C5008D184016CF007A531D /* facebook.cpp in Sources */,
                                1447AE9018FB584200B3D7FF /* flickr.cpp in Sources */,
+                               1444AE93177E79BB00F8030A /* fragment.cpp in Sources */,
+                               14C5009318403DA0007A531D /* Interpreter.cpp in Sources */,
                                14976EC8177E3649006B819A /* list.cpp in Sources */,
-                               14976ECC177E3C87006B819A /* CommandLine.cpp in Sources */,
-                               65E401A61C657A87003C6E9C /* nimlang.cpp in Sources */,
-                               1447AE9218FB584200B3D7FF /* theverge.cpp in Sources */,
-                               14D0BFF31A6F4D3B00109F31 /* stress_aligned.cpp in Sources */,
+                               14452CB0177D24460097E057 /* main.cpp in Sources */,
                                1451FAED18B14B7100DB6D47 /* medium.cpp in Sources */,
-                               14C5009318403DA0007A531D /* Interpreter.cpp in Sources */,
-                               1447AE9118FB584200B3D7FF /* reddit.cpp in Sources */,
-                               14E11932177ECC8B003A8D15 /* CPUCount.cpp in Sources */,
-                               1444AE93177E79BB00F8030A /* fragment.cpp in Sources */,
+                               E37681031F8529EF00617E4C /* Memory.cpp in Sources */,
+                               1444AE96177E8DF200F8030A /* message.cpp in Sources */,
+                               65E401A61C657A87003C6E9C /* nimlang.cpp in Sources */,
                                14105E8218E13EEC003A106E /* realloc.cpp in Sources */,
-                               14105E7F18DF7D73003A106E /* balloon.cpp in Sources */,
-                               14976ECE177E3D67006B819A /* Benchmark.cpp in Sources */,
+                               1447AE9118FB584200B3D7FF /* reddit.cpp in Sources */,
+                               14FCA36119A7C917001CFDA9 /* stress.cpp in Sources */,
+                               14D0BFF31A6F4D3B00109F31 /* stress_aligned.cpp in Sources */,
+                               1447AE9218FB584200B3D7FF /* theverge.cpp in Sources */,
+                               14976ED1177E4AF7006B819A /* tree.cpp in Sources */,
                        );
                        runOnlyForDeploymentPostprocessing = 0;
                };
index d2d842c..bcbd268 100644 (file)
 #include "stress_aligned.h"
 #include "theverge.h"
 #include "tree.h"
-#include <dispatch/dispatch.h>
+#include <algorithm>
 #include <iostream>
-#include <mach/mach.h>
-#include <mach/task_info.h>
 #include <map>
+#include <stdio.h>
 #include <string>
+#include <strings.h>
 #include <sys/time.h>
 #include <thread>
-#include <unistd.h>
+#include <vector>
+
+#ifdef __APPLE__
+#include <dispatch/dispatch.h>
+#endif
 
 #include "mbmalloc.h"
 
@@ -130,9 +134,7 @@ static void deallocateHeap(void*** chunks, size_t heapSize, size_t chunkSize, si
 }
 
 Benchmark::Benchmark(CommandLine& commandLine)
-    : m_benchmarkPair()
-    , m_elapsedTime()
-    , m_commandLine(commandLine)
+    : m_commandLine(commandLine)
 {
     const BenchmarkPair* benchmarkPair = std::find(
         benchmarkPairs, benchmarkPairs + benchmarksPairsCount, m_commandLine.benchmarkName());
@@ -156,6 +158,7 @@ void Benchmark::runOnce()
         return;
     }
 
+#ifdef __APPLE__
     dispatch_group_t group = dispatch_group_create();
 
     for (size_t i = 0; i < cpuCount(); ++i) {
@@ -167,6 +170,16 @@ void Benchmark::runOnce()
     dispatch_group_wait(group, DISPATCH_TIME_FOREVER);
 
     dispatch_release(group);
+#else
+    std::vector<std::thread> threads;
+    for (size_t i = 0; i < cpuCount(); ++i) {
+        threads.emplace_back([&] {
+            m_benchmarkPair->function(m_commandLine);
+        });
+    }
+    for (auto& thread : threads)
+        thread.join();
+#endif
 }
 
 void Benchmark::run()
@@ -212,18 +225,3 @@ double Benchmark::currentTimeMS()
     return (now.tv_sec * 1000.0) + now.tv_usec / 1000.0;
 }
 
-Benchmark::Memory Benchmark::currentMemoryBytes()
-{
-    Memory memory;
-
-    task_vm_info_data_t vm_info;
-    mach_msg_type_number_t vm_size = TASK_VM_INFO_COUNT;
-    if (KERN_SUCCESS != task_info(mach_task_self(), TASK_VM_INFO_PURGEABLE, (task_info_t)(&vm_info), &vm_size)) {
-        cout << "Failed to get mach task info" << endl;
-        exit(1);
-    }
-
-    memory.resident = vm_info.internal + vm_info.compressed - vm_info.purgeable_volatile_pmap;
-    memory.residentMax = vm_info.resident_size_peak;
-    return memory;
-}
index 9a449b9..4c83f12 100644 (file)
@@ -27,6 +27,7 @@
 #define Benchmark_h
 
 #include "CommandLine.h"
+#include "Memory.h"
 #include <map>
 #include <string>
 
@@ -35,30 +36,7 @@ struct BenchmarkPair;
 
 class Benchmark {
 public:
-    struct Memory {
-        Memory()
-            : resident()
-            , residentMax()
-        {
-        }
-        
-        Memory(size_t resident, size_t residentMax)
-            : resident(resident)
-            , residentMax(residentMax)
-        {
-        }
-
-        Memory operator-(const Memory& other)
-        {
-            return Memory(resident - other.resident, residentMax - other.residentMax);
-        }
-    
-        size_t resident;
-        size_t residentMax;
-    };
-
     static double currentTimeMS();
-    static Memory currentMemoryBytes();
 
     Benchmark(CommandLine&);
     
@@ -75,12 +53,12 @@ private:
 
     MapType m_map;
 
-    const BenchmarkPair* m_benchmarkPair;
+    const BenchmarkPair* m_benchmarkPair { nullptr };
 
     CommandLine& m_commandLine;
 
     Memory m_memory;
-    double m_elapsedTime;
+    double m_elapsedTime { 0 };
 };
 
 #endif // Benchmark_h
diff --git a/PerformanceTests/MallocBench/MallocBench/CMakeLists.txt b/PerformanceTests/MallocBench/MallocBench/CMakeLists.txt
new file mode 100644 (file)
index 0000000..df0b650
--- /dev/null
@@ -0,0 +1,69 @@
+add_library(sysmalloc SHARED mbmalloc.cpp)
+set_target_properties(sysmalloc PROPERTIES OUTPUT_NAME "mbmalloc")
+set_target_properties(sysmalloc PROPERTIES LIBRARY_OUTPUT_DIRECTORY ${PROJECT_BINARY_DIR}/lib/system/)
+
+set(MALLOC_BENCH_SOURCES
+    Benchmark.cpp
+    CPUCount.cpp
+    CommandLine.cpp
+    Interpreter.cpp
+    Memory.cpp
+    alloc_free.cpp
+    balloon.cpp
+    big.cpp
+    churn.cpp
+    facebook.cpp
+    flickr.cpp
+    fragment.cpp
+    list.cpp
+    main.cpp
+    medium.cpp
+    message.cpp
+    nimlang.cpp
+    realloc.cpp
+    reddit.cpp
+    stress.cpp
+    stress_aligned.cpp
+    theverge.cpp
+    tree.cpp
+)
+
+set(MALLOC_BENCH_INCLUDE_DIRECTORIES
+    "${BMALLOC_DIR}"
+    "${CMAKE_BINARY_DIR}"
+    "${DERIVED_SOURCES_DIR}"
+    "${THIRDPARTY_DIR}"
+)
+
+set(MALLOC_BENCH_LIBRARIES
+    ${CMAKE_DL_LIBS}
+)
+
+WEBKIT_INCLUDE_CONFIG_FILES_IF_EXISTS()
+
+WEBKIT_WRAP_SOURCELIST(${MALLOC_BENCH_SOURCES})
+
+
+include_directories(${MALLOC_BENCH_INCLUDE_DIRECTORIES})
+
+SET(CMAKE_SKIP_BUILD_RPATH  TRUE)
+add_executable(MallocBench ${MALLOC_BENCH_SOURCES})
+target_link_libraries(MallocBench ${CMAKE_THREAD_LIBS_INIT} ${MALLOC_BENCH_LIBRARIES} mbmalloc)
+add_dependencies(MallocBench sysmalloc mbmalloc)
+
+set(MALLOC_BENCH_OPS
+    facebook.ops
+    flickr.ops
+    flickr_memory_warning.ops
+    nimlang.ops
+    reddit.ops
+    reddit_memory_warning.ops
+    theverge.ops
+    theverge_memory_warning.ops
+)
+
+file(COPY
+    ${MALLOC_BENCH_OPS}
+    DESTINATION
+    ${PROJECT_BINARY_DIR}
+)
index 0f20160..31d6d2d 100644 (file)
@@ -28,6 +28,7 @@
 #include <sys/param.h>
 #include <sys/sysctl.h>
 #include <sys/types.h>
+#include <unistd.h>
 
 static size_t count;
 
@@ -36,6 +37,7 @@ size_t cpuCount()
     if (count)
         return count;
 
+#ifdef __APPLE__
     size_t length = sizeof(count);
     int name[] = {
             CTL_HW,
@@ -44,6 +46,12 @@ size_t cpuCount()
     int sysctlResult = sysctl(name, sizeof(name) / sizeof(int), &count, &length, 0, 0);
     if (sysctlResult < 0)
         abort();
+#else
+    long sysconfResult = sysconf(_SC_NPROCESSORS_ONLN);
+    if (sysconfResult < 0)
+        abort();
+    count = sysconfResult;
+#endif
 
     return count;
 }
index 7a1bb8f..8c6cf3f 100644 (file)
@@ -227,7 +227,7 @@ static size_t compute2toPower(unsigned log2n)
     return result;
 }
 
-void Interpreter::doMallocOp(Op op, ThreadId threadId)
+void Interpreter::doMallocOp(Op op, ThreadId)
 {
     switch (op.opcode) {
         case op_malloc: {
diff --git a/PerformanceTests/MallocBench/MallocBench/Memory.cpp b/PerformanceTests/MallocBench/MallocBench/Memory.cpp
new file mode 100644 (file)
index 0000000..169596e
--- /dev/null
@@ -0,0 +1,87 @@
+/*
+ * Copyright (C) 2014 Apple Inc. All rights reserved.
+ * Copyright (C) 2017 Yusuke Suzuki <utatane.tea@gmail.com>
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "Memory.h"
+#include <iostream>
+#include <stdlib.h>
+
+#ifdef __APPLE__
+#include <mach/mach.h>
+#include <mach/task_info.h>
+#else
+#include <stdio.h>
+#include <unistd.h>
+#endif
+
+Memory currentMemoryBytes()
+{
+    Memory memory;
+
+#ifdef __APPLE__
+    task_vm_info_data_t vm_info;
+    mach_msg_type_number_t vm_size = TASK_VM_INFO_COUNT;
+    if (KERN_SUCCESS != task_info(mach_task_self(), TASK_VM_INFO_PURGEABLE, (task_info_t)(&vm_info), &vm_size)) {
+        std::cout << "Failed to get mach task info" << std::endl;
+        exit(1);
+    }
+
+    memory.resident = vm_info.internal + vm_info.compressed - vm_info.purgeable_volatile_pmap;
+    memory.residentMax = vm_info.resident_size_peak;
+#else
+    FILE* file = fopen("/proc/self/status", "r");
+
+    auto forEachLine = [] (FILE* file, auto functor) {
+        char* buffer = nullptr;
+        size_t size = 0;
+        while (getline(&buffer, &size, file) != -1) {
+            functor(buffer, size);
+            ::free(buffer); // Be careful. getline's memory allocation is done by system malloc.
+            buffer = nullptr;
+            size = 0;
+        }
+    };
+
+    unsigned long vmHWM = 0;
+    unsigned long vmRSS = 0;
+    unsigned long rssFile = 0;
+    unsigned long rssShmem = 0;
+    forEachLine(file, [&] (char* buffer, size_t) {
+        unsigned long sizeInKB = 0;
+        if (sscanf(buffer, "VmHWM: %lu kB", &sizeInKB) == 1)
+            vmHWM = sizeInKB * 1024;
+        else if (sscanf(buffer, "VmRSS: %lu kB", &sizeInKB) == 1)
+            vmRSS = sizeInKB * 1024;
+        else if (sscanf(buffer, "RssFile: %lu kB", &sizeInKB) == 1)
+            rssFile = sizeInKB * 1024;
+        else if (sscanf(buffer, "RssShmem: %lu kB", &sizeInKB) == 1)
+            rssShmem = sizeInKB * 1024;
+    });
+    fclose(file);
+    memory.resident = vmRSS - (rssFile + rssShmem);
+    memory.residentMax = vmHWM - (rssFile + rssShmem); // We do not have any way to get the peak of RSS of anonymous pages. Here, we subtract RSS of files and shmem to estimate the peak of RSS of anonymous pages.
+#endif
+    return memory;
+}
diff --git a/PerformanceTests/MallocBench/MallocBench/Memory.h b/PerformanceTests/MallocBench/MallocBench/Memory.h
new file mode 100644 (file)
index 0000000..5c7e91e
--- /dev/null
@@ -0,0 +1,53 @@
+/*
+ * Copyright (C) 2014 Apple Inc. All rights reserved.
+ * Copyright (C) 2017 Yusuke Suzuki <utatane.tea@gmail.com>
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#pragma once
+
+#include <cstddef>
+
+struct Memory {
+    Memory()
+        : resident()
+        , residentMax()
+    {
+    }
+
+    Memory(size_t resident, size_t residentMax)
+        : resident(resident)
+        , residentMax(residentMax)
+    {
+    }
+
+    Memory operator-(const Memory& other)
+    {
+        return Memory(resident - other.resident, residentMax - other.residentMax);
+    }
+
+    size_t resident;
+    size_t residentMax;
+};
+
+Memory currentMemoryBytes();
index 8c0d43f..070e860 100644 (file)
  * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
  */
 
-#include "Benchmark.h"
 #include "CPUCount.h"
+#include "Memory.h"
 #include "balloon.h"
 #include <array>
 #include <chrono>
 #include <memory>
 #include <stddef.h>
+#include <strings.h>
 
 #include "mbmalloc.h"
 
@@ -61,7 +62,7 @@ void benchmark_balloon(CommandLine&)
     // Converts bytes to time -- for reporting's sake -- by waiting a while until
     // the heap shrinks back down. This isn't great for pooling with other
     // benchmarks in a geometric mean of throughput, but it's OK for basic testing.
-    while (Benchmark::currentMemoryBytes().resident > 2 * steadySize
+    while (currentMemoryBytes().resident > 2 * steadySize
         && std::chrono::steady_clock::now() - start < 8 * benchmarkTime) {
         for (size_t i = 0; i < steady.size(); ++i) {
             steady[i] = mbmalloc(chunkSize);
index 62537b8..6a89e64 100644 (file)
 #include <limits>
 #include <stdio.h>
 #include <stdlib.h>
+
+#ifdef __APPLE__
 #import <malloc/malloc.h>
+#else
+#include <malloc.h>
+#endif
 
 extern "C" {
 
@@ -38,7 +43,8 @@ void* mbmalloc(size_t size)
 void* mbmemalign(size_t alignment, size_t size)
 {
     void* result;
-    posix_memalign(&result, alignment, size);
+    if (posix_memalign(&result, alignment, size))
+        return nullptr;
     return result;
 }
 
@@ -54,7 +60,11 @@ void* mbrealloc(void* p, size_t, size_t newSize)
 
 void mbscavenge()
 {
+#ifdef __APPLE__
     malloc_zone_pressure_relief(nullptr, 0);
+#else
+    malloc_trim(0);
+#endif
 }
 
 } // extern "C"
index 1b5f9d0..7633f9f 100644 (file)
 
 #include "CPUCount.h"
 #include "message.h"
-#include <dispatch/dispatch.h>
+#include <condition_variable>
+#include <deque>
+#include <functional>
+#include <mutex>
 #include <stdlib.h>
 #include <strings.h>
+#include <thread>
 
 #include "mbmalloc.h"
 
@@ -110,6 +114,68 @@ private:
 
 } // namespace
 
+class WorkQueue {
+public:
+    WorkQueue()
+    {
+        m_thread = std::thread([&] {
+            while (true) {
+                std::function<void()> target;
+                {
+                    std::unique_lock<std::mutex> locker(m_mutex);
+                    m_condition.wait(locker, [&] { return !m_queue.empty(); });
+                    auto queued = m_queue.front();
+                    m_queue.pop_front();
+                    if (!queued)
+                        return;
+                    target = std::move(queued);
+                }
+                target();
+            }
+        });
+    }
+
+    ~WorkQueue() {
+        {
+            std::unique_lock<std::mutex> locker(m_mutex);
+            m_queue.push_back(nullptr);
+            m_condition.notify_one();
+        }
+        m_thread.join();
+    }
+
+    void dispatchAsync(std::function<void()> target)
+    {
+        std::unique_lock<std::mutex> locker(m_mutex);
+        m_queue.push_back(target);
+        m_condition.notify_one();
+    }
+
+    void dispatchSync(std::function<void()> target)
+    {
+        std::mutex syncMutex;
+        std::condition_variable syncCondition;
+
+        std::unique_lock<std::mutex> locker(syncMutex);
+        bool done = false;
+        dispatchAsync([&] {
+            target();
+            {
+                std::unique_lock<std::mutex> locker(syncMutex);
+                done = true;
+                syncCondition.notify_one();
+            }
+        });
+        syncCondition.wait(locker, [&] { return done; });
+    }
+
+private:
+    std::mutex m_mutex;
+    std::condition_variable m_condition;
+    std::deque<std::function<void()>> m_queue;
+    std::thread m_thread;
+};
+
 void benchmark_message_one(CommandLine& commandLine)
 {
     if (commandLine.isParallel())
@@ -118,24 +184,20 @@ void benchmark_message_one(CommandLine& commandLine)
     const size_t times = 2048;
     const size_t quantum = 16;
 
-    dispatch_queue_t queue = dispatch_queue_create("message", 0);
-
+    WorkQueue workQueue;
     for (size_t i = 0; i < times; i += quantum) {
         for (size_t j = 0; j < quantum; ++j) {
             Message* message = new Message;
-            dispatch_async(queue, ^{
+            workQueue.dispatchAsync([message] {
                 size_t hash = message->hash();
                 if (hash)
                     abort();
                 delete message;
             });
         }
-        dispatch_sync(queue, ^{ });
+        workQueue.dispatchSync([] { });
     }
-
-    dispatch_sync(queue, ^{ });
-
-    dispatch_release(queue);
+    workQueue.dispatchSync([] { });
 }
 
 void benchmark_message_many(CommandLine& commandLine)
@@ -147,15 +209,15 @@ void benchmark_message_many(CommandLine& commandLine)
     const size_t quantum = 16;
 
     const size_t queueCount = cpuCount() - 1;
-    dispatch_queue_t queues[queueCount];
+    std::unique_ptr<WorkQueue> queues[queueCount];
     for (size_t i = 0; i < queueCount; ++i)
-        queues[i] = dispatch_queue_create("message", 0);
+        queues[i] = std::make_unique<WorkQueue>();
 
     for (size_t i = 0; i < times; i += quantum) {
         for (size_t j = 0; j < quantum; ++j) {
             for (size_t k = 0; k < queueCount; ++k) {
                 Message* message = new Message;
-                dispatch_async(queues[k], ^{
+                queues[k]->dispatchAsync([message] {
                     size_t hash = message->hash();
                     if (hash)
                         abort();
@@ -165,12 +227,9 @@ void benchmark_message_many(CommandLine& commandLine)
         }
 
         for (size_t i = 0; i < queueCount; ++i)
-            dispatch_sync(queues[i], ^{ });
+            queues[i]->dispatchSync([] { });
     }
 
     for (size_t i = 0; i < queueCount; ++i)
-        dispatch_sync(queues[i], ^{ });
-
-    for (size_t i = 0; i < queueCount; ++i)
-        dispatch_release(queues[i]);
+        queues[i]->dispatchSync([] { });
 }
index ae1eb37..335f07e 100644 (file)
@@ -51,6 +51,6 @@ void benchmark_nimlang(CommandLine& commandLine)
     for (size_t i = 0; i < times; ++i)
         interpreter.run();
 
-        if (commandLine.detailedReport())
-            interpreter.detailedReport();
+    if (commandLine.detailedReport())
+        interpreter.detailedReport();
 }
index 6bc1ba3..1c14747 100644 (file)
@@ -27,6 +27,7 @@
 #include "CPUCount.h"
 #include "stress.h"
 #include <array>
+#include <cassert>
 #include <chrono>
 #include <cstdlib>
 #include <memory>
@@ -86,6 +87,8 @@ public:
             return random() % largeMax;
         }
         }
+        assert(0);
+        return 0;
     }
 
 private:
index 49b91bc..89c8ea2 100644 (file)
@@ -27,6 +27,7 @@
 #include "CPUCount.h"
 #include "stress_aligned.h"
 #include <array>
+#include <cassert>
 #include <chrono>
 #include <cmath>
 #include <cstdlib>
@@ -89,6 +90,8 @@ public:
             return random() % largeMax;
         }
         }
+        assert(0);
+        return 0;
     }
 
 private:
index b5bd0a0..eeda5d9 100755 (executable)
@@ -7,6 +7,29 @@ require 'pathname'
 $binDir = "#{File.expand_path(File.dirname(__FILE__))}"
 $productDir = `perl -e 'use lib \"#{$binDir}/../../Tools/Scripts\"; use webkitdirs; print productDir()'`
 
+def determineOS
+    case RbConfig::CONFIG["host_os"]
+    when /darwin/i
+        "darwin"
+    when /linux/i
+        "linux"
+    when /mswin|mingw|cygwin/
+        "windows"
+    else
+        $stderr.puts "Warning: unable to determine host operating system"
+        nil
+    end
+end
+
+$hostOS = determineOS unless $hostOS
+$cmake = false
+
+if $hostOS == 'darwin'
+    $libraryExtension = "dylib"
+else
+    $libraryExtension = "so"
+end
+
 $benchmarks_all = [
     # Single-threaded benchmarks.
     "churn",
@@ -91,7 +114,8 @@ def usage
        puts "Options:"
     puts
     puts "    --benchmark <benchmark>      Select a single benchmark to run instead of the full suite."
-    puts "    --heap <heap>           Set a baseline heap size."
+    puts "    --heap <heap>                Set a baseline heap size."
+    puts "    --cmake                      Specify if build directory layout is for CMake."
     puts
 end
 
@@ -101,7 +125,7 @@ class Dylib
 
     def initialize(name, path)
         @name = name
-        @path = File.join(path, "libmbmalloc.dylib")
+        @path = File.join(path, "libmbmalloc.#{$libraryExtension}")
     end
 end
 
@@ -214,6 +238,7 @@ end
 def parseOptions
     GetoptLong.new(
         ['--benchmark', GetoptLong::REQUIRED_ARGUMENT],
+        ['--cmake', GetoptLong::NO_ARGUMENT],
         ['--memory', GetoptLong::NO_ARGUMENT],
         ['--memory_warning', GetoptLong::NO_ARGUMENT],
         ['--heap', GetoptLong::REQUIRED_ARGUMENT],
@@ -227,6 +252,8 @@ def parseOptions
             $benchmarks = $benchmarks_memory
         when '--memory_warning'
             $benchmarks = $benchmarks_memory_warning
+        when '--cmake'
+            $cmake = true
         when '--heap'
             $heap = arg
         when '--help'
@@ -237,6 +264,16 @@ def parseOptions
         end
     }
 
+    if $cmake
+        $libraryDir = "#{$productDir}/lib"
+        $systemMallocLibraryDir = "#{$productDir}/lib/system"
+        $binaryDir = "#{$productDir}/bin"
+    else
+        $libraryDir = $productDir
+        $binaryDir = $productDir
+        $systemMallocLibraryDir = $productDir
+    end
+
     if ARGV.length < 1
         puts "Error: No dylib specified."
         exit 1
@@ -246,9 +283,9 @@ def parseOptions
     ARGV.each {
         | arg |
         if arg == "SystemMalloc"
-            dylib = Dylib.new("SystemMalloc", $productDir)
+            dylib = Dylib.new("SystemMalloc", $systemMallocLibraryDir)
         elsif arg == "NanoMalloc"
-            dylib = Dylib.new("NanoMalloc", $productDir)
+            dylib = Dylib.new("NanoMalloc", $libraryDir)
         else
             name = arg.split(":")[0]
             path = arg.split(":")[1]
@@ -288,10 +325,11 @@ def runBenchmarks(dylibs)
 
             $stderr.print "\rRUNNING #{dylib.name}: #{benchmark}...                                "
             env = "DYLD_LIBRARY_PATH='#{Pathname.new(dylib.path).dirname}' "
+            env += "LD_LIBRARY_PATH='#{Pathname.new(dylib.path).dirname}' "
             if dylib.name == "NanoMalloc"
                 env += "MallocNanoZone=1 "
             end
-            input = "cd '#{$productDir}'; #{env} '#{$productDir}/MallocBench' --benchmark #{benchmark} --heap #{$heap}}"
+            input = "cd '#{$productDir}'; #{env} '#{$binaryDir}/MallocBench' --benchmark #{benchmark} --heap #{$heap}}"
             output =`#{input}`
             splitOutput = output.split("\n")
 
index bf604ce..3fab69f 100644 (file)
@@ -37,3 +37,7 @@ include_directories(${bmalloc_INCLUDE_DIRECTORIES})
 add_library(bmalloc STATIC ${bmalloc_SOURCES})
 target_link_libraries(bmalloc ${bmalloc_LIBRARIES})
 set_target_properties(bmalloc PROPERTIES COMPILE_DEFINITIONS "BUILDING_bmalloc")
+
+add_library(mbmalloc SHARED bmalloc/mbmalloc.cpp)
+target_link_libraries(mbmalloc bmalloc ${CMAKE_THREAD_LIBS_INIT} ${bmalloc_LIBRARIES})
+set_target_properties(mbmalloc PROPERTIES COMPILE_DEFINITIONS "BUILDING_mbmalloc")
index d8dac47..ad0ac34 100644 (file)
@@ -1,3 +1,12 @@
+2017-10-05  Yusuke Suzuki  <utatane.tea@gmail.com>
+
+        [Linux] Port MallocBench
+        https://bugs.webkit.org/show_bug.cgi?id=177856
+
+        Reviewed by Filip Pizlo.
+
+        * CMakeLists.txt:
+
 2017-10-04  Filip Pizlo  <fpizlo@apple.com>
 
         bmalloc mutex should be adaptive