2 * Add more test cases. Categories we'd like to cover (with reasonably
3 real-world tests, preferably not microbenchmarks) include:
14 - function calls / recursion
15 - object access (unclear if it is possible to make a realistic
16 benchmark that isolates this)
18 I'd specifically like to add all the computer language shootout
19 tests that Mozilla is using.
21 * Normalize tests. Most of the test cases available have a repeat
22 count of some sort, so the time they take can be tuned. The tests
23 should be tuned so that each category contributes about the same
24 total, and so each test in each category contributes about the same
25 amount. The question is, what implementation should be the baseline?
26 My current thought is to either pick some specific browser on a
27 specific platform (IE 7 or Firefox 2 perhaps), or try to target the
28 average that some set of same-generation release browsers get on
29 each test. The latter is more work. IE7 is probably a reasonable
30 normalization target since it is the latest version of the most
31 popular browser, so results on this benchmark will tell you how much
32 you have to gain or lose by using a different browser.
34 * Add support to compare two different engines (or two builds of the
35 same engine) interleaved.
37 * Add support to compare two existing sets of saved results.
39 * Allow repeat count to be controlled from the browser-hosted version
40 and the WebKitTools wrapper script.
42 * Add support to run only a subset of the tests (both command-line and
45 * Add a profile mode for the command-line version that runs the tests
46 repeatedly in the same command-line interpreter instance, for ease
49 * Make the browser-hosted version prettier, both in general design and
50 maybe using bar graphs for the output.
52 * Make it possible to track change over time and generate a graph per
53 result showing result and error bar for each version.
55 * Hook up to automated testing / buildbot infrastructure.
57 * Possibly... add the ability to download iBench from its original
58 server, pull out the JS test content, preprocess it, and add it as a
59 category to the benchmark.