Performance testing - best practices?


#1

(on behalf of CLR perf team)
We are fishing for best practices how to do performance tests in open source projects and for cross-platform.
We have bunch of reusable practices home-grown, but they heavily depend on monitoring on certified dedicated machines in our perf lab (i.e. no other process/disk IO interaction).

Any pointers how other open source projects deal with perf test suites in their CI systems?

What we do today in-house:
There’s a set of tests. Many of them are microbenchmarks, they are the easiest: Each microbenchmark is wrapped in runner which warms up the scenario, then runs it 5 times and measures the time. It prints the result with some basic statistics to the output/log. A tool then parses all the logs and displays them (through a DB) in HTML-based UI with history (graphs for trends, etc.).
Currently we run it on the same dedicated machines, so results are comparable and one can reason about changes over time.

What we could do:
Keep running tests on dedicated machines and publish results somewhere. Is that the best practice?
To support dev scenario on local box we could generalize the infrastructure to run on any machine and provide a tool that compares 2 runs (baseline vs. PR), so any dev can check impact on performance prior to PR if there’s perf risk.


#2

Keep running tests on dedicated machines and publish results somewhere. Is that the best practice?

Sounds like a perfectly fine approach to me. E.g. here’s a blog post from the HHVM team showing some of their benchmarks. I couldn’t find out if they publish this somewhere publicly for every commit/test run, so if you do that it’s even better.

The Mono team also publishes GC perf test results, see docs and an example result

To support dev scenario on local box we could generalize the infrastructure to run on any machine and provide a tool that compares 2 runs (baseline vs. PR), so any dev can check impact on performance prior to PR if there’s perf risk.

:+1:

Another interesting thing the HHVM team is doing is to include major OSS projects in perf runs, e.g. https://github.com/hhvm/oss-performance. I’m not sure if you’re already doing something similar internally, but I’d image seeing how a change impacts a complex application / framework like Orchard, RavenDB, Nancy, ServiceStack, etc. could be very helpful.


#3

Would it be a problem to list the hardware (“certified dedicated machines”)?

So if anyone wants to do the performance tests this would be one of the conditions for comparison.


#4

Slightly OT, but would you ever consider releasing your micro-benchmark tools so that others can use them?


#5

@akoeplinger Thanks for your suggestions, we will look into the resources you sent.

I don’t mind listing the HW on the results page. However I don’t expect anyone will go as far as building similar machine for any testing (waste of money). So it is more informative than useful for other devs. Or did you have anything else in your mind?

Sounds like a good idea, I don’t see any reason why we should not release also the tooling.


#7

I expected that hardware is specialized high end and thus expensive.

Thanks for your answer.


#8

Thanks that’s cool, I’ll keep an eye out for an announcement about it


#9

Just to set the right expectations: It will likely take few months before we get to it, but we will definitely get to it at some point.


.NET Foundation Website | Blog | Projects | Code of Conduct