Benchmarking your code can take on many forms. On some level Application Performance Monitoring (APM) solutions such as New Relic can sometimes be considered live benchmarking tools if you are using A/B testing. All the way down to wrapping a stopwatch object around your code and running it inside a loop. This article will be looking more towards the latter. Benchmarking specific lines of code either against each other or on it’s own to get performance metrics can be extremely important in understanding how your code will run at scale.
While wrapping your code in a timer and running it a few hundred times is a good start, it’s not exactly reliable. There are far too many pitfalls that you can get trapped in that completely skew your results. Luckily there is always a Nuget package to cover you! That package in this case is BenchmarkDotNet. It takes care of things like warming up your code, isolating each benchmark from each other, and giving you metrics on code performance. Let’s jump straight in!
Code Benchmarking
Code Benchmarking is when you want to compare two pieces of code/methods against each other. It’s a great way to quantify a code rewrite or refactor and it’s going to be the most common use case for BenchmarkDotNet.
To get started, create a blank .NET Core console application. Now, most of this “should” work when using .NET Full Framework too, but I’ll be doing everything here in .NET Core.
Next you need to run the following from your Package Manager console to install the BenchmarkDotNet nuget package :
Install-Package BenchmarkDotNet
Next we need to build up our code. For this we are going to use a classic “needle in a haystack”. We are going to build up a large list in C# with random items within it, and place a “needle” right in the middle of the list. Then we will compare how doing “SingleOrDefault” on a list compares to “FirstOrDefault”. Here is our complete code :
using System; using System.Collections.Generic; using System.Linq; using BenchmarkDotNet.Attributes; using BenchmarkDotNet.Running; namespace BenchmarkExample { public class SingleVsFirst { private readonly List<string> _haystack = new List<string>(); private readonly int _haystackSize = 1000000; private readonly string _needle = "needle"; public SingleVsFirst() { //Add a large amount of items to our list. Enumerable.Range(1, _haystackSize).ToList().ForEach(x => _haystack.Add(x.ToString())); //Insert the needle right in the middle. _haystack.Insert(_haystackSize / 2, _needle); } [Benchmark] public string Single() => _haystack.SingleOrDefault(x => x == _needle); [Benchmark] public string First() => _haystack.FirstOrDefault(x => x == _needle); } class Program { static void Main(string[] args) { var summary = BenchmarkRunner.Run<SingleVsFirst>(); Console.ReadLine(); } } }
Walking through this a bit, first we create a class to hold our benchmarks within it. This can contain any number of private methods and can include setup code within the constructor. Any code within the constructor is not included in the timing of the method. We can then create public methods and add the attribute of [Benchmark] to have them listed as items that should be compared and benchmarked.
Finally inside our main method of our console application, we use the “BenchmarkRunner” class to run our benchmark.
A word of note when running the benchmarking tool. It must be built in “Release” mode, and run from the command line. You should not use benchmarks run from Visual Studio as this also attaches a debugger and is not compiled as “optimized”. To run from the command line, head to your applications bin/Release/netcoreappxx/ folder, then run dotnet {YourDLLName}.dll
And the results?
Method | Mean | StdDev | Median | ------- |----------:|----------:|----------:| Single | 15.591 ms | 0.4429 ms | 15.507 ms | First | 7.638 ms | 0.4399 ms | 7.475 ms |
So it looks like Single is twice as slow as First! If you understand what Single does under the hood, this is to be expected. When First finds an item, it immediately returns (After all, it only wants the “First” item). However when Single finds an item, it still needs to traverse the entire rest of the list because if there is more than one, it needs to throw an exception. This makes sense when we are placing the item in the middle of the list!
Input Benchmarking
Let’s say that we’ve found Single is slower than First. And we have a theory on why that is (That Single needs to continue through the list), then we may need a way to try different “configurations” without having to re-run the test with minor details changed. For that we can use the “Input” feature of BenchmarkDotNet.
Let’s modify our code a bit :
using System; using System.Collections.Generic; using System.Linq; using BenchmarkDotNet.Attributes; using BenchmarkDotNet.Running; namespace BenchmarkExample { public class SingleVsFirst { private readonly List<string> _haystack = new List<string>(); private readonly int _haystackSize = 1000000; public List<string> _needles => new List<string> { "StartNeedle", "MiddleNeedle", "EndNeedle" }; public SingleVsFirst() { //Add a large amount of items to our list. Enumerable.Range(1, _haystackSize).ToList().ForEach(x => _haystack.Add(x.ToString())); //One at the start. _haystack.Insert(0, _needles[0]); //One right in the middle. _haystack.Insert(_haystackSize / 2, _needles[1]); //One at the end. _haystack.Insert(_haystack.Count - 1, _needles[2]); } [ParamsSource(nameof(_needles))] public string Needle { get; set; } [Benchmark] public string Single() => _haystack.SingleOrDefault(x => x == Needle); [Benchmark] public string First() => _haystack.FirstOrDefault(x => x == Needle); } class Program { static void Main(string[] args) { var summary = BenchmarkRunner.Run<SingleVsFirst>(); Console.ReadLine(); } } }
What we have done here is create a “_needles” property to hold different needles we may wish to find. And we’ve inserted them at different indexes within our list. We then create a “Needle” property with the attribute of ParamsSource”. This tells BenchmarkDotNet to rotate through these and run a different test for each possible value.
A big tip is that the ParamsSource must be public, and it must be a property – It cannot be a property.
Running this, our report now looks like so :
Method | Needle | Mean | StdDev | ------- |------------- |-----------------:|-----------------:| Single | EndNeedle | 19,741,752.75 ns | 1,078,431.672 ns | First | EndNeedle | 18,422,088.07 ns | 998,023.064 ns | Single | MiddleNeedle | 19,326,424.98 ns | 1,356,796.153 ns | First | MiddleNeedle | 9,586,518.55 ns | 649,534.186 ns | Single | StartNeedle | 18,509,550.74 ns | 1,113,976.063 ns | First | StartNeedle | 77.90 ns | 7.782 ns |
It’s a little harder to see because we are now down to nanoseconds based on the time that “First” takes to return the StartNeedle. But the results are very clear.
When Single is run, the time it takes to return the needle is the same regardless of where it is in the list. Whereas First’s response time is totally dependent on where the item is in the list.
The Input feature can be a huge help in understand how or why applications slow down given different inputs. For example does your password hashing function get slower when passwords are longer? Or is it not a factor at all?
Creating A Baseline
One last helpful tip that does nothing more than create a nice little “multiplier” on the report is to mark one of your benchmarks as the “baseline”. If we go back to our first example (Without Inputs), we just need to mark one of our Benchmarks as baseline like so : [Benchmark(Baseline = true)]
Now when we run our test with “First” marked as the baseline, the output now looks like :
Method | Mean | Scaled | ------- |---------:|-------:| Single | 22.77 ms | 1.99 | First | 11.42 ms | 1.00 |
So now it’s easier to see the “factor” by which our other methods are slower (Or faster). In this case our Single call is almost twice as slow as the First call.
What Have You Benchmarked?
As programmers we love seeing how a tiny little change can improve performance by leaps and bounds. If you’ve used BenchmarkDotNet to benchmark a piece of code and you’ve been amazed by the results, drop a comment below with a Github Gist of the code and a little about the what/why/how it was slow!
Is it possible to benchmark a class in a big project without creating a new solution?
Am asking this because this project is connected with several other projects in order to provide data for exporting them.