C# Tip: Improve memory allocation by initializing collection size
Sometimes just a minor change can affect performance. Here’s a simple trick: initialize your collections by specifying the initial size!
Table of Contents
Just a second! π«·
If you are here, it means that you are a software developer. So, you know that storage, networking, and domain management have a cost .
If you want to support this blog, please ensure that you have disabled the adblocker for this site. I configured Google AdSense to show as few ADS as possible - I don't want to bother you with lots of ads, but I still need to add some to pay for the resources for my site.
Thank you for your understanding.
- Davide
When you initialize a collection, like a List
, you create it with the default size.
Whenever you add an item to a collection, .NET checks that there is enough capacity to hold the new item. If not, it resizes the collection by doubling the inner capacity.
Resizing the collection takes time and memory.
Therefore, when possible, you should initialize the collection with the expected number of items it will contain.
Initialize a List
In the case of a List
, you can simply replace new List<T>()
with new List<T>(size)
. By specifying the initial size in the constructor’s parameters, you’ll have a good performance improvement.
Let’s create a benchmark using BenchmarkDotNet and .NET 8.0.100-rc.1.23455.8 (at the time of writing, .NET 8 is still in preview. However, we can get an idea of the average performance).
The benchmark is pretty simple:
[MemoryDiagnoser]
public class CollectionWithSizeInitializationBenchmarks
{
[Params(100, 1000, 10000, 100000)]
public int Size;
[Benchmark]
public void WithoutInitialization()
{
List<int> list = new List<int>();
for (int i = 0; i < Size; i++)
{
list.Add(i);
}
}
[Benchmark(Baseline = true)]
public void WithInitialization()
{
List<int> list = new List<int>(Size);
for (int i = 0; i < Size; i++)
{
list.Add(i);
}
}
}
The only difference is in the list initialization: in the WithInitialization
, we have List<int> list = new List<int>(Size);
.
Have a look at the benchmark result, split by time and memory execution.
Starting with the execution time, we can see that without list initialization, we have an average 1.7x performance degradation.
Method | Size | Mean | Ratio |
---|---|---|---|
WithoutInitialization | 100 | 299.659 ns | 1.77 |
WithInitialization | 100 | 169.121 ns | 1.00 |
WithoutInitialization | 1000 | 1,549.343 ns | 1.58 |
WithInitialization | 1000 | 944.862 ns | 1.00 |
WithoutInitialization | 10000 | 16,307.082 ns | 1.80 |
WithInitialization | 10000 | 9,035.945 ns | 1.00 |
WithoutInitialization | 100000 | 388,089.153 ns | 1.73 |
WithInitialization | 100000 | 227,040.318 ns | 1.00 |
If we talk about memory allocation, we waste an overage of 2.5x memory if compared to collections with size initialized.
Method | Size | Allocated | Alloc Ratio |
---|---|---|---|
WithoutInitialization | 100 | 1184 B | 2.60 |
WithInitialization | 100 | 456 B | 1.00 |
WithoutInitialization | 1000 | 8424 B | 2.08 |
WithInitialization | 1000 | 4056 B | 1.00 |
WithoutInitialization | 10000 | 131400 B | 3.28 |
WithInitialization | 10000 | 40056 B | 1.00 |
WithoutInitialization | 100000 | 1049072 B | 2.62 |
WithInitialization | 100000 | 400098 B | 1.00 |
Initialize an HashSet
Similar to what we’ve done with List
’s, we can see significant improvements when initializing correctly other data types, such as HashSet
’s.
Let’s run the same benchmarks, but this time, let’s initialize a HashSet<int>
instead of a List<int>
.
The code is pretty similar:
[Benchmark]
public void WithoutInitialization()
{
var set = new HashSet<int>();
for (int i = 0; i < Size; i++)
{
set.Add(i);
}
}
[Benchmark(Baseline = true)]
public void WithInitialization()
{
var set = new HashSet<int>(Size);
for (int i = 0; i < Size; i++)
{
set.Add(i);
}
}
What can we say about performance improvements?
If we talk about execution time, we can see an average of 2x improvements.
Method | Size | Mean | Ratio |
---|---|---|---|
WithoutInitialization | 100 | 1,122.2 ns | 2.02 |
WithInitialization | 100 | 558.4 ns | 1.00 |
WithoutInitialization | 1000 | 12,215.6 ns | 2.74 |
WithInitialization | 1000 | 4,478.4 ns | 1.00 |
WithoutInitialization | 10000 | 148,603.7 ns | 1.90 |
WithInitialization | 10000 | 78,293.3 ns | 1.00 |
WithoutInitialization | 100000 | 1,511,011.6 ns | 1.96 |
WithInitialization | 100000 | 810,657.8 ns | 1.00 |
If we look at memory allocation, if we don’t initialize the HashSet
, we are slowing down the application by a factor of 3x. Impressive!
Method | Size | Allocated | Alloc Ratio |
---|---|---|---|
WithoutInitialization | 100 | 5.86 KB | 3.28 |
WithInitialization | 100 | 1.79 KB | 1.00 |
WithoutInitialization | 1000 | 57.29 KB | 3.30 |
WithInitialization | 1000 | 17.35 KB | 1.00 |
WithoutInitialization | 10000 | 526.03 KB | 3.33 |
WithInitialization | 10000 | 157.99 KB | 1.00 |
WithoutInitialization | 100000 | 4717.4 KB | 2.78 |
WithInitialization | 100000 | 1697.64 KB | 1.00 |
Wrapping up
Do you need other good reasons to initialize your collection capacity when possible? π
I used BenchmarkDotNet to create these benchmarks. If you want an introduction to this tool, you can have a look at how I used it to measure the performance of Enums:
π Enum.HasFlag performance with BenchmarkDotNet | Code4IT
I hope you enjoyed this article! Let’s keep in touch on Twitter or LinkedIn! π€π€
Happy coding!
π§