The 16-Byte Rule: Unraveling the Performance Mystery of C# Structures
#csharp #benchmarkMany C# developers are familiar with the following statements. By default, when passing to or returning from a method, an instance of a value type is copied, while an instance of a reference type is passed by reference. This has led to a belief that utilizing structures may degrade the overall app performance, especially if the structure has a size greater than 16 bytes. The discussion about this is still ongoing. In this article, we will try to find the truth.
Disclaimer
The information from this article is а true only under specific conditions. I admit that the benchmark may yield different results on a machine with a different CPU or using different structures and classes. Always verify your code on your specific hardware and don’t rely solely on only articles from the internet. :)
Benchmark
The benchmark is very straightforward. It contains structures and classes with sizes ranging 4 to 160 bytes, incrementing by 4 bytes (int value).
public record struct Struct04(int Param);
// other structures
public record struct Struct160(
int Param1, int Param2, int Param3, int Param4,
int Param5, int Param6, int Param7, int Param8,
int Param9, int Param10, int Param11, int Param12,
int Param13, int Param14, int Param15, int Param16,
int Param17, int Param18, int Param19, int Param20,
int Param21, int Param22, int Param23, int Param24,
int Param25, int Param26, int Param27, int Param28,
int Param29, int Param30, int Param31, int Param32,
int Param33, int Param34, int Param35, int Param36,
int Param37, int Param38, int Param39, int Param40);
public record struct Class04(int Param);
// other classes
public record class Class160(
int Param1, int Param2, int Param3, int Param4,
int Param5, int Param6, int Param7, int Param8,
int Param9, int Param10, int Param11, int Param12,
int Param13, int Param14, int Param15, int Param16,
int Param17, int Param18, int Param19, int Param20,
int Param21, int Param22, int Param23, int Param24,
int Param25, int Param26, int Param27, int Param28,
int Param29, int Param30, int Param31, int Param32,
int Param33, int Param34, int Param35, int Param36,
int Param37, int Param38, int Param39, int Param40);
For each structure and class, there is an appropriate method that creates an instance from a single int
value.
// other methods
public static Struct20 GetStruct20(int value) => new(value, value, value, value, value);
// other methods
public static Class20 GetClass20(int value) => new(value, value, value, value, value);
// other methods
In the end, there are benchmark methods, each creating a list of structures or classes with a size of 1000.
public int Iterations { get; set; } = 1000;
private static void Add<T>(List<T> list, T value) => list.Add(value);
[Benchmark(Baseline = true)]
public List<Struct04> GetStruct4()
{
var list = new List<Struct04>(Iterations);
for (int i = 0; i < Iterations; i++) Add(list, GetStruct04(i));
return list;
}
// other benchmark methods
[Benchmark(Baseline = true)]
public List<Class04> GetClass4()
{
var list = new List<Class04>(Iterations);
for (int i = 0; i < Iterations; i++) Add(list, GetClass04(i));
return list;
}
// other benchmark methods
As usual, for benchmarking, I used the BenchmarkDotNet library. The whole code of the benchmark class can be found here.
Results
Time measurements
The plot indicates that structures yield better results when their size is 64 bytes or fewer.
If we take a closer look at the execution time for entities up to 64 bytes, we see, that using structures can be 40-70% faster than using classes.
To understand why .NET behaves in this way, we need to examine the complied code. Let’s take a look at GetStruct64
and GetStruct128
methods, for example. At the screenshot below, we can observe that these methods are compiled to the same IL code. The only difference is in names.
It means that we have to delve deeper and examine the assembly code generated by the JIT compiler. Hopefully, BenchmarkDotNet provides such functionality.
Comparing both methods shows that there is no invocation of the GetStruct64(int value)
method. Instead, multiple mov
operations (mov
operation moves data between registers and memory) are present. These operations actually initialize Struct64
. But where is GetStruct64(int value)
? The JIT compiler optimized the code and replaced the method invocation by its body. This optimization is known as function or method inlining.
Another interesting observation is that there are no stack allocations for structures. The JIT compiler optimized the code in such a way that only registers are used, even for structures of 128 bytes or more (thanks to AVX 256-bit registers).
Memory measurements
The memory usage graph displays a smooth pattern without sudden spikes. It indicates that the difference in memory consumption between structures and classes remains constant. Thus, as the size of the entity increases, the use of structures becomes less advantageous in relative terms, but still advantageous. For structures with a size less or equal to 64 bytes, memory usage less from 27% to 87%.
Conclusion
This article proved, that utilizing structures not always degrades application performance. JIT compiler is smart enough to optimize the code during runtime, so structs may behave better than classes. Using structures up to 64 bytes more efficient in terms of both execution time and memory consumption.