Optimizing CSharp Null-Coalescing Operator
Posted on November 21, 2020 in dotnet
Consider the following code:
public Func<int> F { get; set; }
private int GetValue()
{
// return the value produced by F,
// if F is not null, else zero
}
I naïvely implemented the function as:
if (F == null) return default;
return F();
Does the job.
Elegant Code
Yet, ReSharper squiggled the if
statement, suggesting I might want to convert to:
return F == null ? default : F();
Which is nice indeed. Ah, ReSharper now squiggles the == null
statement, suggesting I merge the conditional expression into:
return F?.Invoke() ?? default;
And... I started to wonder. It certainly is more compact. But is it cleaner? It reminds me of a recent thead initiated by Frans, on the topic of C# getting more and more syntax-complex, to the point that C++ looks elegant and you need ReSharper to remind you of all the new bells and whistles.
Thank you, but no. Obfuscation can sure be fun, but one day I might have to troubleshoot that code, and I do prefer the ?:
pattern over the ?.Invoke ??
one. As Steve replied:
It's just obfuscating the intent of your code. Clarity over minimal keystrokes. I gotta support this code for years.
(rant) Every developer should have to spend some time doing 24/7 support on production-critical code. Once you have been waken up at 3am to spend hours trying to figure out what is wrong with some recent code changes while the entire factory waits for you to restart operations... you get a different idea of what elegant code is. (/rant)
Anyways.
And besides, it is not the same!
At that point I got a carried on by my rant, and started to think:
- With the
?.
pattern, if the function isnull
, we return the value type default value, else execute the function and return the value type result - With the
?.Invoke ??
pattern, if the function isnull
,?.Invoke
will benull
i.e. a reference type, so there has to be some boxing and unboxing happening under the hood for the ?? operator to work
And so, the ?.Invoke ??
pattern has to be worse, performance-wise!
Time to turn to Sharplab.io. This amazing tool let you type C# code on the left, and presents you with the resulting IL or assembly code on the right:
We are going to compile the following code:
public Func<int> F { get; set; }
public int M1()
{
return F == null
? 0
: F();
}
public int M2()
{
return F?.Invoke() ?? 0;
}
For M1
the IL code (in Release mode) is:
IL_0000: ldarg.0
IL_0001: call instance class [System.Private.CoreLib]System.Func`1<int32> C::get_F()
IL_0006: brfalse.s IL_0014
IL_0008: ldarg.0
IL_0009: call instance class [System.Private.CoreLib]System.Func`1<int32> C::get_F()
IL_000e: callvirt instance !0 class [System.Private.CoreLib]System.Func`1<int32>::Invoke()
IL_0013: ret
IL_0014: ldc.i4.0
IL_0015: ret
And for M2
:
IL_0000: ldarg.0
IL_0001: call instance class [System.Private.CoreLib]System.Func`1<int32> C::get_F()
IL_0006: dup
IL_0007: brtrue.s IL_000c
IL_0009: pop
IL_000a: ldc.i4.0
IL_000b: ret
IL_000c: callvirt instance !0 class [System.Private.CoreLib]System.Func`1<int32>::Invoke()
IL_0011: ret
First, note that the ?.
coalesce operator does not imply any kind of boxing, as I originally thought. So it is safe. In fact, the code produced for M2
looks simpler, especially with M1
invoking get_F()
twice in case F
is not null
. However, remember this is only IL code. There is another level of optimization in the JIT, which leads to the following x86 assembly code:
C.M1()
L0000: mov eax, [ecx+4]
L0003: test eax, eax
L0005: je short L000e
L0007: mov ecx, [eax+4]
L000a: call dword ptr [eax+0xc]
L000d: ret
L000e: xor eax, eax
L0010: ret
C.M2()
L0000: mov edx, [ecx+4]
L0003: test edx, edx
L0005: jne short L000a
L0007: xor eax, eax
L0009: ret
L000a: mov ecx, [edx+4]
L000d: call dword ptr [edx+0xc]
L0010: ret
In other words, both methods end up doing almost exactly the same, at assembly level. And benchmarks (see below, if you really are into this) show that they perform the same. This is a good thing: it means that one can use one syntax or the other, without worrying about performance.
As for code elegance, I know what I prefer.
Appendix
This is more details, for people who are into it.
Just to be sure, I have benchmarked the two methods. The BenchmarkDotNet code is quite simple (see this Gist) and I have executed it with F
being null
, or returning a constant, or computing a value... and, funny enough, I never get the exact same duration for both methods, and one or the other is always randomly slightly faster that the other. I guess that is due to glitches on my benchmark machine, and it means that the methods are equivalent.
Just to be super sure, I fired WinDBG at the benchmark, after it had run. First you want to locate the class:
> !name2ee Benchmarks!Benchmarks.NullCoalesce
Module: 00007ff98a5cf800
Assembly: Benchmarks.dll
Token: 0000000002000005
MethodTable: 00007ff98a69ea80
EEClass: 00007ff98a689038
Name: Benchmarks.NullCoalesce
Then you want to dump the method table:
> !dumpmt -md 00007ff98a69ea80
EEClass: 00007ff98a689038
Module: 00007ff98a5cf800
Name: Benchmarks.NullCoalesce
mdToken: 0000000002000005
File: D:\d\Benchmarks\bin\Release\netcoreapp3.1\Benchmarks.dll
BaseSize: 0x20
ComponentSize: 0x0
Slots in VTable: 11
Number of IFaces in IFaceMap: 0
--------------------------------------
MethodDesc Table
Entry MethodDesc JIT Name
00007FF98A520090 00007ff98a4f0a80 NONE System.Object.Finalize()
00007FF98A520098 00007ff98a4f0a90 NONE System.Object.ToString()
00007FF98A5200A0 00007ff98a4f0aa0 JIT System.Object.Equals(System.Object)
00007FF98A5200B8 00007ff98a4f0ae0 JIT System.Object.GetHashCode()
00007FF98A538F30 00007ff98a69e9f8 NONE Hazelcast.Benchmarks.NullCoalesce..ctor()
00007FF98A538F20 00007ff98a69e9c8 JIT Benchmarks.NullCoalesce.get_F()
00007FF98A538F28 00007ff98a69e9e0 NONE Benchmarks.NullCoalesce.set_F(System.Func`1)
00007FF98A538F38 00007ff98a69ea08 JIT Benchmarks.NullCoalesce.BenchmarkM1()
00007FF98A538F40 00007ff98a69ea20 JIT Benchmarks.NullCoalesce.M1()
00007FF98A538F48 00007ff98a69ea38 JIT Benchmarks.NullCoalesce.BenchmarkM2()
00007FF98A538F50 00007ff98a69ea50 JIT Benchmarks.NullCoalesce.M2()
And... !dumpil
can show us the IL code, but we really want the assembly code:
> !u 00007ff98a69ea20
Normal JIT generated code
Benchmarks.NullCoalesce.M1()
Begin 00007FF98AAB5F20, size 17
00007ff9`8aab5f20 488b4108 mov rax,qword ptr [rcx+8]
00007ff9`8aab5f24 4885c0 test rax,rax
00007ff9`8aab5f27 740b je 00007ff9`8aab5f34
00007ff9`8aab5f29 488b4808 mov rcx,qword ptr [rax+8]
00007ff9`8aab5f2d 488b4018 mov rax,qword ptr [rax+18h]
00007ff9`8aab5f31 48ffe0 jmp rax
00007ff9`8aab5f34 33c0 xor eax,eax
00007ff9`8aab5f36 c3 ret
> !u 00007ff98a69ea50
Normal JIT generated code
Benchmarks.NullCoalesce.M2()
Begin 00007FF98AAE4220, size 17
00007ff9`8aae4220 488b5108 mov rdx,qword ptr [rcx+8]
00007ff9`8aae4224 4885d2 test rdx,rdx
00007ff9`8aae4227 7503 jne 00007ff9`8aae422c
00007ff9`8aae4229 33c0 xor eax,eax
00007ff9`8aae422b c3 ret
00007ff9`8aae422c 488b4a08 mov rcx,qword ptr [rdx+8]
00007ff9`8aae4230 488b4218 mov rax,qword ptr [rdx+18h]
00007ff9`8aae4234 48ffe0 jmp rax
My .NET host (.NET Core 3.1.9) produces assembly code that is slightly different from what Sharplab.io shows, but the conclusion stands: both methods are, practically, equivalent.
There used to be Disqus-powered comments here. They got very little engagement, and I am not a big fan of Disqus. So, comments are gone. If you want to discuss this article, your best bet is to ping me on Mastodon.