🖕🏿 💽 🥁 C # 8 and null validity. How do we live with this 🏴‍☠️ 🚃 👼🏿

Hello colleagues! It's time to mention that we have plans to release Ian Griffiths' fundamental book on C # 8:

In the meantime, on his blog, the author has published two related articles in which he considers the intricacies of new phenomena such as nullability, null-obliviousness, and null-awareness. We have translated both articles under one heading and suggest discussing them.

The most ambitious new feature in C # 8.0 is called nullable references .

The purpose of this new feature is to smooth out the damage from a dangerous thing, which the computer scientist Tony Hoar once called his " billion dollar mistake ." C # has a keywordnull(the equivalent of which is found in many other languages), and the roots of this keyword can be traced back to the language Algol W, in the development of which Hoar participated. In this ancient language (it appeared in 1966), variables referring to instances of a certain type could receive a special meaning indicating that right now this variable is not referenced anywhere. This opportunity was very widely borrowed, and today many experts (including Hoar himself) believe that it has become the biggest source of costly software errors of all time.

What is wrong with assuming zero? In a world where any link may point to zero, you have to consider this wherever any links are used in your code, otherwise you run the risk of being refused at runtime. Sometimes it’s not too burdensome; if you initialize a variable with an expression newin the same place where you declare it, then you know that this variable is not equal to zero. But even such a simple example is fraught with some cognitive load: before the release of C # 8, the compiler could not tell you if you are doing anything that can convert this value to null. But, as soon as you start stitching different fragments of code, it becomes much more difficult to judge with certainty about such things: how likely is it that this property that I am reading now can return null? Is it allowed to transmitnullinto that method? In what situations can I be sure that the method I'm calling will set this argument outnot to null, but to a different value? Moreover, the matter is not even limited to remembering to check such things; it’s not entirely clear what you should do if you run into zero.

With numerical types in C #, there is no such problem: if you write a function that takes some numbers as an input and returns a number as a result, then you don’t have to wonder if the transmitted values are really numbers, and if anything among them can be mixed up. When calling such a function, it is not necessary to think about whether it can return anything instead of a number. Unless such a development of events interests you as an option: in this case, you can declare parameters or results of the typeint?, indicating that in this particular case you really want to allow the transmission or return of a null value. So, for numerical types and, in a more general sense, significant types, zero tolerance has always been one of those things that are done voluntarily, as an option.

As for the reference types, prior to C # 8.0, the permissibility of zero was not only set by default, but it could not be disabled either.

In fact, for reasons of backward compatibility, zero-validity continues to operate by default even in C # 8.0, since new language functions in this area remain disabled until you explicitly request them.

However, as soon as you enable this new feature - everything changes. The easiest way to activate it is to add it <Nullablegt;enable</Nullablegt;inside the element.<PropertyGroup>in your file .csproj. (I note that more filigree control is also available. If you really really need it, you can configure the behavior to be allowed nullseparately on each line. But, when we recently decided to include this feature in all our projects, it turned out that it would be activated on a scale one project at a time is a doable task.)

When the permissible links in C # 8.0 are nullfully activated, the situation changes: now, by default, it is assumed that the links do not allow null only if you yourself do not specify the opposite, exactly as with significant types ( even the syntax is the same: you could write int ?, if you really wanted the integer value to be optional. Now you write string?, if you mean that you want either a string reference ornull.)

This is a very significant change, and, first of all, due to its significance, this new feature is disabled by default. Microsoft could have designed this language feature differently: you could leave the default links nullable and introduce new syntax that would allow you to specify that you want to ensure that it is not allowed null. Perhaps this would lower the bar when exploring this possibility, but in the long run such a solution would be incorrect, since in practice most of the links in the huge mass of C # code are not designed to point to null.

Assuming zero is an exception, not a rule, and that is why, when this new language feature is enabled, preventing null becomes a new default. This is reflected even in the original feature name: “nullable references.” The name is curious, given that links could point nullback to C # 1.0. But the developers chose to emphasize that now the null assumption goes into the category of things that need to be explicitly requested.

C # 8.0 smoothes out the process of introducing permissive links null, since it allows you to introduce this feature gradually. One does not have to make a yes or no choice. This is quite different from the feature async/awaitadded in C # 5.0, which tended to spread: in fact, asynchronous operations oblige the caller to beasync, and therefore, the code that calls this caller must be async, and so on, to the very top of the stack. Fortunately, types that allow nullare constructed differently: they can be implemented selectively and gradually. You can work through the files one by one, or even line by line, if necessary.

The most important aspect of types allowingnull(thanks to which the transition to them is simplified), is that by default they are disabled. Otherwise, most developers would refuse to use C # 8.0, since such a transition would cause warnings in almost any code base. However, for the same reasons, the entry threshold for using this new feature feels rather high: if a new feature makes such dramatic changes that it is disabled by default, then you probably won’t want to mess with it, but there are problems associated with switching to it will always seem unnecessary hassle. But this would be a shame, because the feature is very valuable. It helps to find bugs in the code before users do it for you.

So, if you are considering introducing types that allownull, be sure to note that you can introduce this feature step by step.

Warnings only

The roughest level of control over the entire project after a simple on / off is the ability to activate warnings regardless of annotations. For example, if I fully enable the zero assumption for Corvus.ContentHandling.Json in our Corvus.ContentHandling repository , adding <Nullablegt;enable</Nullablegt;to the group of properties in the project file, then in its current state 20 warnings from the compiler will immediately appear. However, if I use it instead, I <Nullablegt;warnings</Nullablegt;’ll get just one warning.

But wait! Why will less warnings be shown to me? In the end, here I just asked for warnings. The answer to this question is not entirely obvious: the fact is that some variables and expressions can be nullnull-oblivious.

Null Neutrality

C # supports two interpretations of null validity. Firstly, any variable of a reference type can be declared as admitting or not admitting null, and secondly, the compiler will nullwhenever possible logically conclude whether or not this variable can be at any particular point in the code. This article deals only with the first variety of admissibilitynull, that is, about the static type of a variable (in fact, this applies not only to variables and parameters and fields close to them in spirit; both static and logically deducible admissibility are nulldetermined for each expression in C #.) In fact, admissibility nullin its first sense , the one we are considering is an extension of the type system.

However, it turns out that if we focus only on null admissibility for a type, the situation will not be as coherent as one might assume. This is not just a contrast between "null validity" and "invalidnull". In fact, there are two more possibilities. There is a category of “unknown”, which is mandatory due to the availability of generics; if you have an unlimited type parameter, then it will not be possible to find out anything about the validity nullfor it: code using the appropriate generalized method or type can substitute an argument in them, either allowing or not allowing null. You can add restrictions, but often such restrictions are undesirable, since they narrow the scope of the generalized type or method. So, for variables or expressions of some unlimited type parameter, the T(non) validity of zero must be unknown; perhaps, in each case, the question of admissibilitynullit will be decided separately for them, but we don’t know which option will appear in the generic code, since it will depend on the type argument.

The latter category is called “neutral”. By the principle of "neutrality" everything worked before the advent of C # 8.0, and this will work if you do not activate the ability to work with nullable links. (Basically, this is an example of retroactivity . Even though the idea of null neutrality was first introduced in C # 8.0 as a natural state of code before activating null validity for references, C # designers insisted that this property was never really alien to C #.)

Perhaps you don’t have to explain what “neutrality” means in this case, since it is in this vein that C # always worked, so you yourself understand everything ... although, perhaps, this is a little dishonest. So listen: in a world where admissibility is known null, the most important characteristic of nullneutral expressions is that they do not cause warnings about null acceptability. You can assign a null-neutral expression as an allowable nullvariable, but not allowable. Null-neutral variables (as well as properties, fields, etc.), you can assign expressions that the compiler considered “possible null” or “not null”.

That’s why, if you just turn on warnings, then there will not be many new alerts. All code remains in the context of disabled validity annotations null, so all variables, parameters, fields and properties will be nullneutral, which means that you will not receive any warnings if you try to use them in conjunction with any entities that take into account null.

Why, then, do I get warnings at all? A common reason is because of an attempt to make friends in an unacceptable way two pieces of code that take into account null. For example, suppose I have a library where permissive links are fully included null, and this library has the following deeply contrived class:

public static class NullableAwareClass
	{
	    public static string? GetNullable() => DateTime.Now.Hour > 12 ? null : "morning";
	

	    public static int RequireNonNull(string s) => s.Length;
	}

Further, in another project, I can write this code in the context where null validity warnings are activated, but the corresponding annotations are disabled:

static int UseString(string x) => NullableAwareClass.RequireNonNull(x);

Since annotations about null validity are disabled, the parameter xhere is nullneutral. This means that the compiler cannot determine if this code is true or not. If the compiler issued warnings in cases where nullneutral expressions are mixed with those that take into account null, a significant proportion of these warnings could be considered doubtful - therefore, warnings are not issued.

With this wrapper, I actually hid the fact that the code takes into account validity null. This means that now I can write like this:

	int x = UseString(NullableAwareClass.GetNullable());

The compiler knows what it GetNullablecan return null, but since I called the method with a null-neutral parameter, the program does not know if this is right or wrong. Using the null-neutral wrapper, I disarmed the compiler, which now does not see a problem here. However, if I combined these two methods directly, everything would be different:

int y = NullableAwareClass.RequireNonNull(NullableAwareClass.GetNullable());

Here I pass the result GetNullableright to RequireNonNull. If I tried to do this in a context where null assumptions are enabled, the compiler would generate a warning, regardless of whether I turned on or off the context of the corresponding annotations. In this particular case, the context of annotations does not matter, since there are no declarations with a reference type. If you enable warnings about the assumption of null, but disable the corresponding annotations, then all declarations will become- nullneutral, which, however, does not mean that all expressions become such. So, we know that the result GetNullableis null. Therefore, we get a warning.

Summarizing: since all declarations in the context of disabled annotations that allow nullarenull-neutral, we won’t get many warnings, since most expressions will be null-neutral. But the compiler will still be able to catch errors related to the assumption nullin those cases when the expressions do not pass through some null-neutral intermediary. Moreover, the greatest benefit in this case will be from detecting errors associated with attempts to dereference potential null values using ., for example:

int z = NullableAwareClass.GetNullable().Length;

If your code is well-designed, then there should not be a large number of errors of this kind.

Gradual annotation of the entire project

After you take the first step - just activate the warnings, then you can proceed to the gradual activation of annotations, file by file. It is convenient to include them immediately in the entire project, see in which files warnings appear - and then select a file in which there are relatively few warnings. Again, turn them off at the level of the entire project, and write at the top of the file you selected #nullable enable. This will fully enable the assumption null(both for warnings and for annotations) in the entire file (unless you turn them off again using another directive#nullable) Then you can go through the entire file and make sure that all entities that are likely to be null are annotated as allowing null(i.e., add ?), and then deal with warnings in this file, if any remain.

It may turn out that adding all the necessary annotations is all that is required to eliminate all warnings. The reverse is also possible: you may notice that when you neatly annotate one file about validitynull, other warnings have surfaced in other files using it. As a rule, there are not many such warnings, and you have time to quickly fix them. But, if for some reason after this step you just drown in warnings, then you still have a couple of solutions. Firstly, you can simply cancel the selection, leave this file and take on another one. Secondly, you can selectively turn off annotations for those members that you think are causing the most problems. ( #nullableYou can use the directive as many times as you want, so you can control the null validity settings even line by line, if you want to.) Perhaps if you return to this file later when you already activate null validity in most of the project, you will see less warnings than the first time.

There are times when problems cannot be solved in such a straightforward way. So, in certain scenarios related to serialization (for example, when using Json.NET or Entity Framework), the work may be more difficult. I think this problem deserves a separate article.

Links with the assumption nullimprove the expressiveness of your code and increase the chances that the compiler will catch your errors before users bump into them. Therefore, it is better to include this feature if possible. And, if you include it selectively, then the benefits of it will begin to feel faster.

C # 8 and null validity. How do we live with this

More articles: