Single line code or Nethermind validation using PVS-Studio C # for Linux

Picture 1

This article is dedicated to the launch of the PVS-Studio C # beta test for Linux, as well as the plug-in for Rider. For such a wonderful reason, using these tools, we checked the source code of the Nethermind product and in this article we will look at interesting and sometimes funny errors.

Nethermind is a fast client for .NET Core Ethereum for Linux, Windows, MacOs. It can be used in projects when setting up Ethereum or dApps private networks. Nethermind is open source located on GitHub. The project was founded in 2017 and is constantly evolving.

Introduction


Do you like manual labor? For example, such as finding errors in program code. Agree, it is rather tedious to read and analyze the site you wrote or the whole project in search of some tricky bug. Well, if the project is small, let's say 5,000 lines, but if its size has already exceeded one hundred thousand or a million lines? Moreover, it was written by different programmers and sometimes not in a very digestible form. What to do in this case? Do you really have to sleep somewhere, somewhere underestimate and 100% of your time to realize all these endless lines in order to understand where this nasty mistake is? I doubt that you would like to do this. So what to do, maybe there are modern means to somehow automate this?

Figure 3

And then a tool like a static code analyzer comes into play. Static analyzer is a tool for detecting defects in the source code of programs. The advantage of this tool over manual verification is as follows:

  • almost does not spend your time looking for the location of the error (at least it's definitely faster than looking for the failed copy-paste with your eyes)
  • does not get tired, unlike a person who after some time of searching will need rest;
  • knows a lot of error patterns that a person may not even be aware of;
  • uses technologies such as: data flow analysis, symbolic execution, pattern matching, and others;
  • allows you to regularly and at any time to analyze;
  • etc.

Of course, using a static code analyzer does not replace or cancel code reviews. But code reviews are becoming more productive and useful. You can focus on finding high-level errors, on transferring knowledge, rather than tiring to read the code in search of typos.

If you became interested in reading more about it, then I suggest the following article , as well as an article about the technologies used in PVS-Studio.

PVS-Studio C # for Linux / macOS


At the moment, we are porting our C # analyzer to .NET Core, and we are also actively developing a plug-in for the IDE Rider.

If you are interested, you can sign up for a beta test by filling out the form on this page . Installation instructions will be sent to your mail (don’t worry, it’s very simple), as well as a license to use the analyzer.

This is what Rider looks like with the PVS-Studio plugin:

Picture 4

A bit of indignation


I want to say that in some places it was very difficult to read the source code of Nethermind, because in it lines of 300-500 characters in length are quite normal. Yes, it is all the code without formatting in 1 line. And these lines, for example, contain several ternary operators, and logical operators, and there is nothing there. In general, enjoyment as from the last season of the game of thrones.

I will explain a little to understand the scope. I have an UltraWide monitor with a length of about 82 centimeters. If you open the IDE on it in full screen, then it will fit about 340 characters. That is, the lines that I'm talking about do not even fit. If you want to see how it looks, I left links to files on GitHub:

Example 1

private void LogBlockAuthorNicely(Block block, ISyncPeer syncPeer)
{
    string authorString = (block.Author == null ? null : "sealed by " +
(KnownAddresses.GoerliValidators.ContainsKey(block.Author) ?
KnownAddresses.GoerliValidators[block.Author] : block.Author?.ToString())) ??
(block.Beneficiary == null ? string.Empty : "mined by " +
(KnownAddresses.KnownMiners.ContainsKey(block.Beneficiary) ?
KnownAddresses.KnownMiners[block.Beneficiary] : block.Beneficiary?.ToString()));
    if (_logger.IsInfo)
    {
        if (_logger.IsInfo) _logger.Info($"Discovered a new block
{string.Empty.PadLeft(9 - block.Number.ToString().Length, '
')}{block.ToString(Block.Format.HashNumberAndTx)} {authorString}, sent by
{syncPeer:s}");
    }
}

File Link

Example 2

private void BuildTransitions()
{
    ...
    releaseSpec.IsEip1283Enabled = (_chainSpec.Parameters.Eip1283Transition ??
long.MaxValue) <= releaseStartBlock &&
((_chainSpec.Parameters.Eip1283DisableTransition ?? long.MaxValue) 
> releaseStartBlock || (_chainSpec.Parameters.Eip1283ReenableTransition ??
long.MaxValue) <= releaseStartBlock);           
    ...
}

Link to file

public void 
Will_not_reject_block_with_bad_total_diff_but_will_reset_diff_to_null()
{
    ...
    _syncServer = new SyncServer(new StateDb(), new StateDb(), localBlockTree,
NullReceiptStorage.Instance, new BlockValidator(Always.Valid, new
HeaderValidator(localBlockTree, Always.Valid, MainnetSpecProvider.Instance,
LimboLogs.Instance), Always.Valid, MainnetSpecProvider.Instance, 
LimboLogs.Instance), Always.Valid, _peerPool, StaticSelector.Full, 
new SyncConfig(), LimboLogs.Instance);
    ...     
}

Link to the file

Now imagine that an error occurs in such a section, will it be pleasant to look for it? I’m sure that it’s not, and everyone understands that it’s impossible to write like that. And by the way, there is a similar place with an error in this project.

Validation Results


Conditions that do not like 0


Condition 1

public ReceiptsMessage Deserialize(byte[] bytes)
{
    if (bytes.Length == 0 && bytes[0] == Rlp.OfEmptySequence[0])
        return new ReceiptsMessage(null);
    ...
}

PVS-Studio Warning: V3106 Possibly index is out of bound. The '0' index is pointing beyond 'bytes' bound. Nethermind.Network ReceiptsMessageSerializer.cs 50

To notice an error, consider the case when the number of elements in the array is 0. Then the bytes.Length == 0 condition will be true and an indexOutOfRangeException of type 0 will be thrown when accessing the array element .

They wanted to exit this method immediately if the array is empty or the 0 element is equal to a certain value, but it seems that they accidentally mixed up "||" with "&&". I propose to fix this problem as follows:

public ReceiptsMessage Deserialize(byte[] bytes)
{
    if (bytes.Length == 0 || bytes[0] == Rlp.OfEmptySequence[0])
        return new ReceiptsMessage(null);
    ...
}

Condition 2

public void DiscoverAll()
{
    ...
    Type? GetStepType(Type[] typesInGroup)
    {
        Type? GetStepTypeRecursive(Type? contextType)
        {
            ...
        }
        ...
        return typesInGroup.Length == 0 ? typesInGroup[0] :
               GetStepTypeRecursive(_context.GetType());
    }
    ...
}

PVS-Studio Warning: V3106 Possibly index is out of bound. The '0' index is pointing beyond 'typesInGroup' bound. Nethermind.Runner EthereumStepsManager.cs 70

A situation similar to the one described above occurs. If the number of elements in typesInGroup is equal to 0, then when accessing the 0 element, an exception of type IndexOutOfRangeException will be thrown .

Only in this case I do not understand what the developer wanted. Most likely, instead of typesInGroup [0] you need to write null .

Error or unfinished optimization?


private void DeleteBlocks(Keccak deletePointer)
{
   ...
   if (currentLevel.BlockInfos.Length == 1)
   {
      shouldRemoveLevel = true;
   }
   else
   {
      for (int i = 0; i < currentLevel.BlockInfos.Length; i++)
      {
         if (currentLevel.BlockInfos[0].BlockHash == currentHash) // <=
         {
            currentLevel.BlockInfos = currentLevel.BlockInfos
                                      .Where(bi => bi.BlockHash != currentHash)
                                      .ToArray();
            break;
         }
      }
   }
   ...
}

PVS-Studio Warning: V3102 Suspicious access to element of 'currentLevel.BlockInfos' object by a constant index inside a loop. Nethermind.Blockchain BlockTree.cs 895

At first glance, the error is obvious - the loop is aimed at iterating over the currentLevel.BlockInfos elements , but when accessing currentLevel.BlockInfos [i], they wrote currentLevel.BlockInfos [0] . We fix 0 on i and the mission is completed. But do not rush, let's figure it out.

Now, we Length access the BlockHash of the null element. If it is equal to currentHash , then we take from currentLevel.BlockInfos all elements that are not equalcurrentHash , write to it and exit the loop. It turns out that the cycle is superfluous.

I think that earlier there was an algorithm that they decided to change / optimize using linq , but something went wrong. Now, in the case when the condition is false, we get meaningless iterations.

By the way, if the developer who wrote this would use the incremental analysis mode , then he immediately realized that something would have been wrong and would have fixed everything quickly. At the moment, I would rewrite the code like this:

private void DeleteBlocks(Keccak deletePointer)
{
    ...
    if (currentLevel.BlockInfos.Length == 1)
    {
        shouldRemoveLevel = true;
    }
    else
    {
        currentLevel.BlockInfos = currentLevel.BlockInfos
                                  .Where(bi => bi.BlockHash != currentHash)
                                  .ToArray();
    }
    ...
}

Dereferencing null reference


Dereferencing 1

public void Sign(Transaction tx, int chainId)
{
    if (_logger.IsDebug)
        _logger?.Debug($"Signing transaction: {tx.Value} to {tx.To}");
    IBasicWallet.Sign(this, tx, chainId);
}

PVS-Studio Warning: V3095 The '_logger' object was used before it was verified against null. Check lines: 118, 118. Nethermind.Wallet DevKeyStoreWallet.cs 118

Error in the wrong sequence. First there is an appeal to _logger.IsDebug and only after that is _logger checked for null . Accordingly, in the case when _logger is null , we get an exception of type NullReferenceException .

Dereferencing 2

private void BuildNodeInfo()
{
    _nodeInfo = new NodeInfo();
    _nodeInfo.Name = ClientVersion.Description;
    _nodeInfo.Enode = _enode.Info;                           // <=
    byte[] publicKeyBytes = _enode?.PublicKey?.Bytes;        // <=
    _nodeInfo.Id = (publicKeyBytes == null ? Keccak.Zero :
                   Keccak.Compute(publicKeyBytes)).ToString(false);
    _nodeInfo.Ip = _enode?.HostIp?.ToString();
    _nodeInfo.ListenAddress = $"{_enode.HostIp}:{_enode.Port}";
    _nodeInfo.Ports.Discovery = _networkConfig.DiscoveryPort;
    _nodeInfo.Ports.Listener = _networkConfig.P2PPort;
    UpdateEthProtocolInfo();
}

PVS-Studio Warning: V3095 The '_enode' object was used before it was verified against null. Check lines: 55, 56. Nethermind.JsonRpc AdminModule.cs 55 The

error is completely the same as described above, only this time the culprit is _enode .

I want to add that if you forgot to check for null, then remember this only when your program crashes. The analyzer will remind you of this and everything will be fine.

Our favorite Copy-Paste


Case 1

public static bool Equals(ref UInt256 a, ref UInt256 b)
{
    return a.s0 == b.s0 && a.s1 == b.s1 && a.s2 == b.s2 && a.s2 == b.s2;
}

PVS-Studio Warning: V3001 There are identical sub-expressions 'a.s2 == b.s2' to the left and to the right of the '&&' operator. Nethermind.Dirichlet.Numerics UInt256.cs 1154

Here the same condition is checked 2 times:

a.s2 == b.s2

Since the parameters a and b have a field s3 , I assume that when copying we simply forgot to change s2 to s3 .

It turns out that the parameters will be equal more often than expected by the author of the code. At the same time, some developers think that they do not do this and begin to look for a mistake in a completely different place, spending a lot of energy and nerves.

By the way, errors in comparison functions are generally a classic. Apparently, programmers, considering such functions simple, relate to their writing very casually and inattentively. Proof . Knowing this, now be careful :)!

Case 2

public async Task<ApiResponse> 
PublishBlockAsync(SignedBeaconBlock signedBlock,
                  CancellationToken cancellationToken)
{
    bool acceptedLocally = false;
    ...
    if (acceptedLocally)
    {
        return new ApiResponse(StatusCode.Success);
    }
    else
    {
        return new ApiResponse(StatusCode.Success);
    }
    ...
}

PVS-Studio Warning: V3004 The 'then' statement is equivalent to the 'else' statement. Nethermind.BeaconNode BeaconNodeFacade.cs 177

For any value of the acceptedLocally variable, the method returns the same. It is difficult to say whether it is a mistake or not. Suppose a programmer copied a line and forgot to change StatusCode.Success to something else, then this is a real mistake. Moreover, the StatusCode has an InternalError and InvalidRequest . But the code refactoring is probably to blame and we don’t care about the acceptedLocally value anyway, in this case, the condition becomes the place that makes you sit and think whether it is a mistake or not. So in any case, this case is extremely unpleasant.

Case 3

public void TearDown()
{
    ...
    foreach (var testResult in _results)
    {
         string message = $"{testResult.Order}. {testResult.Name} has " 
               + $"{(testResult.Passed ? "passed [+]" : "failed [-]")}";
         if (testResult.Passed)
         {
               TestContext.WriteLine(message);
         }
         else
         {
               TestContext.WriteLine(message);
         }
     }
}

PVS-Studio Warning: V3004 The 'then' statement is equivalent to the 'else' statement. Nethermind.Overseer.Test TestBuilder.cs 46

And again, we do not care for verification, because in the end we get the same thing. And again we sit and suffer, thinking, but what was the developer wanting to write here. A waste of time that could have been avoided by using static analysis and immediately fixing such ambiguous code.

Case 4

public void Setup()
{
    if (_decoderBuffer.ReadableBytes > 0)
    {
        throw new Exception("decoder buffer");
    }

    if (_decoderBuffer.ReadableBytes > 0)
    {
        throw new Exception("decoder buffer");
    }
    ...
}

PVS-Studio Warning: V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless Nethermind.Network.Benchmark InFlowBenchmarks.cs 55

Just hit Ctrl + V again . We remove the excess check and everything is fine. I’m sure that if some other condition were important here, then everyone would write in one if through the logical operator I.

Case 5

private void LogBlockAuthorNicely(Block block, ISyncPeer syncPeer)
{
    if (_logger.IsInfo)
    {
        if (_logger.IsInfo)
        {
            ...
        }
    }
}

PVS-Studio Warning: V3030 Recurring check. The '_logger.IsInfo' condition was already verified in line 242. Nethermind.Synchronization SyncServer.cs 244

In the same way as in the fourth case, an extra check is performed. However, the difference is that _logger has not only one property, but also, for example, ' bool IsError {get; } '. Therefore, the code should probably look like this:

private void LogBlockAuthorNicely(Block block, ISyncPeer syncPeer)
{
    if (_logger.IsInfo)
    {
        if (!_logger.IsError) // <=
        {
            ...
        }
    }
}

Well, or all the fault of the code refactoring and just one check is no longer needed.

Case 6

if (missingParamsCount != 0)
{
    bool incorrectParametersCount = missingParamsCount != 0; // <=
    if (missingParamsCount > 0)
    {
        ...
    }
    ...
}

PVS-Studio warning : V3022 Expression 'missingParamsCount! = 0' is always true. Nethermind.JsonRpc JsonRpcService.cs 127

Check the condition (missingParamsCount! = 0) and if it is true, then we calculate it again and assign the result to a variable. Agree that this is a fairly original way to write true.

Confusing check


public async Task<long> 
DownloadHeaders(PeerInfo bestPeer, 
                BlocksRequest blocksRequest, 
                CancellationToken cancellation)
{
  ...
  for (int i = 1; i < headers.Length; i++)
  {
    ...
    BlockHeader currentHeader = headers[i];
    ...
    bool isValid = i > 1 ? 
        _blockValidator.ValidateHeader(currentHeader, headers[i - 1], false):
        _blockValidator.ValidateHeader(currentHeader, false);
    ...
    if (HandleAddResult(bestPeer, 
                        currentHeader, 
                        i == 0,                              // <=
                        _blockTree.Insert(currentHeader))) 
    {
       headersSynced++;
    }

    ...
  }
  ...
}

PVS-Studio warning : V3022 Expression 'i == 0' is always false. Nethermind.Synchronization BlockDownloader.cs 192

Let's start in order. When initializing, the variable i is assigned the value 1. Next, the variable is only incremented; therefore, the value will always be passed to the function false .

Now let's look at HandleAddResult :

private bool HandleAddResult(PeerInfo peerInfo, 
                             BlockHeader block,
                             bool isFirstInBatch, 
                             AddBlockResult addResult)
{
    ...
    if (isFirstInBatch)
    {
        ...
    }
    else
    {
        ...
    }
    ...
}

Here we are interested in isFirstInBatch . Judging by the name of this parameter, it is responsible for whether something is the first in the party. Hm, first. Let's look again above and see that there are 2 calls using i :

BlockHeader currentHeader = headers[i];
_blockValidator.ValidateHeader(currentHeader, headers[i - 1], false)

Do not forget that in this case the count is at 1. It turns out that we have two options: either a "first" means an element at index 1, or under the symbol 0. In any case, at the same time i will be equal to 1.

It turns out that the function call should look like this:

HandleAddResult(bestPeer, currentHeader, 
                i == 1, _blockTree.Insert(currentHeader))

Or like this:

HandleAddResult(bestPeer, currentHeader, 
                i - 1 == 0, _blockTree.Insert(currentHeader))

And again, if a developer constantly used a static analyzer, then he would write this code and see a warning, would quickly fix it and enjoy life.

A priority ??


Situation 1

public int MemorySize
{
  get
  {
    int unaligned = (Keccak == null ? MemorySizes.RefSize : 
        MemorySizes.RefSize + Keccak.MemorySize) 
        + (MemorySizes.RefSize + FullRlp?.Length 
                                 ?? MemorySizes.ArrayOverhead)   // <=
        + (MemorySizes.RefSize + _rlpStream?.MemorySize 
                                 ?? MemorySizes.RefSize)         // <=
        + MemorySizes.RefSize + (MemorySizes.ArrayOverhead + _data?.Length 
        * MemorySizes.RefSize ?? MemorySizes.ArrayOverhead) 
        + MemorySizes.SmallObjectOverhead + (Key?.MemorySize ?? 0);
    return MemorySizes.Align(unaligned);
  }
}

PVS-Studio warnings:

  • V3123 Perhaps the '??' operator works in a different way than it was expected. Its priority is lower than priority of other operators in its left part. Nethermind.Trie TrieNode.cs 43
  • V3123 Perhaps the '??' operator works in a different way than it was expected. Its priority is lower than priority of other operators in its left part. Nethermind.Trie TrieNode.cs 44

The analyzer advises you to check how we use the "??" operators, and to understand what the problem is, I propose to consider the following situation. We look at this line here:

(MemorySizes.RefSize + FullRlp?.Length ?? MemorySizes.ArrayOverhead)

MemorySizes.RefSize and MemorySizes.ArrayOverhead are constants:

public static class MemorySizes
{
    ...
    public const int RefSize = 8;
    public const int ArrayOverhead = 20;
    ...
}

Therefore, for clarity, I propose to rewrite the string, substituting their values:

(8 + FullRlp?.Length ?? 20)

Now suppose FullRlp is null. Then (8 + null) will be null . Next, we get the expression ( null ?? 20 ), which will return 20.

It turns out that, provided that FullRlp is null , the value from MemorySizes.ArrayOverhead will always be returned, regardless of what is stored in MemorySizes.RefSize. The fragment on the line below is similar.

But the question is, did the developer want this behavior? Let's look at the following line:

MemorySizes.RefSize + (MemorySizes.ArrayOverhead 
    + _data?.Length * MemorySizes.RefSize ?? MemorySizes.ArrayOverhead) 

Here, as in the sections considered above, MemorySizes.RefSize is added to the expression, but

note that after the first “+” operator there is a bracket. It turns out that we should add some expression to MemorySizes.RefSize , and if it is null , then we will add another. So the code should look like this:

public int MemorySize
{
  get
  {
    int unaligned = (Keccak == null ? MemorySizes.RefSize : 
       MemorySizes.RefSize + Keccak.MemorySize) 
       + (MemorySizes.RefSize + (FullRlp?.Length 
                                 ?? MemorySizes.ArrayOverhead))    // <=
       + (MemorySizes.RefSize + (_rlpStream?.MemorySize 
                                 ?? MemorySizes.RefSize))          // <=
       + MemorySizes.RefSize + (MemorySizes.ArrayOverhead + _data?.Length 
       * MemorySizes.RefSize ?? MemorySizes.ArrayOverhead) 
       + MemorySizes.SmallObjectOverhead + (Key?.MemorySize ?? 0);
    return MemorySizes.Align(unaligned);
  }
}

Again, this is only an assumption, however, if the developer wanted a different behavior, then you should explicitly indicate this:

((MemorySizes.RefSize + FullRlp?.Length) ?? MemorySizes.ArrayOverhead)

And then, the one who reads this code would not have to delve into for a long time what is happening here and what the programmer wanted.

Situation 2

private async Task<JsonRpcResponse> 
ExecuteAsync(JsonRpcRequest request, 
             string methodName,
             (MethodInfo Info, bool ReadOnly) method)
{
    var expectedParameters = method.Info.GetParameters();
    var providedParameters = request.Params;
    ...
    int missingParamsCount = expectedParameters.Length 
            - (providedParameters?.Length ?? 0) 
            + providedParameters?.Count(string.IsNullOrWhiteSpace) ?? 0; // <=
    if (missingParamsCount != 0)
    {
        ...
    }
    ...
}

PVS-Studio Warning: V3123 Perhaps the '??' operator works in a different way than it was expected. Its priority is lower than priority of other operators in its left part. Nethermind.JsonRpc JsonRpcService.cs 123

And again, the priority of the operation is "??", therefore, as in the last time, we consider the situation. We look at this line here:

expectedParameters.Length 
            - (providedParameters?.Length ?? 0) 
            + providedParameters?.Count(string.IsNullOrWhiteSpace) ?? 0;

Suppose that providedParameters is null , then for clarity, immediately replace everything related to providedParameters with null, and substitute a random value instead of expectedParameters.Length :

100 - (null ?? 0) + null ?? 0;

Now it’s immediately noticeable that there are two similar checks, only in one case there are brackets, and in the other there. Let's run this example. First we get that ( null ?? 0 ) will return 0, then subtract 0 from 100 and get 100:

100 + null ?? 0;

Now, instead of first executing " null ?? 0 " and finally getting ( 100 + 0 ), we get completely different.

First it will execute ( 100 + null ) and we get null . Then it checks ( null ?? 0 ), which leads to the fact that the value of the missingParamsCount variable will be 0.

Since the next condition is that the missingParamsCount is equal to zero, we can assume that the developer sought this behavior. And I’ll say, why not put brackets and express my thoughts explicitly? In general, it is possible that this check was due to a misunderstanding of why 0 is sometimes returned, and this is nothing more than a crutch.

And again, we are wasting time, although we might not have done this; use the incremental analysis mode when writing code .

Conclusion


In conclusion, I hope that I was able to convey to you that the static analyzer is your friend, and not an evil overseer who is just waiting for you to make a mistake.

It should also be noted that if you use the analyzer once or rarely use it, you will certainly find errors and some of them will even be quickly fixed, but there will also be ones that you need to smash your head at. Therefore, you need to use a static analyzer constantly. Then you will find many more errors and correct them at the moment when the program code is written and you understand exactly what you are trying to do.

The simple truth is that everyone makes mistakes and that’s normal. We all learn from mistakes, but only from those that were able to notice and understand. Therefore, use modern tools to search for these very errors, for example - PVS-Studio. Thank you for the attention.



If you want to share this article with an English-speaking audience, then please use the link to the translation: Nikolay Mironov. Single line code or check of Nethermind using PVS-Studio C # for Linux .

All Articles