And again about embedded: looking for bugs in the Embox project

Figure 2

Embox is a cross-platform, multi-tasking real-time operating system for embedded systems. It is designed to work in low computing resources and allows you to run Linux-based applications on microcontrollers without using Linux itself. Of course, like any other application, Embox bugs are also not spared. This article is devoted to the analysis of errors found in the code of the Embox project.

A few months ago, I already wrote an article about checking FreeRTOS, another OS for embedded systems. I did not find errors in it then, but I found them in libraries added by the guys from Amazon when developing their own version of FreeRTOS.

The article that you see now in front of you, in a sense, continues the theme of the previous one. We often received requests to check FreeRTOS, and we did it. This time, there were no requests to check on a specific project, but I began to receive letters and comments from embedded developers who liked it and who want more.

Well, the new issue of PVS-Studio for Embedded magazine has matured and lies before you. Enjoy watching!

Analysis


The analysis was carried out using PVS-Studio - a static code analyzer for C, C ++, C # and Java. Before analysis, the project needs to be assembled - so we will be sure that the project code is working, and we will also give the analyzer the opportunity to collect the assembly information that can be useful for better code verification.

The instructions in the official Embox repository offer the ability to build under different systems (Arch Linux, macOS, Debian) and using Docker. I decided to add some variety to my life and build and analyze from under Debian, which I recently installed on my virtual machine.

The assembly went smoothly. Now it was necessary to conduct an analysis. Debian is one of the supported PVS-Studio Linux-based systems. A convenient way to check projects from under Linux is to compile trace. This is a special mode in which the analyzer collects all the necessary information about the assembly so that you can then start the analysis with one click. All I needed to do was:

1) Download and install PVS-Studio;

2) Launch assembly tracking by going to the folder with Embox and typing in the terminal

pvs-studio-analyzer analyze -- make

3) After waiting for the assembly to complete, run the command:

pvs-studio-analyzer analyze -o /path/to/output.log

4) Convert the raw report to any format convenient for you. The analyzer comes with a special utility PlogConverter, with which you can do this. For example, the command to convert the report to tasklist (for viewing, for example, in QtCreator) will look like this:

plog-converter -t tasklist -o /path/to/output.tasks /path/to/project

All! It took me no more than 15 minutes to complete these points. The report is ready, now you can view the errors. Well, let's get started :)

Strange cycle


One of the errors found by the analyzer was the strange while loop:

int main(int argc, char **argv) {
  ....

  while (dp.skip != 0 ) {
    n_read = read(ifd, tbuf, dp.bs);
    if (n_read < 0) {
      err = -errno;
      goto out_cmd;
    }
    if (n_read == 0) {
      goto out_cmd;
    }

    dp.skip --;
  } while (dp.skip != 0);       // <=

  do {
    n_read = read(ifd, tbuf, dp.bs);
    if (n_read < 0) {
      err = -errno;
      break;
    }

    if (n_read == 0) {
      break;
    }

    ....

    dp.count --;
  } while (dp.count != 0);
  ....
}

PVS-Studio warning : V715 The 'while' operator has empty body. Suspicious pattern detected: 'while (expr) {...} while (dp.skip! = 0);'. dd.c 225 hm

. Really weird loop. The while expression (dp.skip! = 0) is written twice, once right above the loop, and the second just below it. In fact, now these are two different cycles: one contains expressions in curly braces, and the second is empty. In this case, the second cycle will never be executed.

Below is a do ... while loop with a similar condition, which leads me to think: the strange loop was originally meant as do ... while, but something went wrong. I think that this piece of code with a high probability contains a logical error.

Memory leaks


Yes, not without them.

int krename(const char *oldpath, const char *newpath) {
  
  char *newpatharg, *oldpatharg;

  ....

  oldpatharg =
    calloc(strlen(oldpath) + diritemlen + 2, sizeof(char));
  newpatharg =
    calloc(strlen(newpath) + diritemlen + 2, sizeof(char));
  if (NULL == oldpatharg || NULL == newpatharg) {
    SET_ERRNO(ENOMEM);
    return -1;
  }

  ....
}

PVS-Studio warnings:

  • V773 The function was exited without releasing the 'newpatharg' pointer. A memory leak is possible. kfsop.c 611
  • V773 The function was exited without releasing the 'oldpatharg' pointer. A memory leak is possible. kfsop.c 611

The function creates the local variables newpatharg and oldpatharg inside itself . These pointers are assigned the addresses of new memory locations allocated internally using calloc . If a problem occurs when allocating memory, calloc returns a null pointer.

What if only one memory block is allocated? The function will crash without any memory being freed. The site that happened to be allocated will remain in memory without any opportunity to access it again and free it for further use.

Another example of a memory leak is a bit brighter:

static int block_dev_test(....) {
  int8_t *read_buf, *write_buf;
  
  ....

  read_buf = malloc(blk_sz * m_blocks);
  write_buf = malloc(blk_sz * m_blocks);

  if (read_buf == NULL || write_buf == NULL) {
    printf("Failed to allocate memory for buffer!\n");

    if (read_buf != NULL) {
      free(read_buf);
    }

    if (write_buf != NULL) {
      free(write_buf);
    }

    return -ENOMEM;
  }

  if (s_block >= blocks) {
    printf("Starting block should be less than number of blocks\n");
    return -EINVAL;            // <=
  }

  ....
}

PVS-Studio warnings:

  • V773 The function was exited without releasing the 'read_buf' pointer. A memory leak is possible. block_dev_test.c 195
  • V773 The function was exited without releasing the 'write_buf' pointer. A memory leak is possible. block_dev_test.c 195

Here, the programmer has already shown accuracy and correctly processed the case in which it was possible to allocate only one piece of memory. Processed correctly ... and literally in the next expression made another mistake.

Thanks to a correctly written check, we can be sure that at the time the expression is executed, return -EINVAL; we will definitely have memory allocated for both read_buf and write_buf . Thus, with such a return from the function, we will have two leaks at once.

I think that getting a memory leak on an embedded device can be more painful than on a classic PC. In conditions when resources are severely limited, you need to monitor them especially carefully.

Pointers mishandling


The following error code is concise and simple enough:

static int scsi_write(struct block_dev *bdev, char *buffer,
                      size_t count, blkno_t blkno) {
  struct scsi_dev *sdev;
  int blksize;

  ....

  sdev = bdev->privdata;
  blksize = sdev->blk_size; // <=

  if (!sdev) {              // <=
    return -ENODEV;
  }

  ....
}

PVS-Studio Warning: V595 The 'sdev' pointer was utilized before it was verified against nullptr. Check lines: 116, 118. scsi_disk.c 116

The sdev pointer is dereferenced just before it is checked for NULL. It is logical to assume that if someone wrote such a check, then this pointer may be null. In this case, we have the potential dereferencing of the null pointer in the string blksize = sdev-> blk_size .

The error is that the check is not located where it is needed. It should have come after the line " sdev = bdev-> privdata; ", but before the line " blksize = sdev-> blk_size; ". Then the potential appeal to the zero address could be avoided.

PVS-Studio found two more errors in the following code:

void xdrrec_create(....)
{
  char *buff;

  ....

  buff = (char *)malloc(sendsz + recvsz);
  assert(buff != NULL);

  ....

  xs->extra.rec.in_base = xs->extra.rec.in_curr = buff;
  xs->extra.rec.in_boundry 
    = xs->extra.rec.in_base + recvsz;                    // <=

  ....
  xs->extra.rec.out_base
    = xs->extra.rec.out_hdr = buff + recvsz;             // <= 
  xs->extra.rec.out_curr 
    = xs->extra.rec.out_hdr + sizeof(union xdrrec_hdr);

  ....
}

PVS-Studio warnings:

  • V769 The 'xs-> extra.rec.in_base' pointer in the 'xs-> extra.rec.in_base + recvsz' expression could be nullptr. In such case, resulting value will be senseless and it should not be used. Check lines: 56, 48. xdr_rec.c 56
  • V769 The 'buff' pointer in the 'buff + recvsz' expression could be nullptr. In such case, resulting value will be senseless and it should not be used. Check lines: 61, 48. xdr_rec.c 61

The buf pointer is initialized with malloc , and then its value is used to initialize other pointers. The malloc function can return a null pointer, and this should always be checked. It would seem that here is an assert checking buf for NULL , and everything should work fine.

But no! The fact is that assert 's are used for debugging, and when building the project in the Release configuration, this assert will be deleted. It turns out that when working in Debug, the program will work correctly, and when building in Release, the null pointer "goes" further.

Use nullin arithmetic operations is incorrect, because the result of such an operation will not make any sense, and such a result cannot be used. This is what the analyzer warns us about.

Some might argue that not checking the pointer after malloc / realloc / calloc is fearless. Like, at the first access by a null pointer, a signal / exception will occur and there will be nothing wrong. In practice, everything is much more complicated, and if the lack of verification does not seem dangerous to you, I suggest you take a look at the article β€œ Why is it important to check what the malloc function returned ”.

Incorrect handling of arrays


The following error is very similar to the example before last:


int fat_read_filename(struct fat_file_info *fi,
                      void *p_scratch,
                      char *name) {
  int offt = 1;

  ....

  offt = strlen(name);
  while (name[offt - 1] == ' ' && offt > 0) { // <=
    name[--offt] = '\0';
  }
  log_debug("name(%s)", name);

  return DFS_OK;
}

PVS-Studio Warning: V781 The value of the 'offt' index is checked after it was used. Perhaps there is a mistake in program logic. fat_common.c 1813 The

variable offt is first used inside the indexing operation, and only then it is checked that its value is greater than zero. But what happens if name turns out to be an empty string? The strlen () function will return 0 , then a loud shot in the leg. The program will access at a negative index, which will lead to undefined behavior. Anything can happen, including a program crash. Mess!

Picture 1

Suspicious conditions


And where without them? We find such errors literally in every project that we check.

int index_descriptor_cloexec_set(int fd, int cloexec) {
  struct idesc_table *it;

  it = task_resource_idesc_table(task_self());
  assert(it);

  if (cloexec | FD_CLOEXEC) {
    idesc_cloexec_set(it->idesc_table[fd]);
  } else {
    idesc_cloexec_clear(it->idesc_table[fd]);
  }
  return 0;
}

PVS-Studio Warning: V617 Consider inspecting the condition. The '0x0010' argument of the '|' bitwise operation contains a non-zero value. index_descriptor.c 55

To understand what the error lies in, look at the definition of the FD_CLOEXEC constant :

#define FD_CLOEXEC 0x0010

It turns out that in the expression if (cloexec | FD_CLOEXEC) to the right of the bitwise β€œor” there is always a nonzero constant. The result of such an operation will always be a nonzero number. Thus, this expression will always be equivalent to the if (true) expression , and we will always process only the then-branch of the if-expression.

I suspect that this macro constant is used to pre-configure the Embox operating system, but even then this always true condition looks strange. Perhaps they wanted to use the & operator here , but they made a typo.

Integer division


The following error is related to one feature of the C language:

#define SBSIZE    1024

static int ext2fs_format(struct block_dev *bdev, void *priv) {
  size_t dev_bsize;
  float dev_factor;

  ....

  dev_size = block_dev_size(bdev);
  dev_bsize = block_dev_block_size(bdev);
  dev_factor = SBSIZE / dev_bsize;            // <=

  ext2_dflt_sb(&sb, dev_size, dev_factor);
  ext2_dflt_gd(&sb, &gd);

  ....
}

PVS-Studio warning : V636 The '1024 / dev_bsize' expression was implicitly cast from 'int' type to 'float' type. Consider utilizing an explicit type cast to avoid the loss of a fractional part. An example: double A = (double) (X) / Y ;. ext2.c 777

This feature is as follows: if we divide two integer values, then the result of the division will be integer. Thus, division will occur without a remainder, or, in other words, the fractional part will be discarded from the division result.

Sometimes programmers forget about it, and errors like this are made. SBSIZE constant and variable dev_bsize have integer type (int, and size_t respectively). Therefore, the result of the SBSIZE / dev_bsize expressionwill also be of integer type.

But wait a moment. The variable dev_factor is of type float ! Obviously, the programmer expected to get a fractional division result. This can be further verified if you pay attention to the further use of this variable. For example, the ext2_dflt_sb function , where dev_factor is passed as the third parameter, has the following signature:

static void ext2_dflt_sb(struct ext2sb *sb, size_t dev_size, float dev_factor);

Similarly, in other places where the dev_factor variable is used : everything indicates that a floating-point number is expected.

To correct this error, it is enough just to cast one of the division operands to a real type. For instance:

dev_factor = float(SBSIZE) / dev_bsize;

Then the result of division will be a fractional number.

Unverified input


The following error is associated with the use of unverified data received from outside the program.

int main(int argc, char **argv) {
  int ret;
  char text[SMTP_TEXT_LEN + 1];

  ....

  if (NULL == fgets(&text[0], sizeof text - 2, /* for \r\n */
      stdin)) { ret = -EIO; goto error; }
    text[strlen(&text[0]) - 1] = '\0'; /* remove \n */    // <=

  ....
}

PVS-Studio Warning: V1010 Unchecked tainted data is used in index: 'strlen (& text [0])'. sendmail.c 102

First, consider what the fgets function returns . In case of successful reading of a line, the function returns a pointer to this line. If the read encounters an end-of-file before at least one element is read, or if an input error occurs, the fgets function returns NULL .

So the expression is NULL == fgets (....)checks if the input received is correct. But there is one caveat. If you transfer a null terminal as the first character to be read (this can be done, for example, by pressing Ctrl + 2 in the Legacy mode of the Windows command line), the fgets function considers it without returning NULL . In this case, the line to which the recording is made will have only one element - ' \ 0 '.

What will happen next? The expression strlen (& text [0]) will return 0 . As a result, we get a negative index call:

text[ 0 - 1 ] = '\0';

As a result, we can β€œoverturn” the program by simply passing the line termination character to the input. This is awkward, and could potentially be used to attack systems using Embox.

My colleague, who was developing this diagnostic rule, even made a note with an example of such an attack on the NcFTP project:


I recommend to see if you still do not believe in such an opportunity :)

Also, the analyzer found two more places with the same error:

  • V1010 Unchecked tainted data is used in index: 'strlen (& from [0])'. sendmail.c 55
  • V1010 Unchecked tainted data is used in index: 'strlen (& to [0])'. sendmail.c 65

Misra


MISRA is a set of guidelines and guidelines for writing secure C and C ++ code for highly responsible embedded systems. In a sense, this is a training manual, following which will allow you to get rid of not only the so-called "code smells", but also protect your program from vulnerabilities.

MISRA is used where human lives depend on the quality of your embedded system: in the medical, automotive, aircraft and military industries.

PVS-Studio has an extensive set of diagnostic rules that allow you to check your code for compliance with MISRA C and MISRA C ++ standards. By default, the mode with these diagnostics is turned off, but since we are looking for errors in the project for embedded systems, I simply could not do without MISRA.

Here is what I managed to find:

/* find and read symlink file */
static int ext2_read_symlink(struct nas *nas,
                             uint32_t parent_inumber,
                             const char **cp) {
  char namebuf[MAXPATHLEN + 1];

  ....

  *cp = namebuf;              // <=
  if (*namebuf != '/') {
    inumber = parent_inumber;
  } else {
    inumber = (uint32_t) EXT2_ROOTINO;
  }
  rc = ext2_read_inode(nas, inumber);

  return rc;
} 

PVS-Studio Warning: V2548 [MISRA C 18.6] Address of the local array 'namebuf' should not be stored outside the scope of this array. ext2.c 298

The analyzer detected a suspicious assignment that could potentially lead to undefined behavior.

Let's take a closer look at the code. Here, namebuf is an array created in the local scope of the function, and the cp pointer is passed to the function by pointer.

According to C syntax, the name of the array is a pointer to the first element in the memory area in which the array is stored. It turns out that the expression * cp = namebuf will assign the address of the array namebuf to the variable pointed to by cp. Since cp is passed to the function by pointer, a change in the value that it points to will be reflected in the place where the function was called.

It turns out that after the ext2_read_symlink function finishes its work, its third parameter will indicate the area that the namebuf array once occupied .

There is only one β€œbut”: since namebuf is an array reserved on the stack, it will be deleted when the function exits. Thus, a pointer that exists outside the function will point to the freed memory.

What will be at that address? It is impossible to predict. It is possible that for some time the contents of the array will continue to be in memory, or it is possible that the program will immediately erase this area with something else. In general, accessing such an address will return an undefined value, and using such a value is a gross error.

The analyzer also found another error with the same warning:

  • V2548 [MISRA C 18.6] Address of the local variable 'dst_haddr' should not be stored outside the scope of this variable. net_tx.c 82

Figure 6

Conclusion


I liked working with the Embox project. Despite the fact that I did not write out all the errors found in the article, the total number of warnings was relatively small, and in general, the project code was written with high quality. Therefore, I express my gratitude to the developers, as well as to those who contributed to the project on behalf of the community. You are great!

I take this opportunity to convey greetings to the developers from Tula. I will believe that in St. Petersburg it’s not very cold right now :)

This is where my article comes to an end. I hope you enjoyed reading it, and you found something new for yourself.

If you are interested in PVS-Studio and would like to independently verify some project using it, download and try it . This will take no more than 15 minutes.



If you want to share this article with an English-speaking audience, then please use the link to the translation: George Gribkov. About embedded again: searching for bugs in the Embox project .

All Articles