NSA, Ghidra and Unicorns

NSA, Ghidra and Unicorns

This time, the PVS-Studio team was attracted by Ghidra, a large and evil framework for reverse engineering with which you can analyze various binary files and do all kinds of scary things with them. The most interesting thing about it is not that it is free to use or extends well with plugins, but that it was written in the NSA and posted on GitHub for everyone. On the one hand, the NSA seems to have enough resources to keep the code base clean. And on the other hand, new contributors who were not very familiar with it could have accidentally added undetected bugs lately. Therefore, armed with a static analysis, we decided to look for weaknesses in this project.

Prelude


In total, the PVS-Studio static analyzer issued 651 high, 904 medium, and 909 low warnings in the Java part of the Ghidra project ( release 9.1.2, commit 687ce7f ). Among them, about half of the high and medium responses were triggered by the V6022 diagnostic.Parameter is not used inside method's body ", which usually appears after refactoring, when some parameter was no longer needed or some functionality was temporarily disabled by comments. A quick look at these warnings (there are too many of them to view each as an outside observer ) in this project did not reveal anything obviously suspicious. It is probably permissible for this project to temporarily disable this diagnostics in the analyzer settings so as not to be distracted by it. In practice, you can often see typos in the name of the setter or constructor parameter and, generally speaking, it should I’m sure most readers at least once come across a similar unpleasant pattern:

public class A {
  private String value;
  public A(String val) { // V6022
    this.value = value;
  }
  public int hashCode() {
    return value.hashCode(); // NullPointerException
  }
}

More than half of the low warnings were made by the " V6008 Potential null dereference of 'variable'" diagnostic - for example, the value File.getParentFile () is often used without checking for null . If the file object on which this method was called was constructed without an absolute path, null will be returned and the absence of verification may drop the application.

By tradition, we will analyze only warnings of high and medium levels, since the bulk of real errors are contained in them. When working with analyzer reports, we always recommend analyzing warnings in descending order of their reliability.

Next, we consider a few fragments indicated by the analyzer that seemed suspicious or interesting to me. The code base of the project turned out to be of impressive size, and it is almost impossible to find such places manually.

Fragment 1: broken validation


private boolean parseDataTypeTextEntry()
throws InvalidDataTypeException {
  ...
  try {
    newDataType = parser.parse(selectionField.getText(),
                               getDataTypeRootForCurrentText());
  }
  catch (CancelledException e) {
    return false;
  }
  if (newDataType != null) {
    if (maxSize >= 0
        && newDataType.getLength() > newDataType.getLength()) { // <=
      throw new InvalidDataTypeException("data-type larger than "
                                         + maxSize + " bytes");
    }
    selectionField.setSelectedValue(newDataType);
    return true;
  }
  return false;
}

PVS-Studio Warning: V6001 There are identical sub-expressions 'newDataType.getLength ()' to the left and to the right of the '>' operator. DataTypeSelectionEditor.java data66

This class provides a graphical component for selecting a data type that supports auto-completion. The developer using this component can set the maximum permissible size of the selected data type (via the maxSize field ) or make it unlimited by setting a negative value. It was assumed that when validating the entered data, exceeding the limit throws an exception, which is then caught up the call stack and the user is shown a message.

It seems that the author of the component was distracted right at the time of writing this test, or maybe he thought about the meaning of life, but in the end, validation simply does not work, since the number can never be greater than itself and, accordingly, we get ignoring this condition. This means that this component can provide invalid data.

Another similar error was found in two more classes: GuidUtil and NewGuid .

public class GuidUtil {
  ...
  public static GuidInfo parseLine(...) {
    ...
    long[] data = new long[4];
    ...
    if (isOK(data)) {
      if (!hasVersion) {
        return new GuidInfo(guidString, name, guidType);
      }
      return new VersionedGuidInfo(guidString, version, name, guidType);
    }
    return null;
  }
  ...
  private static boolean isOK(long[] data) {
    for (int i = 0; i < data.length; i++) {
      if ((data[i] != 0) || (data[i] != 0xFFFFFFFFL)) { // <=
        return true;
      }
    }
    return false;
  }
  ...
}

PVS-Studio warning : V6007 Expression 'data [i]! = 0xFFFFFFFFL' is always true. GuidUtil.java:200

The for loop of the isOK method checks that the same value is not equal to two different numbers at the same time. If so, then the GUID is immediately recognized as valid. That is, the GUID will be invalid only if the data array is empty, and this never happens, since the value of the corresponding variable is assigned only once - at the beginning of the parseLine method . IsOK

method bodyin both classes it completely coincides, which suggests the idea of ​​another copy-paste of incorrect code. What exactly the author wanted to check, I'm not sure, but I can assume that this method should be fixed as follows:

private static boolean isOK(long[] data) {
  for (int i = 0; i < data.length; i++) {
    if ((data[i] == 0) || (data[i] == 0xFFFFFFFFL)) {
      return false;
    }
  }
  return true;
}

Fragment 2: hiding exceptions


public void putByte(long offsetInMemBlock, byte b)
throws MemoryAccessException, IOException {
  long offsetInSubBlock = offsetInMemBlock - subBlockOffset;
  try {
    if (ioPending) {
      new MemoryAccessException("Cyclic Access"); // <=
    }
    ioPending = true;
    doPutByte(mappedAddress.addNoWrap(offsetInSubBlock / 8),
              (int) (offsetInSubBlock % 8), b);
  }
  catch (AddressOverflowException e) {
    new MemoryAccessException("No memory at address"); // <=
  }
  finally {
    ioPending = false;
  }
}

PVS-Studio Warning: V6006 The object was created but it is not being used. The 'throw' keyword could be missing: 'new MemoryAccessException ("Cyclic Access")'. BitMappedSubMemoryBlock.java:99

The exception objects themselves, as you know, do nothing (or, at least, should not do anything). Almost always, their new instances are thrown through throw , in some rare cases - transferred somewhere or, perhaps, placed in a collection.

The class containing this method is a wrapper over a memory block that allows you to read and write data. Here, due to the fact that exceptions are not thrown, the imposed restriction of access to the current memory block with the ioPending flag may be violatedand, in addition, an AddressOverflowException is ignored . Thus, the data can be silently corrupted, and instead of explicitly indicating an error at a particular place, the developer will receive strange artifacts that have to be analyzed by the debugger.

There were eight of these lost exceptions:

  • BitMappedSubMemoryBlock.java: lines 77, 99, 106, 122
  • ByteMappedSubMemoryBlock.java: lines 52, 73, 92, 114

It is characteristic that in the same files there are extremely similar methods in which throw is present. Most likely, one method was originally written similarly to the above fragment, after which it was copied several times, somehow found an error and corrected it in those places that they could remember.

Fragment 3: minefield


private void processSelection(OptionsTreeNode selectedNode) {
  if (selectedNode == null) {
    setViewPanel(defaultPanel, selectedNode); // <=
    return;
  }
  ...
}
private void setViewPanel(JComponent component, OptionsTreeNode selectedNode) {
  ...
  setHelpLocation(component, selectedNode);
  ...
}
private void setHelpLocation(JComponent component, OptionsTreeNode node) {
  Options options = node.getOptions();
  ...
}

PVS-Studio warning : V6008 Null dereference of 'selectedNode' in function 'setViewPanel'. OptionsPanel.java:266

The analyzer lied a bit - at the moment, calling the processSelection method does not lead to a NullPointerException , since this method is called only two times, and before it is called, selectedNode is explicitly checked for null . This should not be done, since another developer can see that the method explicitly handles the case selectedNode == null , and decide that this is a valid value, which then will result in the application crashing. Such surprises are especially dangerous just in open projects, since people who do not know the code base thoroughly participate in them.

In general, I must say that the whole processSelection method looks rather strange. It is likely that this is a copy-paste error, since in the same method an if block with the same body is encountered twice more, although with different conditions. However, at this point, the selectedNode will no longer be null , and the setViewPanel-setHelpLocation call chain will not result in a NullPointerException .

Fragment 4: auto-completion for evil


public static final int[] UNSUPPORTED_OPCODES_LIST = { ... };
public static final Set<Integer> UNSUPPORTED_OPCODES = new HashSet<>();

static {
  for (int opcode : UNSUPPORTED_OPCODES) {
    UNSUPPORTED_OPCODES.add(opcode);
  }
}

PVS-Studio Warning: V6053 The collection is modified while the iteration is in progress. ConcurrentModificationException may occur. DWARFExpressionOpCodes.java:205

In this case, the analyzer again lied a bit - the exception will not be thrown, since the UNSUPPORTED_OPCODES collection is always empty and the loop simply will not execute. In addition, it so happened that the collection is a multitude, and adding an already present element will not change it. Most likely, the author entered in for-eachthe name of the collection through auto-completion and did not notice that the wrong field was proposed. Modification of the collection during iteration on it is impossible, but in good circumstances, as in this case, the application may not fall. Here, this typo has an indirect effect: the machine that parses DWARF files relies on this collection to stop the analysis when finding unsupported opcodes.

Starting with Java 9, it is worth using the factory methods of the standard library for constant collections: for example, Set.of (T ... elements) is not only much more convenient, but also immediately makes the created collection immutable, which increases the reliability of the code.

Fragment 5: there is everything


public void setValueAt(Object aValue, int row, int column) {
  ...
  int index = indexOf(newName);
  if (index >= 0) {                  // <=
    Window window = tool.getActiveWindow();
    Msg.showInfo(getClass(), window, "Duplicate Name",
                 "Name already exists: " + newName);
    return;
  }

  ExternalPath path = paths.get(row); // <=
  ...
}
private int indexOf(String name) {
  for (int i = 0; i < paths.size(); i++) {
    ExternalPath path = paths.get(i);
    if (path.getName().equals(name)) {
      return i;
    }
  }
  return 0;
}

PVS-Studio warnings:

  • V6007 Expression 'index> = 0' is always true. ExternalNamesTableModel.java:105
  • V6019 Unreachable code detected. It is possible that an error is present. ExternalNamesTableModel.java:109

The author thought about it, and in the indexOf method , instead of "index" -1 for an unknown value returns 0 - the index of the first element of the paths collection . Even if the collection is empty. Or maybe the method was generated, but forgot to change the default return value. As a result, the setValueAt method will discard any value passed to it and display the user with a "Name already exists" error, even if there is no existing name.

By the way, indexOf is not used anywhere else, and its value is needed only in order to determine whether the element you are looking for exists. Perhaps, instead of a separate method, write for-each directly in setValueAt and make returnon a matching item instead of games with indexes.

Note: I could not reproduce the alleged error. The setValueAt method is probably no longer used or called only under certain conditions.

Fragment 6: keep silence


final static Map<Character, String> DELIMITER_NAME_MAP = new HashMap<>(20);
// Any non-alphanumeric char can be used as a delimiter.
static {
  DELIMITER_NAME_MAP.put(' ', "Space");
  DELIMITER_NAME_MAP.put('~', "Tilde");
  DELIMITER_NAME_MAP.put('`', "Back quote");
  DELIMITER_NAME_MAP.put('@', "Exclamation point");
  DELIMITER_NAME_MAP.put('@', "At sign");
  DELIMITER_NAME_MAP.put('#', "Pound sign");
  DELIMITER_NAME_MAP.put('$', "Dollar sign");
  DELIMITER_NAME_MAP.put('%', "Percent sign");
  ...
}

PVS-Studio Warning: V6033 An item with the same key '@' has already been added. FilterOptions.java:45

Ghidra supports filtering data in various contexts: for example, you can filter the list of project files by name. In addition, filtering by several keywords at once is implemented: '.java, .c' when the 'OR' mode is on, displays all files whose name contains either '.java' or '.c'. It is understood that any special character can be used as a word separator (a specific separator is selected in the filter settings), but in reality the exclamation mark was not available.

In such initialization sheets, it is very easy to seal up, as they are often written using copy-paste, and when you look at such code, your eyes quickly blur. And if the typo was not on two adjacent lines, then by hand it will almost certainly no one see.

Fragment 7: remainder of division is always 0


void setFactorys(FieldFactory[] fieldFactorys,
                 DataFormatModel dataModel, int margin) {
  factorys = new FieldFactory[fieldFactorys.length];

  int x = margin;
  int defaultGroupSizeSpace = 1;
  for (int i = 0; i < factorys.length; i++) {
    factorys[i] = fieldFactorys[i];
    factorys[i].setStartX(x);
    x += factorys[i].getWidth();
    // add in space between groups
    if (((i + 1) % defaultGroupSizeSpace) == 0) { // <=
      x += margin * dataModel.getUnitDelimiterSize();
    }
  }
  width = x - margin * dataModel.getUnitDelimiterSize() + margin;
  layoutChanged();
}

PVS-Studio warnings:

  • V6007 Expression '((i + 1)% defaultGroupSizeSpace) == 0' is always true. ByteViewerLayoutModel.java:66
  • V6048 This expression can be simplified. Operand 'defaultGroupSizeSpace' in the operation equals 1. ByteViewerLayoutModel.java:66

The hex byte viewer supports the choice of the size of the displayed groups: for example, you can configure the output in the format 'ffff ffff' or 'ff ff ff ff'. The setFactorys method is responsible for the location of these groups in the user interface . Despite the fact that customization and display work correctly, the cycle in this method looks extremely suspicious: the remainder of the division by one is always zero, which means that the x coordinate will increase at each iteration. Suspicion adds properties and availability groupSize in setting DataModel .

Remaining garbage after refactoring? Or maybe the defaultGroupSizeSpace variable calculations were lost? In any case, an attempt to replace its value with dataModel.getGroupSize () broke the layout, and perhaps only the author of this code can give an unambiguous answer.

Fragment 8: broken validation, part 2


private String parseArrayDimensions(String datatype,
                                    List<Integer> arrayDimensions) {
  String dataTypeName = datatype;
  boolean zeroLengthArray = false;
  while (dataTypeName.endsWith("]")) {
    if (zeroLengthArray) {                   // <=
      return null; // only last dimension may be 0
    }
    int rBracketPos = dataTypeName.lastIndexOf(']');
    int lBracketPos = dataTypeName.lastIndexOf('[');
    if (lBracketPos < 0) {
      return null;
    }
    int dimension;
    try {
      dimension = Integer.parseInt(dataTypeName.substring(lBracketPos + 1,
                                                          rBracketPos));
      if (dimension < 0) {
        return null; // invalid dimension
      }
    }
    catch (NumberFormatException e) {
      return null;
    }
    dataTypeName = dataTypeName.substring(0, lBracketPos).trim();
    arrayDimensions.add(dimension);
  }
  return dataTypeName;
}

PVS-Studio Warning: V6007 Expression 'zeroLengthArray' is always false. PdbDataTypeParser.java:278

This method parses the dimension of a multidimensional array and returns either the text remaining after parsing or null for invalid data. A comment next to one of the validation checks states that only the last read size can be zero. The analysis goes from right to left, so it is understood that '[0] [1] [2]' is a valid input text, and '[2] [1] [0]' is not.

But, trouble: no one added the check that the next size is zero, and the parser eats invalid data without unnecessary questions. You should probably fix the try block as follows:

try {
  dimension = Integer.parseInt(dataTypeName.substring(lBracketPos + 1,
                                                      rBracketPos));
  if (dimension < 0) {
    return null; // invalid dimension
  } else if (dimension == 0) {
    zeroLengthArray = true;
  }
}
catch (NumberFormatException e) {
  return null;
}

Naturally, there is the possibility that this criterion of validity over time turned out to be unnecessary, or the author’s comment has a different meaning and it is necessary to check the first dimension read. In any case, data validation is a critical part of any application, which must be taken with full responsibility. Errors in it can lead to quite harmless crashes of the application, as well as to security holes, data leakage, data corruption or loss (for example, if you skip SQL injection during query validation).

A bit about the rest of the warnings


The reader may notice that a lot of warnings were issued, but few were considered. Not very neatly tuned cloc in the project counted about 1.25 million lines of Java code (not empty and not comments). The fact is that almost all warnings are extremely similar: here they forgot to check for null , they did not delete the unused legacy code there. I don’t really want to bore the reader by listing the same thing, and I mentioned part of such cases at the beginning of the article.

Another example is the fifty warning “ V6009 Function receives an odd argument” in the context of inaccurate use of the substring method(CParserUtils.java:280, ComplexName.java:48 and others) to get the rest of the string after any separator. Developers often hope that this separator will be present in the string and forget that otherwise indexOf will return -1, which is an incorrect value for substring . Naturally, if the data was validated or received not from outside, then the probability of the application crashing is significantly reduced. However, in general, these are potentially dangerous places that we want to help get rid of.

Conclusion


In general, Ghidra pleased with the quality of the code - no special nightmares are evident. The code is well formatted and has a very consistent style: in most cases, variables, methods, and everything else are given clear names, explanatory comments are found, a huge number of tests are present.

Naturally, there were no problems, among which:

  • Dead code, which, most likely, remained after numerous refactoring;
  • Many javadocs are hopelessly outdated and, for example, indicate non-existent parameters;
  • There is no possibility of convenient development when using IntelliJ IDEA ;
  • A modular system built around reflection makes navigating a project and finding dependencies between components much more difficult.

Please do not neglect developer tools. Static analysis, like seat belts, is not a panacea, but it will help prevent some disasters before release. And nobody likes to use the logged-in software.

You can read about other proven projects in our blog . And we also have a trial license and various options for using the analyzer without the need to pay for it.



If you want to share this article with an English-speaking audience, then please use the link to the translation: Nikita Lazeba. NSA, Ghidra, and Unicorns .

All Articles