Debugging ARM Cortex-M Microcontrollers by UART Part 2

In the last article, I talked about the DebugMon interrupt and the registers associated with it.

In this article we will write the implementation of the debugger for UART.

Low level part


Here and here there is a description of the structure of requests and responses of the GDB server. Although it seems simple, we will not implement it in the microcontroller for the following reasons:

  • Big data redundancy. Addresses, values โ€‹โ€‹of registers, variables are encoded as a hex-string, which increases the message volume by 2 times
  • Parsing and collecting messages will take additional resources
  • Tracking the end of the packet is required either by timeout (the timer will be busy), or by a complex automatic machine, which will increase the time spent in the UART interrupt

To get the easiest and fastest debugging module, we will use a binary protocol with control sequences:

  • 0xAA 0xFF - Start of frame
  • 0xAA 0x00 - End of frame
  • 0xAA 0xA5 - Interrupt
  • 0xAA 0xAA - Replaced by 0xAA

To process these sequences during reception, an automaton with 4 states is required:

  • Waiting for ESC character
  • Waiting for the second character of the Start of frame sequence
  • Data reception
  • The last time an Esc character was accepted

But to send states, you need already 7:

  • Sending the first byte Start of frame
  • Sending a second byte Start of frame
  • Sending data
  • Submitting End of frame
  • Sending Esc character replacement
  • Sending the first Interrupt byte
  • Sending Second Interrupt Byte

Let us write the definition of the structure inside which all the module variables will be located:

typedef struct 
{    
  // disable receive data
  unsigned tx:1;
  // program stopped
  unsigned StopProgramm:1;
  union {
    enum rx_state_e 
    {
      rxWaitS = 0, // wait Esc symbol
      rxWaitC = 1, // wait Start of frame
      rxReceive = 2, // receiving
      rxEsc = 3, // Esc received
    } rx_state;
    enum tx_state_e 
    {
      txSendS = 0, // send first byte of Start of frame
      txSendC = 1, // send second byte
      txSendN = 2, // send byte of data
      txEsc = 3,   // send escaped byte of data
      txEnd = 4,   // send End of frame
      txSendS2 = 5,// send first byte of Interrupt
      txBrk = 6,   // send second byte
    } tx_state;
  };
  uint8_t pos; // receive/send position
  uint8_t buf[128]; // offset = 3
  uint8_t txCnt; // size of send data
} dbg_t;
#define dbgG ((dbg_t*)DBG_ADDR) //   ,         

The states of the receiving and transmitting machines are combined into one variable since the work will be carried out in half duplex mode. Now you can write automata themselves with an interrupt handler.

UART handler
void USART6_IRQHandler(void)
{
  if (((USART6->ISR & USART_ISR_RXNE) != 0U)
      && ((USART6->CR1 & USART_CR1_RXNEIE) != 0U))
  {
    rxCb(USART6->RDR);
    return;
  }

  if (((USART6->ISR & USART_ISR_TXE) != 0U)
      && ((USART6->CR1 & USART_CR1_TXEIE) != 0U))
  {
    txCb();
    return;
  }
}

void rxCb(uint8_t byte)
{
  dbg_t* dbg = dbgG; // debug vars pointer
  
  if (dbg->tx) // use half duplex mode
    return;
  
  switch(dbg->rx_state)
  {
  default:
  case rxWaitS:
    if (byte==0xAA)
      dbg->rx_state = rxWaitC;
    break;
  case rxWaitC:
    if (byte == 0xFF)
      dbg->rx_state = rxReceive;
    else
      dbg->rx_state = rxWaitS;
    dbg->pos = 0;
    break;
  case rxReceive:
    if (byte == 0xAA)
      dbg->rx_state = rxEsc;
    else
      dbg->buf[dbg->pos++] = byte;
    break;
  case rxEsc:
    if (byte == 0xAA)
    {
      dbg->buf[dbg->pos++] = byte;
      dbg->rx_state  = rxReceive;
    }
    else if (byte == 0x00)
    {
      parseAnswer();
    }
    else
      dbg->rx_state = rxWaitS;
  }
}

void txCb()
{
  dbg_t* dbg = dbgG;
  switch (dbg->tx_state)
  {
  case txSendS:
    USART6->TDR = 0xAA;
    dbg->tx_state = txSendC;
    break;
  case txSendC:
    USART6->TDR = 0xFF;
    dbg->tx_state = txSendN;
    break;
  case txSendN:
    if (dbg->txCnt>=dbg->pos)
    {
      USART6->TDR = 0xAA;
      dbg->tx_state = txEnd;
      break;
    }
    if (dbg->buf[dbg->txCnt]==0xAA)
    {
      USART6->TDR = 0xAA;
      dbg->tx_state = txEsc;
      break;
    }
    USART6->TDR = dbg->buf[dbg->txCnt++];
    break;
  case txEsc:
    USART6->TDR = 0xAA;
    dbg->txCnt++;
    dbg->tx_state = txSendN;
    break;
  case txEnd:
    USART6->TDR = 0x00;
    dbg->rx_state = rxWaitS;
    dbg->tx = 0;
    CLEAR_BIT(USART6->CR1, USART_CR1_TXEIE);
    break;
  case txSendS2:
    USART6->TDR = 0xAA;
    dbg->tx_state = txBrk;
    break;
  case txBrk:
    USART6->TDR = 0xA5;
    dbg->rx_state = rxWaitS;
    dbg->tx = 0;
    CLEAR_BIT(USART6->CR1, USART_CR1_TXEIE);
    break;
  }
}


Everything is pretty simple here. Depending on the event that occurred, the interrupt handler calls either a receiving machine or a transmission machine. To verify that everything works, we write a packet handler that responds with one byte:

void parseAnswer()
{
  dbg_t* dbg = dbgG;
  dbg->pos = 1;
  dbg->buf[0] = 0x33;
  dbg->txCnt = 0;
  dbg->tx = 1;
  dbg->tx_state = txSendS;
  SET_BIT(USART6->CR1, USART_CR1_TXEIE);
}

Compile, sew, run. The result is visible on the screen, it worked.

Test exchange


Next, you need to implement command analogs from the GDB server protocol:

  • memory reading
  • memory record
  • program stop
  • continued execution
  • kernel register read
  • kernel register entry
  • setting a breakpoint
  • delete breakpoint

The command will be encoded with the first byte of data. Codes of teams have numbers in the order of their implementation:

  • 2 - read memory
  • 3 - memory record
  • 4 - stop
  • 5 - continued
  • 6 - read case
  • 7 - install breakpoint
  • 8 - clearing breakpoint
  • 9 - step (failed to implement)
  • 10 - register entry (not implemented)

Parameters will be transmitted in the following bytes of data.

The answer will not contain the command number, as we already know which team sent.

To prevent the module from raising BusFault exceptions during read / write operations, you must mask it when used on M3 or higher, or write a HardFault handler for M0.

Safe memcpy
int memcpySafe(uint8_t* to,uint8_t* from, int len)
{
    /* Cortex-M3, Cortex-M4, Cortex-M4F, Cortex-M7 are supported */
    static const uint32_t BFARVALID_MASK = (0x80 << SCB_CFSR_BUSFAULTSR_Pos);
    int cnt = 0;

    /* Clear BFARVALID flag by writing 1 to it */
    SCB->CFSR |= BFARVALID_MASK;

    /* Ignore BusFault by enabling BFHFNMIGN and disabling interrupts */
    uint32_t mask = __get_FAULTMASK();
    __disable_fault_irq();
    SCB->CCR |= SCB_CCR_BFHFNMIGN_Msk;

    while ((cnt<len))
    {
      *(to++) = *(from++);
      cnt++;
    }

    /* Reenable BusFault by clearing  BFHFNMIGN */
    SCB->CCR &= ~SCB_CCR_BFHFNMIGN_Msk;
    __set_FAULTMASK(mask);

    return cnt;
}


The breakpoint setting is implemented by searching for the first inactive register FP_COMP.

Code Installing Breakpoints
	
  dbg->pos = 0; //  -    0
    addr = ((*(uint32_t*)(&dbg->buf[1])))|1; //    FP_COMP
    for (tmp = 0;tmp<8;tmp++) //      breakpoint 
      if (FP->FP_COMP[tmp] == addr)
        break;
    
    if (tmp!=8) //  , 
      break;
    
    for (tmp=0;tmp<NUMOFBKPTS;tmp++) //   
      if (FP->FP_COMP[tmp]==0) // ?
      {
        FP->FP_COMP[tmp] = addr; // 
        break; //  
      }
    break;


Cleaning is done by searching for the set breakpoint. Stopping execution sets breakpoint on the current PC. When exiting the UART interrupt, the kernel immediately enters DebugMon_Handler.

The DebugMon handler itself is very simple:

  • 1. The flag to stop execution is set.
  • 2. All set breakpoints are cleared.
  • 3. Waiting for completion of sending a response to the command in uart (if it did not have time to go)
  • 4. The sending of the Interrupt sequence begins
  • 5. In the loop, the handlers of the transmit and receive machines are called until the stop flag is lowered.

DebugMon Handler Code
void DebugMon_Handler(void)
{
  dbgG->StopProgramm = 1; //   
  
  for (int i=0;i<NUMOFBKPTS;i++) //  breakpoint
    FP->FP_COMP[i] = 0;
  
  while (USART6->CR1 & USART_CR1_TXEIE) //    
    if ((USART6->ISR & USART_ISR_TXE) != 0U)
      txCb();

  
  dbgG->tx_state = txSendS2; //   Interrupt 
  dbgG->tx = 1;
  SET_BIT(USART6->CR1, USART_CR1_TXEIE);

  while (dbgG->StopProgramm) //       
  {
  	//   UART  
    if (((USART6->ISR & USART_ISR_RXNE) != 0U)
        && ((USART6->CR1 & USART_CR1_RXNEIE) != 0U))
      rxCb(USART6->RDR);

    if (((USART6->ISR & USART_ISR_TXE) != 0U)
        && ((USART6->CR1 & USART_CR1_TXEIE) != 0U))
      txCb(); 
  }
}


Read kernel registers from C-Syshny when the task is problematic, so I rewrote some of the code on ASM. The result is that neither DebugMon_Handler, nor the UART interrupt handler, nor the machines use the stack. This simplified the definition of kernel register values.

Gdb server


The microcontroller part of the debugger works, now let's write the link between the IDE and our module.

From scratch, writing a debug server does not make sense, so let's take a ready-made one as a basis. Since I have the most experience in developing programs on .net, I took this project as a basis and rewrote it to other requirements. It would be more correct to add support for the new interface in OpenOCD, but it would take more time.

At startup, the program asks which COM port to work with, then starts listening on TCP port 3333 and waits for the GDB client to connect.

All GDB protocol commands are translated into a binary protocol.

As a result, a workable UART debugging implementation was released.

Final result


Conclusion


It turned out that debugging the controller itself is not something super complicated.
Theoretically, by placing this module in a separate memory section, it can also be used to flash the controller.

The source files were posted on GitHub for the general study of the

microcontroller part of the
GDB server

All Articles