Programming a game for an embedded device on ESP32

Part 0: motivation


Introduction


I was looking for a hobby project that I could work on outside my main tasks in order to escape from the situation in the world. I'm mostly interested in game programming, but I also like embedded systems. Now I work in a gaming company, but before I was mainly engaged in microcontrollers. Although in the end I decided to change my path and go into the game industry, I still like to experiment with them. So why not combine both hobbies?

Odroid go


I had Odroid Go lying around , which would be interesting to play with. Its core is ESP32 - a very popular microcontroller with standard MK functionality (SPI, I2C, GPIO, timers, etc.), but also with WiFi and Bluetooth, which makes it attractive for creating IoT devices.

Odroid Go complements the ESP32 with a bunch of peripherals, turning it into a portable gaming machine reminiscent of Gameboy Color: an LCD display, a speaker, a control cross, two primary and four auxiliary buttons, a battery and an SD card reader.

Mostly people buy Odroid Go to run emulators of old 8-bit systems. If this thing is capable of emulating old games, it will also cope with the launch of a native game designed specifically for it.


Limitations


Resolution 320x240 The

display has a size of only 320x240, so we are very limited in the amount of information displayed on the screen at the same time. We need to carefully consider what game we will make and what resources to use.

16-bit color The

display supports 16-bit color per pixel: 5 bits for red, 6 bits for green, and 5 for blue. For obvious reasons, such a circuit is usually called RGB565. Green got one bit more red and blue, because the human eye better distinguishes between gradations of green than blue or red.

16-bit color means that we have access to only 65 thousand colors. Compare this with the standard 24-bit color (8 bits per color), providing 16 million colors.

Lack of GPU

Without a GPU, we cannot use an API like OpenGL. Today, the same GPUs are usually used for rendering 2D games as for 3D games. Just instead of objects, quadrangles are drawn, on which bit textures are superimposed. Without a GPU, we have to rasterize each pixel with a CPU, which is slower but simpler.

With a screen resolution of 320x240 and 16-bit color, the total frame buffer size is 153,600 bytes. This means that at least thirty times per second we will need to transmit 153,600 bytes to the display. This can ultimately cause problems, so we need to be smarter when rendering the screen. For example, you can convert an indexed color to a palette so that for each pixel you need to store one byte, which will be used as an index of a 256-color palette.

4 MB

ESP32 has 520 KB of internal RAM, while Odroid Go adds another 4 MB of external RAM. But not all of this memory is available to us, because part is used by the ESP32 SDK (more on this later). After disabling all possible extraneous functions and entering my main function, ESP32 reports that we can use 4,494,848 bytes. If in the future we need more memory, then later we can return to trimming unnecessary functions.

80-240 MHz processor

The CPU is configured at three possible speeds: 80 MHz, 160 MHz and 240 MHz. Even a maximum of 240 MHz is far from the power of more than three gigahertz of modern computers with which we are used to working. We will start at 80 MHz and see how far we can go. If we want the game to work on battery power, then power consumption should be low. To do this, it would be nice to lower the frequency.

Bad debugging

There are ways to use debuggers with embedded devices (JTAG), but, unfortunately, Odroid Go does not provide us with the necessary contacts, so we cannot step through the code in the debugger, as is usually the case. This means that debugging can be a difficult process, and we will have to actively use on-screen debugging (using colors and text), and also output information to the debugging console (which, fortunately, is easily accessible via USB UART).

Why all the trouble?


Why even try to create a game for this weak device with all the limitations listed above, and just not write anything for a desktop PC? There are two reasons for this:

Limitations stimulate creativity.

When you work with a system that has a certain set of equipment, each of which has its own limitations, it makes you think over how to best use the advantages of these limitations. So we get closer to game developers of old systems, for example, Super Nintendo (but it’s still much easier for us than for them).

Low-level development is fun

To write a game from scratch for a regular desktop system, we have to work with standard low-level engine concepts: rendering, physics, collision recognition. But when implementing all this on an embedded device, we also have to deal with low-level computer concepts, for example, writing an LCD driver.

How low will the development be?


When it comes to low level and creating your own code, you have to draw a border somewhere. If we are trying to write a game without libraries for the desktop, then the border is likely to be an operating system or a cross-platform API like SDL. In my project, I will draw a line on writing things like SPI drivers and bootloaders. With them much more torment than fun.

So, we will use the ESP-IDF, which is essentially an SDK for ESP32. We can assume that it provides us with some utilities that the operating system usually provides, but the operating system does not work in ESP32 . Strictly speaking, this MK uses FreeRTOS, which is a real-time operating systembut this is not a real OS. This is just a planner. Most likely, we will not interact with it, but in its core ESP-IDF uses it.

ESP-IDF provides us with an API for ESP32 peripherals such as SPI, I2C, and UART, as well as a C runtime library, so when we call something like printf, it actually transfers bytes via UART to be displayed on the serial interface monitor. It also processes all the startup code needed to prepare the machine before it invokes the launch point of our game.

In this post I will keep a development magazine in which I will talk about interesting points that seemed to me and explain the most difficult aspects. I do not have a plan and most likely I will make many mistakes. All this I create out of interest.

Part 1: build system


Introduction


Before we can start writing code for Odroid Go, we need to configure the ESP32 SDK. It contains the code that starts ESP32 and calls our main function, as well as the peripheral code (for example, SPI) that we will need when we write the LCD driver.

Espressif calls its ESP-IDF SDK ; we use the latest stable version v4.0 .

We can either clone the repository according to their instructions (with the recursive flag ), or simply download the zip from the releases page.

Our first goal is a minimal Hello World-style application installed on Odroid Go that proves the correct setup of the build environment.

C or C ++


ESP-IDF uses C99, so we will choose it too. If desired, we could use C ++ (in the ESP32 toolchain there is a C ++ compiler), but for now we will stick to C.

Actually, I like C and its simplicity. No matter how much I write code in C ++, I never managed to reach the moment of enjoying it.

This person sums up my thoughts pretty well.

In addition, if necessary, we can switch to C ++ at any time.

Minimal project


IDF uses CMake to manage the build system. It also supports Makefile, but they are deprecated in v4.0, so we’ll just use CMake.

At a minimum, we need a CMakeLists.txt file with a description of our project, a main folder with the source file of the entry point into the game, and another CMakeLists.txt file inside main , which lists the source files.

CMake needs to reference environment variables that tell it where to look for IDF and toolchain. I was annoyed that I had to reinstall them each time I started a new terminal session, so I wrote the export.sh script . It sets IDF_PATH and IDF_TOOLS_PATH, and is also an IDF export source that sets other environment variables.

It is enough for the script user to set the IDF_PATH and IDF_TOOLS_PATH variables .

IDF_PATH=
IDF_TOOLS_PATH=


if [ -z "$IDF_PATH" ]
then
	echo "IDF_PATH not set"
	return
fi

if [ -z "$IDF_TOOLS_PATH" ]
then
	echo "IDF_TOOLS_PATH not set"
	return
fi


export IDF_PATH
export IDF_TOOLS_PATH

source $IDF_PATH/export.sh

CMakeLists.txt in the root:

cmake_minimum_required(VERSION 3.5)

set(COMPONENTS "esptool_py main")

include($ENV{IDF_PATH}/tools/cmake/project.cmake)

project(game)

By default, the build system will build every possible component inside $ ESP_IDF / components , which will result in more compilation time. We want to compile a minimal set of components to call our main function, and connect additional components later if necessary. This is what the COMPONENTS variable is for .

CMakeLists.txt inside main :

idf_component_register(
	SRCS "main.c"
    INCLUDE_DIRS "")

Everything that he does - infinitely once a second displays on the monitor the serial interface "Hello World". VTaskDelay uses FreeRTOS to delay .

The main.c file is very simple:

#include <stdio.h>
#include <freertos/FreeRTOS.h>
#include <freertos/task.h>


void app_main(void)
{
	for (;;)
	{
		printf("Hello World!\n");
		vTaskDelay(1000 / portTICK_PERIOD_MS);
	}

	// Should never get here
	esp_restart();
}

Note that our function is called app_main , not main . The main function is used by the IDF for the necessary preparation, and then it creates a task with our app_main function as an entry point.

A task is just an executable block that FreeRTOS can manage. While we should not worry about this (or maybe not at all), it is important to note here that our game runs in one core (ESP32 has two cores), and with each iteration of the for loop, the task delays execution for one second. During this delay, the FreeRTOS scheduler may execute other code that is waiting in line for execution (if any).

We can use both cores, but for now, let's limit ourselves to one.

Components


Even if we reduce the list of components to the minimum required for the Hello World application (which are esptool_py and main ), due to the configuration of the dependency chain, it still collects some other components that we do not need. It collects all these components:

app_trace app_update bootloader bootloader_support cxx driver efuse esp32 esp_common esp_eth esp_event esp_ringbuf
esp_rom esp_wifi espcoredump esptool_py freertos heap log lwip main mbedtls newlib nvs_flash partition_table pthread
soc spi_flash tcpip_adapter vfs wpa_supplicant xtensa

Many of them are quite logical ( bootloader , esp32 , freertos ), but they are followed by unnecessary components because we do not use network functions: esp_eth, esp_wifi, lwip, mbedtls, tcpip_adapter, wpa_supplicant . Unfortunately, we are still forced to assemble these components.

Fortunately, the linker is smart enough and does not put unused components into a ready-made binary file of the game. We can verify this with make size-components .

Total sizes:
 DRAM .data size:    8476 bytes
 DRAM .bss  size:    4144 bytes
Used static DRAM:   12620 bytes ( 168116 available, 7.0% used)
Used static IRAM:   56345 bytes (  74727 available, 43.0% used)
      Flash code:   95710 bytes
    Flash rodata:   40732 bytes
Total image size:~ 201263 bytes (.bin may be padded larger)
Per-archive contributions to ELF file:
            Archive File DRAM .data & .bss   IRAM Flash code & rodata   Total
                  libc.a        364      8   5975      63037     3833   73217
              libesp32.a       2110    151  15236      15415    21485   54397
           libfreertos.a       4148    776  14269          0     1972   21165
                libsoc.a        184      4   7909        875     4144   13116
          libspi_flash.a        714    294   5069       1320     1386    8783
                libvfs.a        308     48      0       5860      973    7189
         libesp_common.a         16   2240    521       1199     3060    7036
             libdriver.a         87     32      0       4335     2200    6654
               libheap.a        317      8   3150       1218      748    5441
             libnewlib.a        152    272    869        908       99    2300
        libesp_ringbuf.a          0      0    906          0      163    1069
                liblog.a          8    268    488         98        0     862
         libapp_update.a          0      4    127        159      486     776
 libbootloader_support.a          0      0      0        634        0     634
                libhal.a          0      0    519          0       32     551
            libpthread.a          8     12      0        288        0     308
             libxtensa.a          0      0    220          0        0     220
                libgcc.a          0      0      0          0      160     160
               libmain.a          0      0      0         22       13      35
                libcxx.a          0      0      0         11        0      11
                   (exe)          0      0      0          0        0       0
              libefuse.a          0      0      0          0        0       0
         libmbedcrypto.a          0      0      0          0        0       0
     libwpa_supplicant.a          0      0      0          0        0       0

Most of all, libc affects the size of the binary, and that's fine.

Project configuration


IDF allows you to specify compile-time configuration parameters that it uses during assembly to enable or disable various functions. We need to set parameters that will allow us to take advantage of the additional aspects of Odroid Go.

First, you need to run the source script of export.sh so that CMake has access to the necessary environment variables. Further, as for all CMake projects, we need to create an assembly folder and call CMake from it.

source export.sh
mkdir build
cd build
cmake ..

If you run make menuconfig , a window opens where you can configure project settings.

Expanding flash memory up to 16 MB


Odroid Go expands the standard flash drive capacity to 16 MB. You can enable this feature by going to Serial flasher config -> Flash size -> 16MB .

Turn on external SPI RAM


We also have access to an additional 4 MB of external RAM connected via SPI. You can enable it by going to Component config -> ESP32-specific -> Support for external, SPI-connected RAM and pressing the space bar to enable it. We also want to be able to explicitly allocate memory from SPI RAM; this can be enabled by going to SPI RAM config -> SPI RAM access method -> Make RAM allocatable using heap_caps_malloc .

Lower the frequency


ESP32 works by default with a frequency of 160 MHz, but let's lower it to 80 MHz to see how far you can go with the lowest clock frequency. We want the game to work on battery power, and lowering the frequency will save power. You can change it by going to Component config -> ESP32-specific -> CPU frequency -> 80MHz .

If you select Save , the sdkconfig file will be saved to the root of the project folder . We can write this file in git, but it has a lot of parameters that are not important to us. So far, we are satisfied with the standard parameters, except for those that we just changed.

You can create the sdkconfig.defaults file insteadwhich will contain the values ​​changed above. Everything else will be configured by default. During the build, the IDF will read sdkconfig.defaults , override the values ​​we set, and use the standard for all other parameters.

Now sdkconfig.defaults looks like this:

# Set flash size to 16MB
CONFIG_ESPTOOLPY_FLASHSIZE_16MB=y

# Set CPU frequency to 80MHz
CONFIG_ESP32_DEFAULT_CPU_FREQ_80=y

# Enable SPI RAM and allocate with heap_caps_malloc()
CONFIG_ESP32_SPIRAM_SUPPORT=y
CONFIG_SPIRAM_USE_CAPS_ALLOC=y

In general, the original structure of the game looks like this:

game
├── CMakeLists.txt
├── export.sh
├── main
│   ├── CMakeLists.txt
│   └── main.c
└── sdkconfig.defaults

Build and flash


The assembly and firmware process itself is quite simple.

We run make to compile (add -j4 or -j8 for parallel builds ), make flash to write the image to Odroid Go, and make monitor to see the output from the printf statements .

make
make flash
make monitor

We can also execute them in one line.

make flash monitor

The result is not particularly impressive, but it will become the basis for the rest of the project.


References



Part 2: input


Introduction


We need to be able to read the buttons pressed by the player and the cross on Odroid Go.

Buttons



GPIO


Odroid Go has six buttons: A , B , Select , Start , Menu and Volume .

Each of the buttons is connected to a separate General Purpose IO (GPIO) pin . GPIO pins can be used as inputs (for reading) or as outputs (we write to them). In the case of buttons, we need a read.

First you need to configure the contacts as inputs, after which we can read their status. The contacts inside have one of two voltages (3.3V or 0V), but when reading them using the IDF function, they are converted to integer values.

Initialization


Elements marked as SW in the diagram are the physical buttons themselves. When not pressed, the ESP32 contacts ( IO13 , IO0 , etc.) are connected to 3.3 V; i.e. 3.3 V means that the button is not pressed . The logic here is the opposite of what is expected.

IO0 and IO39 have physical resistors on the board. If the button is not pressed, then the resistor pulls the contacts to a high voltage. If the button is pressed, then the current

flowing through the contacts goes to the ground instead, so the voltage 0 will be read from the contacts. IO13 , IO27 , IO32 and IO33do not have resistors, because the contact on the ESP32 has internal resistors, which we configured for pull-up mode.

Knowing this, we can configure six buttons using the GPIO API.

const gpio_num_t BUTTON_PIN_A = GPIO_NUM_32;
const gpio_num_t BUTTON_PIN_B = GPIO_NUM_33;
const gpio_num_t BUTTON_PIN_START = GPIO_NUM_39;
const gpio_num_t BUTTON_PIN_SELECT = GPIO_NUM_27;
const gpio_num_t BUTTON_PIN_VOLUME = GPIO_NUM_0;
const gpio_num_t BUTTON_PIN_MENU = GPIO_NUM_13;

gpio_config_t gpioConfig = {};

gpioConfig.mode = GPIO_MODE_INPUT;
gpioConfig.pull_up_en = GPIO_PULLUP_ENABLE;
gpioConfig.pin_bit_mask =
	  (1ULL << BUTTON_PIN_A)
	| (1ULL << BUTTON_PIN_B)
	| (1ULL << BUTTON_PIN_START)
	| (1ULL << BUTTON_PIN_SELECT)
	| (1ULL << BUTTON_PIN_VOLUME)
	| (1ULL << BUTTON_PIN_MENU);

ESP_ERROR_CHECK(gpio_config(&gpioConfig));

The constants specified at the beginning of the code correspond to each of the circuit contacts. We use the gpio_config_t structure to configure each of the six buttons as pull-up input. In the case of IO13 , IO27 , IO32 and IO33, we need to ask IDF to turn on the pull-up resistors of these contacts. For IO0 and IO39 we don’t need to do this because they have physical resistors, but we will do it anyway to make the configuration beautiful.

ESP_ERROR_CHECK is a helper macro from IDF that automatically checks the result of all functions that return esp_err_t(most of the IDF) and assert that the result is not equal to ESP_OK . This macro is convenient to use for a function if its error is critical and after it makes no sense to continue execution. In this game, a game without input is not a game, so this statement is true. We will often use this macro.

Reading buttons


So, we configured all the contacts, and can finally read the values.

The number buttons are read by the gpio_get_level function , but we need to invert the received values, because the contacts are pulled up, that is, a high signal actually means “not pressed”, and a low one means “pressed”. Inverting preserves the usual logic: 1 means “pressed”, 0 - “not pressed”.

int a = !gpio_get_level(BUTTON_PIN_A);
int b = !gpio_get_level(BUTTON_PIN_B);
int select = !gpio_get_level(BUTTON_PIN_SELECT);
int start = !gpio_get_level(BUTTON_PIN_START);
int menu = !gpio_get_level(BUTTON_PIN_MENU);
int volume = !gpio_get_level(BUTTON_PIN_VOLUME);

Crosspiece (D-pad)



ADC


Connecting the cross is different from connecting the buttons. The up and down buttons are connected to one pin of an analog-to-digital converter (ADC) , and the left and right buttons are connected to another ADC pin.

Unlike GPIO digital contacts, from which we could read one of two states (high or low), the ADC converts a continuous analog voltage (e.g., from 0 V to 3.3 V) into a discrete numerical value (e.g., from 0 to 4095 )

I suppose the Odroid Go designers did so to save on GPIO pins (you only need two analog pins instead of four digital pins). Be that as it may, this slightly complicates the configuration and reading from these contacts.

Configuration


Contact IO35 is connected to the Y axis of the spider , and contact IO34 is connected to the X axis of the spider . We see that the joints of the cross are a little more complicated than the number buttons. Each axis has two switches ( SW1 and SW2 for the Y axis, SW3 and SW4 for the X axis), each of which is connected to a set of resistors ( R2 , R3 , R4 , R5 ).

If neither “up” nor “down” is pressed, the IO35 pin is pulled down to the ground via R3 , and we consider the value 0 V. If neither “left” nor “right” is pressed, contact IO34pulls down to the ground through R5 , and we count the value to 0 V.

If SW1 is pressed (“up”) , then with IO35 we count 3.3 V. If SW2 is pressed (“down”) , then with IO35 we count about 1, 65 V, because half the voltage will drop on resistor R2 .

If SW3 (“left”) is pressed , then with IO34 we count 3.3 V. If SW4 (“right”) is pressed , then with IO34 we also count about 1.65 V, because half the voltage will drop on resistor R4 .

Both cases are examples of voltage dividers.. When two resistors in the voltage divider have the same resistance (in our case - 100K), then the voltage drop will be half the input voltage.

Knowing this, we can configure the crosspiece:

const adc1_channel_t DPAD_PIN_X_AXIS = ADC1_GPIO34_CHANNEL;
const adc1_channel_t DPAD_PIN_Y_AXIS = ADC1_GPIO35_CHANNEL;

ESP_ERROR_CHECK(adc1_config_width(ADC_WIDTH_BIT_12));
ESP_ERROR_CHECK(adc1_config_channel_atten(DPAD_PIN_X_AXIS,ADC_ATTEN_DB_11));
ESP_ERROR_CHECK(adc1_config_channel_atten(DPAD_PIN_Y_AXIS,ADC_ATTEN_DB_11));

We set the ADC to 12 bits wide so that 0 V was read as 0, and 3.3 V as 4095 (2 ^ 12). Attenuation reports that we don’t need to attenuate the signal so that we get the full voltage range from 0 V to 3.3 V.

At 12 bits, we can expect that if nothing is pressed, then 0 will be read, when pressed up and to the left - 4096, and approximately 2048 will be read when pressed down and to the right (because resistors reduce voltage by half).

Cross reading


Reading the cross is more difficult than buttons, because we need to read the raw values ​​(from 0 to 4095) and interpret them.

const uint32_t ADC_POSITIVE_LEVEL = 3072;
const uint32_t ADC_NEGATIVE_LEVEL = 1024;

uint32_t dpadX = adc1_get_raw(DPAD_PIN_X_AXIS);

if (dpadX > ADC_POSITIVE_LEVEL)
{
	// Left pressed
}
else if (dpadX > ADC_NEGATIVE_LEVEL)
{
	// Right pressed
}


uint32_t dpadY = adc1_get_raw(DPAD_PIN_Y_AXIS);

if (dpadY > ADC_POSITIVE_LEVEL)
{
	// Up pressed
}
else if (dpadY > ADC_NEGATIVE_LEVEL)
{
	// Down pressed
}

ADC_POSITIVE_LEVEL and ADC_NEGATIVE_LEVEL are values ​​with a margin, ensuring that we always read the correct values.

Poll


There are two options for getting button values: polling or interrupts. We can create input processing functions and ask IDF to call these functions when buttons are pressed, or manually poll the state of the buttons when we need it. Interrupt-driven behavior makes things more complicated and difficult to understand. In addition, I always strive to make everything as simple as possible. If necessary, we can add interrupts later.

We will create a structure that will store the state of six buttons and four directions of the cross. We can create a structure with 10 boolean, or 10 int, or 10 unsigned int. However, instead, we will create the structure using bit fields .

typedef struct
{
	uint16_t a : 1;
	uint16_t b : 1;
	uint16_t volume : 1;
	uint16_t menu : 1;
	uint16_t select : 1;
	uint16_t start : 1;
	uint16_t left : 1;
	uint16_t right : 1;
	uint16_t up : 1;
	uint16_t down : 1;
} Odroid_Input;

When programming for desktop systems, bit fields are usually avoided because they are poorly ported to different machines, but we program for a specific machine and we don’t need to worry about that.

Instead of fields, a structure with 10 Boolean values ​​with a total size of 10 bytes could be used. Another option is one uint16_t with bit shift and bit masking macros that can set, clear , and check individual bits. It will work, but it will not be very beautiful.

A simple bit field allows us to take advantage of both approaches: two bytes of data and named fields.

Demo


Now we can poll the state of inputs inside the main loop and display the result.

void app_main(void)
{
	Odroid_InitializeInput();

	for (;;)
	{
		Odroid_Input input = Odroid_PollInput();

		printf(
			"\ra: %d  b: %d  start: %d  select: %d  vol: %d  menu: %d  up: %d  down: %d  left: %d  right: %d",
			input.a, input.b, input.start, input.select, input.volume, input.menu,
			input.up, input.down, input.left, input.right);

		fflush(stdout);

		vTaskDelay(250 / portTICK_PERIOD_MS);
	}

	// Should never get here
	esp_restart();
}

The printf function uses \ r to overwrite the previous line instead of adding a new one. fflush is needed to display a line, because in the normal state it is reset by the newline character \ n .


References



Part 3: display


Introduction


We need to be able to render pixels on the Odroid Go LCD.

Displaying colors on the screen will be more difficult than reading the input status because the LCD has brains. The screen is controlled by ILI9341 - a very popular TFT LCD driver on a single chip.

In other words, we are talking to ILI9341, which responds to our commands by controlling the pixels on the LCD. When I say “screen” or “display” in this part, I will actually mean ILI9341. We are dealing with ILI9341. It controls the LCD.

SPI


The LCD is connected to the ESP32 via SPI (Serial Peripheral Interface) .

SPI is a standard protocol used to exchange data between devices on a printed circuit board. It has four signals: MOSI (Master Out Slave In) , MISO (Master In Slave Out) , SCK (Clock) and CS (Chip Select) .

A single master device on the bus coordinates data transfer by controlling SCK and CS. There can be several devices on one bus, each of which will have its own CS signals. When the CS signal of this device is activated, it can transmit and receive data.

The ESP32 will be the SPI master (master), and the LCD will be the slave SPI slave. We need to configure the SPI bus with the required parameters and add an LCD display to the bus by configuring the corresponding contacts.



The names VSPI.XXXX are just labels for the contacts in the diagram, but we can go through the contacts themselves by looking at the parts of the LCD and ESP32 diagrams.

  • MOSI -> VSPI.MOSI -> IO23
  • MISO -> VSPI.MISO -> IO19
  • SCK -> VSPI.SCK -> IO18
  • CS0 -> VSPI.CS0 -> IO5

We also have IO14 , which is the GPIO pin that is used to turn on the backlight, and also IO21 , which is connected to the DC pin of the LCD. This contact controls the type of information that we transmit to the display.

First, configure the SPI bus.

const gpio_num_t LCD_PIN_MISO = GPIO_NUM_19;
const gpio_num_t LCD_PIN_MOSI = GPIO_NUM_23;
const gpio_num_t LCD_PIN_SCLK = GPIO_NUM_18;
const gpio_num_t LCD_PIN_CS = GPIO_NUM_5;
const gpio_num_t LCD_PIN_DC = GPIO_NUM_21;
const gpio_num_t LCD_PIN_BACKLIGHT = GPIO_NUM_14;
const int LCD_WIDTH = 320;
const int LCD_HEIGHT = 240;
const int LCD_DEPTH = 2;


spi_bus_config_t spiBusConfig = {};
spiBusConfig.miso_io_num = LCD_PIN_MISO;
spiBusConfig.mosi_io_num = LCD_PIN_MOSI;
spiBusConfig.sclk_io_num = LCD_PIN_SCLK;
spiBusConfig.quadwp_io_num = -1; // Unused
spiBusConfig.quadhd_io_num = -1; // Unused
spiBusConfig.max_transfer_sz = LCD_WIDTH * LCD_HEIGHT * LCD_DEPTH;

ESP_ERROR_CHECK(spi_bus_initialize(VSPI_HOST, &spiBusConfig, 1));

We configure the bus using spi_bus_config_t . It is necessary to communicate the contacts we use and the maximum size of one data transfer.

For now, we will perform one SPI transmission for all frame buffer data, which is equal to the width of the LCD (in pixels) times its height (in pixels) times the number of bytes per pixel.

The width is 320, the height is 240, and the color depth is 2 bytes (the display expects pixel colors to be 16 bits deep).

spi_handle_t gSpiHandle;

spi_device_interface_config_t spiDeviceConfig = {};
spiDeviceConfig.clock_speed_hz = SPI_MASTER_FREQ_40M;
spiDeviceConfig.spics_io_num = LCD_PIN_CS;
spiDeviceConfig.queue_size = 1;
spiDeviceConfig.flags = SPI_DEVICE_NO_DUMMY;

ESP_ERROR_CHECK(spi_bus_add_device(VSPI_HOST, &spiDeviceConfig, &gSpiHandle));

After initializing the bus, we need to add an LCD device to the bus so that we can start talking to it.

  • clock_speed_hz — - , SPI 40 , . 80 , .
  • spics_io_num — CS, IDF CS, ( SD- SPI).
  • queue_size — 1, ( ).
  • flags - the IDF SPI driver usually inserts empty bits in the transmission to avoid timing problems during reading from the SPI device, but we perform one-way transmission (we will not read from the display). SPI_DEVICE_NO_DUMMY reports that we confirm this one-way transmission and we do not need to insert empty bits.


gpio_set_direction(LCD_PIN_DC, GPIO_MODE_OUTPUT);
gpio_set_direction(LCD_PIN_BACKLIGHT, GPIO_MODE_OUTPUT);

We also need to set the DC and backlight pins as GPIO pins. After switching DC, the backlight will be constantly on.

Teams


Communication with the LCD is in the form of commands. First, we pass a byte denoting the command we want to send, and then we pass the command parameters (if any). The display understands that the byte is a command if the DC signal is low. If the DC signal is high, then the received data will be considered the parameters of the previously transmitted command.

In general, the stream looks like this:

  1. We give a low signal to DC
  2. We send one byte of the command
  3. We give a high signal to DC
  4. Send zero or more bytes, depending on the requirements of the command
  5. Repeat steps 1-4

Here our best friend is the ILI9341 specification . It lists all the possible commands, their parameters and how to use them.


An example of a command without parameters is Display ON . The command byte is 0x29 , but no parameters are specified for it.


An example of a command with parameters is the Column Address Set . The command byte is 0x2A , but four required parameters are specified for it. To use the command, you need to send a low signal to DC , send 0x2A , send a high signal to DC , and then transfer the bytes of four parameters.

The command codes themselves are listed.

typedef enum
{
	SOFTWARE_RESET = 0x01u,
	SLEEP_OUT = 0x11u,
	DISPLAY_ON = 0x29u,
	COLUMN_ADDRESS_SET = 0x2Au,
	PAGE_ADDRESS_SET = 0x2Bu,
	MEMORY_WRITE = 0x2Cu,
	MEMORY_ACCESS_CONTROL = 0x36u,
	PIXEL_FORMAT_SET = 0x3Au,
} CommandCode;

Instead, we could use a macro ( #define SOFTWARE_RESET (0x01u) ), but they do not have symbols in the debugger and they have no scope. It would also be possible to use the integer static constants, as we did with the GPIO contacts, but thanks to enum, at a glance we can understand what data is passed to a function or member of the structure: they are of type CommandCode . Otherwise, it could be raw uint8_t that tells nothing to the programmer reading the code.

Launch


During initialization, we can pass different commands to be able to draw something. Each command has a command byte, which we will call Command Code .

We will define a structure for storing the launch command so that you can specify their array.

typedef struct
{
	CommandCode code;
	uint8_t parameters[15];
	uint8_t length;
} StartupCommand;

  • code is the command code.
  • parameters is an array of command parameters (if any). This is a static array of size 15, because this is the maximum number of parameters we need. Due to the static nature of the array, we don’t have to worry about allocating a dynamic array for each command every time.
  • length is the number of parameters in the parameters array .

Using this structure, we can specify a list of launch commands.

StartupCommand gStartupCommands[] =
{
	// Reset to defaults
	{
		SOFTWARE_RESET,
		{},
		0
	},

	// Landscape Mode
	// Top-Left Origin
	// BGR Panel
	{
		MEMORY_ACCESS_CONTROL,
		{0x20 | 0xC0 | 0x08},
		1
	},

	// 16 bits per pixel
	{
		PIXEL_FORMAT_SET,
		{0x55},
		1
	},

	// Exit sleep mode
	{
		SLEEP_OUT,
		{},
		0
	},

	// Turn on the display
	{
		DISPLAY_ON,
		{},
		0
	},
};

Commands without parameters, for example, SOFTWARE_RESET , set the parameters initializer list to empty (that is, with one zeros) and length set to 0. Commands with parameters fill in the parameters and specify length. It would be great if we could set length automatically, and not write numbers (in case we make a mistake or the parameters change), but I do not think that it is worth the trouble.

The purpose of most teams is clear from the name, with the exception of two.

MEMORY_ACCESS_CONTROL

  • Landscape Mode: By default, the display uses portrait orientation (240x320), but we want to use landscape (320x240).
  • Top-Left Origin: (0,0) , ( ) .
  • BGR Panel: , BGR. , , , , .

PIXEL_FORMAT_SET

  • 16 bits per pixel: 16- .

There are many other commands that can be sent at startup to control various aspects, such as gamma. The necessary parameters are described in the specification of the LCD itself (and not the ILI9341 controller), to which we do not have access. If we do not transmit these commands, then the default display settings are used, which suits us perfectly.

Having prepared an array of launch commands, we can begin to transfer them to the display.

First, we need a function that sends one byte of command to the display. Do not forget that sending commands is different from sending parameters, because we need to send a low signal to DC .

#define BYTES_TO_BITS(value) ( (value) * 8 )

void SendCommandCode(CommandCode code)
{
	spi_transaction_t transaction = {};

	transaction.length = BYTES_TO_BITS(1);
	transaction.tx_data[0] = (uint8_t)code;
	transaction.flags = SPI_TRANS_USE_TXDATA;

	gpio_set_level(LCD_PIN_DC, 0);
	spi_device_transmit(gSpiHandle, &transaction);
}

The IDF has a spi_transaction_t structure , which we populate when we want to transfer something via the SPI bus. We know how many bits the payload is and transfer the load itself.

We can either pass a pointer to the payload, or use the internal struct tx_data structure, which is only four bytes in size, but saves the driver from having to access external memory. If we use tx_data , we must set the flag SPI_TRANS_USE_TXDATA .

Before transmitting data, we send a low signal to the DC , indicating that this is a command code.

void SendCommandParameters(uint8_t* data, int length)
{
	spi_transaction_t transaction = {};

	transaction.length = BYTES_TO_BITS(length);
	transaction.tx_buffer = data;
	transaction.flags = 0;

	gpio_set_level(LCD_PIN_DC, 1);
	spi_device_transmit(SPIHANDLE, &transaction);
}

Passing parameters is similar to sending a command, only this time we use our own buffer ( data ) and send a high signal to DC to tell the display that the parameters are being transmitted. In addition, we do not set the SPI_TRANS_USE_TXDATA flag because we are passing our own buffer.

Then you can send all the launch commands.

#define ARRAY_COUNT(value) ( sizeof(value) / sizeof(value[0]) )

int commandCount = ARRAY_COUNT(gStartupCommands);

for (int commandIndex = 0; commandIndex < commandCount; ++commandIndex)
{
	StartupCommand* command = &gStartupCommands[commandIndex];

	SendCommandCode(command->code);

	if (command->length > 0)
	{
		SendCommandData(command->parameters, command->length);
	}
}

We iteratively traverse the array of launch commands, passing the command code first, and then the parameters (if any).

Frame drawing


After initializing the display, you can start drawing on it.

#define UPPER_BYTE_16(value) ( (value) >> 8u )
#define LOWER_BYTE_16(value) ( (value) & 0xFFu )

void Odroid_DrawFrame(uint8_t* buffer)
{
	// Set drawing window width to (0, LCD_WIDTH)
    uint8_t drawWidth[] = { 0, 0, UPPER_BYTE_16(LCD_WIDTH), LOWER_BYTE_16(LCD_WIDTH) };
	SendCommandCode(COLUMN_ADDRESS_SET);
	SendCommandParameters(drawWidth, ARRAY_COUNT(drawWidth));

	// Set drawing window height to (0, LCD_HEIGHT)
    uint8_t drawHeight[] = { 0, 0, UPPER_BYTE_16(LCD_HEIGHT), LOWER_BYTE_16(LCD_HEIGHT) };
	SendCommandCode(PAGE_ADDRESS_SET);
	SendCommandParameters(drawHeight, ARRAY_COUNT(drawHeight));

	// Send the buffer to the display
	SendCommandCode(MEMORY_WRITE);
	SendCommandParameters(buffer, LCD_WIDTH * LCD_HEIGHT * LCD_DEPTH);
}

ILI9341 has the ability to redraw individual parts of the screen. This may come in handy in the future if we notice a drop in frame rate. In this case, it will be possible to update only the changed parts of the screen, but for now we will simply redraw the entire screen again.

To render a frame, it requires setting a render window. To do this, send the COLUMN_ADDRESS_SET command with the window width and the PAGE_ADDRESS_SET command with the window height. Each of the commands takes four bytes of the parameter that describe the window into which we will perform the rendering.

UPPER_BYTE_16 and LOWER_BYTE_16- These are auxiliary macros for extracting the high and low bytes from a 16-bit value. The parameters of these commands require us to split the 16-bit value into two 8-bit values, which is why we do this.

Rendering is initiated by the MEMORY_WRITE command and sending to the display all 153,600 bytes of the frame buffer at a time.

There are other ways to transfer the frame buffer to the display:

  • We can create another FreeRTOS task (task), which is responsible for coordinating SPI transactions.
  • You can transfer a frame not in one, but in several transactions.
  • You can use non-blocking transmission, in which we initiate the sending, and then continue to perform other operations.
  • You can use any combination of the above methods.

For now, we will use the simplest way: the only blocking transaction. When DrawFrame is called, the transfer to the display is initiated and our task is paused until the transfer is completed. If later we find out that we cannot achieve a good frame rate with this method, then we will return to this problem.

RGB565 and byte order


A typical display (for example, your computer’s monitor) has a bit depth of 24 bits (1.6 million colors): 8 bits per red, green and blue. The pixel is written to memory as RRRRRRRRGGGGGGGGGBBBBBBBBB .

The Odroid LCD has a bit depth of 16 bits (65 thousand colors): 5 bits of red, 6 bits of green and 5 bits of blue. The pixel is written to memory as RRRRRGGGGGGGBBBBB . This format is called RGB565 .

#define SWAP_ENDIAN_16(value) ( (((value) & 0xFFu) << 8u) | ((value) >> 8u)  )
#define RGB565(red, green, blue) ( SWAP_ENDIAN_16( ((red) << 11u) | ((green) << 5u) | (blue) ) )

Define a macro that creates a color in the RGB565 format. We will pass him a byte of red, a byte of green and a byte of blue. He will take the five most significant bits of red, the six most significant bits of green and the five most significant bits of blue. We chose high bits because they contain more information than low bits.

However, the ESP32 stores the data in Little Endian order , i.e. the least significant byte is stored in the lower memory address.

For example, the 32-bit value [0xDE 0xAD 0xBE 0xEF] will be stored in memory as [0xEF 0xBE 0xAD 0xDE] . When transferring data to the display, this becomes a problem because the least significant byte will be sent first, and the LCD expects to receive the most significant byte first. Set

macro SWAP_ENDIAN_16to swap bytes and use it in the RGB565 macro .

Here's how each of the three primary colors is described in RGB565 and how they are stored in ESP32 memory if you don't change the byte order.

Red

11111 | 000000 | 00000? -> 11111000 00000000 -> 00000000 11111000

Green

00000 | 111111 | 00000? -> 00000111 11100000 -> 11100000 00000111

Blue

00000 | 000000 | 11111? -> 00000000 00011111 -> 00011111 00000000

Demo


We can create a simple demo to watch the LCD in action. At the beginning of the frame, it flushes the frame buffer to black and draws a 50x50 square. We can move the square with a cross and change its color with buttons A , B and Start .

void app_main(void)
{
	Odroid_InitializeInput();
	Odroid_InitializeDisplay();

	ESP_LOGI(LOG_TAG, "Odroid initialization complete - entering main loop");

	uint16_t* framebuffer = (uint16_t*)heap_caps_malloc(320 * 240 * 2, MALLOC_CAP_DMA);
	assert(framebuffer);

	int x = 0;
	int y = 0;

	uint16_t color = 0xffff;

	for (;;)
	{
		memset(framebuffer, 0, 320 * 240 * 2);

		Odroid_Input input = Odroid_PollInput();

		if (input.left) { x -= 10; }
		else if (input.right) { x += 10; }

		if (input.up) { y -= 10; }
		else if (input.down) { y += 10; }

		if (input.a) { color = RGB565(0xff, 0, 0); }
		else if (input.b) { color = RGB565(0, 0xff, 0); }
		else if (input.start) { color = RGB565(0, 0, 0xff); }

		for (int row = y; row < y + 50; ++row)
		{
			for (int col = x; col < x + 50; ++col)
			{
				framebuffer[320 * row + col] = color;
			}
		}

		Odroid_DrawFrame(framebuffer);
	}

	// Should never get here
	esp_restart();
}

We allocate the frame buffer according to the full size of the display: 320 x 240, two bytes per pixel (16-bit color). We use heap_caps_malloc so that it is allocated in memory, which can be used for SPI transactions with Direct Memory Access (DMA) . DMA allows SPI peripherals to access the frame buffer without the need for CPU involvement. Without DMA, SPI transactions take much longer.

We do not perform checks to ensure that rendering does not occur outside the borders of the screen.


Strong tearing is noticeable. In desktop applications, the standard way to eliminate tearing is to use multiple buffers. For example, when double buffering, there are two buffers: front and rear buffers. While the front buffer is displayed, recording is performed
in the rear. Then they change places and the process repeats.

ESP32 does not have enough RAM with DMA capabilities to store two frame buffers (4 MB of external SPI RAM, unfortunately, does not have DMA capabilities), so this option is not suitable.

ILI9341 has a signal ( TE ) that tells you when VBLANK occurs so that we can write to the display until it is drawn. But with Odroid (or the display module) this signal is not connected, so we can’t access it.

Perhaps we could find a decent value, but for now we will not do it, because now our task is to simply display the pixels on the screen.

Source


All source code can be found here .

References



All Articles