Game console stm32

A few shooters for stm32; how, why, what happened.



Foreword


Being a fan of the “old” school of shooters on the one hand and embedded-developer on the other, I always wondered how and why the authors of that era managed to implement a new genre that required completely new approaches on a very “modest” hardware. And I decided to try to launch something similar using solutions based on modern MK - here there are bare-metal and "modest" resources and a rather powerful debugging tool (stm32, IMHO). And so, my choice fell on the stm32f769i discovery development board.

Notes


At the moment, assembly is possible only from the Keil MDK environment (bootloader, games) or using arm-gcc + make (bootloader only). Currently available ports for - Quake I (+ mods), Doom (+ mods), Duke Nukem (+ mods), Hexen, Heretic. With all the modifications, the list can be significantly expanded.

Let's start


In this article I will try to briefly outline the main ideas and principles of their implementation on the way to creating a game console in particular for stm32f769i discovery. Also, I will try to avoid detailed technical details, I rather pursue the goal of introducing the reader to another option for using modern MK. By "game console" - I mean an independent device with the ability to run "user" applications without updating the main software.

Architecture


Since the final implementation requires the ability to independently run various games without updating the MK firmware, a need arose for some version of the “bootloader”, so:

  1. Loader; work with memory - “installation” of an application (game), launch.
  2. Driver; Service HAL system level and related functions, transfer API to the application.
  3. Application; The final program does not return control to the bootloader.

1. Bootloader


It is based on the IAP (In-Application-Programming) model - drivers using an example from ST-microelectronics.

The peculiarity of this approach is that there is no need to change the configuration of the MK boot. The
entire “body” of the bootloader is in the main memory, and this in turn allows the use of stm32f769i discovery “out of the box”.

The main functionality of this level is reading the .bin file, writing its contents to the MK memory and transferring control. At this stage, the key point is to read the address of the entry point and the address of the stack pointer, the second is not required, because application does not return control -
the stack is shared and there is no need to rewrite the pointer. A call can also be made through a pointer to a function. Thus, the result will be a “hybrid” loader - its “driver” part continues to serve the application, while the resources of the loader itself are unloaded.

2. Driver


The driver is made in the form of a “wrapper” above the HAL level, providing access to the necessary resources - file system, display \ monitor, sound, game controller \ display sensor. For further use, the driver API is transmitted in the form of a pointer structure through “shared memory” - a piece of memory reserved for both the bootloader and the return side. Such manipulation requires memory costs, and perhaps the best solution would be to use SWI (soft-interrupt, svc call), but for this, in turn, you must be able to change the context - because not all calls can be handled in an interrupt. Also, “shared” memory is used to pass user arguments (for example, through the console), a prerequisite is to add the no-init attribute for this section,this will avoid overwriting it with runtime-library at the time of initialization of the user application.

3. Application


As a result, the only thing you need to know at the time of building the application is the architecture of the processor core, there are no dependencies from the HAL, there is also no table of interrupt vectors, all interrupts are processed by the loader. The application as a result uses much less space in the program memory - due to the fact that part of the functionality is “protected” along with the bootloader / driver, which allows you to install it in the SRAM data area (internal RAM). This in turn can significantly reduce the number of write cycles of Flash memory and also speed up the debugging and execution process in general. Of the minuses - at the time of debugging, it is possible to call the application only from the outside, for example, using a command from the console (COM port over ST-Link, VCOM), for this a very simplified version of the command line is used.

Resources:

Loader , hal , Driver

Development Features


Memory


The first thing I had to deal with was the problem of allocating an external memory resource (SDRAM). This is due to the fact that some games require more memory for the .bss section (duke nukem ~ 5.5mbytes). To place such a volume is possible only in sdram, but because the same memory is used by the bootloader to store temporary data - images, sound, file contents, etc. .., necessary only before the application starts - it was decided to split the management of this part of the memory - on each side there is its own malloc / free. After starting - the driver uses pointers to the malloc \ free functions, which, if necessary, are passed as a parameter to the call function. That is, after starting the game, the driver cannot directly perform allocation from sdram.An interesting fact about D-Cache and I-Cache - due to the peculiarities of handling external memory - you must turn off both lines before starting, because sdram is re-initialized, everything would be fine, but there is one “but” - you must always invalidate the cache, otherwise, by default it preserves the valid state of all lines, while they were overwritten in the interval when the cache was turned off.
Another feature - all the bootloader data is placed in the DTCM section, this allows not to use the cache when accessing the memory (the MPU allows the same) and as a result - coherence problems are solved when working with DMA;
In conjunction - CPU -> D-Cache -> Memory <- DMA

Graphic arts


For the most part, these are several image scaling functions (2x2, 3x3),
the initialization function and the loading of the palette. The key point is the correct indication of the frame memory attributes (performed through the MPU configuration) - for the bootloader this will be “Write-back, no write allocate”, to eliminate the flicker effect, as single-buffer mode is used, while the application uses “Write through, no write allocate”, which achieves the highest FPS (Doom ~ 28-40).

All existing game ports operate with 8-bit graphics, but it is also possible to switch to 16-bit true-color mode (requires modification from the side of the game).
It is possible to scale an 8-bit image using DMA2D, but this approach has not paid off - it costs ~ 1000 interruptions necessary for processing an image with a final resolution of 640x480 pixels, it also generates a lot of artifacts in games - individual images (sprites, polygons) will not be fully rendered, because in this case, the entire rendering process in the game will occur in parallel with the screen sketch.

Sound


This part is made in the form of a simple software 16-bit, 16-channel mixer based on examples from ST-microelectronics, the I2S controller is also used. At the moment, there is no way to convert audio formats between themselves - this part is implemented at the game level depending on the requirements. Duke Nukem, in my opinion, has the richest set of utilities for working with sound, including and reverb.

Enter


The joystick driver is also made using the usb-hid class (in fact, the gamepad will be defined as a computer mouse ..). The display sensor - like a gamepad, uses the same channel to transmit events, and it’s an extremely inconvenient thing.

Games


Doom


The stm32doom port was taken as a basis ,
added sound support, patched with the latest chocolate doom changes , some corrections due to Killough, from prBoom, were added. The game allows you to use all the modifications and maps available for chocolate doom, including Added scenery and sound from 3DO and PS1 versions of the game. Graphics optimization has been added - the resolution of texture rendering depends on the distance, the solution is so-so, at different locations - a gain of + 3-7 frames per second. available. The latest version also adds support for "transparent" sprites - everything is based on the generated table of combinations of palette elements - something similar is used in Quake II and games based on the Build engine. Game incl. modifications can be fully completed.

Resources:

stm32doom , 3DO doom , chocolate doom , Current Version

Duke nukem


Based on the chocolate duke port . There was no time to fully understand the source code of the game, so everything remained “as is”, only minor defects were fixed. The port also allows you to run official and not quite modifications - Atomic edition, Nuclear winter, etc ...
Note - at the moment, none of the episodes of the game can be completely completed due to existing defects.

Resources:

chocolate duke , Current Version

Quake i


Unfortunately, the link to the original repository was not preserved and I can not find it, the
port was executed under the name sdl quake. Of the features, it is worth noting the presence of a client-server architecture, a very “gluttonous” stack (~ 700kb) due to which a lot of interesting situations arose for the first time (armcc does not really monitor its use), widespread problems with alignment - maybe this concerns only the armcc of the compiler, but almost everywhere where there is an appeal to an element of the structure larger than one byte in size - you need to use the wrapper function to read / write byte otherwise - hard-fault exception. The game is pretty good "going", with an average fps ~ 15. Several episodes can also be completed, more or less comfortable only at the first difficulty level :)

Resources:

Current version

Hexen, Heretic


For the most part, they inherit the Doom engine, so from the point of view of porting, they are almost identical (IMHO). In hexen added the ability to start the game at the selected location, games can not be completed completely.

Resources:

hexen stm32 , heretic stm32

Result


Doom , Duke Nukem , Quake

Thank you for your attention.

All Articles