Porting Quake to iPod Classic


Launch Quake on iPod Classic ( video ).

TL; DR : I managed to run Quake on an MP3 player. The article describes how this happened.

I spent part of last summer on a couple of my favorite things: Rockbox and the game Quake id Software. I even had the opportunity to combine these two hobbies by porting Quake to Rockbox! It was impossible to wish for more!

This post tells the story of how it worked out. This is a long story stretching for almost two years. In addition, this is my first attempt to document the development process, detailed and unvarnished, in contrast to the finished technical documentation, which I have written too much in my life. The article will also have technical details, but first of all I will try to talk about the thought process that led to the creation of the code.

Alas, the time has come to say goodbye to Rockbox and Quake, at least for the near term. For several months, free time will be a very scarce resource for me, so before rushing to work, I hasten to express my thoughts.

Rockbox


Rockbox is a curious open source project, which I spent a lot of time hacking. The best thing about it is written on the web page: "Rockbox is free firmware for digital music players." That's right - we created a complete replacement for the factory software that came with Sandisk Sansa players, the Apple iPod, and many other supported devices.

We are not only striving to recreate the functions of the original firmware, but also implemented support for downloadable extensions called plug - ins - small programs running on an MP3 player. Rockbox already has many great games and demos, the most impressive of which are probably first-person shooters Doom and Duke Nukem 3D 1. But I felt that something was missing in him.

Quake appears on stage


Quake is a fully three-dimensional first-person shooter. Let's see what that means. The key words here are "fully three-dimensional . " Unlike Doom and Duke Nukem 3D, commonly referred to as 2.5D (imagine a 2D map with an optional height component), Quake is implemented in full 3D. Each vertex and polygon exists in 3D space. This means that old pseudo-3D tricks no longer work - everything is done in full 3D. However, I was distracted. In short, Quake is a powerful thing.

And Quake does not forgive jokes. Our research has shown that Quake “requires” an x86 processor with a frequency of approximately 100 MHz and an FPU, as well as about 32 MB of RAM. Before you start giggling, remember that the target platforms for Rockbox are not comparable to what John Carmack focused on when writing the game - Rockbox even works on devices with processors with a frequency of only 11 MHz and 2 MB of RAM (of course, Quake should not work on such devices). With this in mind, I looked at my gradually decreasing collection of digital audio players and chose the most powerful of the survivors: Apple iPod Classic / 6G with 216 MHz ARMv5E processor and 64 MB of RAM (index Eindicates the presence of ARM DSP extensions - later this will be important for us). Serious specs, but there are barely enough to run Quake.

Port


There is a wonderful version of Quake that can run on SDL . It has the logical name SDLQuake . Fortunately, I have already ported the SDL library to Rockbox (this is a topic for another article), so preparing Quake for compilation turned out to be a fairly simple process: copy the source tree; make; we correct errors; rinse, soap, repeat. I’m probably here a bit of repainting a lot of boring details, but just imagine my admiration for being able to successfully compile and link the Quake executable. I was delighted.

“Well, load it!” I thought.

And it booted! I was greeted by the beautiful Quake console background and menu. All perfectly. But take your time! When I started the game, something was wrong. The “Introduction” level seemed to load normally, but the player’s spawn position was completely off the map. Strange , I thought. I tried various tricks, started debugging and splashf, but it was all in vain - the bug turned out to be too complicated for me, or it seemed to me like that.

And this situation persisted for several years. Probably worth a little talk about the timing. The first attempt to launch Quake was made in September 2017, after which I gave up, and my Frankenstein from Quake and Rockbox lay on the shelf, collecting dust, until July 2019. Having found the perfect combination of boredom and motivation, I decided to proceed with the completion of what I started.

I started debugging. My state of the flow was such that I do not remember practically no details about what I was doing, but I will try to recreate the course of work.

I found that the Quake structure is divided into two main parts: the engine code in C and the high-level logic of the game in QuakeC, a bytecode-compiled language. I always tried to stay away from QuakeC VM due to the irrational fear of debugging someone else's code. But now I was forced to plunge into it. I vaguely recall the insane streaming session during which I searched for the source of the bug. After many grep, I found the culprit: pr_cmds.c:PF_setorigin. This function received a three-dimensional vector that sets the player’s new coordinates when loading the map, which for some reason have always been equal (0, 0, 0). Hm ...

I backtracked the data stream and found where it came from: from the call Q_atof()- the classic conversion function from string to float. And then the insight dawned on me: I wrote a set of wrapper functions that redefined Q_atof()the Quake code, and my implementation atof()was probably wrong. It was very easy to fix it. I replaced my erroneous with the atofcorrect - function from the Quake code. And voila! The famous entry level with three corridors loaded without any problems, as did the “E1M1: The Slipgate Complex”. The audio output still sounds like a broken lawn mower, but we still ran Quake on the MP3 player!

Down the rabbit hole


This project finally became an excuse for what I had been putting off: learning the ARM 2 assembly language .

The problem was the speed-sensitive sound mixing cycle in snd_mix.c(remember the sound of a lawn mower?).

The function SND_PaintChannelFrom8receives an array of 8-bit mono audio samples and mixes them into a 16-bit stereo stream, the left and right channels of which are scaled separately based on two integer parameters. GCC did a lousy job optimizing saturation arithmetic, so I decided to do it myself. The result completely satisfied me.

Here is the assembler version of what I got (the C version is presented below):

SND_PaintChannelFrom8:
        ;; r0: int true_lvol
        ;; r1: int true_rvol
        ;; r2: char *sfx
        ;; r3: int count

        stmfd sp!, {r4, r5, r6, r7, r8, sl}

        ldr ip, =paintbuffer
        ldr ip, [ip]

        mov r0, r0, asl #16                 ; prescale by 2^16
        mov r1, r1, asl #16

        sub r3, r3, #1                      ; count backwards

        ldrh sl, =0xffff                    ; halfword mask

1:
        ldrsb r4, [r2, r3]                  ; load input sample
        ldr r8, [ip, r3, lsl #2]                ; load output sample pair from paintbuffer
                                ; (left:right in memory -> right:left in register)
        ;; right channel (high half)
        mul r5, r4, r1                      ; scaledright = sfx[i] * (true_rvol << 16) -- bottom half is zero
        qadd r7, r5, r8                     ; right = scaledright + right (in high half of word)
        bic r7, r7, sl                      ; zero bottom half of r7

        ;; left channel (low half)
        mul r5, r4, r0                      ; scaledleft = sfx[i] * (true_rvol << 16)
        mov r8, r8, lsl #16                 ; extract original left channel from paintbuffer
        qadd r8, r5, r8                     ; left = scaledleft + left

        orr r7, r7, r8, lsr #16                 ; combine right:left in r7
        str r7, [ip, r3, lsl #2]                ; write right:left to output buffer
        subs r3, r3, #1                         ; decrement and loop

        bgt 1b                          ; must use bgt instead of bne in case count=1

        ldmfd sp!, {r4, r5, r6, r7, r8, sl}

        bx lr

There are tricky hacks here that are worth explaining. I use the qaddARM processor DSP instruction to implement low-cost addition of saturation, but qaddit only works with 32-bit words, and the game uses 16-bit sound samples. The hack is that I first shift the samples left 16 bits; I am combining samples with qadd; and then do the reverse shift. So in one instruction I do what the GCC took seven. (Yes, it would be possible to do without hacks at all if I worked with ARMv6, which has MMX-like packed saturation arithmetic with qadd16, but alas, life is not so simple. Besides, the hack turned out to be cool!)

Note also that I read two stereo samples at a time (using words ldrandstr) to save a couple more cycles.

Below is a C version for reference:

void SND_PaintChannelFrom8 (int true_lvol, int true_rvol, signed char *sfx, int count)
{
        int     data;
        int             i;

        // we have 8-bit sound in sfx[], which we want to scale to
        // 16bit and take the volume into account
        for (i=0 ; i<count ; i++)
        {
            // We could use the QADD16 instruction on ARMv6+
            // or just 32-bit QADD with pre-shifted arguments
            data = sfx[i];
            paintbuffer[2*i+0] = CLAMPADD(paintbuffer[2*i+0], data * true_lvol); // need saturation
            paintbuffer[2*i+1] = CLAMPADD(paintbuffer[2*i+1], data * true_rvol);
        }
}

I calculated that, compared to the optimized C version, the number of instructions per sample decreased by 60%. Most of the loops were saved by using qaddsaturation and packing memory operations for arithmetic.

Conspiracy of "prime" numbers


Here is another interesting bug that I found in the process. In the assembly code listing, next to the instruction bgt(branch “if more than”) there is a comment that bne(branch “if not equal”) cannot be used due to a borderline case that slows down the program with the number of samples equal to 1. This leads to a cyclic transfer integer on 0xFFFFFFFFand an extremely long delay (which eventually ends).

This borderline case is triggered by one particular sound, having a length of 7325 samples 3 . What is so special about 7325? Let's try to find the remainder of its division by any power of two:

73251(mod2)73251(mod4)73255(mod8)732513(mod16)732529(mod32)732529(mod64)732529(mod128)7325157(mod256)7325157(mod512)7325157(mod1024)73251181(mod2048)73253229(mod4096)


5, 13, 29, 157 ...

Have you noticed anything? Namely - by some coincidence, 7325 is a "prime" number when dividing by any power of two. This somehow (I don’t understand how) leads to the fact that an array from one sample is transferred to the sound mixing code, a borderline case is triggered and it hangs.

I spent at least a day identifying the causes of this bug, as a result of finding out that it all comes down to one wrong instruction. Sometimes it happens in life, right?

Parting


I eventually packaged this port as a patch and merged it with the main Rockbox branch, where it is today. In Rockbox version 3.15 and later, it comes in assemblies for most ARM target platforms with 4 color displays . If you do not have a supported platform, then you can see the demo user890104 .

For the sake of saving space, I missed a couple of interesting points. For example, there is a race condition that occurs only when a zombie breaks into pieces of meat when the sampling rate is 44.1 kHz. (This was the result of the sound stream trying to load the sound - an explosion, and the model loader trying to load a piece of meat model. These two sections of code use one function that uses one global variable.) And there are also a lot of ordering problems (love you, ARM! ) and a bunch of rendering microoptimizations that I created to squeeze out a few more frames from the equipment. But I will leave them another time. And now it's time to say goodbye to Quake - I liked this experience.

All the best, and thanks for the fish!



Notes


  1. Duke Nukem 3D , runtime Rockbox SDL, . , user890104.
  2. ARM, Tonc: Whirlwind Tour of ARM Assembly — ( GBA) . , ARM Quick Reference Card.
  3. , 100 .
  4. Honestly, I don’t remember which specific target platforms support and do not support Quake. If you're curious, go to the Rockbox website and try installing the build for your platform. And let me know on the mail as it works! Newer versions of Rockbox Utility (from 1.4.1 and higher) also support the automatic installation of the shareware version of Quake.

Source: https://habr.com/ru/post/undefined/


All Articles