The shortest decimal program

In 1984, Steven Levy's cult book, Hackers: Heroes of the Computer Revolution , was released. There is an amateur Russian translation, but it is far from ideal. I undertook to correct inaccuracies in it, putting the English original next to it (by the way, and it is not without sin), but abandoned it after the second chapter. One way or another, I want to draw your attention to a fragment (you can read it in a separate article ) dedicated to the subroutine for printing a number in a decimal system. How much can such a program be reduced? What is the limit?

In August 2018, I wrote a program for measuring the exact execution time of instructions of the Soviet processor 1801BM1 (it has a set of PDP-11 instructions ). Knowing the exact time (in processor cycles) was necessary when working on the “ Good Apple ” demo for the BK 0011M computer. I wanted to see the measurement results in decimal. To do this, I had to write my own subroutine - system functions were not available due to the specifics of the test.

The first thing I did was make up a TEN array with powers of 10. The procedure takes a number in register R0, the output is a text string at NUMBER. Important: there is no division instruction in the processor !

	MOV #NUMBER,R1	; pointer to output text string
	MOV #TEN,R5
1:	CMP (R5)+,R0 	; skip leading zeros
	BHI 1		; branch if higher, for 16-bit not signed
	MOV -(R5),R3
	BEQ 4		; if less then 10

2:	MOV #47.,R4	; 0 symbol in ASCII codepage - 1
3:	INC R4		; count digits
	SUB R3,R0
	BHIS 3		; branch if higher or same, for 16-bit not signed
	MOVB R4,(R1)+	; print R4

	ADD (R5)+,R0
	MOV (R5),R3
	BNE 2

4:	ADD #48.,R0	; 0 symbol in ASCII codepage
	MOVB R0,(R1)+	; print R0
	CLRB (R1) 	; end of text marker
	RET

TEN:	.WORD 10000.,1000.,100.,10.,0

To understand the PDP-11 assembler, it is enough to remember that the arguments are written from left to right (first the source, then the receiver), and conditional branch commands begin with the letter B (from the word branch - branching). I will not describe the algorithm, it is of no interest and is given here only as a starting point. The size of this routine is 22 words (not including data).

After everything worked, I suddenly remembered a story from Stephen Levy's book: hackers struggled to reduce the size of a similar program, and also on the PDP architecture. True, they had PDP-1, but a few years later they got PDP-11.

image

Opening the book, I found an extremely vague description. Hackers from MIT startedjust like me from the list of decimal places. But what happened next is not clear from the text. Obviously, this was not clear to the author of the book either; he simply wrote down common words from the lips of eyewitnesses of that hacking contest.

I had to delve into the Internet archive of programs for PDP-1 . There are a lot of interesting things: Minskytron, Munching squares and other so-called “display hacks” (by the way, it is noteworthy that already then - in the early 60s - MIT hackers used the term “ demo ” in the same sense in which we use it now on the demoscene). There are many system routines in the archive, but, unfortunately, there is no decimal number among them.

Then, armed with a debugger, I decided to see how this procedure was implemented in the MKDOS operating system, which I use on the BK 0010 and BK 0011M. Oh miracle! - Having looked closely, I realized that the subprogram fits very well with the foggy description from the book “Hackers”. Here is her code:

	MOV #10.,R4
	CLR R2
1:	CLR R1
2:	SUB R4,R0 	;    10,    
	BLO 3
	INC R1 		;  -   
	BR 2
3:	ADD R4,R0
	ADD #48.,R0 	; ASCII-  0
	INC R2		;  -   
	MOVB R0,-(SP)
	MOV R1,R0 	;    -   
	BNE 1
	INC R2
	MOVB #32.,-(SP)	; ASCII- 
4:	MOVB (SP)+,R0
	CALL PRINT
	SOB R2,4	;     
	RET

The program forms a text string on the stack, then calls the print procedure for each saved character. Apparently, this is exactly what was meant in the book by Stephen Levy under the phrase "converts in the opposite way, and with the help of a cunning software focus it prints in the right order." Other features of the algorithm should be clear from the comments on the code.

The size of the subroutine is 23 words , but you cannot directly compare it with my subroutine: the conditions are too different. I decided to remake the program from MKDOS to my conditions: the formation of a text string in memory.

In the end, I realized that it’s better to leave only the idea of ​​subtracting the number 10, and write the rest from scratch. After several laps of Sansara optimization, I got the following:

	MOV #NUMBER,R1	; pointer to output text string
	CLRB -(R1)	; end of text marker
	MOV #10.,R4
1:	MOV #-1.,R5
2:	INC R5		; counter of 10s
	SUB R4,R0
	BHIS 2		; branch if higher or same
	ADD #58.,R0	; #10. + '0' ASCII code
	MOVB R0,-(R1)	; store R0 to text string
	MOV R5,R0	; let's count next how many 10s in number of 10s
	BNE 1
	RET		; returns text string pointer in R1

16 words , the limit is reached (I thought), here it is - Nirvana, about which Stephen Levy wrote so emotionally!

What tricks are applied here:

  1. The first command sets the pointer not to the beginning, but to the end of the text line. The text is filled from right to left - this is also convenient because at the output we get the address of the beginning of the line, ready to be sent to the text printing procedure.
  2. , . (INC R5) 1. 0, , ?.. , 10 – , . , . . – . : 1 , , 1 ( ). , , . . -1.
  3. 10, , , . – ( ). , – . , , . ( 10 10 ) . : ASCII- 0, ! , , . ADD #58.,R0 (48+10).

I was so pleased with the program that I decided to share it on the zx-pk.ru forum (not suspecting anything of local traditions to criticize without arguments). The community’s reaction was something like this: “you just had to see how they did it at DEC, it's a classic.”

Well, here is a program from DEC , the company that created PDP-11 and incorporated some of the hackers from MIT into its ranks:

; RETURNS:
; R0 = 0
; R1 -> byte following last digit in converted number
CVBTOD:	MOV	R0,-(SP)	;SAVE THE NUMBER PASSED TO US
	CLR	R0		;SET FOR CRUDE DIVIDE BY 10.
10$:	INC	R0		;BUMP QUOTIENT
	SUB	#10.,(SP)	;REDUCE NUMBER BY 10.
	BHIS	10$		;IF SIGN DIDN'T CHANGE...
	ADD	#10.+48.,(SP)	;MAKE REMAINDER PRINTABLE
	DEC	R0		;REDUCE QUOTIENT
	BEQ	20$		;IF ZERO, TIME TO PRINT
	CALL	CVBTOD		;OTHERWISE, RECURSE !
20$:	MOVB	(SP)+,(R1)+	;STORE A CONVERTED DIGIT
	RETURN			;UNWIND THE RECURSION

14 words , cool! Or not? I was defeated, but let's take a closer look:

  1. Adding the ASCII code of character 0 and number 10 is done in one operation. It turns out that such a trick was used back in the 70s. Cool.
  2. The program calls itself recursively - a beautiful solution!
  3. – , . – .
  4. R1 . , , .
  5. , ! ?.. , . , zx-pk.ru MOV #NUMBER,R1 !

Total, the actual size is 16 words . Exactly like mine. Both programs consist of 12 instructions. So which is better?

Even if you replace stack calls with register calls, the DEC program will be slower due to the DEC R0 and CALL instructions inside the loop.

But that's not all. When I started writing this article, I noticed that in my program there was a rudimentary (from MKDOS) instruction MOV # 10., R4 - it does not make any sense other than accelerating the inner loop. It's time to get rid of her.

	MOV #NUMBER,R1	; pointer to output text string
	CLRB -(R1)	; end of text marker
1:	MOV #-1.,R5
2:	INC R5		; counter of 10s
	SUB #10.,R0
	BHIS 2		; branch if higher or same
	ADD #58.,R0	; #10. + '0' ASCII code
	MOVB R0,-(R1)	; store R0 to text string
	MOV R5,R0	; let's count next how many 10s in number of 10s
	BNE 1 		; loop if R0 is not zero
	RET		; returns text string pointer in R1

15 words . 11 instructions. Now, it seems, that's all.

Well, my optimization ideas are over. It was an inspirational, even a gamble, challenge. It is remarkable that the idea proposed by a hacking student in the early 60s for PDP-1 was used by DEC ten and even twenty years later, and it was applied on the Soviet computer BK 0011M until the early 2000s. Surprisingly, in the year 2018, it was possible to partially reinvent and optimize the algorithm. Characteristically, many considered this impossible.

So, here is the Holy Grail (according to Stephen Levy), which the hackers of the 60s tried to find - the shortest decimal program for PDP. Or ... maybe even shorter?

Update: I knew that this day would come, but I did not think so soon :) It turned out that the program can be reduced by one more word! The idea was suggested in the comments of Mr_Rm :

	MOV #NUMBER,R1	; pointer to output text string
	CLRB -(R1)	; end of text marker
1:	CLR R5
2:	INC R5		; counter of 10s +1
	SUB #10.,R0
	BHIS 2		; branch if higher or same
	ADD #58.,R0	; #10. + '0' ASCII code
	MOVB R0,-(R1)	; store R0 to text string
	MOV R5,R0	; let's count how many 10s in number of 10s
	SOB R0,1	; subtract 1 and loop if not zero
	RET		; returns text string pointer in R1

14 words . 11 instructions.
The trick is this. Remember, I wrote that with the “wrong” organization of the R5 cycle, one more is obtained than necessary? Well, let! We will reduce it at the very end of the program. In the previous version, R5 was copied to R0, after which the BNE (Branch if Not Equal to zero) command checked R0, and if it is not equal to zero, it went to the beginning of the cycle. If only we would first reduce R0 by one, and then (if we got not zero) go to the beginning ... Wait a minute, but this is a normal SOB (Subtract One and Branch) cycle command. True, here it was not used in a canonical way: it fulfills once, and then the cycle counter is frayed. This looks discouraging, but if you look at the previous version of the subroutine, it becomes clear how a new one is shortened out of it.

Useful links:


All Articles