📧 👨🏼‍🍳 📝 We extract constants from the crystal of the mathematical coprocessor 8087 🙎🏾 👩🏼‍🔧 📋

In 1980, Intel introduced the 8087 chip to accelerate the processing of floating point numbers on 8086 processors, and it was used in the original IBM PC. Since the first microprocessors only worked with integers, arithmetic with floating point numbers was slow, and with transcendental functions like arctangent or logarithms, things were even worse. Adding the 8087 coprocessor chip to the system was able to speed up operations with floating point numbers up to a hundred times.

I opened the 8087 chip and took a few photos of it under a microscope. The photo below shows a tiny silicon chip crystal. On its sides, tiny conductors connect it with 40 external legs. The markup of the main functional blocks in the picture was made by me thanks to reverse engineering. If you carefully study the chip, you can extract various constants from its ROM - numbers such as π used by the chip in the calculations. Intel 8087 chip chip for floating point operation with marked main function blocks. ROM with constants is marked in green. Clickable.

In the upper half of the chip are control circuits. Up to 1000 steps may be required to execute a floating point instruction; 8087 used microcode to describe these steps. In the photo of the crystal you can see the "mechanism" that launched the program from microcode; in fact, it is a simple CPU. Next to it is a large ROM, where the microcode is stored.

At the bottom of the chip are circuits that process floating point numbers. A floating point number consists of a fractional part (also known as a significant part of a number or mantissa), an exponent, and a sign bit. In decimal notation, the number 6.02 × 10 ²³6.02 will be the mantissa and 23 will be the exponent. Separate chip circuits simultaneously process the mantissa and exponent. The mantissa processing scheme supports 67-bit values - a 64-bit mantissa and three additional bits for accuracy. From left to right, the mantissa operation schemes consist of ROMs with constants, a shift register, an adder / subtracter, and a register stack. The topic of this article is ROM with constants; in the photo it is highlighted in green.

8087 worked as a coprocessor to the 8086 processor. When the 8086 processor came across a special instruction related to floating point numbers, it ignored it and gave the 8087 the opportunity to execute it in parallel. The interaction of 8086 and 8087 was arranged in a rather cunning way. To simplify, then 8087 looks at the stream of instructions 8086, and executes any instructions related to 8087. The difficulty is that 8086 has a buffer of prefetching instructions, so the instruction that 8086 is currently receiving does not match the instruction he performs. Therefore, 8087 duplicated the 8086 instruction preemptive fetch buffer (or a smaller large 8088 instruction preemptive fetch buffer) to know what the 8086 was doing ( described in more detail here) Another difficulty is associated with complex 8086 addressing modes that use registers inside 8086. 8087 cannot execute these addressing modes because it does not have access to 8086 registers. Instead, when 8086 sees the instruction for 8087, it requests data from a specified location in memory and ignores the result. Meanwhile, 8087 takes the address from the bus, in case he needs it. It may seem to you that a trap will form in this place if 8087 is not installed - but this will not happen. On a system without 8087, the linker rewrites the 8087 instructions, replacing them with subroutine calls from the emulation library.

I will not tell in detail the internal workings of 8087, but in general, floating point operations are implemented through additions, subtractions, and shifts of integers. To add or subtract two floating point numbers, 8087 shifts the numbers until the binary commas are aligned (like decimal points, only in the binary system), and then adds or subtracts the mantissa. Multiplication, division and taking the square root are performed by successive shifts and additions / subtractions. Transcendental functions (tangent, arctangent, logarithm, degree) use CORDIC algorithms , which use shifts and additions of special constants to increase the efficiency of calculations.

ROM implementation

This article describes the ROM where the constants are stored. Do not confuse it with a larger four-level ROM where the microcode is stored - this last one is implemented using an unusual technology that stores two bits per transistor. This is done using transistors of three different sizes or the absence of a transistor at each position. These four options indicate two bits. This sophisticated technology was needed in order to fit a large ROM onto an 8087 chip.

ROM for constants uses standard technologies for storing constants (such as π, ln (2) and √2) required by 8087 for calculations. The photo below shows part of the ROM with constants. To see the crystal itself, a metal layer has been removed from it. Pinkish areas are silicon with impurities giving it different properties, and reddish and greenish lines are polysilicon , a special type of silicon wiring lying on top. Pay attention to the structure of the ROM, similar to the correct lattice. ROM consists of two columns of transistors storing bits. To explain the scheme of its operation, I will begin with the scheme of operation of the transistor.

Part of ROM with constants, with a removed metal layer. Three columns of larger transistors are used to select rows.

High Density Integrated Circuits (ICs) in the 1970s were usually made from N-MOS transistors. (Modern computers are made of CMOS, consisting of N-MOS and P-MOS reverse polarity to them). The diagram below shows the structure of an N-MOS transistor. The IC is assembled from a silicon substrate on which transistors are created. Impurities are added to the silicon sites, creating "diffuse" regions with the desired electrical properties. A transistor can be thought of as a switch that allows current to flow between two sections of diffusion, which are called the source and drain. The transistor is controlled by a gate made of a special type of silicon - polysilicon. Applying voltage to the gate allows current to flow between the source and drain; otherwise, no current flows. The 8087 is quite complex, with about 40,000 transistors on it.

Different sources give a different number of transistors for 8087: Intel says about 40,000, Wikipedia says about 45,000. Perhaps the whole thing is in different calculation methods. Since the number of transistors in a ROM, PLA, or other similar structure depends on the data stored, sources often indicate the number of “potential” transistors instead of real ones. You can also consider or not consider pull-up transistors, or consider high-current drivers as one transistor or several parallel ones.

MOS structure implemented in IP

By zooming in, individual ROM transistors can be considered. Pinkish areas are silicon with impurities that form sources and drains. Polysilicon vertical select bus lines form transistor gates. The marked areas of silicon are connected to the ground, and pull down one side of each transistor. The circles are VIA , an interlayer vias between silicon and metal tires above. For photos, metal tires are removed; the location of one of them is shown by the orange line.

Part of ROM with constants. Each sampling bus selects a specific constant. Transistors are shown in yellow. X marks the missing transistor corresponding to bit 0. The orange line indicates the location of the metal conductor.

An important feature of the ROM is the absence of some transistors - there is no first in the upper row, and two marked with X in the lower row. Bits are programmed in ROM by changing the circuit for adding impurities to silicon, which create transistors or leave insulating areas. Each available or missing transistor represents one bit. When the sampling bus is activated, all transistors in this column open, pulling down the corresponding output buses. But if there is no transistor in the selected position, the corresponding output will remain high. Thus, the value is read from the ROM by activating the sample bus, which outputs this value from the ROM to the output buses.

ROM contents

ROM with constants consists of 143 rows and 21 columns. It contains 134 rows of 21 bits, except for one piece of 6 × 6 transistors in the upper left. Therefore, the physical size of the ROM with constants is 2946 bits.

Based on the ROM scheme, the missing section means that the first 12 constants are 64-bit, not 67-bit. These are constants not related to CORDIC, and they apparently do not require additional precision.

Under the microscope, the ROM bit pattern is visible, so it can be removed from there. However, it is not at all obvious how to interpret these bits later. The first question is whether the presence of a transistor indicates 0 or 1 (later it turned out that the presence of a transistor is 1). The second problem is how to translate the 134 × 21 bit grid into values.

Bit encoding can be defined in two ways. The first is to track the circuit reading the data from the ROM and see how it is used. The second is to look for patterns in raw data and try to comprehend them. Since 8087 is very complex, I wanted to avoid full reverse engineering when studying constants, so I used the second approach.

The data path of the chip consists of 67 horizontal rows, so it was clear that the 134 rows in the ROM correspond to two sets of 67-bit constants. I extracted one set of constants from the odd series, and the other from the even series, however the obtained values did not make sense. Thinking a little more, I realized that the rows did not alternate, but walked in repeating ABBA order.

The rows went in the order ABBAABBAABBA ..., where rows A contained bits for one set of constants, and rows B contained bits for another. Such a circuit was used instead of simply alternating ABAW, possibly because one contact can control two adjacent transistors. That is, one conductor can select each of the groups AA and BB.

When I took into account the ABBA order, I got a bunch of familiar constants, including π and 1. The diagram below shows the bits from these constants. In the photo, bit 1 is a green strip, bit 0 is red. In the binary system, π is 11.001001 ..., and it is this value that is visible in the upper row of marked bits. The lower value is constant 1. The

upper row of bits is the number π, the lower one is 1. This diagram is rotated 90 ° compared to the others.

The next difficulty with interpretation is that only the mantissa are stored in the ROM, but not the exponent (so far I have not found a separate ROM with the exponents). And I experimented with various exhibitors until I got meaningful values. Some were immediately clear: for example, the constant 1.204120 gives log ₁₀ (2) using the exponent 2 ^-2 . Others were more difficult to understand, for example, 1,734723. In the end, I realized that 1.734723 × 2 ⁵⁹ = 10 ¹⁸ . Why is there such a constant in 8087? Perhaps because the 8087 supports 18-character packed binary decimal code .

Some exponents were very difficult to find, and I used the brute force method to see if the result would yield any logarithm or degree of some number. The hardest thing was to determine the constant for ln (2) / 3. The importance of this number is not clear to me.

Here is the complete list of constants from ROM. The “meaning” column is my description of the meaning.

Constants	Decimal value	Meaning
1.204120 × 2 ^-2	0.3010300	log ₁₀ (2)
1.386294 × 2 ^-1	0.6931472	ln (2)
1.442695 × 2 ⁰	1.426950	log ₂ (e)
1.570796 × 2 ¹	3.1415927	π
1.000000 × 2 ⁰	1.000000	1
1.660964 × 2 ¹	3.219281	log ₂ (10)
1.734723 × 2 ⁵⁹	1.00e + 18	10 ¹⁸
1.734723 × 2 ⁵⁹	1.00e + 18	10 ¹⁸
1.848392×2^-3	0.2310491	ln(2)/3
1.082021×2²	4.280851	3*log₂(e)
1.442695×2⁰	1.426950	log₂ (e)
1.414214×2⁰	1.4142136	√2
1.570796×2^-1	0.7853982	atan(2⁰)
1.854590×2^-2	0.4636476	atan(2^-1)
2.000000×2^-15	0.0000610	atan(2^-14)
2.000000×2^-16	0.0000305	atan(2^-15)
1.959829×2^-3	0.2449787	atan(2^-2)
1.989680×2^-4	0.1243550	atan(2^-3)
2.000000×2^-13	0.0002441	atan(2^-12)
2.000000×2^-14	0.0001221	atan(2^-13)
1.997402×2^-5	0.0624188	atan(2^-4)
1.999349×2^-6	0.0312398	atan(2^-5)
1.999999×2^-11	0.0009766	atan(2^-10)
2.000000×2^-12	0.0004883	atan(2^-11)
1.999837×2^-7	0.0156237	atan(2^-6)
1.999959×2^-8	0.0078123	atan(2^-7)
1.999990×2^-9	0.0039062	atan(2^-8)
1.999997×2^-10	0.0019531	atan(2^-9)
1.441288×2^-9	0.0028150	log₂(1+2^-9)
1.439885×2^-8	0.0056245	log₂(1+2^-8)
1.437089×2^-7	0.0112273	log₂(1+2^-7)
1.431540×2^-6	0.0223678	log₂(1+2^-6)
1.442343×2^-11	0.0007043	log₂(1+2^-11)
1.441991×2^-10	0.0014082	log₂(1+2^-10)
1.420612×2^-5	0.0443941	log₂(1+2^-5)
1.399405×2^-4	0.0874628	log₂(1+2^-4)
1.442607×2^-13	0.0001761	log₂(1+2^-13)
1.442519×2^-12	0.0003522	log₂(1+2^-12)
1.359400×2^-3	0.1699250	log₂(1+2^-3)
1.287712×2^-2	0.3219281	log₂(1+2^-2)
1.442673×2^-15	0.0000440	log₂(1+2^-15)
1.442651×2^-14	0.0000881	log₂(1+2^-14)

Not sure why 10 ^{18 is} repeated - perhaps the difference is exponential.

Physically, the constants are located in three groups. The first group is the values that the user can load (1, π, log ₂ 10, log ₂ e, log ₁₀ 2, and ln), as well as values for internal use (10 ¹⁸ , ln (2) / 3, 3 * log ₂ (e), log ₂ (e) and √2). The second group consists of 16 arctangent constants, and the third consists of 14 log ₂ constants .

In 8087, there are seven instructions for loading constants directly. The instructions FDLZ, FLD1, FLDPI, FLD2T, FLD2E, FLDLG2 and FLDLN2 load the constants 0, 1, π, log ₂ 10, log ₂ onto the stacke, log ₁₀ 2 and ln 2, respectively. All these constants except 0 are stored in ROM.

The last two groups of constants are used to calculate transcendental functions using CORDIC algorithms.

CORDIC Algorithms

By the constants from the ROM, you can find out some details of the operation of the 8087 algorithms. The ROM contains 16 values for the arctangent, arctangents from 2 ^-n . Also there are stored 14 logarithms on the base 2 of (1 + 2 ^-n ). Such values may seem unusual, but they are used in the efficient CORDIC algorithm, invented in 1958.

The idea behind CORDIC is that tangent and arctangent can be calculated by breaking the angle into smaller ones with the rotation of the vector by these angles. The trick is that if you choose the right smaller angles, then each rotation can be calculated through the effective shifts and additions instead of trigonometric functions. Suppose we need to find tan (z). We can divide z into the sum of small angles: z ≈ {atan (2 ^-1 ) or 0} + {atan (2 ^-2) or 0} + {atan (2 ^-3 ) or 0} + ... + {atan (2 ^-16 or 0}. You can rotate the vector by, say, atan (2 ^-2 ) by multiplying by 2 ^-2 and adding. The bottom line is that multiplying by 2 ^{-2 is} carried out through a fast bitwise shift.Taking all this into account, you can calculate tan (z) by comparing z with atan constants, and then after going through 16 cycles of additions and shifts, which is done quickly on iron. for atan are pre-calculated and stored in ROM. The arc tangent is calculated in a similar way, but vice versa - during rotation, the angles (from the ROM with constants) are summed up and give the final one.

When calculating the logarithm and exponent, CORDIC algorithms and their corresponding logarithmic constants are also used. The main thing here is that multiplication by (1 + 2 ^-n ) can be done quickly with the help of shifts and additions. The logarithm and exponent can be calculated by multiplying one side of the equation by a sequence of values and adding the corresponding logarithmic constants to the other. According to the algorithms for calculating the logarithm and exponent of the documentation for 8087, I did not find it. I think that they are similar to those described in the next article , only 8087 uses base 2 instead of e. I do not understand why 8087 does not have the log ₂ constant (1 + 2 ^-1 ) needed for that algorithm.

Support for transcendental functions in 8087 is not as extensive as you might expect. It supports only tangent and arc tangent, without sines and cosines. To calculate the latter, it is necessary to apply trigonometric identities. Logarithms and exponents only support base 2 - for bases 10 or e the user will have to apply a scale factor. At one time, 8087 expanded the limits of the capacity of the chips, so the number of instructions was minimized.

Conclusion

The 8087 is a complex chip, and at first glance it looks like a hopelessly tangled maze. However, for the most part, it can be understood after careful study. In its ROM, 42 constants are stored, and their values can be extracted using a microscope. Some constants (e.g. π) were expected, while others (e.g. ln (2) / 3) raise more questions. Many of the constants are used to calculate tangents, arctangents, logarithms, and degrees using CORDIC algorithms. Photo of a crystal 8087 without a metal layer. Clickable.

Although Intel 8087 for floating point operation was introduced 40 years ago, its impact is still felt. He spawned the IEEE 754 standard for floating point numbers, used in most arithmetic calculations, and the 8087 instructions remain part of the x86 processors used on most computers.

We extract constants from the crystal of the mathematical coprocessor 8087

ROM implementation

ROM contents

CORDIC Algorithms

Conclusion

More articles: