🧚🏾 🛫 👪 Inside AMD's 1970s Am2901 multi-chip sectional microprocessor 🛤️ 👨🏼‍💻 👩

Inside the 1970s AMD Am2901 multi-chip sectional microprocessor,

you may be familiar with the advanced processors manufactured by Advanced Micro Devices. But AMD started producing processors back in 1975 when it first introduced its Am2901. It was the so-called multi-chip sectional processor: each chip processed 4 bits, and to increase the word size, several chips were used simultaneously. This approach was used in the 1970s and 1980s to create 16, 32 or 64 bit processors (for example), when they could not place the whole processor on one fast chip. There were processors on the same chip, but their MOSTransistors Worked Slower. Over time, CMOS processors became faster than bipolar transistor processors, and when their speed grew enough, almost all manufacturers switched to them. Photo of a crystal with an Am2901 chip. The metal layers of the chip are visible; silicon is at the bottom. At the edges of the crystal, tiny conductors connect the chip with external contacts.

The Am2901 chip gained great popularity, it was used in a variety of systems, from the Battlezone video game to the VAX-11/730 minicomputer, from the Xerox Star workstation to the on-board Magic 372 computer of the F-16 fighter. A faster version of this processor, Am2901C, used emitter-coupled logic (ESL) to improve performance. In this article, I dissect Am2901C, examine its crystal under a microscope and explain how ESL circuits make it possible to implement an arithmetic logic unit (ALU).

By the way, in the documentation for Atari Battlezone there is no mention of a specific model of the Am2901 chip, however there is a footnote for the part number 137004-001, which they call the “transistor array”. Moreover, intentional distortions were made in the given pinout diagram of the chip, and 20 address pins and 8 data pins are shown to make the chip look like ROM (unlike, for example, the 7400 series chips described exactly). Atari may have tried to prevent the cloning of its video games by hiding the models of some key chips.

A popular alternative for Am2901 in many microcomputers was the ALU 74181 chip . It provided the same arithmetic and logic functions as Am2901, but not its registers.

Multi-chip sectional microprocessor

You might be wondering how several processor chips can work together and support words of arbitrary length. The bottom line is that the microprocessor section (MS) is a building brick, not the entire processor, and it needs separate circuits to decode instructions and manage the system. The MS had registers, they performed arithmetic and logical operations with data, and the control chip (such as Am2901) told the MS what to do. Each machine instruction was divided into smaller steps, micro-instructions stored in the ROM microcode. In this case, the set of instructions was determined by the microcode, and not Am2901, so almost any set of instructions could be supported .

Due to the fact that sections in such a processor are not completely independent from each other, certain difficulties arise during the operation of the processor. For example, when adding two numbers, the transfer from one section must be transferred to another. Also, joint work of several sections requires operations such as checking a sign or checking for a zero result. The Am2901 chip has special outputs to support these functions.

Is Am2901 a microprocessor? From my point of view, Am2901 is only a part of the processor, but it all depends on how to determine the “microprocessor" (I described my thoughts on this in detail in a separate article) Interestingly, in the USSR they were more inclined towards sectional microprocessors than in the USA. And if in the West the word “microprocessor” usually means a processor on a single chip, in the USSR processors on a single chip or from several sections were usually not distinguished .

Multi-chip sectional microprocessors (MSMs) were somewhere in the middle between microprocessor chips and a computer made from simple TTLschips. At that time, assembling a computer from TTL chips was much faster than making a microprocessor, but this required a lot of boards with chips. The use of MSM allowed to maintain the advantage in speed, while reducing the number of chips used. MSM also provided greater flexibility compared to the microprocessor, allowing the designer to customize the set of instructions and other architectural features.

Crystal overview

The photo below shows the Am2901 crystal and highlights the key functional blocks. For this photo, I removed the metal layers so that silicon and transistors can be seen. The largest functional block of the chip is the register memory in the center. The chip has 16 4-bit registers (you can see 16 columns and 4 rows in the memory array). To the left and right of the memory block are the memory driver circuits that control writing and reading. Photo crystal Am2901; Key function blocks are marked. External circuits mainly consist of buffers that convert signals between external TTLs and internal ESLs. Am2901 Complete Flowchart

The arithmetic logic device (ALU) of the chip is engaged in arithmetic (addition and subtraction) and logical operations (AND, OR, excluding OR). The first section of the ALU is a large block at the bottom left; it consists of four rows, since it is a 4-bit ALU. ALU also has logic that generates a carry output for addition, and uses a quick technique called “carry lookahead”. Then, the ALU uses the transferred values to generate the sum in parallel. Finally, the output circuits process and buffer the amount, and send it to the output contact.

A transfer with a preview uses the Generate and Propagate signals to determine if a bit at each position produces its own transfer or passes the incoming one. For example, if you add 0 + 0 + C (C is the transfer), the transfer cannot be eliminated from this addition, regardless of its size. On the other hand, if you add 1 + 1 + C, the transfer will appear anyway, regardless of what C is. Finally, in the cases 0 + 1 + C (or 1 + 0 + C), the transfer will be transferred further if C was nonzero. As a result, simple logic gates create a G (Generate) signal for each bit if both bits are equal to 1, and a P (Propagate) signal if both bits are not equal to 0.

The transfer formula depends on the location of the bit. For example, consider transferring from bit 0 to bit 1. It will happen if the P0 flag is set (that is, the transfer has arisen or is being transmitted), and if its transfer has appeared in this bit, or it has come to it from other bits. Then C1 = P0 AND (Cin OR G0). In higher order hyphenation, the number of options is growing, and their complexity is constantly increasing. For example, consider a transfer to bit 2. First, P1 must be cocked so that the transfer goes from bit 1. In addition, the transfer was either created by bit 1 or transferred from bit 0. Finally, the first transfer also had to come from somewhere take it: it was either a transfer that came from bit 0, or a transfer generated by bit 1. If you put all this into a formula, you get the function used in Am2901: C2 = P1 AND (G1 OR P0) AND (C0 OR G0 OR G1).The formulas for the different transfers and the external P and G are given inspecifications , fig. 9.

Empty rectangles at the edges of the chip are areas for connecting the chip to the outside world. Next to them are schemes for sending and receiving signals. In particular, since the chip communicates with external circuits using TTL signals, but uses ESL internally, these circuits convert between TTL and ESL voltages.

The chip has two shift registers capable of shifting the word one bit to the right or left. Register Q - A 4-bit register based on triggers. Finally, the reference voltage circuit generates the exact reference voltages necessary for the operation of the ESL.

How to see a crystal

To look inside the chip, you usually have to dissolve its plastic case in hazardous acids. However, I did not buy the Am2901 chip in a plastic case, but in a ceramic case. I just walked along the seam of the chip with a chisel and disconnected the two halves, which allowed me to get to the crystal inside. A silicon crystal is a small rectangle in the center of the chip. Thin conductors connect the crystal pads to the lead frame, which leads to 40 external chip contacts.

Am2901 after disconnecting the two halves of the ceramic body.

To obtain high-resolution photos of the chip, I used a special metallographic microscope. In the photo below you can see the AMD logo. Above is a conductor soldered to the site. The chip has two metal layers that create an electrical circuit, visible in the photo on the right.

Close-up photo of the chip - the inscription 4301X (probably the part number) and “1983 AMD” are visible.

I collected a large high-resolution photo from several small microscope images (read more about the process of creating crystal photos here ). Then I removed the metal layers and took another set of silicon photographs.

The close-up photo below shows four transistors and three resistors. Different areas of silicone have different impurities, giving them different properties, and these areas are visible under a microscope. The chip is based on bipolarNPN transistors different from MOS transistors of modern computers. The base transistor (p-type silicon), emitter (n-type silicon) and collector (n-type silicon) are marked on the left transistor [B, E, C]. Light rectangles are the contacts of silicon and the metal layer that used to be on top. The two transistors on the right have one common large collector. On this chip, transistors with a common collector are often found.

Below are three resistors. A resistor is obtained by adding impurities to silicon that increase its resistance. The accuracy of resistors in ICs is usually poor. They also turn out to be relatively large - here they are the same size with transistors, while others are much larger. Therefore, when designing ICs, they try to minimize the number of resistors.

Emitter-related logic

Logic schemes can be created in very different ways. Almost all modern computers use the CMOS logic system (a complementary metal-oxide-semiconductor structure ), where the valves are made up of MOS transistors. In the era of minicomputers, TTL was very popular. ESL was a faster, but less common scheme. The disadvantage of ESL was greater energy consumption (the 1985 Cray-2 supercomputer used ESL valves to increase its speed, but it had to be cooled with liquid freon).

Most of the advantages of ESL in speed were due to the fact that the transistors were not fully turned on. This allowed the transistors to very quickly change the current paths. In addition, the difference between the voltage for values 0 and 1 was small (of the order of 0.8 V), so the signals could switch back and forth quite quickly. For example, in TTL valves, the voltage difference is about 3.2 V (the signals can switch at a speed of about 1 V per nanosecond, so with a large voltage difference there are delays of several nanoseconds). On the other hand, a small voltage difference led to an increase in the sensitivity of ESL to electrical noise.

The first versions of Am2901 used TTL, but in 1979 AMD introduced its faster version, Am2901C. The Am2901C internally used ESL for speed, but on the outside it maintained TTL voltage, making it easy to use in TTL computers. This post describes the Am2901C variant.

ESL is based on a differential pair system - operational amplifiers work in a similar way . The idea of a differential pair (see below) is that a fixed sense flows according to the scheme. If the voltage at the input on the left is greater than on the right, then the left transistor will turn on and most of the current will go through the left branch. And vice versa (note that transistor emitters are connected - hence the name of emitter-coupled logic.

Differential pair. If the voltage at the left input (red) is higher, most of the current will go along the left path, and vice versa.

Some modifications allow you to turn a differential pair into an ESL valve. Firstly, the voltage in one branch is fixed and becomes reference, somewhere in the interval between levels 0 and 1. Then, if the input is higher than the reference voltage, it will be considered as 1, and if lower - as 0. Then, the output is connected to the branch transistor (green), which gives the output signal by buffering the voltage of the branch. The inverter circuit is shown below, because if the input voltage is high, the current through the left resistor will pull the output down. To increase performance, the lower resistor was replaced by a drain (magenta), consisting of a transistor and a resistor.

The drain at the bottom of the ESL valve gives, in fact, a constant current controlled by the incoming voltage V _CS . This option is better than a simple resistor, because the current through the resistor varies depending on the voltage, depending on the input voltages. Also, such a circuit saves space, since it uses a smaller resistor.

ESL inverter. The upper right resistor can be omitted because it is not connected anywhere.

You can build a more complex ESL valve by adding more inputs. In the diagram below, a second input transistor (2) is added parallel to the first (1). Current will flow through resistor R1 if there is 1 at input A or B (that is, the voltage will be higher than the reference). In this case, the output is pulled down, and we get the NOR valve. Using other configurations, you can make AND, XOR valves or more complex circuits.

ESL NOR valve

The diagram above shows a NOR gate - such as it is implemented on a chip. The photo below shows the corresponding physical diagram of the valve. On the left is a silicon crystal layer, where transistors and resistors are visible. On the right are metal tracks in the same part of the chip. Above are transistors 1 and 2, receiving an input signal. Each of them has a base on top, and an emitter in the middle. Transistors have a common collector - a white rectangle below. Resistors R1 and R2 are rectangles of silicon. All transistors in the middle (including 3 and 4) have a common collector connected twice to the plus (non-numbered transistors and resistors belong to other valves).

NOR valve implemented on Am2901 crystal

It can be seen from the conductors on the right that the top layer provides the connection of horizontal conductors to the plus, reference voltages, VCS drain and to the minus (it can be seen that the plus and minus are made wider to support high currents). Underneath are the conductors connecting the transistors. From above, inputs A and B are connected to the bases of transistors. The rest of the wiring is more difficult to trace, since it is covered by the top layer. But you can, for example, see the connection between transistor 4, the collector of transistors 1 and 2, and R1. By carefully studying the photographs of the crystal, you can understand all the wiring and reverse engineer the chip logic.

Arithmetic logic unit (ALU)

The Arithmetic Logic Unit (ALU) in the Am2901 chip performs 4-bit arithmetic or logic operations. It supports 8 different operations: addition, subtraction and bitwise logical operations (it does not deal with multiplication and division).

The block diagram below shows the structure of the AL29 Am2901. First, the selector (multiplexer) selects two inputs from potential sources. The value of D is transmitted to the data contacts of the chip, usually to the processor data bus. A - this is the value of one of the 16 entries in the chip register file, selected by contacts A0-A3; B works in a similar way. A constant value of 0 can be fed to ALU. Finally, Q is the contents of the Q register (optional separate register). Many data sources give the chip more flexibility.

Block diagram of the ALU Am2901 from the chip specification. ALU performs one of eight functions on two 4-bit inputs, R and S. On the right are various outputs from the chip: G, P, carry output, sign, overflow, zero test.

Two selected values, R and S, are fed to the ALU, which performs the selected operation and outputs the result to F. Also, the ALU accepts the carry-in amount and transfers the carry-out value (CN + 4) ; this allows you to combine several ALUs to handle longer words. The outputs G and P are used for transferring with a preview, and the sign, overflow and test of zeros can be used as conditional processor codes.

Briefly describe the ALU circuit, starting with the selector. The first two boxes of the selector below (D and A) select the first argument of ALU, and the last three (A, Q and B) select the second argument. Each selector implements the Select • (Value ⊕ Invert) function, where Value is the potential input value, Select is 1 to select this value, and Invert is 1 to invert the value (since ALU is 4-bit, 4 bits are selected; each selector is implemented using four ESL valves).

The desired value is selected by including one of the Select lines. If none of them is turned on, then the value coming to the ALU will be 0. Also, the selector can invert the input; the chip performs subtraction by adding the inverted value.

The first part of ALU consists of four horizontal layers, one per bit

The diagram below shows the AND-XOR circuit used in ALU AM2901, which implements operation A '• (B ⊕ C). I will briefly describe how it works. If the voltage at input A is high, current flows through the left transistors, pulling the output down. If B and C are high voltage, the current through the left transistors B and C draws the output down. If the voltage on V and C is low, the current through the transistors V _ref draws the output down. If B and C have a different voltage, the current goes from transistors +, and the output remains high voltage. The bottom line is that a single ESL valve can implement complex functionality. With most logic gates, XOR is harder to implement. To me personally, the ESL logic resembles the relay of the 1920s, because it switches between the two current paths, and not just turns on and off.

After selecting two inputs for the ALU, it calculates the “Propagate” (P) and “Generate” (G) bits for each pair of incoming bits. This is part of the preview transfer procedure used for quick addition.

The photo below shows the remaining parts of the ALU circuit (for a change, this photo has a metal layer, unlike previous photos, where there was only silicon). Signals P and G from the previous circuit pass into two transfer computation blocks. The lower transfer unit calculates the external P, G, and transfer signal with a preview for several chips; this allows you to quickly add long words.

The transfer technique with preview can be implemented on several chips to quickly add numbers larger than 4 bits. Each chip generates Generate and Propagate signals, telling whether it will generate a transfer or transmit an incoming transfer. These signals are combined with a transfer generator chip with a preview - such as that of Am2902.

The upper transfer unit calculates the internal transfers. The “sum” scheme calculates the sum of each bit using hyphenation, and the values of P and G. It is important that due to the hyphenation hyphenation, the sum of each bit can be calculated in parallel. Finally, the output circuit converts the internal ESL signals to TTL signals and controls the four output contacts.

The rest of the ALU circuit

The chip uses some interesting techniques that allow you to use the adder for eight operations. The selector circuit described earlier can optionally complement its input. This is used for subtraction, as well as for some of the logical functions. When calculating logical functions (instead of adding / subtracting), transfer calculation is disabled. During logical operations, bits are not affected by what happens to other bits. Finally, the XOR circuit of the adder turns into the AND circuit by increasing the P signals to the maximum. Thus, instead of using eight different circuits for eight ALU operations, the chip uses a single circuit with some carefully selected tweaks.

The chip uses the values of P and G to generate the sum of the inputs R and S with the carry C. The sum (R ⊕ S ⊕ C) ', is calculated as ((P' ∨ G) ⊕ C) ', where P = R∨S, and G = R • S. If P equals 1, then (P '∨ G) reduces to G equal to R • S. It turns out that, by changing P, the same circuit can be used to calculate AND from the input values of R and S.

The table below shows the 8 operations performed by ALU. Three bits of instructions are fed to the chip and used to select the operation: I5, I4 and I3. The “function” column shows the functions according to the documentation, and the “calculation” column shows how each of the bits is calculated. Note that all operations are ultimately reduced to exclusive OR (⊕) or AND (∧). Addition is done by the bitwise XOR of the two arguments and carry bits. Subtraction is carried out through the addition of the argument and subsequent addition. For example, adding the complement R (R ') is the same as subtracting R. Bit I3 complements R, and bit I4 complements S. Operations with exclusive OR (EXOR and EXNOR) use the same elements as addition, but with transfer calculation lock. The AND operation is performed by blocking the signal G. Finally, OR is calculated according to de Morgan law, R '∧ S'= (R ∨ S) '. The point is that Am2901 does not require separate elements for addition, subtraction, AND, OR and EXOR - most of the elements are used in each of the operations.

Symbolic notation	I5	I4	I3	Function	Calculation
ADD	0	0	0	R plus s	R ⊕ S ⊕ Carry
SUBR	0	0	1	S minus r	R '⊕ S ⊕ Carry
SUBS	0	1	0	R minus s	R ⊕ S '⊕ Carry
OR	0	1	1	R OR S	(R '∧ S') ⊕ 1
AND	1	0	0	R AND S	R ∧ S
NOTRS	1	0	1	R 'AND S	R '∧ S
EXOR	1	1	0	R EX OR S	R ⊕ S '⊕ 1
EXNOR	1	1	1	R EX NOR S	R '⊕ S' ⊕ 1

Conclusion

The Am2901C chip is of interest as an example of high-speed ESL, a relatively rare logical family. The ALU chip is distributed on the bottom of the chip, implements eight different functions, and uses preview transfer to speed up the work. Although the chip is quite complex, careful examination under a microscope helps to understand its work.

Multi-chip sectional processors, such as the Am2901, were used in microcomputers and many other systems in the 1970s and 1980s. However, in the end, improvements in CMOS technology enabled the implementation of a fast processor on a single chip, which made this technology obsolete. And although the Am2901 probably contains about a thousand transistors, and it runs at 16 MHz, today AMD manufactures processors containing billions of transistors and operating at 4 GHz.