# **ED2900A** INTRODUCTION TO DESIGNING WITH THE Am2900 FAMILY OF MICROPROGRAMMABLE BIPOLAR DEVICES LECTURE II #### **ED2900A** # INTRODUCTION TO DESIGNING WITH THE Am2900 FAMILY OF MICROPROGRAMMABLE BIPOLAR DEVICES VOLUME II 3rd Edition January 1985 Advanced Micro Devices, Inc. Customer Education Center #### Volume II #### Table of Contents - Improving the ALU Improving ALU Performance Additional ALU Improvements - Bit-slice ALU, Am2901 Interconnection of Slices, Am2901 Carry-Lookahead, Am2902 Sample Microcode, Am2901 - Introducing the Super Slice, Am2903/Am29203 Interconnecting Slices, Am2903/Am29203 Example Microcode, Am2903/Am29203 Special Functions of Am2903/Am29203 Multiplicaton Process Single-Length Normalize Process Am29203 Additional Special Functions - Expanded Memory for ALUs - Introduction to Interrupts Implementation of Interrupt Control, the Am2913 A Complete Interrupt Controller, Am2914 - Am2900 Family Support Devices - 16-bit ALU Controller, Am29116 - AMD Families - Future Microprogrammable AMD Devices - AMD Support Tools for Microprogram Development IMPROVING THE ALU #### THE BASIC STRUCTURE - The following page repeats the basic structure which is the initial configuration for a simple computer system. - A highly capable computer control unit has evolved using the Am2910 as an example microinstruction sequencer. - An arithmetic/logic unit (ALU) with more operational and storage capabilities will now be developed leading to a variety of commercial devices that can be selected for implementation. #### ALU Development - The system thus far can support basic machine instructions: add, subtract, OR, AND, exclusive OR, load accumulator, and store. - The particular way in which the A and B ports of the ALU are connected to the ourside world dictate that a single-address machine format be used for this architecture. - Since only one address is supplied, the second operand address (for two operand functions) is assumed to be the accumulator. - This addressing technique is also known as <u>implied addressing</u> and is often used to save program space (instructions) at the expense of instruction generality. #### Single Address Format | OPC0<br>(4) | OPCODE OPERAND ADDRESS (4) (12) | | 1 | MACHINE INSTRUCTION<br>FORMAT | | |-------------|---------------------------------|------------------|------|-------------------------------|--| | S (1) | MAGNITU<br>(1 | DE OF DATA<br>5) | DATA | FORMAT | | #### **DEFINITIONS** PC Program counter (register); maintains the memory address of the next machine instruction to be fetched. MAR Memory address register; contains the address of the item (instruction or operand) which is to be fetched from main memory. MAIN MEMORY Read/write storage (CORE; RAM); contains the program under execution and the associated data; or contains part of the program and part of the data (that which is actively in use). ACC Accumulator register (accumulates ALU results). ALU Arithmetic/logic unit; operates on data according to the instruction operation code. #### COMPUTER CONTROL UNIT OPERATION Once the system is initialized and the uPC (microprogram counter) has been given an initial or first microprogram statement address the general cycles (phases) of "fetch instruction", "fetch operand" and "execute instruction" occur. The register transfer operations in these cycles are further defined below. - Fetch the machine instruction (macro level) at the memory address defined by the PC register. - Decode the opcode portion of the macro instruction (assuming the simplistic machine instruction format shown previously). The instruction decode determines if a data operand is needed. - During the microinstruction decode (JMAP) step, increment the PC register. Note that the amount of the increment (1, 2, ...) depends on how many memory locations are taken up by the completed instruction (1 here). (This may be done later.) - Fetch the operand if required. Determine if any operations are required before the operand is ready for use, such as complementing (none in the current architecture). - If none determine if any other operand is needed (none required in current architecture). - If all operands are present, execute the instruction. - When the execution steps are completed (and the result left in the accumulator), then increment PC, if not already done, and fetch the next instruction. #### SIMPLE COMPUTER The following page diagrams a simple general purpose computer capable of executing the basic machine instructions previously defined. - The ALU output is connected to an <u>F BUS</u>, which in turn may be connected to provide input to the ACC, the PC or the MAR. - The MAR is the only register which may address the main memory. - The PC register is incremented using the ALU. - When the MAR supplies an address to the external main memory, the value in the addressed location could be an instruction, data, or the location into which data is to be written. The associated register transfer activity is controlled by the CCU. - The TEST input to CCU comes from the D BUS (the ACC). The test is for $\emptyset$ . If any bit = 1, TEST = 1 (OR operation). Use for conditional jump. - e.g. Jump (CJP) if ACC = $\emptyset$ #### MORE DETAILED FETCH CYCLE (Defined in pseudo register transfer language) #### FIRST: - The CCU controls the transfer of the <PC> (contents of the PC) to the B port of the ALU. - The CCU causes the ALU to pass <PC> (no operation). - The ALU output is written into the MAR register. #### THEN: - The MAR contents are enabled on the Address Bus. - The CCU instructs the memory to perform a READ. - The CCU instructs the ALU to pass the macroinstruction. - The CCU instructs the instruction register to latch the upper 4 bits (opcode), with the lower 12 bits latched by the MAR (operand address) in anticipation of fetching an operand. #### FINALLY: • The CCU transfers <PC> through the ALU and increments it by 1 before storing the incremented value. #### WHEN DATA IS TO BE FETCHED FROM MEMORY (Fetch Operand Phase): - The CCU causes the MAR to output to the Address Bus. - The CCU causes the memory to perform a READ. - The DATA READ is gated to the A port of the ALU. #### **EXECUTE INSTRUCTION PHASE** The CCU causes the data and, if appropriate, ACC data to be manipulated according to the macro-level instruction opcode. #### WHEN DATA IS TO BE WRITTEN TO MEMORY - The CCU causes the MAR to output to the Address Bus. - The CCU causes the ACC to output to the MDI port of the memory. - The CCU causes the memory to perform a write. #### Note: The PC contents must be transferred to the MAR before the instruction can be fetched. The MAR is used to address the memory for instruction fetches as well as operand fetches. ## THE NEW BASIC COMPUTER INSTRUCTION SET | LDA,ADDR | Load accumulator with contents of address | |-----------|----------------------------------------------------------------------| | ADD, ADDR | Add accumulator and contents of address | | SUB,ADDR | Subtract accumulator from contents of address | | OR,ADDR | OR accumulator with contents of address | | AND, ADDR | AND accumulator with contents of address | | XOR,ADDR | Exclusive-OR accumulator with contents of address | | INA | Input to accumulator | | OUT | Output from accumulator | | JMP,ADDR | Jump to <address></address> | | JMZ,ADDR | Jump to <address> if accumulator is <math>\emptyset</math></address> | | STO,ADDR | Store contents of accumulator at address | #### DESIGN PROBLEM: A VERY SIMPLE COMPUTER #### HOMEWORK - CPU MICROPROGRAM - Turn to your ED2900A Exercise and Laboratory Manual. The basic hardware and macroinstruction set for the simple computer design problems are presented. - Your assignment: Write the microprogram to support the entire macroinstruction set (implement direct jump first, indirect jump if you have time). - Limit: 16 microinstructions - Do not look at the solution until you have tried the problem: Learning by doing! IMPROVING ALU PERFORMANCE #### ALU SPEED OF EXECUTION Consider an ADD (assume <ACC>+<MEM> -> <ACC>): ADD, MEMADDR2 PC -> MAR FETCH INSTR DECODE, INCR PC FETCH DATA, ADD TO <ACC> 4 MICROCYCLES • This is only valid for sequential operations on the accumulator. If the accumulator is needed to store other data, then the intermediate results must be stored in memory and refetched before each operation. In that case, the time needed becomes: LDA, MEMADDR1 4 MICROCYCLES ADD, MEMADDR2 4 MICROCYCLES STO, MEMADDR2 4 MICROCYCLES TOTAL 12 MICROCYCLES This is more realistic, since the ALU accumulator is often needed for current storage and cannot be considered generally available for storage of intermediate results. Additional accumulators or general purpose ALU registers could thus reduce the number of microcycles required per macro level instruction execution. #### GENERAL REGISTER ARCHITECTURE To improve speed and flexibility, consider a different machine instruction format: | OPCODE<br>(8) | REGISTER<br>ADDRESS 1<br>(4) | REGISTER<br>ADDRESS 2<br>(4) | |---------------|------------------------------|------------------------------| | | COURCE | COURCE AND | SOURCE SOURCE AND DESTINATION Now an ADD could consist of these steps: ADD R1, R2 PC -> MAR. FETCH INSTR DECODE, INCR PC ADD R1 + R2 -> R2 TOTAL MICROCYCLES 4 This speed improvement for an "ADD" is valid ONLY - If the data for the instruction is already in the register. - If the result is to be used in a following instruction such that it remains in the registers. If "enough" ALU registers exist, these assumptions are valid because of the tendency of data to cluster (locality of data, locality of reference) within most computer programs. The number of required ALU registers depends upon the specific application. Bit-slice architecture using the Am2900 family permits devices with a choice. #### FURTHER IMPROVEMENTS - How many registers? Let's begin with 16 general purpose "scratchpad" registers. - These registers are multiport registers. Two may be accessed at a time in order to perform: $$R_A + R_B \longrightarrow R_B$$ in one microcycle - The accumulator register is now <u>any register</u> which provides a more general system architecture. - However, one must always specify two operand addresses per machine instruction -- the advantage of implied addressing no longer exists. - Note that the one word register address machine instruction format is as compact as the one word single address instruction. However, the address space directly addressable is lower; 2<sup>12</sup> versus 2<sup>4</sup> which is the current configuration. ### A, B ALU ADDRESSES - The register addresses can be supplied from two sources: - from the instruction register (macroprogramming) - from the microword (microprogramming) - This implies a multiplexer for selection of A, B addresses and the microword field to control the selection. - A and B addresses would be taken from the macroinstruction register for most instructions. - A and B addresses could also be supplied by the microword in special cases (long word arithmetic; floating point operations; etc.). # MULTIPORT MEMORY TIMING #### MULTIPORT READ/WRITE TIMING Clock rising edge Load pipeline register Load instruction register (if Fetch done previously) Load RAM register (if Write done previously) Time delay into clock high A, B addresses stable • Clock high READ <A>, <B> Clock falling edge Latch the RAM outputs Clock low Perform ALU function Set-up time for RAM #### MACHINE INSTRUCTION SET #### ONE WORD FORMAT ADD R1, R2 Addition SUB R1, R2 Subtraction OR R1, R2 Boolean OR AND R1, R2 Boolean AND XOR R1, R2 Boolean Exclusive OR MOV R1, R2 MOVE <R1> TO <R2> IN R2 Input (R2) OUT R2 Output <R2> JMP R2 JUMP TO <R2> JMZ R1, R2 JUMP TO $\langle R2 \rangle$ IF $\langle R1 \rangle = 0$ #### TWO WORD FORMAT LDR R1, MEMADDR STO R1, MEMADDR #### THE MACRO PROGRAM COUNTER - There is no need nor real advantage to maintaining a separate PC register. - Use one of the scratchpad registers (general purpose) as the PC for increased flexibility (addressing). - Any one of the registers could be used (R15 is usually selected). - Sophisticated addressing schemes are possible: ``` ADDRESS = <PC> ADDRESS = <PC> + <BASE REGISTER> ADDRESS = <PC> + <OFFSET REGISTER> ADDRESS = <PC> + <BASE> + <OFFSET> ADDRESS = <Ri> ``` #### THE "NEW" ARCHITECTURE - A separate PC register is no longer included. - The scratchpad registers are shown in a new position for consistency with typical AMD data sheets. - ullet Carry-in (C<sub>IN</sub>) is available for arithmetic operations. - TESTS (zero, overflow, negative, carry-out) are now made by the ALU. - The MAR remains as a separate register. It will not be considered as part of the "ALU architecture", and will not be shown in subsequent drawings since it is an address buffer for main memory. ADDITIONAL IMPROVEMENTS ### ADDITIONAL ALU ARCHITECTURAL FEATURES - To allow for more flexibility in selecting ALU sources, multiplexers are added to both ALU inputs. - Three sources are available to the A-ALU Input (R-Port): - data in $(D_{IN})$ - the RAM (register) A-Port Output - the value '0' - Four sources are available to the B-ALU Input (S-Port): - the RAM (Register) A-Port Output - the RAM (Register) B-Port Output - The value '0' - The Q-register (to be defined) - In order to allow register values to be output quickly without passing through the ALU, an output MUX is added to allow selection of the RAM (Register) A-Port output or the ALU output. An output enable control on the output MUX allows these outputs to be connected to a tri-state bus. - As indicated earlier, the MAR is no longer shown as part of this architecture, although one would usually be connected to the tri-state output Y<sub>OUT</sub>. - Control signals for the multiplexers are included with the ALU function control as a set of instruction lines. ### **ADDITIONAL EXPANSION:** - Add a shifter at the ALU output RAM input - Allow up/down 1-bit shift - With external support can perform 1-bit up/down rotate - Can choose not to shift/rotate - A separate shifter allows a shift and an arithmetic operation (OP) to be performed in one microcycle, i.e. • This capability also allows less complex firmware routines for multiply, divide, and other functions. #### ALU COMPLETION - Multiplication of NxN bit numbers produces a 2N bit result. Thus, add an extension register - the Q register - to store the least significant part of the product. Note that the Q register output is connected to the S port selection MUX. - Add a shifter for the Q register. With external connections this allows double precision up/down shift/rotate as well as supporting the multiply process. - Expand ALU status lines. Add carry propagate and generate for high speed addition (e.g. carry-look-ahead operation which requires additional external logic discussed in detail later). "A BIT-SLICE ALU HAS BEEN DEVELOPED" BIT-SLICE ALU The Am2901 Sequencer | | | MICR | o co | ALU SOURCE<br>OPERANDS | | | |----------|----------------|----------------|----------------|------------------------|----|----| | Mnemonic | l <sub>2</sub> | l <sub>1</sub> | I <sub>O</sub> | Octal<br>Code | R | S | | AQ , | Ľ | L | L | 0 | Α | a | | AB | L | L. | Н | 1 | Α | В | | ZQ | L | н | L | 2 | 0 | Q | | ZB | L | н | н | 3 | 0 | В | | ZA | Н | L | L | 4 | 0 | A | | DA | Н | L | н | 5 | D | A | | DQ | н | н | L | 6 | D | a | | DZ | Н | Н | н | 7 | D. | Ο. | ## SOURCE CONTROL | | 1 | MICI | 30 C | ODE | ALU | SYMBOL | | |----------|----|------|----------------|---------------|------------|--------|--| | Mnemonic | 15 | 14 | l <sub>3</sub> | Octal<br>Code | Function | | | | ADD | L | L | L | 0 | R Plus S | R + S | | | SUBR | L | L | н | 1 | S Minus R | S - R | | | SUBS | L | н | L | 2 | R Minus S | R - S | | | OR | L | н | н | 3 | R OR S | RVS | | | AND | н | L | L | 4 | R AND S | R∧s | | | NOTRS | н | L | н | 5 | R AND S | RAS | | | EXOR | н | н | L | 6 | R EX-OR S | R₩S | | | EXNOR | н | н | Н | 7 | R EX-NOR S | RYS | | ## **FUNCTION CONTROL** Consult the AMD Data Book for discussion of tables. Note the effect of $\mathrm{C}_{\mbox{\footnotesize{IN}}}$ in Am2901 function control. DESTINATION CONTROL | MICRO C | | o co | DE | RAM<br>FUNCTION | | Q-REG.<br>FUNCTION | | Y | RAM<br>SHIFTER | | Q<br>SHIFTER | | | |----------|----------------|------|----------------|-----------------|-------|--------------------|-------|---------|----------------|-----------------|------------------|------------------|-----------------| | Mnemonic | I <sub>8</sub> | 17 | l <sub>6</sub> | Octal<br>Code | Shift | Load | Shift | Load | оитрит | RAMO | RAM <sub>3</sub> | $\mathbf{a}_{0}$ | <b>Q</b> 3 | | QREG | L | L | L | 0 | Х | NONE | NONE | F→Q | F | X | × | × | × | | NOP | L | L | Н | 1 | × | NONE | х | NONE | F | × | × | x | х | | RAMA | L | Н | L | 2 | NONE | F→B | × | NONE | · A | × | × | x | × | | RAMF | L | Н | Н | 3 | NONE | F→B | × | NONE | F | х | × | х | × | | RAMQD | Н | L | L | 4 | DOWN | F/2 → B | DOWN | Q/2 → Q | F. | Fo | IN <sub>3</sub> | Ω <sub>0</sub> | IN <sub>3</sub> | | RAMD | н | L | Н | 5 | DOWN | F/2 → B | × | NONE | F | F <sub>0</sub> | IN <sub>3</sub> | $\Omega_0$ | х | | RAMQU | Н | Н | L | 6 | UP | 2F → B | UP | 2Q → Q | F | IN <sub>0</sub> | F <sub>3</sub> | INo | Q <sub>3</sub> | | RAMU | н | н | Н | 7 | UP | 2F → B | х | NONE | F | INo | F <sub>3</sub> | Х | Q <sub>3</sub> | X = Don't care. Electrically, the shift pin is a TTL input internally connected to a three-state output which is in the high-impedance state B = Register Addressed by B inputs. UP is toward MSB, DOWN is toward LSB. Am2901 PIPELINE REQUIREMENTS INTERCONNECTION OF SLICES Am2901 # 12-bit ALU, Ripple Carry #### Am2902A #### CARRY LOOKAHEAD DEVICE To develop the implemented equations for the carry lookahead device, a single bit addition is considered first, then a single ALU device is considered next and finally, a combination of slices. - $\bullet$ A carry, $C_{i+1}$ , from the i'th bit location is either - generated $(G_i)$ at the i'th position, i.e., $$G_i = A_i B_i$$ (Boolean AND) where $$A_i = i$$ 'th bit of augend $B_i = i$ 'th bit of addend - or propagated $(P_i)$ across the i'th bit position if $C_i = 1$ , i.e., $$P_1 = A_1 B_1 + A_1 B_1$$ (Boolean Exclusive OR) Carry-out, $C_{i+1}$ , of the i'th bit position is then $$C_{i+1} = G_i + P_i C_i$$ ### CARRY LOOKAHEAD (CONT'D) • Internal to each 4 bit slice, the carry value is developed using the i'th bit equations for i=1,2,3,4. Note that each bit can calculate $P_1$ and $G_1$ without the i'th bit carry-in value. The resulting carry equations for each bit are $$C_0 = C_{in}$$ (carry-in) $C_1 = G_0 + P_0 C_{in}$ + = Boolean OR $C_2 = G_1 + P_1 G_0 + P_1 P_0 C_{in}$ $C_3 = G_2 + P_2 G_1 + P_2 P_1 G_0 + P_2 P_1 P_0 C_{in}$ $C_4 = G_3 + P_3 G_2 + P_3 P_2 G_1 + P_3 P_2 P_1 G_0 + P_3 P_2 P_1 P_0 C_{in}$ • These carry bit equations are calculated and the "modulus two" sum (Exclusive OR) or each augend bit, addend bit and carry bit generates the 4 bit sum. For example, consider the addition of two 4-bit binary numbers (augend = 0111, addend = 1001): 1 1110 carry ( $$C_{in} = 0$$ ) 0111 augend 1001 addend $$\overline{0000} \quad 4\text{-bit sum with } C_4 = 1$$ where $G_0 = 1$ $P_0 = 0$ $C_1 = 1$ $$G_1 = 0 P_1 = 1 C_2 = 1$$ $$G_2 = 0 P_2 = 1 C_3 = 1$$ $$G_3 = 0 P_3 = 1 C_4 = 1$$ ## CARRY LOOKAHEAD (CONT'D) Now, since it is desired to connect ALU slices together why not consider the use of the carry generate and propagate equations for a combination of slices. To accomplish this task the generated and propagated values must be developed at the slice level. The associated equations for the j'th slice are: $$G_{j} = G_{3} + P_{3} G_{2} + P_{3} P_{2} G_{1} + P_{3} P_{2} P_{1} G_{0}$$ $P_{j} = P_{3} P_{2} P_{1} P_{0}$ - Thus, each slice can develop $G_j$ and $P_j$ internally and provide the values on the device pin connections. Then each slice can receive its carry-in, $C_j$ , as soon as the associated lower significant slice produces $P_j$ and $G_j$ . - Therefore, Carry-out of the j'th slice (which is Carry-in to the [j + 1]st slice) can be calculated from P and G of the lesser slices plus C<sub>in</sub>. Using an external carry-look-ahead device, then the Carry-out of each slice can be generated faster than that using a ripple carry connection. ## CARRY LOOKAHEAD (CONT'D) Considering a 16-bit ALU, the external Carry-in equations for the three most significant slices using the equations similar to those internal to the slice are given by: $$c_3 = G_2 + P_2G_1 + P_2P_1G_0 + P_2P_1P_0C_0$$ $c_2 = G_1 + P_1G_0 + P_1P_0C_0$ $c_1 = G_0 + P_0C_0$ $c_0 = C_{in}$ (carry-in) where $P_j$ and $G_j$ are generated by j'th ALU slice (j=0,1,2,3) and $C_j$ is the Carry-in for each ALU slice generated by the Am2902A. For example, with $G_2 = P_0 = P_1 = P_2 = 0$ and $G_0 = G_1 = 1$ : 0 ---0 ---1 ---1 ---0 carry from Am2902 0011 0111 0111 0111 augend 0001 0101 1010 1001 addend 0100 1101 0010 0000 sum - The Am2902A look-ahead carry generator performs these calculations for four slices (see figures). More Am2902As can be ganged for longer word size; a hierarchy of carry lookahead operations. - The result is that the Carry-in time to any slice is equal to the time $(t_{pg})$ to generate $P_j$ and $G_j$ for all slices in parallel plus the delay $(t_{02})$ through the Am2902 (see the AMD Data Book). The total add time is this value plus the Am2901 add time $(t_{add})$ . ## 16-bit ALU, Lookahead Carry ### 32-bit ALU #### CONNECTING THE SHIFTERS - For rotates - connect RAMØ RAMn - connect QØ Qn - For shifts - connect ' $\emptyset$ ' or '1' to be shifted in as required - could shift into MSB or LSB - needed for both Q and RAM - For double-length arithmetic shift - connect RAMØ to Qn - connect output Fn to RAMn for down shift (sign) - To connect the various Am2901 pins use SSI logic or use the Am2904, Status and Shift Control Logic. # RECOMMENDED MICROPROGRAMMED SYSTEM ARCHITECTURE - This is the architecture that has been developed - Single pipeline register allows two parallel operations: - fetch next microinstruction (sequencer operations) - execute current microinstruction (ALU operations) ## Recommended Architecture IMPROVING SPEED Am2901C Worst Case 16 to 16-bit Registers Add Time with Am2901C ## SAMPLE MICROCODE - AM2901 - Various ALU processes are described on the following pages in terms of Am2901 micro-operations: - Increment a register and output original value - Byte swap ### Am2901 ## **INCREMENT A REGISTER** #### AND ### **OUTPUT ITS ORIGINAL VALUE** - This operation is required in the macroinstruction fetch cycle for the PC (macro program counter). - Assume the register is loaded with macroinstruction address in main memory. - Assume that Reg 15 is the PC. ## Am2901 SOLUTION - Address A and Address B are both set to 15 from pipeline - I<sub>210</sub> (source) is set to 3 to select source operands Ø and B (ZB) - $I_{543}$ (function) is set to Ø (ADD) - $I_{876}$ (destination) is set to 2 to select F -> B and A -> Y (RAMA) - C<sub>in</sub> is set to 1 (ONE) ## Am2901 ## BYTE SWAP: # to exchange two halves of a 16 bit word Assume again register 15. (Number chosen for no particular reason.) ## **CONNECTIONS:** # SHIFTING SEQUENCE: | | • | REG | 15 | | |-------|------|------|------|------| | START | 1011 | 1100 | 1111 | 0001 | | 2 A | 0111 | 1001 | 1110 | 0011 | | SHIFT | 1111 | 0011 | 1100 | 0110 | | 2 A | 1110 | 0111 | 1000 | 1101 | | SHIFT | 1100 | 1111 | 0001 | 1011 | | 2 A | 1001 | 1110 | 0011 | 0111 | | SHIFT | 0011 | 1100 | 0110 | 1111 | | 2 A | 0111 | 1000 | 1100 | 1110 | | SHIFT | 1111 | 0001 | 1011 | 1100 | ADD SHIFT REGISTER TO ITSELF SHIFT LEFT (ROTATE) ROTATES 2 BITS IN ONE MICROCYCLE ### Am2901 SOLUTION - Address A and address B are both set to 15 (F) - I<sub>210</sub> is set to 1 to select source operands A and B (AB) - $I_{543}$ is set to 0 (ADD) - $I_{876}$ is set to 7 to select 2F -> B RAM shift up (toward MSB) (RAMU) - C<sub>in</sub> is set to C<sub>out</sub> - Repeat 4 times (2 bit rotate per cycle) | I <sub>876</sub> | I <sub>543</sub> | I <sub>210</sub> | CIN | |------------------|------------------|------------------|------------------| | 7 | 0 | 1 | c <sub>out</sub> | | RAMU | ADD | AB | | ### HARDWARE BYTE SWAP IMPLEMENTATION - <u>Trade-off</u> added logic hardware for less micromemory (ROM) and faster execution time. - Output register to be manipulated. - Use Am25LS240/244 tri-state buffers with permuted outputs to input ports. ## BYTE SWAP HARDWARE WITH THE Am2901 | SOURCE | FUNC | DEST | |--------|-----------|------| | DZ | ADD<br>OR | RAMA | "OR" is faster than "ADD" ## CLASS EXERCISE - Am2901 Turn to the Am2901 exercises in the ED2900A Exercise and Laboratory Manual and work numbers 1 through 14. A coding sheet is provided. Use mnemonics! INTRODUCING THE SUPER SLICE $^{\mbox{\scriptsize TM}}$ Am2903/ Am29203 ### IMPROVEMENTS BEYOND THE Am2901 - Add ability to expand the multiport "scratchpad" register memory (unlimited expansion). - Simplify multiplication - Simplify division - Add normalization for floating point numbers for use in arithmetic operations. - 1. XXXX E + XX Mantissa Exponent - Add fault tolerance/fault detection features: parity of ALU output. - Add two's complement sign extend for byte manipulation. - Add three address instruction for faster operation. ie: C = A + B # **COMPARISONS** | The Am2903/29203 | The Am2901 | |---------------------------------------------|-----------------------------------------------------| | 48 pins | 40 pins | | Higher throughput | Faster clock speed | | Q register can shift on its own | Q register shifted only when RAM also shifted | | Arithmetic & logical shifts | Logical shifts only | | Expandable RAM | RAM not designed for expansion | | Two- or three-address operations | Two-address operation only (not designed for three) | | DA <sub>0-3</sub> input | DA <sub>O-3</sub> input | | DA <sub>0-3</sub> output (29203) | - | | DA <sub>0-3</sub> input/output | - | | Y <sub>0-3</sub> input/output | Y <sub>0-3</sub> output only | | Arithmetic operation plus shift and output | Requires two microcycles (shift before RAM load) | | Parity bit generation | - | | Special functions<br>Internal logic support | -<br>(requires external <u>assist</u> logic) | ### Am2903/29203 DISTINCTIVE CHARACTERISTICS - Three port RAM same as Am2901 - 16 ALU functions Am2901 functions are a proper subset - 9 (Am2903) or 16 (Am29203) special functions - Expandable registers - Microprogrammable 9 bits of instruction 4 enables 2 position selects: Least Significant Slice - LSS Most Significant Slice - MSS Four status flags similiar to 2901 Q register capable of independent operation Two shifts arithmetic logical Uses shared pins - some lines multifunctional # Am2903 / Am29203 ### Am2903/29203 DIFFERENCES - DA<sub>0-3</sub> <u>bidirectional</u> on Am29203 - External $\overline{I}_{\text{EN}}$ internally connected to write enable on Am29203 for use with ALU RAM write operation - Zero detect on ALU shifter output on Am29203 on output of buffer on Am2903 - $\bullet$ $\overline{\text{OE}}_{V}$ connected to Z pin on Am29203 #### Am29203 - Faster than Am2903, but not Am2903A - Can handle byte operations better - Has 16 special functions # Am2903/29203 Microinstruction format (ALU only) # • Under normal operation Destination is controlled by ${\rm I}_8$ - ${\rm I}_5$ Function is controlled by $I_4 - I_1$ Under special function operation (when $I_4 - I_0$ are all low) Destination $\underline{\text{and}}$ function are controlled by $\mathbf{I}_8$ - $\mathbf{I}_5$ ### Am2903 # Am29203 # Am2903 Operand Sources | EA | A-REGISTER | <u>oe</u> B | B-REGISTER | |----|------------|-------------|------------| | L | INTERNAL. | L | INTERNAL | | Н | EXTERNAL. | Н | EXTERNAL | #### **ALU OPERAND SOURCES** | EA | lo | OEB | ALU Operand R | ALU Operand S | |----|----|-----|-------------------|-------------------| | L | L | L | RAM Output A | RAM Output B | | L | L | н | RAM Output A | DB <sub>0-3</sub> | | L | н | X | RAM Output A | Q Register | | н | L | L | DA <sub>0-3</sub> | RAM Output B | | Н | L | н | DA <sub>0-3</sub> | DB <sub>0-3</sub> | | Н | н | X | DA <sub>0-3</sub> | Q Register | | | | | | | L = LOW H = HIGH X = Don't Care Note: All 8 input codes are valid, but only 6 combinations are possible. (Note Don't Cares) # Am2903 / Am29203 ALU Functions | Arithmetic | Logic | |-----------------------|----------------------| | $S + R + C_N$ | R AND S | | $S - R + C_N - 1$ | R OR S | | $R - S + C_{N} - 1$ | R NAND S | | S + C <sub>N</sub> * | R NOR S | | S̄ + C <sub>N</sub> * | R EXOR S | | R + C <sub>N</sub> | R EXNOR S | | $\overline{R} + C_N$ | $\overline{R}$ and s | | | | HIGH'S\* LOW'S\* $\bullet$ $\,$ For Am29203, $\, {\rm I}_{0} \,$ must be high, hence source must be RAMAQ or DAQ. # Am2903 ALU Functions | 14 | 13 | 12 | 11 | Hex Code | ALU Functions | | | | | | |----|----|----|----|----------|--------------------------------------|-----------------------------------|--|--|--|--| | L | L | L | L | 0 | I <sub>0</sub> = L Special Functions | | | | | | | | | _ | _ | J | I <sub>0</sub> = H | F <sub>i</sub> = HIGH | | | | | | L | L | L | Н | 1 | F = S Mir | nus R Minus 1 Plus C <sub>n</sub> | | | | | | L | L | Н | L | 2 | F = R Mir | nus S Minus 1 Plus C <sub>n</sub> | | | | | | L | L | Н | Н | 3 | F = R Plu | is S Plus C <sub>n</sub> | | | | | | L | Н | L | L | 4 | F = S Plu | s C <sub>n</sub> | | | | | | L | Н | L | H | 5 | F = S Plus C <sub>n</sub> | | | | | | | L | H | Н | L | 6 | F = R Plus C <sub>n</sub> | | | | | | | L | Η | H | Н | 7 | F = R Plu | is C <sub>n</sub> | | | | | | Н | L | L | L | 8 | Fi = LOW | | | | | | | Н | L | L | Н | 9 | $F_i = \overline{R}_i A$ | ND S <sub>i</sub> | | | | | | Н | L | Н | L | Α | Fi = Ri E | XCLUSIVE NOR Si | | | | | | Н | L | Н | Н | В | Fi = Ri E | XCLUSIVE OR Si | | | | | | Н | Н | L | L | С | $F_i = R_i \text{ AND } S_i$ | | | | | | | Н | Н | L | Н | D | $F_i = R_i NOR S_i$ | | | | | | | Н | H | Н | L | E | $F_i = R_i NAND S_i$ | | | | | | | Н | H | Н | Н | F | $F_i = R_i OR S_i$ | | | | | | L = LOW H = HIGH i = 0 to 3 # Am29203 ALU Functions | 14 | l <sub>3</sub> | 12 | 11 | I <sub>0</sub> | ALU Functions | |----|----------------|----|----|----------------|-------------------------------------------| | L | L | L | L | L | Special Functions | | L | L | L | L | Н | F <sub>i</sub> = HIGH | | L | L | L | Н | Х | F = S Minus R Minus 1 Plus C <sub>n</sub> | | L | L | Н | L | Х | F = R Minus S Minus 1 Plus C <sub>n</sub> | | L | L | Н | Н | Х | F = R Plus S Plus C <sub>n</sub> | | L | Н | L | L | Х | F = S Plus C <sub>n</sub> | | L | Η | L | Н | X | F = S Plus C <sub>n</sub> | | L | H | Н | L | L | Reserved Special Functions | | L | Ι | Η | L | Н | F = R Plus C <sub>n</sub> | | L | Ι | Ι | Н | L | Reserved Special Functions | | L | Н | Ι | Н | Н | $F = \overline{R} \text{ Plus } C_n$ | | Н | L | لـ | L | L | Reserved Special Functions | | Н | L | L | L | H | F <sub>i</sub> = LOW | | H | L | L | H | X | $F_i = \overline{R_i}$ AND $S_i$ | | Н | L | Н | L | X | $F_i = R_i$ EXCLUSIVE NOR $S_i$ | | Н | L | Н | Н | Х | $F_i = R_i$ EXCLUSIVE OR $S_i$ | | Н | Н | L | L | X | $F_i = R_i \text{ AND } S_i$ | | Н | Н | L | I | Х | $F_i = R_i NOR S_i$ | | Н | Н | Н | L | X | $F_i = R_i NAND S_i$ | | Н | Н | Н | Н | Х | $F_i = R_i OR S_i$ | L = LOW H = HIGH i = 0 to 3 X = LOW or HIGH ### ALU DESTINATION CONTROL - Destination is controlled by I8-I7-I6-I5 - Includes choice of down-, up-, or no-shift - Allows choice of logical or arithmetic shift (RAM only) - Controls whether data is written to RAM registers ALU DESTINATION CONTROL FOR I<sub>0</sub> OR I<sub>1</sub> OR I<sub>2</sub> OR I<sub>3</sub> = HIGH, IEN = LOW | | | | | | SIO3 | _ | ۲3 | | Y2 | | | | | | O Hed | | | |---|-------|---------------------|------|-------------------------------------------------------------------------------------|--------------------|----------------|--------------------|------------------|--------------------|-----------------|----------------|----------------|----------------|-------|---------------------|------------------|-------------------| | | 7 - و | s <sub>1</sub> 9, 4 | C de | ALU Shifter<br>Function | Most Sig.<br>Slice | Other | Most Sig.<br>Slice | Other<br>Slices | Most Sig.<br>Slice | Other<br>Slices | 7- | ٥ | SIO | Write | Shifter<br>Function | QIO <sub>3</sub> | aio | | 1 | | | 0 | Arith. F.2→Y | Indu | Input | F <sub>3</sub> | SiO <sub>3</sub> | SiO <sub>3</sub> | .F. | F <sub>2</sub> | F, | F <sub>0</sub> | ر | Hold | Hi-Z | Z-IH | | _ | | I | - | Log F/2→Y | Input | Input | SIO <sub>3</sub> | SIO3 | F <sub>3</sub> | F <sub>3</sub> | F2 | 1. | r <sub>0</sub> | | Hod | Z-1H | Z-1H | | - | I | <br>T | 2 | Anth. F.2→Y | Input | Input | F <sub>3</sub> | SIO <sub>3</sub> | SiO <sub>3</sub> | F <sub>3</sub> | F2 | ī. | F <sub>0</sub> | د. | Log. Q.2-Q | input | တိ | | _ | I | I | 3 | Log. F.2.4Y | Indu | Indu | SIO <sub>3</sub> | SiO <sub>3</sub> | F <sub>3</sub> | F. | F2 | u. | r <sub>o</sub> | ر ا | Log Q.2→Q | Input | ဝိ | | 1 | I | ر. | 4 | F± | Input | Input | F3 | F <sub>3</sub> | F <sub>2</sub> | F2 | ı. | r <sub>o</sub> | Parity | | PoH | Z-1H | H <sub>1</sub> -Z | | _ | I | I | 2 | ¥•<br>• | Indu | Input | F3 | F.3 | F <sub>2</sub> | F <sub>2</sub> | L. | F <sub>0</sub> | Panty | I | Log Q.2→Q | Input | o | | + | I | | 9 | F→Y | Input | Input | ŗ. | F <sub>3</sub> | F <sub>2</sub> | F2 | ī. | F <sub>0</sub> | Panty | I | F-0 | Hi-Z | Hi-Z | | - | I | Ι. | ^ | F⊸¥ | Input | Indut | F3 | F. | F <sub>2</sub> | F <sub>2</sub> | 'n | Fo | Parity | ر | F→0 | Z-iH | HI-Z | | | | | 60 | Arrth. 2F →Y | F <sub>2</sub> | F.3 | r. | F2 | 1,1 | ų. | r <sub>o</sub> | SiOo | Input | | Hold | Hi-Z | Hi-Z | | _ | | r | 6 | Log. 2F →Y | F3 | F <sub>3</sub> | F <sub>2</sub> | F <sub>2</sub> | ű. | ī. | n <sub>o</sub> | SIO | Input | ب | Hold | Z-iH | Hi-2 | | _ | I | <br> | ∢ | Arrth. 2F → Y | F <sub>2</sub> | F <sub>3</sub> | ሚ | F2 | п. | L. | 'n. | SIO | Input | ب | Log. 20-+0 | ပ် | Input | | _ | I | r | æ | Log 2F→Y | F <sub>3</sub> | F3 | F <sub>2</sub> | F2 | 'n | ī. | 'n | SiOo | Input | ب | Log 20-0 | ဝီ | Input | | - | I. | ىــ | ပ | F → Y | F <sub>3</sub> | F <sub>3</sub> | F. | Г3 | F <sub>2</sub> | F2 | T. | F <sub>O</sub> | Z-!H | I | рюн | Hi-2 | Hi-Z | | - | Ŧ | I | ۵ | F→Y | F3 | F <sub>3</sub> | F <sub>3</sub> | F <sub>3</sub> | F <sub>2</sub> | F2 | F | Fo | Z-iH | I | Log. 20-40 | o <sub>3</sub> | Input | | _ | I | ī | ш | SIO <sub>0</sub> -Y <sub>0</sub> , Y <sub>1</sub> , Y <sub>2</sub> , Y <sub>3</sub> | SIO | SIO | SIO | SIO | SIO | SIO | SiOo | SIO | Input | _ | PIOH | Z-iH | Hi-Z | | - | I | I | u. | F↓Y | F3 | F3 | F3 | F <sub>3</sub> | F2 | F2 | F. | F <sub>0</sub> | H+.2 | ١ | Hold | Z-jH | Hi-Z | Hi-Z = High Impedance L = LOWH = HIGH Parity = $F_3 + F_2 + F_1 + F_0 + SIO_3$ $\forall$ = Exclusive OR # LOGICAL VERSUS ARITHMETIC SHIFT: ### Am2903 Arithmetic Shift Path # Am2903 Logical Shift Path ### Am2903/29203 SPECIAL FUNCTIONS #### Am2903 and Am29203: - Parity - Sign extension - Sign magnitude/two's complement conversion - Unsigned multiply - Two's complement multiply - Increment by 1 or 2 - Single length normalize - Double length normalize - Two's complement divide ### Am29203 only: - BCD/binary conversion - Decrement by 1 or 2 - BCD divide by 2 - BCD add and subtract ### Am29203 Special Functions | | | | | | | SIO | 3 | | Q Reg & | | | | |----------------------------------------------------------------------|----|-------------------|---------------------------------------------------|--------------------------------------------------------------------------------|-------------------------|---------------------------------|-----------------|------------------|---------------------|------------------|------------------|-------| | (Hex)<br> <sub>8</sub> <sub>7</sub> <sub>6</sub> <sub>5</sub> | 14 | (Hex)<br> 3 2 1 0 | Special<br>Function | ALU Function | ALU Shifter<br>Function | Most Sig<br>Slice | Other<br>Slices | SIO <sub>0</sub> | Shifter<br>Function | QIO <sub>3</sub> | Q10 <sub>0</sub> | WRITE | | 0 | L | 0 | Unsigned Multiply | $F = S + C_n \text{ if } Z = L$ $F = R + S + C_n \text{ if } Z = H$ | Log F/2 → Y<br>(Note 1) | z | Input | F <sub>0</sub> | Log Q/2 → Q | Input | Q <sub>0</sub> | L | | 1 | L | 0 | BCD to Binary<br>Conversion | (Note 4) | Log F/2 → Y | Input | Input | F <sub>0</sub> | Log Q/2 → Q | Input | Q <sub>0</sub> | L | | 1 | н | 0 | Multiprecision<br>BCD to Binary | (Note 4) | Log F/2 → Y | Input | Input | F <sub>0</sub> | Hold | z | <b>Q</b> 0 | L | | 2 | L | 0 | Two's Complement<br>Multiply | $F = S + C_n \text{ if } Z = L$ $F = R + S + C_n \text{ if } Z = H$ | Log F/2 → Y<br>(Note 2) | z | Input | F <sub>0</sub> | Log Q/2 → Q | Input | <b>Q</b> 0 | L | | 3 | L | 0 | Decrement by<br>One or Two | F = S - 2 + C <sub>n</sub> | F-→ Y | z | z | Parity | Hold | z | Z | L | | 4 | L | 0 | Increment by<br>One or Two | F = S + 1 + C <sub>n</sub> | F→Y | Input | Input | Parity | Hold | z | z | L | | 5 | L | 0 | Sign/Magnitude<br>Two's Complement | $F = S + C_n \text{ if } Z = L$ $F = S + C_n \text{ if } Z = H$ | F → Y<br>(Note 3) | Input | Input | Parity | Hold | z | z | L | | 6 | L | 0 | Two's Complement<br>Multiply, Last Cycle | $F = S + C_n \text{ if } Z = L$<br>$F = S - R - 1 + C_n \text{ if } Z = H$ | Log F/2 → Y<br>(Note 2) | z | Input | F <sub>0</sub> | Log Q/2 → Q | Input | <b>Q</b> 0 | L | | 7 | L | 0 | BCD Divide<br>by Two | (Note 4) | F→Y | z | Z | Parity | Hold | z | z | L | | 8 | L | 0 | Single Length<br>Normalize | F = S + C <sub>n</sub> | F→ Y | F <sub>3</sub> | F3 | z | Log 2Q → Q | Q <sub>3</sub> | Input | L | | 9 | L | 0 | Binary to BCD<br>Conversion | (Note 5) | Log 2F → Y | F <sub>3</sub> | F <sub>3</sub> | Input | Log 2Q → Q | Q <sub>3</sub> | Input | L | | 9 | н | 0 | Multiprecision<br>Binary to BCD | (Note 5) | Log 2F → Y | F <sub>3</sub> | F <sub>3</sub> | Input | Hold | z | Input | L | | A | L | 0 | Double Length<br>Normalize and First<br>Divide Op | F = S + C <sub>n</sub> | Log 2F → Y | R <sub>3</sub> ∀ F <sub>3</sub> | F3 | Input | Log 2Q → Q | O <sub>3</sub> | Input | L | | В | L | 0 | BCD Add | F = R + S + C <sub>n</sub> BCD<br>(Note 6) | F→ Y | 0 | 0 | z | Hold | z | z | L | | С | L | 0 | Two's Complement<br>Divide | F = S + R + C <sub>n</sub> if Z = L<br>F = S - R - 1 + C <sub>n</sub> if Z = H | Log 2F → Y | R <sub>3</sub> ∀ F | F <sub>3</sub> | Input | Log 2Q → Q | Q <sub>3</sub> | Input | L | | D | L | 0, | BCD Subtract | F = R - S - 1 + C <sub>n</sub> BCD<br>(Note 6) | F-→ Y | o | 0 | z | Hold | z | Z | L | | E | L | 0 | Two's Complement Divide Correction and Remainder | F = S + R + C <sub>n</sub> itZ - L<br>F = S - R - 1 + C <sub>n</sub> itZ - H | F⊶Y | F <sub>3</sub> | F <sub>3</sub> | z | Log 2Q → Q | O <sub>3</sub> | Input | L | | F | L | 0 | BCD Subtract | F = S - R - 1 + C <sub>n</sub> BCD .<br>(Note 6) | FY | 0 | 0 | z | Hold | z | z | L | Notes: 1. At the most significant slice only, the $C_{n+4}$ signal is internally gated to the $Y_3$ output. 2. At the most significant slice only, $F_3 \ \forall \ \text{OVR}$ is internally gated to the $Y_3$ output. - 3. At the most significant slice only, $S_3 \ \forall \ F_3$ is generated at the $Y_3$ output. - 4. On each slice, F = S if magnitude of S<sub>0-3</sub> is less than 8 and F = S minus 3 if magnitude of S<sub>0-3</sub> is 8 or greater. 5. On each slice, F = S if magnitude of S<sub>0-3</sub> is less than 5 and F = S plus 3 if magnitude of S<sub>0-3</sub> is 5 or greater. Addition is module 16. - 6. Additions and subtractions are BCD adds and subtracts. Results are undefined if R or S are not in valid BCD format. - 7. The Q Register cannot be used explicitly as an operand for any Special Functions. It is defined implicitly within the functions. L = LOW Hi-Z = High Impedance H = HIGH = Exclusive OR X = Don't Care Parity = $SIO_3 \forall F_3 \forall F_2 \forall F_1 \forall F_0$ # Am2903 TWO ADDRESS OPERATION # **SOURCES:** Registers selected by A-address and B-address # DESTINATION: Register selected by B-address ### Am2903 THREE ADDRESS OPERATION ### **SOURCES:** Register selected by A-address and B-address ### **DESTINATION:** Register selected by C-address through use of RAM B-address MUX which is controlled by the clock ### Am2903 Three Address Control # Am2903 Three-Address Operation INTERCONNECTING THE SLICES Am2903/Am29203 16-Bit CPU with Ripple Carry. NOTE ISOLATING RESISTORS NOTE ISOLATING RESISTORS Connections for Word/Byte Operations (Am29203 Only). # **EXAMPLE** # Am2903/29203 MICROCODE - Increment a register and output original value - Byte swap ### Am2903 ### **INCREMENT A REGISTER** ### AND ### **OUTPUT ITS ORIGINAL VALUE** - This operation can be used in the macroinstruction fetch cycle for the PC (macro program counter). - Assume the register is loaded with the macroinstruction address in main memory. - Assume that Reg 15 is the PC. #### Am2903 Solution - Address A and Address B are both set 10 15 - $\overline{E}_A$ $I_{\emptyset}$ $\overline{OE}_B$ (Source) are set to $\emptyset\emptyset\emptyset$ (RAMAB) to select as source operands RAM output A,B - $\bullet$ I<sub>4321</sub> (Function) is set to 6 (increment) (INCRR) - C<sub>IN</sub> is set to 1 - $\overline{\text{OE}}_{\text{B}} = \emptyset$ places the original value of R<sub>15</sub> at DB<sub>I/O</sub> - $\bullet$ $I_{8765}$ (Destination is set to F and $\overline{\text{OE}}_{\gamma}$ to Low to select F-->Y and to write the new value F into the RAM. (RAM or RAMEXT) Notes: 1. $DA_{0:3}$ is input only on Am2903, but is I/O port on Am29203. 2. On Am29203, zero logic is connected to Y, after the $OE_Y$ buffer. ### Am2903 ### BYTE SWAP To exchange two halves of a 16-bit word Assume again Register 15. (Number chosen for no particular reason) #### 2-300 #### Am2903 - Address A and Address B are both set to 15 - $\overline{E}_A I_{\emptyset} \overline{OE}_B$ set to $\emptyset$ for A and B ports as operands - $c_{IN} = c_{OUT}$ - $\bullet$ I<sub>4321</sub> is set to 3 (Add) - $\bullet$ I<sub>8765</sub> is set to 9 for 2F-->Y (Shift) and write to RAM - WE Low - $\overline{0E}_{Y}$ Low Repeat for total of four times, same as for Am2901 $\,$ | SOURCE | FUNC | DEST | C <sub>in</sub> | 2904 | | |--------|------|--------|-----------------|-------------------|---------| | | | | | | | | RAMAB | ADD | RAMUPL | $c_{ ext{out}}$ | SI0 <sub>15</sub> | to SIOø | #### BYTE SWAP - HARDWARE ASSIST • There are several ways to handle a one-microcycle, hardware assist byte swap with the Am2903/Am29203. ### Basically, - Bring the data (already in a register) out RAMB to DB - Pass it through buffers (inverting or noninverting) - Bring it back in either DA or Y - The interconnections permute the data - DA passes data through the ALU - medium speed version - use Am2958/59 (invert/true) - octal buffer/driver/receiver - tri-state output - Y passes data directly to the RAM registers - high speed version - use Am25LS244 ### SHAP Byte Swap requires an enable for the tri-state buffer drivers The code is: | addr | 2910<br>INST | COND<br>MUX | BRCH<br>CNTR | SRCE | FUNC | DEST | | rb<br>Addr | | 0Ēy | Īen | Ē | |------|--------------|-------------|--------------|--------|-------|------|---|------------|-----|-----|-----|----| | n | CONT | # | # | DARAMB | INCRR | RAM | # | Rb | LOW | EN | EN | EN | | OR | | | | | | | | | | | | | | n | CONT | # | # | RAMAB | # | RAM | # | Rb | # | DIS | EN | EN | THE SPECIAL FUNCTIONS 0F THE Am2903/Am29203 # Am2903/Am29203 SPECIAL FUNCTIONS AND FEATURES - Parity - Sign extension - Sign magnitude <--> two's complement conversion - Increment by 1 or 2 - Unsigned multiply - Two's complement multiply - Two's complement divide - Single length normalize - Double length normalize # SPECIAL FUNCTIONS: $I_0 = I_1 = I_2 = I_3 = I_4 = LOW$ , $\overline{IEN} = LOW$ | | | | | | | | | | SI | 03 | | | | | | |----|----|----------------|----------------|-------------|-------------------|---------------------------------------------------------|--------------------------------------------------------------|-------------------------|--------------------------------|-----------------|------------------|--------------------------------|----------------|------------------|-------| | 18 | 17 | <sup>1</sup> 6 | <sup>1</sup> 5 | Hex<br>Code | Available<br>On | Special<br>Function | ALU Function | ALU Shifter<br>Function | Most Sig.<br>Slice | Other<br>Slices | SIO <sub>0</sub> | Q Reg &<br>Shifter<br>Function | 0103 | alo <sub>o</sub> | WRITE | | L | L | L | L | 0 | Am2903<br>Am29203 | Unsigned Multiply | F≈ S+C <sub>n</sub> il Z≈L<br>F=R+S+C <sub>n</sub> il Z=H | Log. F/2→Y<br>(Note 1) | Hı-Z | Input | F <sub>0</sub> | Log Q/2→Q | Input | 00 | L | | L | Ł | L | Н | 1 | Am29203 | | | | | | | | | | | | L | L | н | L | 2 | Am2903<br>Am29203 | Twa's Complement<br>Multiply | F=S+C <sub>n</sub> if Z=L<br>F=R+S+C <sub>n</sub> if Z=H | Log. F/2→Y<br>(Note 2) | Hi-Z | Input | F <sub>0</sub> | Log. Q/2-+Q | Input | ο <sub>0</sub> | ι | | L | L | Н | н | 3 | Am29203 | | | | | | | | | | | | L | Н | L | L | 4 | Am2903<br>Am29203 | Increment by<br>One or Two | F=S+1+C <sub>n</sub> | F→Y | Input | Input | Parity | Hold | Hi-Z | Hi-Z | ι | | L | Н | L | н | 5 | Am2903<br>Am29203 | SigryMagnitude-<br>Two's Complement | F=S+C <sub>n</sub> if Z=L<br>F=Š+C <sub>n</sub> if Z=H | F→Y<br>(Note 3) | Input | Input | Parity | Hold | Hi-Z | Hi-Z | ι | | L | н | н | L | 6 | Am2903<br>Am29203 | Two's Complement<br>Multiply, Last Cycle | F=S+C <sub>n</sub> if Z=L<br>F=S-R-1+C <sub>n</sub> if Z=H | Log F/2→Y<br>(Note 2) | Hi-Z | Input | F <sub>0</sub> | Log Q/2→Q | Input | a <sub>o</sub> | ι | | L | н | н | н | 7 | Am29203 | | | | | | | | | | | | н | L | L | L | 8 | Am2903<br>Am29203 | Single Length<br>Normalize | F=S+C <sub>n</sub> | F→Y | F <sub>3</sub> | F <sub>3</sub> | Hi-Z | Log 2Q→Q | Q <sub>3</sub> | Input | ι | | н | L | L | н | 9 | Am29203 | | | | | | | | | | | | н | L | н | L | A | Am2903<br>Am29203 | Double Length<br>Normalize and<br>First Divide Op. | F=S+C <sub>n</sub> | Log 2F→Y | A <sub>3</sub> YF <sub>3</sub> | Fj | Input | Log. 2Q-+Q | O <sub>3</sub> | Input | ι | | н | L | н | н | В | Am29203 | | | | | | | | | | | | Н | н | L | L | С | Am2903<br>Am29203 | Two's Complement<br>Divide | F=S+R+C <sub>n</sub> if Z=L<br>F=S-R-1+C <sub>n</sub> if Z=H | Log. 2F→Y | R <sub>3</sub> ∀F <sub>3</sub> | F, | Inepal | Log. 2Q-+Q | O <sub>3</sub> | Input | L | | Н | н | L | н | D | Am29203 | | | | | | | | | | | | н | н | н | L | E | Am2903<br>Am29203 | Two's Complement<br>Divide, Correction<br>and Remainder | F=S+R+C <sub>n</sub> if 7=L<br>F=S~R-1+C <sub>n</sub> if Z≅H | F→Y | F <sub>3</sub> | F <sub>3</sub> | Hi-Z | Log. 2Q-+Q | Ο <sub>3</sub> | Input | L | | н | н | н | н | F | Am29203 | | | | | | | | | | | NOTES: 1. At the most significant slice only, the $C_{n+4}$ signal is internally gated to the $Y_3$ output. 2. At the most significant slice only, $F_3 \forall \text{ OVR}$ is internally gated to the $Y_3$ output. 3. At the most significant slice only, $S_3 \forall F_3$ is generated at the $Y_3$ output. L = LOW H = HIGH $\forall$ = Exclusive OR Parity = SIO<sub>3</sub> $\forall$ F<sub>3</sub> $\forall$ F<sub>2</sub> $\forall$ F<sub>1</sub> $\forall$ F<sub>0</sub> X = Don't Care Hi-Z = High Impedance #### **PARITY** - Parity is computed and available at $SIO_{\mbox{\it g}}$ when the destination field $I_{8-5}$ is either {4, 5, 6, 7} - This corresponds to: 4 RAM $F \rightarrow Y$ , $F \rightarrow RAM$ , No Q activity 5 QD F $\rightarrow$ Y, Z/2 $\rightarrow$ Q, no write to RAM 6 LOADQ $F \rightarrow Y$ , $F \rightarrow Q$ , no write to RAM 7 RAMQ $F \rightarrow Y$ , $F \rightarrow Q$ , $Y \rightarrow RAM$ • The computed equation is: $$SIO_{\emptyset} = F_{\emptyset} \vee F_1 \vee F_2 \vee F_3 \vee F_4 \vee ... \vee F_n \vee SIO_n$$ # SIGN EXTEND - By varying the destination control field into different ALU slices, specifically by varying I5, sign extension can be done across any number of devices in one microcycle. - This corresponds to: | 18-16 | 15 | HEX | MNEMONIC | DEVICE ACTIONS | |-------|----|-----|----------|-----------------------------------| | 111 | Ø | E | SGNEXT | SIOØ -> Y, Y -> RAM, SIOØ -> SIO3 | | 111 | 1 | F | RAMEXT | F -> Y, Y -> RAM, F3 -> SIO3 | • Thus by controlling I5 separately to each slice when 18-16 = 111: If I5 = 1, slice behaves "normally" If $I5 = \emptyset$ , whatever is input on SIO $\emptyset$ will appear on all Yi and SIO3 - Add a second 15-bit position to the microword - Change .DEF file to a 5-bit destination field - Example: | | | L<br>Funct | U<br>DEST | RA | RB | Cin | OEY | ĪĒN | |------|-------|------------|-----------|----|----|-----|--------|-----| | CONT | RAMAQ | INCRRR | SIGNEXT | Ra | # | LOW | <br>EN | EN | Where in the .DEF file SIGNEXT EQU B#11101 For all other destination codes, the last two bits are identical. #### **BINARY NUMBER REPRESENTATION** - There are three ways to represent binary numbers in a computer. They are: - sign plus magnitude - sign plus two's complement - sign plus one's complement - The sign plus magnitude is the general way in which humans represent numbers in any base. The sign or sign bit is treated as a separate piece of information and must be manipulated with a different algorithm similar to the human operations for a base ten arithmetic operation (addition, subtraction, multiplication and division). - Using sign plus two's complement or sign plus one's complement, the same binary arithmetic operations can be applied to the sign bit that are used for the other bits in the number. Thus, increasing operational speed and minimizing specialized hardware. - Sign plus magnitude and sign plus two's complement representations are briefly introduced for use in ALU bit-slice manipulations. # NUMBER REPRESENTATIONS (CONT'D) - Sign bit coding for all representations is: - sign bit is Ø if number is positive - sign bit is 1 if number is negative - Sign plus magnitude representation - The value to the right of the sign bit is the absolute magnitude value of the number Examples using an eight-bit register: $$+6_{10} = 00000110_2$$ $$-6_{10} = 10000110_2$$ $$+13_{10} = 00001101_2$$ $$-11_{10} = 10001011_2$$ - The range of sign plus magnitude represented numbers is - largest positive number 011...111 - largest negative number 111...111 - zero (double representation) 0000...000 1000...000 ED2900A Sign plus two's complement representation For positive numbers, this representation is identical to that for sign plus magnitude, i.e. $+7_{10} = 00000111_2$ For negative numbers, the bits following the sign bit are the two's complement of the magnitude of the number. The two's complement of a number is found by reversing (toggle) the 1's and $\emptyset$ 's of the magnitude (one's complement) and adding 1 in the LSB location. For example, using an eight-bit register: -610: magnitude $6_{10} = 0000110_2$ one's complement = $1111001_2$ two's complement = $1111010_2$ ' $-6_{10} = 11111010_2$ The range of (sign plus) two's complement represented numbers is: - largest positive number - 0111...111 - largest negative number - 1000...000 - zero (single) - 0000...000 All AMD ALU's currently use sign plus two's complement notation as their primary representation. # SIGN MAGNITUDE TO/FROM TWO'S COMPLEMENT CONVERSION - Word to be converted placed on S-Port of ALU - RAM B - DB input - Carry-in = Z by connecting Z pin to carry-in - Z is sign bit indicates positive or negative number - Overflow if number is largest <u>negative</u> number - $F = \langle B \rangle + Cin \text{ if } Z = \emptyset$ - $F = \langle B \rangle + Cin \text{ if } Z = 1$ - $Y_3MSS = (S_3 + F_3)MSS$ #### **EXAMPLES** 1. B = 10001010 sign magnitude for -10 Z = 1 therefore B = 01110101 except F3 $\forall$ S3 -> Y<sub>3</sub> of MSS 11110101 Ø ¥ 1 -> 1 $+C_n = 1$ add 1 since Z = 1 11110110 check: $$-128 + 64 + 32 + 16 + 0 + 4 + 2 + 0 = 10$$ 2. 11110110 two's complement for -10 Z = 1 therefore B = 00001001 except for Y3 of MSS 10001001 $+C_n = 1$ add 1 10001010 Voila! # **EXAMPLE MICROCODE** | 2910 | SOURCE | | <br> | RB | Cin | |------|--------|-------|------|----|-----| | CONT | RAMAB | SPECL | # | RØ | 7 | # TWO'S COMPLEMENT #### **INCREMENT BY 1 OR 2** - Although it is possible to increment by 1 without going to a special function, it is not possible to increment by 2. - In a byte-addressable memory both byte addressing and word addressing capability may be desirable. byte addressing: $\langle R\emptyset \rangle \leftarrow \langle R\emptyset \rangle + 1$ word addressing: <PC> <-- <PC> + 2 • The special function "INCRMNT" provides this capability. 4 INCRMNT F = S + 1 - Cin 2910 SOURCE FUNCT DEST RA RB Cin CONT RAMAB SPECL INCRMNT # R1 one # INCR 1 or 2 MULTIPLICATION # MULTIPLICATION | <u>B X A</u> | UNSIGNED BINARY | | | | | | | |--------------|-----------------|---------------------|--|--|--|--|--| | | | | | | | | | | 77 | 1001101 | | | | | | | | X <u>11</u> | *0001011 | | | | | | | | 847 | 1001101 | IN FIRST POSITION | | | | | | | | 1001101 | - IN TIKST PUSTITUM | | | | | | | | 0000000 | IN SECOND POSITION | | | | | | | | 1001101 | | | | | | | | | 0000000 | IN FOURTH POSITION | | | | | | | | 000000 | | | | | | | | | 000000 | | | | | | | | | 0001101001111 | 512 | | | | | | | | 5/2 | 256 | | | | | | | WEIGHT | 64 8421 | 64 | | | | | | | | | 8 | | | | | | | | | 4 | | | | | | | | | 2 | | | | | | | | | 1 | | | | | | | | | 847 | | | | | | # RULES (MANUAL OPERATION): - For each 1 in multiplier B, add the multiplicand $\underline{A}$ shifted to align its LSB with the 1 of $\underline{B}$ . - For each Ø in multiplier B, add zeros, also aligned. - The result of an N $\times$ N multiply is 2N bits long. # MULTIPLICATION TWO'S COMPLEMENT - "METHOD OF FLORES" Case #1 B X A B positive A positive The same algorithm as unsigned multiply # TWO'S COMPLEMENT - METHOD OF FLORES (Cont'd) Case #2 B X A B positive A negative S 1.10011 \*0.01011 1.11111110011 1.1111100110 0.000000000 1.1110011000 0.000000000 1.1101110001 ignore Cout - Rules - expand A to 2N in length: A = 1.1111110011 - proceed as for unsigned multiply - use sign bit of A as MSB for first partial product # (MULTIPLICATION (CONT'D) # B X A B NEGATIVE A POSITIVE 0.01101 1.10101 0.0000001101 0.0000000000 0.0000110100 0.0000000000 0.0011010000 0.0100010001 correction 1.10011 1.1101110001 result #### Rules: - Multiply directly as for unsigned. - Form correction at end by adding two's complement of A to the result. # MULTIPLICATION (CONT'D) | <u>B</u> <u>X</u> <u>A</u> | ! | <u>B</u> | NEGATIVE | = | A NEG | <u>SATIVE</u> | | |----------------------------|-------|----------|----------|-------|----------------|---------------|--| | | | | | | | | | | | | | | | 1.100 | 11 | | | | | | | | 1.101 | 01 | | | | | | | 1.111 | 111 <u>100</u> | 11 | | | | | | | 0.000 | 00000 | 00 | | | | | | | 1.111 | 10011 | 00 | | | | | | | 0.000 | 00000 | 00 | | | | | | | 1.110 | 001100 | 00 | | | | | | 10 | 1.101 | 11011 | 11 | | | co | orrec | ti | on | 0.011 | 01 | | | - Rules: - Expand A to 2N in length. - Proceed as for B negative and A positive. 0.0010001111 result # EXAMINATION OF THE FOUR CASES SHOWS THE COMMON PATTERN FOR THE MULTIPLY ALGORITHM: - Expand A, left bits depend upon sign bit of A. - Multiply by adding A or Ø to running sum (partial product) depending upon LSB in B. - "Correct" result using two's complement of A depending upon sign bit of B. #### Therefore: - The bits of B must be accessible. - A method of conditional insert for left-most bits is needed. - A source for conditional operand (A or $\emptyset$ ) is needed. # Am2901 MULTIPLY ALGORITHM - Clear R<sub>B</sub> - ullet Load multiplicand into $R_{\mbox{\scriptsize A}}$ - Load multiplier into Q - LSB shifted out of Q $$R_{KB} \leftarrow R_B + R_A$$ • Shift R<sub>B</sub>,Q down - ullet LSB of $R_B$ input to MSB of Q - $\bullet$ $F_3$ ¥ OVR into MSB of $R_B$ to generate the proper conditional insert for the left-most bits # Am2901 MULTIPLY ALGORITHM (CONT'D) LAST $\mathbf{Q}_{\mbox{out}}$ is sign of multiplier $Q_{out}$ selects $R_B \leftarrow R_B$ or $$R_B \leftarrow R_B - R_A$$ - Shift $R_{\mbox{\footnotesize{B}}}$ down - The result is in ${\rm R}_{\rm B}$ and ${\rm Q}$ Am2901 Multiply Functional Diagram 16 BIT MULTIPLY - RIPPLE CARRY #### Am2903/Am29203 MULTIPLY ALGORITHM • The Am2903/Am29203 improves this operation by providing internally what had to be added externally to the Am2901. #### **UNSIGNED MULTIPLY:** - Initially $R_0 = 0$ (B ADR, and $R_b$ ) - Multiplicand in $R_1$ (A ADR, any $R_a$ ) - Multiplier in R<sub>2</sub> (any R<sub>i</sub>) - Transfer $R_2$ ----> Q - Execute unsigned multiply 16 times (counter = 15) - $\bullet$ <u>17 microcycles</u> as shown (assuming all registers except Q initialized and result left in R<sub>B</sub> and Q) # Am29203/2903 UNSIGNED MULTIPLY (16x16) | Z | ALU | DESTINATION | S <sub>3 in</sub> MSS | | | |---|---------------|--------------------|-------------------------------------------|--|--| | L | PASS<br>A + B | F/2 → RAM; Q/2 → Q | INTERNAL<br>OPERATION<br>C <sub>N+4</sub> | | | #### MULTIPLY # Am2903/Am29203 #### UNSIGNED MULTIPLY • The flow chart for unsigned multiply is given on the facing page. The code (using Am2910 and Am2903/Am29203) would be: addr 2910 COND BRCH SRCE FUNC DEST RA RB Cin OEy Ten ROTATE INST MUX CNTR A L U ADDR ADDR CONNECTIONS n LDCT # 15 RAMAB INCRR LOADQ Ri # LOW # EN # N+1 RPCT # n+1 LOW SPECL MULT Ra Rb LOW EN EN SIOØ TO QIO3 ### This equates to: - n Load Q register with multipier Load 2910 counter with 15 (One less than actual count) - n+1 Perform special function 15 times Result is in Rb and Q # Am2903/29203 TWO'S COMPLEMENT MULTIPLY ALGORITHM: The flowchart for a 16-bit two's complement multiply is on the following page. - The multiplier must be loaded into the Q register. - The flowchart shows register R2 -> Q - The code shows any register Ri - Your code would match your application! - The multiplicand must be in another register. - The flowchart shows register R1 - The code shows any register Ra - The result ends up in a RAM register (MSH) and the Q register (LSH). - The flowchart shows register RO - The code shows any register Rb TWO'S COMPLEMENT MULTIPLY # TWO'S COMPLEMENT MULTIPLY (CONT'D) - The interconnection for two's complement multiply is the same as for unsigned multiply, except for the last step (the "correction" step). - For unsigned multiply $C_{n+4}$ is internally shifted into Y3 of the MSS. - For two's complement multiply N ¥ OVR is internally shifted into Y3 of the MSS. - The code is: | addr | | | | SRCE<br>A | | DEST<br>U | | | | - | | ROTATE<br>CONNECT | IONS | |------|-------------|---|-----|-----------|-------|-----------|----|----|-----|----|----|-------------------|------| | n | LDCT | # | 14 | RAMAB | INCRR | LOADQ | Ri | # | LOW | # | EN | # | | | n+1 | <b>RPCT</b> | # | n+1 | LOW | SPECL | TWOMULT | Ra | Rb | LOW | EN | EN | SIOO TO | QI03 | | | | | | | | TWO! AST | | | | | | 11 | • | # TWO'S COMPLEMENT MULTIPLY (Cont'd) # Algorithm | FUNCTION | CN | Z | ALU | DESTINATION | S <sub>3</sub> in MSS | |-----------|-----|---|-------|----------------------|-----------------------| | MILITED V | | L | PASS | DOWN CULETC | TAITEONA | | MULTIPLY | L | Н | A + B | DOWN SHIFTS | INTERNAL<br>OPERATION | | LACT CTER | . 7 | L | PASS | F/2 -> RAM; Z/2 -> Q | SIGN + OVR | | LAST STEP | L | Н | B-A | | | ## Am2903 CLASS EXERCISE - Multiply R1 by R2 - Put result in R3 and R4 - Use two's complement multiply with Am2903 # SOLUTION | ADDR | 2910<br>INST | | BRCH<br>CNTR | | L<br>FUNCT | U<br>DEST | ra<br>Addr | RB<br>ADDR | CIN | ŌĒY | ĪEN | ROT | | |------|--------------|---|--------------|-------|------------|-----------|------------|------------|-----|-----|-----|----------------|--| | 1 | CONT | # | # | RAMAB | LOW | RAM | # | R3 | LOW | EN | EN | # | | | 2 | LDCT | # | 14 | RAMAQ | INCRR | LOADQ | R2 | # | LOW | # | EN | # | | | 3 | RPCT | # | 3 | LOW | SPECL | TWOMULT | R1 | R3 | LOW | EN | EN | S1010-<br>Q103 | | | 4 | CONT | # | # | LOW | SPECL | TWOLAST | R1 | R3 | Z | EN | EN | S100-<br>Q103 | | | 5 | CONT | # | # | RAMAQ | INCRS | RAM | # | R4 | LOW | EN | EN | # | | SINGLE LENGTH NORMALIZE #### SINGLE LENGTH/DOUBLE LENGTH NORMALIZE - Normalization is a technique for referencing a floating point number to a fixed radix (binary) point. - Used in fixed to floating point conversion. - Double length normalize requires an extra microcycle per loop or an external counter to shift exponent count. - Example: $$0.0031 \times 10^5 = 0.3100 \times 10^3$$ (base 10) $$0.0011 \times 2^7 = 0.1100 \times 2^5$$ (base 2) For binary numbers, the normalization technique consists of shifting the mantissa left until the two bits immediately adjacent to the binary point are of opposite polarity. ## NORMALIZE ALGORITHM - Shift left (zero fill) until $Q_n$ (MSB) $\neq Q_{n-1}$ . - Increment exponent register for each shift. This approach counts shifts. This value must be subtracted from the actual exponent value for the proper representation. | | RAI | DIX | | | l | | | | l | | | | l | | | | |------------|--------------|-----|----|----|----|------|------|---|---|------|------|---|---|------|------|-----| | | 15 ( | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | | Q REGISTER | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | | | MSS DEVICE 4 | | | | | DEVI | CE 3 | 3 | [ | DEVI | CE 2 | ) | ם | DEVI | CE 1 | LSS | a) Unnormalized Positive Number. b) Normalized Positive Number. | | RAI | XIC | | | ı | | | 1 | ] | | | | l | | | | | |------------|--------------|-----|----|----|----|-----|-------|---|---|-----|-------|---|---|-----|-------|----------|--| | | 15 1 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | | | Q REGISTER | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | | | | MSS DEVICE 4 | | | | | DEV | ICE 3 | 3 | [ | DEV | ICE 2 | 2 | | DEV | ICE ' | LSS<br>1 | | a) Unnormalized Negative Single Length Number. b) Normalized Negative Single Length Number. # Am2903 SINGLE LENGTH NORMALIZE (SLN) # Set up: - Define exponent register. - Put unnormalized number into Q. ## Operation: - When SLN is executed - Q is shifted left one bit - Zero is loaded into $Q_{0}$ - $c_{n+4} = Q_3 + Q_2$ (MSS) - OVR = $Q_2 \vee Q_1$ (MSS) $$Z = 1$$ if $Q = \emptyset$ (cannot normalize) $$\langle B \rangle$$ + $C_N$ -> $\langle B \rangle$ (use for exponent) ADVANCED MICRO DEVICES 27 2-1500 ED2900A 2-1500 #### ALGORITHM EXPLANATION - 1. Number must be in Q for Am2903/29203. - Cannot normalize a zero. With IEN = high, execute SLN and store condition codes on this step only. - 3. If result of 2 is zero, abort. - 4. If already normalized, quit. Test result of step 2. Don't latch codes while clearing exponent register. - 5. Again test result of Step 2 to see if one shift will normalize the number. Simultaneously shift with SLN. If TEST = TRUE, we are done. - 6. Continue to test and shift until done. Because we execute the shift and test simultaneously, we in fact execute one more shift than we require. - 7. Correction step. Downshift and restore sign bit from condition code register. Decrement exponent. This requires two microcycles. # SINGLE LENGTH NORMALIZE MICROCODE | | | | BRCH<br>CNTF | A<br>SRCE | L<br>Funct | U<br>Dest | RA | RB | Cin | OFY | TEN | | STORE<br>STATUS | COMMENT | |---|-------|------|-------------------|-----------|------------|-----------|----|----|-----|-----|-----|-------|-----------------|--------------------------------------------------| | | 211.5 | | • •••• | | | | | | | | | | | | | 1 | CONT | # | # | RAMAO | INCRR | LOADO | Ra | # | LOW | OFF | EN | # | NO. | LOAD Q | | | CONT | | # | RAMAB | SPECL | | | | LOW | | OFF | | | DO SLN | | 3 | CJP | ZERO | ABOR <sup>-</sup> | Г# | # | # | # | # | # | 0FF | OFF | # | NO | ZERO ? | | 4 | CJP | CARY | DONE | RAMAQ | LOW | RAM | # | Rb | LOW | EN | EN | # | NO | CLR REG/DONE? | | 5 | CJP | OVR | DONE | RAMAB | SPECL | SLN | # | Rb | ONE | EN | EN | 0-Q0 | YES | DO SLN | | 6 | CJP | OVR | 6 | RAMAB | SPECL | SLN | # | Rb | ONE | EN | EN | 0-Q0 | YES | DO SLN | | 7 | CONT | # | # | # | # | QD | # | # | LOW | OFF | EN | SIGN- | | DOWN SHIFT | | 8 | CONT | # | # | RAMAB | SPECL I | DECRMNI | Γ# | Rb | ONE | EN | EN | # | | DECR<br>EXPONENT<br>USING<br>SPECIAL<br>FUNCTION | # Am2903 DOUBLE LENGTH NORMALIZE (DLN) # Set-up: - MSH in RAM for B data out - LSH in Q - Define exponent register Am29203 SPECIAL FUNCTIONS #### Am29203 SPECIAL FUNCTIONS The Am29203 will include all of the special functions on the Am2903 plus: - Decrement by 1 or 2 - Single cycle BCD add - Single cycle BCD subtracts R - S S - R - BCD --> binary conversion - Binary --> BCD conversion - BCD divide by two adjust (performed after a downshift) #### Am29203 #### **SPECIAL FUNCTIONS (Note 7)** | | | | | | | SIO | 3 | | Q Reg & | | | | |-------------------------------|----|-------|---------------------------------------------------|--------------------------------------------------------------------------------|-------------------------|---------------------------------|------------------|------------------|---------------------|------------------|------------------|-------| | (Hex)<br> <sub>8 7 6 5</sub> | 14 | (Hex) | Special<br>Function | ALU Function | ALU Shifter<br>Function | Most Sig<br>Slice | Other<br>Slices | SIO <sub>0</sub> | Shifter<br>Function | Q1O <sub>3</sub> | QIO <sub>0</sub> | WRITE | | 0 | L | 0 | Unsigned Multiply | F = S + C <sub>n</sub> if Z = L<br>F = R + S + C <sub>n</sub> if Z = H | Log F/2 → Y<br>(Note 1) | z | Input | Fo | Log Q/2 → Q | Input | <b>Q</b> 0 | L | | 1 | L | 0 | BCD to Binary<br>Conversion | (Note 4) | Log F/2 → Y | Input | Input | Fo | Log Q/2 → Q | Input | O <sub>0</sub> | L | | 1 | н | 0 | Multiprecision<br>BCD to Binary | (Note 4) | Log F/2 → Y | Input | Input | Fo | Hold | z | Q <sub>0</sub> | L | | 2 | L | 0 | Two's Complement<br>Multiply | F = S + C <sub>n</sub> il Z = L<br>F = R + S + C <sub>n</sub> il Z = H | Log F/2 → Y<br>(Note 2) | z | Input | F <sub>0</sub> | Log Q/2 → Q | Input | Q <sub>0</sub> | L | | 3 | L | 0 | Decrement by<br>One or Two | F = S - 2 + C <sub>n</sub> | F→Y | z | z | Parity | Hold | z | z | L | | 4 | L | 0 | Increment by<br>One or Two | F = S + 1 + C <sub>n</sub> | F→Y | Input | Input | Parity | Hold | z | z | L | | 5 | L | 0 | Sign/Magnitude<br>Two's Complement | $F = S + C_n \text{ if } Z = L$ $F = \overline{S} + C_n \text{ if } Z = H$ | F + Y<br>(Note 3) | Input | Input | Parity | Hold | z | z | L | | 6 | L | 0 | Two's Complement<br>Multiply, Last Cycle | $F = S + C_n$ if $Z = L$<br>$F = S - R - 1 + C_n$ if $Z = H$ | Log F/2 → Y<br>(Note 2) | z | Input | F <sub>0</sub> | Log Q/2 - • Q | Input | Q <sub>0</sub> | L | | 7 | L | 0 | BCD Divide<br>by Two | (Note 4) | F→Y | z | z | Parity | Hold | z | z | L | | 8 | L | 0 | Single Length<br>Normalize | F = S + C <sub>n</sub> | F⊶Y | F <sub>3</sub> | . F <sub>3</sub> | z | Log 2Q → Q | 03 | Input | L | | 9 | L | 0 | Binary to BCD<br>Conversion | (Note 5) | Log 2F → Y | F <sub>3</sub> | F <sub>3</sub> | Input | Log 2Q → Q | Q <sub>3</sub> | Input | L | | 9 | н | 0 | Multiprecision<br>Binary to BCD | (Note 5) | Log 2F → Y | F <sub>3</sub> | F <sub>3</sub> | Input | Hold | z | Input | L | | A | L | 0 | Double Length<br>Normalize and First<br>Divide Op | F = S + C <sub>n</sub> | Log 2F → Y | R <sub>3</sub> ∀ F <sub>3</sub> | F3 | Input | Log 2Q → Q | Q <sub>3</sub> | Input | L | | В | L | 0 | BCD Add | F = R + S + C <sub>n</sub> BCD<br>(Note 6) | F→Y | 0 | 0 | z | Hold | z | z | L | | С | L | 0 | Two's Complement<br>Divide | $F = S + R + C_n$ if $Z = L$<br>$F = S - R - 1 + C_n$ if $Z = H$ | Log 2F → Y | R <sub>3</sub> ∀ F | F3 | Input | Log 2Q → Q | 03 | Input | L | | D | L | 0. | BCD Subtract | F = R S - 1 + C <sub>n</sub> BCD<br>(Note 6) | F⊶Y | 0 | 0 | z | Hold | z | z | L | | E | L | 0 | Two's Complement Divide Correction and Remainder | $F = S + R + C_n \text{ if } Z = L$<br>$F = S - R - 1 + C_n \text{ if } Z = H$ | F→Y | F <sub>3</sub> | F <sub>3</sub> | z | Log 2Q → Q | Q <sub>3</sub> | Input | L | | F | L | 0 | BCD Subtract | F = S R · 1 + C <sub>n</sub> BCD<br>(Note 6) | F → Y | 0 | 0 | z | Hold | z | Z | L | Notes: 1. At the most significant slice only, the $C_{n+4}$ signal is internally gated to the $Y_3$ output. - 2. At the most significant slice only, F $_3 \,\, \forall \,\,$ OVR is internally gated to the Y $_3$ output. - 3. At the most significant slice only, $S_3 \ \forall \ F_3$ is generated at the $Y_3$ output. - 4. On each slice, F = S if magnitude of S<sub>0-3</sub> is less than 8 and F = S minus 3 if magnitude of S<sub>0-3</sub> is 8 or greater. 5. On each slice, F = S if magnitude of S<sub>0-3</sub> is less than 5 and F = S plus 3 if magnitude of S<sub>0-3</sub> is 5 or greater. Addition is module 16. - 6. Additions and subtractions are BCD adds and subtracts. Results are undefined if R or S are not in valid BCD format. - 7. The Q Register cannot be used explicitly as an operand for any Special Functions. It is defined implicitly within the functions. L = LOW Hi-Z = High Impedance = Exclusive OR H = HIGHX = Don't Care Parity = $SIO_3 \forall F_3 \forall F_2 \forall F_1 \forall F_0$ ## DECREMENT by 1 or 2 - It is not possible to decrement by 1 or 2 without going to a special function except by storing "1" or "2" in a register. - In a byte-addressable memory both byte addressing and word addressing capability is desirable. For addess decrement (e.g. stack operations). Byte addressing: $$\langle R\emptyset \rangle \langle -- \langle R\emptyset \rangle - 1$$ Word addressing: $\langle SP \rangle \langle -- \langle SP \rangle - 2$ - The special function "DECRMNT" provides this capability. - 4 DECRMNT $$F = S - 2 + Cin$$ | 2910 | SOURCE | FUNCT | | RA | RB | Cin | |------|--------|-------|---------|----|----|-----| | CONT | RAMAB | SPECL | DECRMNT | | R2 | ONE | #### BINARY/BCD CONVERSION - BCD number is always in a specified RAM register - Binary number is always in the Q-register - $\bullet$ SIO $_{\emptyset}$ is connected to QIO $_{n}$ - $SIO_Q$ is connected to $QIO_n$ - For binary to BCD: - Binary number must not exceed BCD value - Binary number is loaded into Q and Ra is cleared - BIN-to-BCD is executed N times for an N-bit number ``` 2910 CND BRCH A L U INST MUX CNTR SRCE FUNCT DEST RA RB Cin ROT COMMENT 1 LDCT # 15 RAMAQ INCRR LOADQ RA # LOW # PUT BINARY IN Q 2 CONT # # RAMAQ LOW RAM # Rb LOW # CLEAR RAM Rb 3 RPCT # 3 RAMAB SPECL BIN.BCD # Rb LOW Q3-SO PERFORM CONVERSION ``` ## BINARY/BCD CONVERSION (cont) ## For BCD to binary: - BCD number is loaded into Ra, Q is cleared - Ra and Q are downshifted one bit (a precorrection step) - BCD-to-BIN is executed N-1 times for an N-bit number ``` 2910 CND BRCH A L U INST MUX CNTR SRCE FUNCT DEST RA RB Cin ROT COMMENT 1 CONT # # RAMAQ LOW LOADQ # # LOW # CLEAR Q 2 LDCT # 14 RAMAQ INCRR RAMQDL Ra # LOW # INITIAL DOWNSHIFT 3 RPCT # 3 RAMAB SPECL BCD.BIN # Ra LOW SO-Q3 PERFORM CONVERSION ``` # Microcode Flow Chart of BCD Conversion # 2-1660 # BCD CONVERSIONS ED2900A # BINARY TO BCD # BCD TO BINARY | ALU | DESTINATION | | | | | | | | | |--------------------------------|--------------------|--|--|--|--|--|--|--|--| | BCD<br>TO<br>BIN<br>CONVERSION | F/2 → RAM; Q/2 → Q | | | | | | | | | # SIGNIFICANT SPEED IMPROVEMENT ON Am2903 | | Am2903<br>(MIL MAX) | Am2903A<br>(PROJECTED MIL | | |------------------|---------------------|---------------------------|---------| | | | MAX)<br> | %FASTER | | ADDRESS TO G,P | 84 | 57 | 32% | | CN TO Z | 65 | 40 | 38% | | ADDRESS TO Z | 126 | 75 | 41% | | | | | | | ADD CYCLE TIME | 181 | 129 | 29% | | LOGIC CYCLE TIME | 152 | 100 | 34% | EXPANDED MEMORY FOR **ALU REGISTER EXPANSION** # **EXPANDED MEMORY** - The Am2901, Am2903 and Am29203 each contains only 16 scratchpad registers plus the Q register. - Some applications require more than 17 registers. - The Am2903 and the Am29203 register set can easily be expanded. - Use the Am29705 RAM with the Am2903 - Use the Am29707 RAM with the Am29203 Am29705 #### Am29705 ## 16-WORD BY 4-BIT, 2-PORT RAM - Distinctive characteristics - Two output ports with latches (Buffers) - Separate data input port - Non-inverting data - Independent three-state outputs - Configurable in either "transparent" or "edge triggered" mode. - Designed for Am2903 register expansion # Am29705 Am2903 - Data Bus Cascading # Am2903 - RAM Address Cascading TWO ADDRESS OPERATION #### Am2903 SCRATCHPAD EXPANSION - Am29705 functionally identical to Am2903 registers - Am29715A PROM stores constants, masks - Five data busses shown - Three-address architecture shown # **Expanded Memory** # Am29705A 16-WORD BY 4-BIT TWO PORT RAM | | Am29705 | | |---------------------------------------|---------|----| | Commercial maximum: | | | | access time | 53 | 30 | | LE to YA/YB | 32 | 20 | | A-latch reset | 35 | 20 | | address set up<br>before latch closes | 45 | 15 | Am29707 # Am29707 -- 28 Pin Am29203 - Am29707 ## CLASS EXERCISE Turn to the Am2903/Am29203 exercises in the ED2900A Exercise and Laboratory Manual and do numbers 1 through 24. # **Evaluation Board Experiments** Do Am29203 laboratory exercises in Manual. # Selecting the #1 Am2900 Microprocessor Slice | TIMING COMPARISON<br>(GUARANTEED COMMERCIAL) | | | | | | |----------------------------------------------|-------|-------|------|-------------|--| | ADDRESS TO: | 2901B | 2901C | 2903 | 2903A/29203 | | | Y | 60 | 40 | 99 | 68 | | | GP | 50 | 37 | 81 | 52 | | | F=Ø | 70 | 40 | 123 | 72 | | | | | | | | | | D TO: | | | | | | | Y | 38 | 30 | 87 | 59 | | | Cn+4 | 40 | 30 | 60 | 49 | | | F=Ø | 48 | 38 | 111 | 65 | | | | | | | | | | I TO: | | | | | | | Y | 51 | 35 | 71 | 64 | | | F=Ø | 60 | 38 | 95 | 72 | | | | | | | | | SIMPLE COMPUTER SOLUTION INTRODUCTION TO INTERRUPTS #### **INTERRUPTS** - An interrupt is a request for service by some device or process, external to the CPU. - An interrupt usually occurs asynchronously with respect to the processor fetch-execute clock cycle (even though it may be checked on a synchronous basis). - An interrupt request is often identified by a one-bit signal, similar to the ALU status lines. We will be primarily concerned with this type of interrupt. - An interrupt may also be caused by particular processor instructions (such as invalid opcodes, privileged instructions, or system service calls) which decode to special microroutines that behave like interrupt routines. - For interrupt support two specific AMD products are available: - Am2913 Priority Interrupt Expander - Am2914 Vectored Priority Interrupt Controller - Only the Am2913 will be considered in this course. ## TYPES OF INTERRUPTS ## • <u>Intraprocessors</u> - within the processor - asynchronous zero divide ALU overflow invalid memory access invalid instruction privileged instruction other status testing ## • <u>Intrasystem</u> - within the system (outside the processor) I/O request CRT printer tape disk memory parity error DMA request peripheral failure power failure ## INTERRUPT TYPES (CONT'D) <u>Executive</u> (traps) task request hardware allocation interprogram communication supervisory program call • Interprocessor - between two processors data transfer status transfer #### INTERRUPT LEVELS - In any computer or controller there are three levels at which interrupts may be handled: - Software level (also machine level or macro level) visible and handled at the machine language level - Firmware level (also microprogram level) handled by microcode routines - Hardware level handled by special purpose hardware - The same general interrupt-handling algorithm (process) is used at all three levels. - Acknowledge interrupt - Save current "state" of system - Service interrupt - Reinstate "state" of system prior to interrupt #### Software Level - Interrupts handled at a fixed place in the machine level instruction cycle. - Interrupts are usually handled during the main memory fetch portion of the cycle (minimize storage of "state"). - On detection of an interrupt a machine level interrupt routine is activated and executed. - On completion of the interrupt routine the original machine program continues. - Where nested interrupts are allowed, an interrupt routine itself may be interrupted. #### Advantages - Can be altered as needed via programming. Does not require space in the microprogram memory. #### Disadvantages - Slow - Does require space in the program memory #### • Firmware Level - Interrupts are handled at fixed places in the microprogram, generally at the end of a microroutine or at a <u>quiescent</u> <u>point</u> in the microroutine if it is very long. - On detection of an interrupt, a microroutine is activated with no alteration of the macroinstruction register. - Nested interrupts may or may not be allowed depending upon the microroutine. - Where interrupts are handled and whether or not they are allowed to be nested is under the control of the microprogrammer. #### Advantages - Faster than software routine - Does not interrupt or alter machine program flow ## • Disadvantages - Requires space in the microprogram memory #### • Hardware Level - Hardware-level interrupts are necessary when the response must be in "fast" relative to the other implementation levels. - The system can be forced to immediately handle the unit causing the interrupt request immediately. ## Advantages - Speed - a key feature of this approach ## Disadvantages - More complex hardware - More complex firmware #### INTERRUPT ARCHITECTURES - There are many ways in which the interrupt request lines from peripheral devices may be connected to the CPU. In order to present the basic concepts of interrupts and to examine the Am2913 and Am2914, only three basic forms need to be considered. - 1. A request line from each device is connected to the CPU. - 2. A request line from each device is connected to the CPU, and all lines are "ORed" together to form an "any device" signal to indicate that one or more of the devices requires service. - 3. A request line from each device is connected to an interrupt priority encoder. From the priority encoder a single interrupt request line (indicating that one or more devices need service) and an identifying code, called a vector (indicating the highest priority device requesting service), are connected to the CPU. INTERRUPT ARCHITECTURES ## INTERRUPT HANDLING STEPS (Algorithm) - Recognize interrupt - Determine that an interrupt is pending - Halt the currently running process - Determine which device or process needs service - Save status - Save the state of the system - Usually includes registers that wil be used (overwritten) by the service routine software - Service the interrupt - Perform the service needed by the requesting device - Restore and return - Restore the system state saved earlier - Continue the running process from where it was interrupted ### GENERAL INTERRUPT HANDLING TECHNIQUES Periodically (usually just prior to instruction fetch) the microprogram tests for interrupts by one of the following techniques. #### Polled interrupts - Check each device interrupt line to see if it needs service. - The order in which devices are checked determines the devices' priorities. - If available, check the ORed "any device" input first, poll only if it is active. - Wastes time and micromemory. ## Vectored interrupts - Check the "any device" signal first. - If any request is pending, read the coded vector to determine which device is waiting. - Faster, but requires priority encoder. IMPLEMENTATION OF INTERRUPT CONTROL # SENSING THE REQUEST | • | Two types of interrupts need to be considered | |---|-----------------------------------------------| | | - Level signals | | | - Pulse signals | | • | Level signals (buffered/hold): | | | - Device request | | | - Status register output | | • | Pulse signals: | | | - CRT retrace | # PULSE LATCHES AND SYNCHRONIZATION - Signals are synchronized with the system clock by using them to set a clocked D flip-flop. - Pulse signals might be gone before the next clock pulse. Thus pulse latching circuits must be used to hold the pulse. - The circuit on the next page shows a pulse latch with a latch bypass for use with level signals. - The following page illustrates the fact that such a circuit is required for each interrupt signal. Note that a different pulse catching circuit is used. LATCH BYPASS = 0, PULSE CATCHER MODE = 1, LEVEL FOLLOWER MODE Multiple (4) Interrupt Storage ## POLLED INTERRUPT IMPLEMENTATION - Interrupt lines feed the condition code multiplexer. - The microroutine selects each interrupt line via the test select field of the microword. - This approach is slow, since only one interrupt can be tested per microword. - The following microsubroutine, called at some regular point in the machine cycle, illustrates this approach. | Address | Next<br>Address | Test<br>Select | Branch<br>Address | CCEN | Other | |---------|-----------------|----------------|-------------------|---------|-------| | INTER: | CJS | INT1 | Routine1 | Enable | | | | CJS | INT2 | Routine2 | Enable | | | | CJS | INT3 | Routine3 | Enable | | | | CJS | INT4 | Routine4 | Enable | | | | CRTN | ## | ## | Disable | | ADVANCED MICRO DEVICES 27 # POLLED INTERRUPT WITH "ANY" REQUEST LINE IMPLEMENTATION - Interrupt lines feed the condition code multiplexer. - The microroutine can select any interrupt line via the test select field of the microword. - An additional MUX is used to select the "any" signal or the individually selected signal. (An 8-input MUX could have handled both in this case.) - The following microsubroutine, called at some regular point in the machine cycle, illustrates this approach. - Speed is improved only when no interrupts are pending. | Address | Next<br>Address | Test<br>Select | Any<br>Select | Branch<br>Address | CCEN | Other | |---------|----------------------------------|------------------------------------|--------------------------------|----------------------------------------------------|-------------------------------------------------|-------| | INTER: | CJS<br>(continu | ##<br>ue with no | Any<br>rmal cycle) | InterI | Enable | | | INTER1 | CJS<br>CJS<br>CJS<br>CJS<br>CRTN | INT1<br>INT2<br>INT3<br>INT4<br>## | A11<br>A11<br>A11<br>A11<br>## | Routine1<br>Routine2<br>Routine3<br>Routine4<br>## | Enable<br>Enable<br>Enable<br>Enable<br>Disable | | #### **VECTORED INTERRUPT IMPLEMENTATION** - Add a priority encoder to identify which interrupt caused the request. - The highest priority interrupt is encoded in binary. - For example: The Am2913 provides - Active low input for 8 device request lines - Three-bit encoded output identifying highest priority request received - "Any" request line output # Am2913 Logic Diagram | İ | Inputs | | | | | | | Outputs | | | | | |----|--------|----------------|----------------|----------------|----------------|----------------|----------------|---------|----|----------------|----------------|----| | ΕĪ | To | Ī <sub>1</sub> | ī <sub>2</sub> | T <sub>3</sub> | ī <sub>4</sub> | ī <sub>5</sub> | ī <sub>6</sub> | ī, | Ao | A <sub>1</sub> | A <sub>2</sub> | ĒŌ | | Н | X | X | × | X | × | X | × | × | L | L | L | H | | L | н | н | н | н | Н | Н | н | Н | L | L | L | L | | L | X | X | × | X | × | X | × | L | н | н | н | н | | L | X | × | X | X | X | × | L | Н | L | Н | н | н | | L | × | × | × | × | × | L | н | н | н | L | н | н | | L | × | × | × | × | L | н | Н | н | L | L | н | н | | L | × | × | × | L | н | н | н | н | н | н | L | н | | L | X | X | L | н | н | Н | н | н | L | н | L | н | | L | X | L | н | н | н | н | н | н | н | L | L | н | | L | L | н | н | Н | н | н | H. | н | L | L | L | н | H = HIGH Voltage Level L = LOW Voltage Level X = Don't Care For $G_1 = H$ , $G_2 = H$ , $G_3 = L$ , $G_4 = L$ , $G_5 = L$ | G1 | G2 | G3 | G4 | G5 | A <sub>0</sub> | A <sub>1</sub> | A <sub>2</sub> | |----|----|----|----|----|----------------|----------------|----------------| | н | Н | L | L | L | Enal | bled | | | L | X | X | X | X | z | Z | Z | | × | L | X | X | X | z | Z | Z | | × | X | н | X | X | Z | Z | Z | | × | X | X | н | X | z | Z | Z | | × | × | × | × | н | z | Z | Z | Z = HIGH Impedance # POSITIONING THE PRIORITY ENCODER ### **VECTOR MAPPING PROM** - On the detection of the "any" signal, branch to the microroutine designated by the vector. - The same problem exists as existed with opcode decoding -- fewer bits in the vector than in the microword address. - Could use the lower micromemory addresses (000-111) for an interrupt jump table. - Could use the vector as high order bits and scatter the interrupt routines throughout micromemory. - However, a vector mapping PROM is used with the opcode mapping PROM. - Now on the detection of the "any" signal the microroutine can branch to any address in micromemory via the vector map. #### **USING PRIORITY INTERRUPTS** - Several design alternatives are available at this point. - The tri-state output enable signal for the vector mapping PROM can come from either of two places. - 1. From the OE-VECT output on the Am2910 (or from an Am29811 augmented by a decoder.) - 2. From a bit in the microword. (May be faster, but requires a wider microword.) - The instruction for testing can be either CJS or CJV. - 1. CJS (Conditional Jump Subroutine) could perform the actual jump to the particular routine only if the vector map enable comes from the pipeline and also disabled the OE-pipeline from the Am2910. - 2. CJV (Conditional Jump Vector) could perform the actual jump to the particular routine for either source of the vector map enable signal. **CJV** #### **VECTORED INTERRUPT IMPLEMENTATION** - Interrupt lines connected to priority encoder. - The "any" output of the priority encoder connected to one input of the normal condition code MUX. - Let the Am2910 provide the vector map enable signal. - The following microsubroutines, called at some regular point in the machine cycle, illustrate this approach. - 1. In the first routine, CJS is used to test for "any" and jump to a common handling routine first. - 2. In the second routine, CJV is used to directly jump to the particular routine (which must save state). #### **EXAMPLE #1** | Address | Next<br>Address | Test<br>Select | Branch<br>Address | 0ther | |---------|-----------------|--------------------------------------------------|-------------------|------------------| | INTER: | CJS | ANY P | _ | | | n+1 | | R4 <r4-1< td=""><td>R5-1</td><td>PL</td></r4-1<> | R5-1 | PL | | n+2 | | R5 <r5+1< td=""><td>l</td><td>PL</td></r5+1<> | l | PL | | n+3 | CJV | R5 <r5+(<br>PASS</r5+(<br> | <b>)</b><br>## | State | | RETURN: | CONT | | | Restore<br>State | | | CRTN | PASS | ## | | ## EXAMPLE #2 Next Test Branch Address Address Select Address Other INTER: CJV ANY RETURN: (continue with normal cycle) SAMPLE SYSTEM FOR FIRMWARE LEVEL INTERRUPT HANDLING ## The Am2914 A COMPLETE INTERRUPT CONTROLLER # REQUIREMENTS FOR INTERRUPT HANDLING - Local storage save (save state) - Clear interrupt Clear one - last one read Clear some - related to program, device Clear all - warmstart Dynamic masking of interrupts Block selected interrupts Nesting interrupts A higher priority interrupt can be acknowledged when a lower priority interrupt routine is executing. Status fence To keep lower priority interrupt from interrupting a higher priority routine #### The Am2914 The Am2914 vectored priority interrupt controller performs the previous functions. - Up to 8 interrupt inputs - All 8 may be pulse or level inputs - Produces a 3-bit vector output to address a vector map - Contains a 3-bit fence register (status register) - Contains an 8-bit mask register - Both mask and status can be read from or written to - Expandable - Microprogrammable - Four instruction lines plus enable ### INTERCONNECTING Am2914s and Am2913s - Multiple Am2914s can be interconnected to provide 16-, 32-, or 64-level architectures. - Additional bits (A3, A4, ...) of vector address beyond AØ, A1, and A2 must be provided using the ripple disable or the parallel disable signals to indicate the active Am2914. - Group control pins must connected appropriately to enable only highest priority device. - See AMD Data Book for detailed information on interconnecting Am2914s for multi-level systems. - The E2900B course develops the Am2914 in detail. Am2900 FAMILY SUPPORT CHIPS # Am2900 SUPPORT CHIPS - Am27S26/27 registered PROMS - Am2904 status and shift control - Am2925 clock generator REGISTERED PROMS Am27S26/27 # Am27S27 -- Am29774/Am29775 ## REGISTERED PROMS - 512 X 8 - 9 address lines, 8 data lines - Two enable controls - Am27S26 open collector output - Am27S27 tri-state output - Am27S35/Am27S35A/Am27S37/Am27S37A 1K x 8 - Am27S45/Am27S45A/Am27S47/Am27S47A 2K x 8 ADDRESS AND $\overline{E}_s$ LOW - FETCH CLK AND E LOW - DATA OUTPUT REGISTERED PROM 2K x 8 # Optional Am27S27 Exercises (See ED2900A Exercise and Laboratory Manual) STATUS AND SHIFT CONTROL UNIT Am2904 OVERVIEW ### Am2904 STATUS AND SHIFT CONTROL - The Am2904 was designed to replace much of the SSI/MSI which is used around the Am2901 or Am2903/29203. - The Am2904 provides the following support on one chip: - Micro and macro status registers (carry, zero, sign, overflow) with ability to read and load registers. - Shift linkage for the RAM and Q registers with 32 shift/rotate functions. - Carry-in select for the ALU with 7 possible sources. - Conditional test multiplexer to drive CC on Am2910 with 16 possible tests from 3 sources. ## Am2904 Typical Application of Am2904 with Am2903 #### MICROPROGRAM CONTROL OF Am2904 - Controlled by thirteen instruction lines, shift enable, six status enable pins, and two output enable pins (22 bits total). - To save pins, some pins perform multiple functions. - Status register functions use I5-IO plus six enable pins. - Condition code select uses I5-IO plus one enable pin. - Status output select uses I5-I4 plus one enable pin. - Carry-in uses I12-I11 and I5-I1. - Shift linkage uses I10-I6 and one enable. - Depending on the state of the various enable pins, I5-I0 can cause up to four different functions to occur simultaneously. - Programming is not for the faint-hearted. More detail provided in the ED2900B couse. Am2904 TABLE 7. SHIFT LINKAGE MULTIPLEXER INSTRUCTION CODES. | 110 | lg | I <sub>8</sub> | l <sub>7</sub> | 16 | M <sub>C</sub> RAM | Q | SIOo | SIOn | QIO <sub>0</sub> | QIOn | Loaded into M <sub>C</sub> | |-----|----|----------------|----------------|----|---------------------|----------------|------------------|------------------|------------------|------------------|----------------------------| | 0 | 0 | 0 | 0 | 0 | MSB LSB | MSB LSB | z | 0 | z | 0 | | | 0 | 0 | 0 | 0 | 1 | | · <del></del> | z | 1 | z | 1 | | | 0 | 0 | 0 | 1 | 0 | | M <sub>N</sub> | z | 0 | z | MN | SIOo | | 0 | 0 | 0 | 1 | 1 | _ ·-(=)- | | z | 1 | z | SIO | | | 0 | 0 | 1 | 0 | 0 | | <del></del> | z | Mc | z | SIO | | | 0 | 0 | 1 | 0 | 1 | u <sub>N</sub> + | | z | MN | z | SIO | | | 0 | 0 | 1 | 1 | 0 | | | z | 0 | z | SIOo | | | 0 | 0 | 1 | 1 | 1 | 5 | | z | 0 | z | SIO | QIO <sub>o</sub> | | 0 | 1 | 0 | 0 | 0 | | | z | SIO <sub>0</sub> | z | 0100 | SIO <sub>o</sub> | | 0 | 1 | 0 | 0 | 1 | | | z | Mc | z | Q10 <sub>0</sub> | SIOo | | 0 | 1 | 0 | 1 | 0 | | | z | SIO <sub>o</sub> | z | QIO <sub>o</sub> | | | 0 | 1 | 0 | 1 | 1 | _ ' | | z | lc | z | SIO | | | 0 | 1 | 1 | 0 | 0 | 6 | | z | Mc | z | SIO | QIOo | | 0 | 1 | 1 | 0 | 1 | IN ÷ IOVA | | z | QIO <sub>o</sub> | z | SIO | QIO <sub>o</sub> | | 0 | 1 | 1 | 1 | 0 | | -=- | z | IN # IOVR | z | SIO | | | 0 | 1 | 1 | 1 | 1 | | | z | Q10 <sub>0</sub> | z | SIO | | | 1 | 0 | ·<br>0 | 0 | 0 | MSB LSB | MSB LSB | 0 | z | 0 | z | SIOn | | 1 | 0 | 0 | 0 | 1 | | | 1 | z | 1 | z | SIOn | | 1 | 0 | 0 | 1 | 0 | _ <del>-</del> | | 0 | z | 0 | z | | | 1 | 0 | 0 | 1 | 1 | - <del>-</del> | | 1 | z | 1 | z | | | 1 | 0 | 1 | 0 | 0 | | | QIOn | z | 0 | z | SIOn | | 1 | 0 | 1 | 0 | 1 | | | QIOn | z | 1 | z | SIOn | | 1 | 0 | 1 | 1 | 0 | □ <b>-</b> □- | | QIOn | z | 0 | z | | | 1 | 0 | 1 | 1 | 1 | O -Œ)- | | QIOn | z | 1 | z | | | 1 | 1 | 0 | 0 | 0 | | | SIOn | z | QIOn | z | SIOn | | 1 | 1 | 0 | 0 | 1 | | | Mc | Z | QIOn | z | SIOn | | 1 | 1 | 0 | 1 | 0 | | | SIOn | z | QIOn | z | | | 1 | 1 | 0 | 1 | 1 | | -[ | Mc | z | 0 | z | | | 1 | 1 | 1 | 0 | 0 | | | QIO <sub>n</sub> | z | Mc | z | SIOn | | 1 | 1 | 1 | 0 | 1 | | | QIOn | z | SIOn | z | SIOn | | 1 | 1 | 1 | 1 | 0 | | | QIOn | z | Мс | z | | | 1 | 1 | 1 | 1 | 1 | nutrouts off) state | -=7 | QIOn | z | SIOn | z | | Notes: 1. Z = High impedance (outputs off) state. 2. Outputs enabled and M<sub>C</sub> loaded only if SE is LOW. 3. Loading of M<sub>C</sub> from I<sub>10-6</sub> overrides control from I<sub>5-0</sub>, $\overline{\text{CE}}_{\text{M}}$ , $\overline{\text{E}}_{\text{C}}$ . TABLE 4. CONDITION CODE OUTPUT (CT) INSTRUCTION CODES. | I <sub>3</sub> 0 | l <sub>3</sub> | l <sub>2</sub> | 11 | lo | l <sub>5</sub> = l <sub>4</sub> = 0 | l <sub>5</sub> = 0, l <sub>4</sub> = 1 | l <sub>5</sub> = 1, l <sub>4</sub> = 0 | l <sub>5</sub> = l <sub>4</sub> = 1 | |------------------|----------------|----------------|----|----|-------------------------------------|-------------------------------------------------------------------|-------------------------------------------|-------------------------------------| | 0 | 0 | 0 | 0 | 0 | (µn⊕µovr) + µz | (μ <sub>N</sub> ⊕ μ <sub>OVR</sub> ) + μ <sub>Z</sub> | (MN ⊕ MOVR) + MZ | (In + lova) + Iz | | 1 | 0 | 0 | 0 | 1 | (µn⊙µova)• Az | (µN⊕µOVR)• ₽Z | (M <sub>N</sub> ⊙ M <sub>OVR</sub> ) • MZ | (INO IOVA) · IZ | | 2 | 0 | 0 | 1 | 0 | #N⊕#OVR | μn⊕ μovr | MN + MOVR | IN + IOVR | | 3 | 0 | 0 | 1 | 1 | <b>MOMONE</b> | <b>4NO4OVR</b> | MNO MOVR | INO IOVA | | 4 | 0 | 1 | 0 | 0 | μZ | μZ | Mz | Iz | | 5 | 0 | 1 | 0 | 1 | μZ | π <sub>Z</sub> | Mz | | | 6 | 0 | 1 | 1 | σ | #OVR | ⊬ovr | MOVR | IOVR | | 7 | 0 | 1 | 1 | 1 | <b>#</b> OVR | <b> </b> | MOVR | Tova | | 8 | 1 | 0 | 0 | 0 | μC + μZ | $\mu_{C} + \mu_{Z}$ | Mc + Mz | T <sub>C</sub> - I <sub>Z</sub> (2) | | 9 | 1 | 0 | 0 | 1 | μ <sub>C</sub> ·μ <sub>Z</sub> | $ \overline{\mu}_{\mathbf{C}} \cdot \overline{\mu}_{\mathbf{Z}} $ | $\overline{M}_{C} \cdot \overline{M}_{Z}$ | Ic • Iz (2) | | A | 1 | 0 | 1 | 0 | μC | μc | Mc | | | 8 | 1 | 0 | 1 | 1 | ΨC | ΨC | Мc | <u>5</u> 10 | | C | 1 | 1 | 0 | 0 | μ <sub>C</sub> + μ <sub>Z</sub> | $\overline{\mu}_{C} + \mu_{Z}$ | M <sub>C</sub> + M <sub>Z</sub> | Tc + Iz | | ه ا | 1 | 1 | o. | 1 | μC・ΨZ | μC·μZ | M <sub>C</sub> ·M <sub>Z</sub> | lc • Īz | | E | 1 | 1 | 1 | 0 | In⊕ Mn | μN | MN | I <sub>N</sub> | | F | 1 | 1 | 1 | 1 | In Mn | ΔN | MN | TN | Notes: 1. ⊕ Represents EXCLUSIVE-OR 2. Correct code as stated. u means micro status register M means macro status register C means carry bit Z means zero bit N means sign bit **OVR** means overflow bit Represents EXCLUSIVE-NOR or coincidence. **CLOCK GENERATOR** Am2925 ### Am2925 CLOCK GENERATOR & MICROCYCLE LENGTH CONTROLLER - Single chip clock generator and driver - Crystal controlled to maximum of 31 MHz - Fundamental oscillator output available - Four different clock output waveforms available on separate pins - One of eight different cycle lengths may be selected by micro program control - Clock halt, single-step, and wait controls #### MICROPROGRAM CONTROL OF Am2925 - Cycle length controlled by three instruction lines. - Other inputs would normally be provided from hardware connections instead of the microword since they concern stopping the clock and wait states. - For three-address instructions (Am2903/29203) using output C3 as the clock and C2 for IEN, an additional field or steering bit would be needed to provide the third register address. - The cycle-length control bits (L3, L2, L1) are latched internal to the Am2925 at the end of the microcycle. Thus no pipeline register is needed for this field. - The cycle length value is specified in the same microword as the instruction with which it is associated. That is, the cycle length as specified stretches the current microcycle. ## Am2925 Clock Waveforms | PATTERN | | |---------------------------------------------------------------|----------------------------------------------------------------------------------------| | INPUT<br>CODE<br>L <sub>3</sub> L <sub>2</sub> L <sub>1</sub> | WAVEFORMS<br>AND<br>TIMING | | F3<br>LLL | C <sub>1</sub> | | F4<br>LLH | $ \begin{array}{cccccccccccccccccccccccccccccccccccc$ | | F <sub>5</sub><br>HLH | C <sub>1</sub> C <sub>2</sub> C <sub>3</sub> C <sub>4</sub> 1 2 3 4 5 F <sub>0</sub> | | F <sub>6</sub><br>ннн | C <sub>1</sub> C <sub>2</sub> C <sub>3</sub> C <sub>4</sub> 1 2 3 4 5 6 F <sub>0</sub> | | PATTERN | | | | | | | | | |-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|--|--| | INPUT<br>CODE | WAVEFORMS<br>AND<br>THING | | | | | | | | | L3 L2 L1 | TIMING | | | | | | | | | F7<br>LHH | C <sub>1</sub> C <sub>2</sub> C <sub>3</sub> C <sub>4</sub> 1 2 3 4 5 6 7 F <sub>0</sub> | | | | | | | | | F8<br>LHL | C <sub>1</sub> C <sub>2</sub> C <sub>3</sub> C <sub>4</sub> T <sub>1</sub> C <sub>2</sub> S <sub>6</sub> T <sub>8</sub> F <sub>0</sub> | | | | | | | | | F <sub>9</sub><br>HHL | C <sub>1</sub> C <sub>2</sub> C <sub>3</sub> C <sub>4</sub> F <sub>0</sub> F <sub>0</sub> C <sub>1</sub> C <sub>2</sub> R <sub>3</sub> R <sub>4</sub> R <sub>5</sub> R <sub>5</sub> R <sub>9</sub> | | | | | | | | | F10<br>HLL | C <sub>1</sub> C <sub>2</sub> C <sub>3</sub> C <sub>4</sub> 1 2 3 4 5 6 7 8 9 10 F <sub>0</sub> | | | | | | | | | | | Am 292<br>L3 <sup>L</sup> 2 <sup>L</sup> 1 | 5 | | CYCLE L | ENGTH | | · | | | |-----------------------|-----------|--------------------------------------------|-------|-------|---------|-------|-------|-----|-------|--| | CRYSTAL<br>FREQ ''f'' | 1/f<br>ns | 000 | 001 | 101 | 111 | 011 | 010 | 110 | 100 | | | 16.6 MHZ | 60 | 180 | 240 | 300 | 360 | 420 | 480 | 540 | 600 | | | 20 '' | 50 | 150 | 200 | 250 | 300 | 350 | 400 | 450 | 500 | | | 25 '' | 40 | 120 | 160 | 200 | 240 | 280 | 320 | 360 | 400 | | | 31 " | 324 | 97 | 129 | 161 | 194 | 226 | 258 | 290 | 322 | | | 30 '' | 33.3 | 100 | 133.3 | 166.7 | 200 | 233.3 | 266.7 | 300 | 333.3 | | ADVANCED MICRO DEVICES 7 ## Am29116 A High-Performance 16-bit Bipolar Microprocessor ## Am29116 Block Diagram 2-2500 #### Am29116, 16-bit ALU #### **Outstanding Features** - 16-bit data path - 16-bit ALU - Full carry look-ahead - Can operate in 16-bit word mode - Can operate in 8-bit byte mode - 32x16-bit RAM scratchpad on-board - Single port - With external multiplexer added may select different source and destination address for same instruction (requires timing adjust) - 16-bit ACC - 16-bit data latch - 16-bit barrel shifter - Byte or word mode - Rotates 1 to 15 bits up in one cycle #### Am29116 - Features (Cont'd) - 8-bit status register - Condition code generator/multiplexer - 12 different test conditions - Immediate instruction capability - First microcycle instruction latched - Second microcycle immediate data available - Both the instruction and the immediate data are fetched via the 16 instruction lines. - Cyclic Redundancy Check generation - Any CRC polynomial of 16-bits or less - 80% of CRC applications use 16-bit polynomials - Powerful instruction set (see Data Book) - Not expandable - Fixed data width of 16-bits - Fixed set of 32 16-bit registers ## Am29116 MICROWORD FORMAT If it is assumed that the Am2910 is used as the sequencer for the Am29116, then the microword might look as follows: | 2910<br>INST<br>(4) | אטניון | BRCH ADDR<br>/COUNTER<br>(12) | | | | | | | TEST<br>INST<br>(4) | ••• | | |---------------------|--------|-------------------------------|--|--|--|--|--|--|---------------------|-----|--| |---------------------|--------|-------------------------------|--|--|--|--|--|--|---------------------|-----|--| - The conditional multiplexer may very well be replaced by an Am2904. - The test instruction field is optional (only used when a test is to be performed during another instruction's execution). - The Am29116 instruction field can be overlayed by the immediate data for the Am29116. (The instruction is latched on-board by the Am29116.) # PERIPHERAL CONTROLLER, MINIMUM PARTS CONFIGURATION ## PERIPHERAL CONTROLLER, MAXIMUM PERFORMANCE CONFIGURATION #### AMD FAMILIES (see associated Data Books) - Analog and communications products - Bipolar microprocessor logic and interface devices - Bipolar/MOS memories data book - MOS microprocessors and peripherals - Programmable Array Logic ## THE FUTURE ## In Bipolar Microprogrammable Microprocessor Logic • ALU's 32 bit Sequencers 16 bit Support Devices **VLSI** AMD SUPPORT TOOLS FOR Am2900 SYSTEM DEVELOPMENT #### AMD SUPPORT - META Assemblers - AMD Customer Education courses - Am29203 Evaluation Board #### **META ASSEMBLERS** #### MICROTEC: - Meta Assembler - Macro facility on Macro Meta Assembler - PROM formatting - Organization - Definition program - Assembler program - PROM formatter program - Assembler - Two pass - Conditional assembly - Written in ANSI FORTRAN - FORTRAN IV - 16 bits minimum - Disk or magnetic tape required - 20K macro memory minimum ## AMD: (for SYS29 users) - M29 Meta Retargetable Microcode Assembler - Extensive microprogramming tools - Organization - M29DEF, Microinstruction Definition - M29ASM, Microprogram Assembler - M29LINK, Relocating Linker - M29LIB, Microcode Library Manager #### **CUSTOMER EDUCATION COURSES** - ED2900A "Introduction to Designing with the Am2900 Family". A three-day seminar on the design of microprogrammed systems using AMD's 2900 series devices, including outlines of newer, related parts, such as the Am29112 and Am29300. The Am29203 evaluation board is used for laboratory exercises. - ED2900B, Advanced Design with the Am2900 Family A four-day seminar that completes ED2900A's training on the 2900 family. A variety of detailed laboratory work with the Am29203 board provides the student with a thorough background in microcoded system design and debugging. ED2900A is a prerequisite. - ED29116, Designing with the Am29116, 16-bit Microprocessor A two-day seminar on the design of microprogrammed systems using AMD's 16-bit, Am29116 bipolar microprocessor. Various types of design examples are discussed. Completion of ED2900A is suggested. - ED29500, Designing Digital Signal/Array Processors with the Am29500 Family A three-day seminar on the theory and design of hardware and software for microprogrammed digital signal processors using the Am29500 family. Digital filtering, array and FFT processing are discussed. The relationships among various commercially available device families are described. Completion of ED2900A is suggested. #### Am29203 EVALUATION BOARD - 16-bit computer using Am29203, Am2910, Am2904, Am2925 - Extensive monitor to allow - Loading microcode - Examining microcode - Executing microcode - Set breakpoints - Single-stepping microcode - Load and examine ALU registers - Load and examine the pipeline - Load and examine macro memory