From WikiChip
Editing acorn/microarchitectures/arm2
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone.
Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
This page supports semantic in-text annotations (e.g. "[[Is specified as::World Heritage Site]]") to build structured and queryable content provided by Semantic MediaWiki. For a comprehensive description on how to use annotations or the #ask parser function, please have a look at the getting started, in-text annotation, or inline queries help pages.
Latest revision | Your text | ||
Line 110: | Line 110: | ||
<div style="float: left; margin: 10px;">'''Register-Immediate:'''<br>[[File:arm1 reg imm.svg|300px]]</div></div> | <div style="float: left; margin: 10px;">'''Register-Immediate:'''<br>[[File:arm1 reg imm.svg|300px]]</div></div> | ||
− | |||
===== Multiplication ===== | ===== Multiplication ===== | ||
− | + | The ARM1 major performance issue was with multiplication. The ARM1 lacked hardware multiplication which meant software had to resort to a software-based solution (e.g., classic [[Shift-and-Add Multiplication]]). For example to perform <code>var = x * 5;</code> one could rewrite it as <code>var = x + (x << 2);</code> to achieve the same result without a multiplication operation. While originally was not thought to be a big program, software multiplication proved to be a rather serious bottleneck. | |
− | The ARM1 major performance issue was with multiplication. The ARM1 lacked hardware multiplication which meant software had to resort to a software-based solution (e.g., classic [[Shift-and-Add Multiplication]]). For example to perform <code>var = x * 5;</code> one could rewrite it as <code>var = x + (x << 2);</code> to achieve the same result without a multiplication operation. While originally was not thought to be a big | ||
This was addressed with the ARM2 which introduced a [[Booth's Multiplier]]. Conceptually, the multiplier sits on the "B" operand of the ALU in a similar way to how the barrel shifter sits on the "A" operand of the ALU, however there are some major differences in how they are implemented and operate. | This was addressed with the ARM2 which introduced a [[Booth's Multiplier]]. Conceptually, the multiplier sits on the "B" operand of the ALU in a similar way to how the barrel shifter sits on the "A" operand of the ALU, however there are some major differences in how they are implemented and operate. | ||
− | With a number of hard constraints (i.e., [[die size]] and power dissipation), the ARM2 solution is a very conservative 2-bit multiplier. Unlike the shifter which can shift the first ALU operand by some amount of bits in almost every instruction, the multiplier is typically inoperative. Only the <code>MUL</code> and <code>MLA</code> make use of it. When that happens the multiplier is invoked. Each cycle the 2-bit Booth's algorithm multiplication is performed. The result is fed through the ALU to a destination register. The destination register is also used to hold the intermediate value which can last up to 16 cycles for all 32 bits. For this reason using the same destination register as the source has been prohibited as it would invoke [[undefined behavior]] | + | With a number of hard constraints (i.e., [[die size]] and power dissipation), the ARM2 solution is a very conservative 2-bit multiplier. Unlike the shifter which can shift the first ALU operand by some amount of bits in almost every instruction, the multiplier is typically inoperative. Only the <code>MUL</code> and <code>MLA</code> make use of it. When that happens the multiplier is invoked. Each cycle the 2-bit Booth's algorithm multiplication is performed. The result is fed through the ALU to a destination register. The destination register is also used to hold the intermediate value which can last up to 16 cycles for all 32 bits. For this reason using the same destination register as the source has been prohibited as it would invoke [[undefined behavior]]. |
The second instruction implemented, the <code>MLA</code>, supports multiply and accumulate. This instruction takes advantage of the fact that the ALU is situated after the multiplier, allowing a final addition operation to be performed on the result of the multiplication prior to saving the value back into the destination register. | The second instruction implemented, the <code>MLA</code>, supports multiply and accumulate. This instruction takes advantage of the fact that the ALU is situated after the multiplier, allowing a final addition operation to be performed on the result of the multiplication prior to saving the value back into the destination register. |
Facts about "ARM2 - Microarchitectures - Acorn"
codename | ARM2 + |
core count | 1 + |
designer | Acorn Computers + |
first launched | 1986 + |
full page name | acorn/microarchitectures/arm2 + |
instance of | microarchitecture + |
instruction set architecture | ARMv2 + |
manufacturer | VLSI Technology + and Sanyo + |
microarchitecture type | CPU + |
name | ARM2 + |
pipeline stages | 3 + |
process | 2,000 nm (2 μm, 0.002 mm) + |