From WikiChip
Difference between revisions of "ambric/am2000"
< ambric

(Architecture)
(Documents)
 
(9 intermediate revisions by one other user not shown)
Line 9: Line 9:
 
| first announced  = October 10, 2006
 
| first announced  = October 10, 2006
 
| first launched    = December, 2006
 
| first launched    = December, 2006
| production start  =  
+
| production start  = 2007
| production end    =  
+
| production end    = 2013
 
| arch              = Many-core 32-bit microprocessor
 
| arch              = Many-core 32-bit microprocessor
 
| isa              =  
 
| isa              =  
Line 18: Line 18:
 
| tech              = CMOS
 
| tech              = CMOS
 
| clock min        = 1 MHz
 
| clock min        = 1 MHz
| clock max        = 333 Mhz
+
| clock max        = 350 Mhz
 
| package          = FCBGA-868
 
| package          = FCBGA-868
 
| package 2        = FCBGA-896
 
| package 2        = FCBGA-896
Line 32: Line 32:
  
 
== Members ==
 
== Members ==
 
 
<!-- NOTE:  
 
<!-- NOTE:  
 
           This table is generated automatically from the data in the actual articles.
 
           This table is generated automatically from the data in the actual articles.
Line 58: Line 57:
 
{{table count|col=5|ask=[[Category:microprocessor models by ambric]][[instance of::microprocessor]][[microprocessor family::Am2000]]}}
 
{{table count|col=5|ask=[[Category:microprocessor models by ambric]][[instance of::microprocessor]][[microprocessor family::Am2000]]}}
 
</table>
 
</table>
 +
 +
== Die Shot ==
 +
 +
=== {{\|Am2045}} ===
 +
{|
 +
| [[File:Am2045 die shot.png]]  || [[File:Am2045 die shot (annotated).png]]
 +
|}
  
 
== Architecture ==
 
== Architecture ==
 
Ambric's Am2000 models are all made of small homogeneous units called '''Brics'''. The exact number of brics depends on the model.
 
Ambric's Am2000 models are all made of small homogeneous units called '''Brics'''. The exact number of brics depends on the model.
  
=== communication ===
+
=== Communication ===
 
[[File:ambric neighbor channels.png|thumb|right|350px|'''Neighbor Channels''']]
 
[[File:ambric neighbor channels.png|thumb|right|350px|'''Neighbor Channels''']]
Ambric's architecture makes heavy use of channels - synchronized interconnects that carry both data and instructions in a [[FIFO]]. Channels are a strong point of this architecture as all data goes through channels including [[memory]] and [[registers]]. Channel interconnects can be loosely divided into three categories:  
+
Ambric's architecture makes heavy use of '''Ambric Channels''' - self-synchronized and asynchronous interconnects that carry both data and instructions across the chip. Channels are a strong point of this architecture as all data goes through channels including [[memory]] and the pipeline itself. Channel interconnects can be loosely divided into three categories:  
 +
 
 +
 
 +
* '''Intra-Bric Channels''': internal channels that span no loner than a single bric. All basic communication utilizes these channels. They are dynamically configured by the instructions themselves. The datapath itself is also an Ambric channel.
  
* '''Intra-Bric Channels''': internal channels that span no loner than a single bric. All basic communication utilizes these channels. They are dynamically configured by the instructions themselves. Typical ALU [[register]]/[[memory]] utilizes these channels.
 
  
 
* '''Neighbor Channels''': channels spanning between two CU units. These channels only go to an adjacent control unit (i.e. directly North, South, East, or West). Only one channel is aviable in each direction. Channels are {{arch|32}} wide operating at up to 9.6 GT/s.
 
* '''Neighbor Channels''': channels spanning between two CU units. These channels only go to an adjacent control unit (i.e. directly North, South, East, or West). Only one channel is aviable in each direction. Channels are {{arch|32}} wide operating at up to 9.6 GT/s.
  
* '''Inter-Bric Channels''': also known as distanced bric channels, are communication channels that operate globally between any two brics.
+
 
 +
* '''Inter-Bric Channels''': also known as distanced bric channels, are communication channels that operate globally between any two brics. Switches are located at each of the control units (CUs). Routes are configured statically.
 +
 
 +
 
 +
An '''Ambric Channels''' can be more concisely define as a chain of '''Ambric registers'''. Ambric registers are simple {{arch|32}} storage units with both data in port and data out port as well two control signals - ''valid'' and ''accept''. The two control signals allow registers to be self-synchronized and operate asynchronously. The accept signal is asserted when input can be written. The valid signal is asserted, when it has output available. Assertion of both signals would indicate transfer. This setup allows input and output to also operate at different clock rates to accommodate different workloads.
 +
 
 +
[[File:ambric channel.png|650px]]
 +
{{clear}}
 +
 
 +
=== Brics ===
 +
[[File:ambric bric.png|left]]
 +
'''Bric''' is the fundamental building block. Each block contains:
 +
 
 +
* 2x '''Compute Units''' (CU)
 +
::Each Compute Unit contains 2x '''SRD''' {{arch|32}} CPUs and 2x '''SR''' {{arch|32}} CPUs. Channel interconnects are also handled in this area.
 +
 
 +
 
 +
* 2x '''RAM Units''' (RU)
 +
::Each RAM Unit contains 4 2 kB RAM banks, each independently accessed via a dynamically programmed channel operating in FIFO and random access modes via the RU engines. RUs are used for temporary data and buffering largely used for local operations. On-die storage is kept simple, fast, and efficient.
 +
 
 +
{{clear}}
 +
 
 +
==== SR Processor ====
 +
[[File:ambric sr core instruction.png|right]]
 +
The '''SR''' ('''Streaming [[RISC]]''') Processor is a {{arch|32}} processor for fast simple operations. The datapath itself is a self-synchronizing Ambric Channel with 3-stages. This processor can handle complex [[addressing]], [[serialization]] and [[deserialization]].  Each processor has 2 input channels and 1 output channel - all of which are controlled by the instructions themselves. Additionally, the processor includes:
 +
 
 +
* 1x ALU - 1x {{arch|32}} OR 2x {{arch|16}} operations
 +
* 8x [[general-purpose registers|General Purpose]] [[register|Registers]]
 +
* {{arch|16}} instruction word size
 +
* 64 word local code/data RAM
 +
 
 +
[[File:ambric sr core.png]]
 +
{{clear}}
 +
==== SRD Processor ====
 +
[[File:ambric srd core instruction.png|right]]
 +
The '''SRD''' ('''Streaming [[RISC]] with [[DSP]] extensions''') Processor is a {{arch|32}} processor for more complex operations that may benefit from [[instruction-level parallelism]] and iterative algorithms. The datapath itself is a self-synchronizing Ambric Channel with 3-stages. Each processor has 2 input channels and 1 output channel - all of which are controlled by the instructions themselves. Additionally, the processor includes:
 +
 
 +
* 3x ALU
 +
** 2x ALU in serial
 +
*** 1x {{arch|32}} OR 2x {{arch|16}} OR 4x {{arch|8}} operations
 +
** 1x ALU in parallel
 +
*** 1x {{arch|32}} * {{arch|8}} OR 2x {{arch|16}} * {{arch|8}}, 1x {{arch|64}} [[accumulator]], etc..
 +
 
 +
[[File:ambric srd core.png]]
 +
{{clear}}
  
 
== Programming ==  
 
== Programming ==  
{{empty section}}
+
[[File:Am2045 Software Development Board.png|thumb|right|Software Development Board]]
 +
Programming may be done in [[assembly]] or in {{\\|aJava}}. {{\\|aJava}} is a [[strict subset]] of [[Java]]. While it excludes the Java standard library, Ambric did offer various libraries for performing various video and imaging processing (e.g. [[AVC-Intra]], [[MPEG-2]]. [[H.264]], [[JPEG 2000]], [[DVCPRO HD]]).
 +
 
 +
Ambric employed a Structural Object Programming Model. Every object is strictly encapsulated. Because the large number of cores each chip offers, objects are treated as independent programs running concurrently. Objects exchange data and control only through structures called '''Ambric channels''' which are both self-synchronizing and operate asynchronously.
 +
 
 +
=== Tools ===
 +
[[File:Am2045 Integrated Development Board.png|thumb|right|Integrated Development Board]]
 +
* Am2045 Software Development Board
 +
** 1x production {{\|Am2045}} + [[SDRAM]]
 +
** [[PCI Express]] interface to host
 +
** For rapid software development
 +
 
 +
 
 +
* Am2045 Integrated Development Board
 +
** 1x production {{\|Am2045}} + [[SDRAM]]
 +
** 4x {{arch|32}} [[GPIO]] connectors, [[USB]]
 +
** Stand-alone capable  / PCIe slot
 +
** Serial Flash
 +
** For embedded development
  
 
== Applications ==
 
== Applications ==
 
The Am2000 has been used for high-definition video processing, medical imaging devices, high performance network processing, image recognition, and various military applications such as drones.
 
The Am2000 has been used for high-definition video processing, medical imaging devices, high performance network processing, image recognition, and various military applications such as drones.
 +
 +
== Documents ==
 +
* [[:File:The Ambric Processor Array.pdf|The Ambric Processor Array]], 2006
 +
* [[:File:TeraOPS Hardware A New Massively-Parallel MIMD Computing Fabric IC.pdf|TeraOPS Hardware A New Massively-Parallel MIMD Computing Fabric IC]], 2006
 +
* [[:File:A Structural Object Programming Model, Architecture, Chip and Tools for Reconfigurable Computing.pdf|A Structural Object Programming Model, Architecture, Chip and Tools for Reconfigurable Computing]], 2007
 +
* [[:File:Ambric University Intro.pdf|Ambric University Intro]], 2008
 +
 +
=== Papers ===
 +
* Butts, Michael, Anthony Mark Jones, and Paul Wasson. "[http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4297243&isnumber=4297232 A structural object programming model, architecture, chip and tools for reconfigurable computing.]" Proceedings of IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM). 2007.
 +
* Qasim, Muhammad, and Chaudhry Majid Ali. "[http://hh.diva-portal.org/smash/get/diva2:238878/FULLTEXT01.pdf Signal Processing on Ambric Processor Array: Baseband processing in radio base stations.]" (2008). Halmstad University.
 +
* Biedermann, Alexander. "[http://www.vlsi.informatik.tu-darmstadt.de/staff/biedermann/Diplomarbeit-Ausarbeitung.pdf Theoretische und algorithmische Evaluation eines massiv-parallelen Multiprozessorarrays der Ambric Am2000-Familie unter Verwendung von Videoalgorithmen]. Diss. 2008.
  
 
== See also ==
 
== See also ==
 
* [[massively parallel processor array]]
 
* [[massively parallel processor array]]
 
* {{rapport|Kilocore}}
 
* {{rapport|Kilocore}}

Latest revision as of 14:45, 9 December 2018

Ambric Am2000
ambric 2045.gif
Am2045
Developer Ambric
Manufacturer TSMC
Type Microprocessors
Introduction October 10, 2006 (announced)
December, 2006 (launch)
Production 2007-2013
Architecture Many-core 32-bit microprocessor
Word size 32 bit
4 octets
8 nibbles
Process 130 nm
0.13 μm
1.3e-4 mm
Technology CMOS
Clock 1 MHz-350 Mhz
Package FCBGA-868, FCBGA-896

Am2000 was a family of 32-bit MPPAs designed by Ambric. The series was introduced at the 2006 Fall Microprocessor Forum. The two flagship models the Am2045 (A) and later Am2045B had over 300 cores with maximum theoretical performance of over one trillion operations per second. Due to the economic downturn of 2008 Ambric failed to secure additional funding and was forced to sell its assets to Nethra Imaging which continued to manufacture the chips until 2013. Prior to the acquisition Ambric also announced a 600 cores, 600 MHz model. It's unknown if the model ever made it to market. Designs, software, and related patents are now held by Imagination Technologies.

Ambric, unlike many of its competitors, managed to develop a sound programming model that proved itself to be simple enough and intuitive enough to allow easy programming. Am2000 found their way to various military applications, medical instruments, and high-end multimedia hardware.

Overview[edit]

Introduced in the fall of 2006, the Am2000 is a series of many-core processors implemented as a massively parallel processor array designed to replace high-end embedded processors, DSPs, and FPGAs in applications where fast general-purpose integer arithmetic and digital-signal processing is required. Such tasks usually land themselves fairly well in highly parallel environments.

Ambric's AM2000 series is an example of one of the few massively parallel processors that succeeded in independently developing a solid programming model and tools that worked extremely well with the underlying processor that relatively easy to code. A complete set of development tools were also offered with the product including extensions to Eclipse IDE. Code for the AM2000 used a language called aJava which was a strict subset of Java that compiled directly into machine code.

Members[edit]

,,,,,,,
Am2000 Models
ModelLaunchedProcessFreqCore Count
Am2012January 2007130 nm
0.13 μm
1.3e-4 mm
333 MHz
0.333 GHz
333,000 kHz
96
Am201615 November 2007130 nm
0.13 μm
1.3e-4 mm
350 MHz
0.35 GHz
350,000 kHz
120
Am2024January 2007130 nm
0.13 μm
1.3e-4 mm
333 MHz
0.333 GHz
333,000 kHz
192
Am202915 November 2007130 nm
0.13 μm
1.3e-4 mm
350 MHz
0.35 GHz
350,000 kHz
216
Am2035January 2007130 nm
0.13 μm
1.3e-4 mm
333 MHz
0.333 GHz
333,000 kHz
280
Am2045January 2007130 nm
0.13 μm
1.3e-4 mm
333 MHz
0.333 GHz
333,000 kHz
344
Am2045B15 November 2007130 nm
0.13 μm
1.3e-4 mm
350 MHz
0.35 GHz
350,000 kHz
344
Am207090 nm
0.09 μm
9.0e-5 mm
600 MHz
0.6 GHz
600,000 kHz
560
Count: 8

Die Shot[edit]

Am2045[edit]

Am2045 die shot.png Am2045 die shot (annotated).png

Architecture[edit]

Ambric's Am2000 models are all made of small homogeneous units called Brics. The exact number of brics depends on the model.

Communication[edit]

Neighbor Channels

Ambric's architecture makes heavy use of Ambric Channels - self-synchronized and asynchronous interconnects that carry both data and instructions across the chip. Channels are a strong point of this architecture as all data goes through channels including memory and the pipeline itself. Channel interconnects can be loosely divided into three categories:


  • Intra-Bric Channels: internal channels that span no loner than a single bric. All basic communication utilizes these channels. They are dynamically configured by the instructions themselves. The datapath itself is also an Ambric channel.


  • Neighbor Channels: channels spanning between two CU units. These channels only go to an adjacent control unit (i.e. directly North, South, East, or West). Only one channel is aviable in each direction. Channels are 32-bit wide operating at up to 9.6 GT/s.


  • Inter-Bric Channels: also known as distanced bric channels, are communication channels that operate globally between any two brics. Switches are located at each of the control units (CUs). Routes are configured statically.


An Ambric Channels can be more concisely define as a chain of Ambric registers. Ambric registers are simple 32-bit storage units with both data in port and data out port as well two control signals - valid and accept. The two control signals allow registers to be self-synchronized and operate asynchronously. The accept signal is asserted when input can be written. The valid signal is asserted, when it has output available. Assertion of both signals would indicate transfer. This setup allows input and output to also operate at different clock rates to accommodate different workloads.

ambric channel.png

Brics[edit]

ambric bric.png

Bric is the fundamental building block. Each block contains:

  • 2x Compute Units (CU)
Each Compute Unit contains 2x SRD 32-bit CPUs and 2x SR 32-bit CPUs. Channel interconnects are also handled in this area.


  • 2x RAM Units (RU)
Each RAM Unit contains 4 2 kB RAM banks, each independently accessed via a dynamically programmed channel operating in FIFO and random access modes via the RU engines. RUs are used for temporary data and buffering largely used for local operations. On-die storage is kept simple, fast, and efficient.

SR Processor[edit]

ambric sr core instruction.png

The SR (Streaming RISC) Processor is a 32-bit processor for fast simple operations. The datapath itself is a self-synchronizing Ambric Channel with 3-stages. This processor can handle complex addressing, serialization and deserialization. Each processor has 2 input channels and 1 output channel - all of which are controlled by the instructions themselves. Additionally, the processor includes:

ambric sr core.png

SRD Processor[edit]

ambric srd core instruction.png

The SRD (Streaming RISC with DSP extensions) Processor is a 32-bit processor for more complex operations that may benefit from instruction-level parallelism and iterative algorithms. The datapath itself is a self-synchronizing Ambric Channel with 3-stages. Each processor has 2 input channels and 1 output channel - all of which are controlled by the instructions themselves. Additionally, the processor includes:

ambric srd core.png

Programming[edit]

Software Development Board

Programming may be done in assembly or in aJava. aJava is a strict subset of Java. While it excludes the Java standard library, Ambric did offer various libraries for performing various video and imaging processing (e.g. AVC-Intra, MPEG-2. H.264, JPEG 2000, DVCPRO HD).

Ambric employed a Structural Object Programming Model. Every object is strictly encapsulated. Because the large number of cores each chip offers, objects are treated as independent programs running concurrently. Objects exchange data and control only through structures called Ambric channels which are both self-synchronizing and operate asynchronously.

Tools[edit]

Integrated Development Board
  • Am2045 Software Development Board


  • Am2045 Integrated Development Board
    • 1x production Am2045 + SDRAM
    • 4x 32-bit GPIO connectors, USB
    • Stand-alone capable / PCIe slot
    • Serial Flash
    • For embedded development

Applications[edit]

The Am2000 has been used for high-definition video processing, medical imaging devices, high performance network processing, image recognition, and various military applications such as drones.

Documents[edit]

Papers[edit]

See also[edit]

Facts about "Am2000 - Ambric"
designerAmbric +
first announcedOctober 10, 2006 +
first launchedDecember 2006 +
full page nameambric/am2000 +
instance ofmicroprocessor family +
main designerAmbric +
manufacturerTSMC +
nameAmbric Am2000 +
packageFCBGA-868 + and FCBGA-896 +
process130 nm (0.13 μm, 1.3e-4 mm) +
technologyCMOS +
word size32 bit (4 octets, 8 nibbles) +