From WikiChip
Xiaomi - Microarchitectures - Phytium
< phytium
Revision as of 15:47, 10 February 2017 by David (talk | contribs) (Block Diagram)

Edit Values
Xiaomi µarch
General Info
Arch TypeCPU
DesignerPhytium
ManufacturerTSMC
Introduction2017
Process28 nm
Pipeline
TypeSuperscalar
SpeculativeYes
Reg RenamingYes
Instructions
ISAARMv8
Cores
Core NamesFTC660,
FTC661

Xiaomi is an ARM microarchitecture designed in-house by Phytium for their consumer market and server-based microprocessors.

Brands

Codename Brand Description
Mars FT-2000
  • High performance
  • High bandwidth, Large memory
  • High bandwidth I/O
  • Large scale cache coherency
Earth FT-1500A
  • Moderate performance
  • High power efficiency
  • High density computing
  • Low cost

Architecture

Overview

  • Fully ARMv8 compatible
    • Support AArch32 and AArch64 modes
    • EL0-EL3 supported
    • ASIMD-128
  • 28 nm process
  • Scalable design
    • 4 to 64 cores
  • Mesh topology network-on-chip
  • Panel-based (grid) architecture
  • Global cache coherency
  • 2x DDR3-1600 channels per panel
  • 2x 16-lane PCIe 3.0

Panel Architecture

xiaomi panel-based data affinity architecture.png

Phytium organizes their processors using a grid-layout they call Panels they call Panel-based data affinity architecture. Each panel consists of 8 independent ARMv8-compatible cores. Phytium "Mars" processor consists of 8 such panels for a total of 64 cores. Panels are interconnected with a 2-dimensional mesh network-on-a-chip level 2 cache with 4 MiB per panel for a total of 32 MiB.

In addition to the main die, Mars uses an additional Cache & Memory chips (CMC) auxiliary chips. "Mars" uses 8 such chips connected to the main die providing 16 MiB of level 3 cache for a total of 128 MiB as well as 8 dual-channel DDR3-1600 memory controllers for a total maximum bandwidth of 204 GB/s. Mars also provides two 16-lane PCIe 3.0 interfaces. The chips incorporates ECC and parity protection on all caches, tags, and TLBs.

Panel

Each Panel consists of 8 cores - each ARMv8-compatible, supporting AArch32 and AArch64 modes, Exception Levels EL0-EL3, as well as ASIMD-128 operations. Each core has its own inclusive L1 cache and a shared L2 cache (4 MiB per panel). Each panel contains two Directory Control Units (DCU) which are in charge of maintaining directory-based cache coherency and one routing cell for managing the inter-panel communication.

On TSMC's 28 nm process, a panel is 6,000 µm x 10,600 µm (63.6 mm²).

xiaomi panel.png   xiaomi panel die (28nm).png

Block Diagram

xiaomi block diagram.svg

Memory Hierarchy

New text document.svg This section is empty; you can help add the missing info by editing this page.

Pipeline

New text document.svg This section is empty; you can help add the missing info by editing this page.

References

  • Zhang, C. (2015, August). Mars: A 64-core ARMv8 processor. In Hot Chips 27 Symposium (HCS), 2015 IEEE (pp. 1-23). IEEE.
codenameXiaomi +
designerPhytium +
first launched2017 +
full page namephytium/microarchitectures/xiaomi +
instance ofmicroarchitecture +
instruction set architectureARMv8 +
manufacturerTSMC +
microarchitecture typeCPU +
nameXiaomi +
process28 nm (0.028 μm, 2.8e-5 mm) +