From WikiChip
TaiShan v110 - Microarchitectures - HiSilicon
< hisilicon
Revision as of 18:45, 5 May 2019 by David (talk | contribs) (Overview)

Edit Values
TaiShan v110 µarch
General Info
Arch TypeCPU
DesignerHiSilicon
ManufacturerTSMC
Introduction2019
Process7 nm
Core Configs32, 48, 64
Pipeline
TypeSuperscalar, Superpipeline
OoOEYes
SpeculativeYes
Reg RenamingYes
Decode4-way
Instructions
ISAARMv8.2-A
ExtensionsNEON
Cache
L1I Cache64 KiB/core
L1D Cache64 KiB/core
L2 Cache512 KiB/core
L3 Cache1 MiB/core
Succession

TaiShan v110 is the successor to the TaiShan v100, a high-performance ARM server microarchitecture designed by HiSilicon for Huawei's own TaiShan servers.

Brands

TaiShan-based CPUs are branded as the Kunpeng 920 series.

Release Dates

Kunpeng 920 CPUs were officially launched in early 2019.

Architecture

Key changes from TaiShan v100

This list is incomplete; you can help by expanding it.

Block Diagram

Entire Chip

taishan v110 soc block diagram.svg

Memory Hierarchy

  • Cache
    • L1I Cache
      • 64 KiB/core, private
    • L1D Cache
      • 64 KiB/core, private
    • L2 Cache
      • 512 KiB/core, private
    • L3 Cache
      • 1 MiB/core
      • Shared by all cores
    • System DRAM
      • 1 TiB Max Memory / socket
      • 8 Channels
      • DDR4, up to 2933 MT/s
        • 1 DPC and 2 DPC support
      • 8 B/cycle/channel (@ memory clock)
      • ECC, SDDC, DDDC

Overview

Overview

Though HiSilicon has a history of designing Arm processors. The TaiShan v110 core is HiSilicons' first custom homegrown high-performance ARM core and SoC design. The chip, which incorporates multiple compute dies and an I/O is a multi-chip package, is fabricated on TSMC's 7 nm process and integrates up to 64 cores and up to 64 MiB of last level cache.

Marketed as the Kunpeng 920, this SoC supports up to 4-way multiprocessing support through HiSilicon's Hydra interface. In order to keep the cores fed, eight DDR4 memory channels are incorporated per socket. Additionally, designed to facilitate an easy accelerator platform, there are 40 PCIe Gen 4 lanes provided per socket with CCIX support, enabling cache coherency.

Core

New text document.svg This section is empty; you can help add the missing info by editing this page.

MCP physical design

The SoC itself comprises 3 dies - two Super CPU Cluster (SCCL) compute dies and a Super IO Cluster (SICL). The SCCL compute dies contains 8 CPU Clusters (CCLs), memory controllers, and the L3 cache block. There are eight CCLs on each of the SICL dies for a total of 64 cores. The CCLs are TaiShan V110 quadplex along with the L3 cache tags partition. The Super IO Clusters include the various I/O peripherals including PCIe Gen 4, SAS, the network interface controllers, and the Hydra links.

taishan v110 soc details.svg

Scalability

Each chip incorporates three Hydra interface ports. The Hydra interface facilitates the cache coherency between the dies on the chip. Every link supports 240 Gb/s (30 GB/s) of peak bandwidth for a total aggregated bandwidth of 720 Gb/s (90 GB/s) in a 2-way symmetric multiprocessing configuration.

Kunpeng 920 2smp.svg

With all three links, there is also support for 4-way SMP. In this configuration, one link from each socket is connected to another socket for an all-for-all connection.


Kunpeng 920 4smp.svg

Die

  • TSMC 7 nm HPC
  • 20,000,000,000 transistors
    • 3-4 dies

All TaiShan Chips

New text document.svg This section is empty; you can help add the missing info by editing this page.

Bibliography

  • Huawei. Personal Communication. 2019
  • Huawei Connect 2018. October 2018
  • HiSilicon Event. January 7, 2019
codenameTaiShan v110 +
core count32 +, 48 + and 64 +
designerHiSilicon +
first launched2019 +
full page namehisilicon/microarchitectures/taishan v110 +
instance ofmicroarchitecture +
instruction set architectureARMv8.2-A +
manufacturerTSMC +
microarchitecture typeCPU +
nameTaiShan v110 +
process7 nm (0.007 μm, 7.0e-6 mm) +