From WikiChip
Scorpio Engine - Microsoft
< microsoft
Revision as of 16:31, 13 December 2017 by ChippyBot (talk | contribs) (Bot: moving all {{mpu}} to {{chip}})

Edit Values
Scorpio Engine
scorpio engine (front).png
General Info
DesignerAMD,
Microsoft
ManufacturerTSMC
Model NumberScorpio Engine
MarketConsole
IntroductionJune 11, 2017 (announced)
November 7, 2017 (launched)
ShopAmazon
General Specs
Frequency2,300 MHz
Microarchitecture
ISAx86-64 (x86)
MicroarchitectureEnhanced Jaguar
Core NameEnhanced Jaguar
Process16 nm
Transistors7,000,000,000
TechnologyCMOS
Die359 mm²
Word Size64 bit
Cores8
Threads8
Max Memory12 GiB
Multiprocessing
Max SMP1-Way (Uniprocessor)

Scorpio Engine is a 64-bit octa-core x86 SoC designed by AMD and Microsoft for their Xbox One X. The chip features eight Enhanced Jaguar cores operating at 2.3 GHz and a custom Arctic Islands-based GPU operating at 1.172 GHz. Fabricated on TSMC's 16FF+, this chip supports 12 (24 for Dev) GiB of 12-channel GDDR5-6800 memory.

Overview

Like the PlayStation 4 and the Xbox One, the Xbox One X is also powered by a chip based on AMD's architectures. The Scorpio Engine most important goal was achieving true 4K gaming performance according to John Sell, a Distinguished Engineer at Microsoft who presented the chip at Hot Chips 29. Fabricated on TSMC's 16 nm process, the chip contains 7 billion transistors on a 359 mm² die, similar in amount to Nvidia's GTX 1080 and almost the same size as the original Xbox One SoC (which was 363 mm² on a 28 nm process).

Like the original Xbox One SoC, the chip feature eight Jaguar cores. The cores have been lightly enhanced and operate at a higher frequency of 2.3 GHz, but are otherwise mostly identical. It's worth pointing out that AMD has not commercialized a 16 nm Jaguar processor outside of this chip.


scorpio engine block diagram.png

Cache

Main article: Enhanced Jaguar § Cache

[Edit/Modify Cache Info]

hierarchy icon.svg
Cache Organization
Cache is a hardware component containing a relatively small and extremely fast memory designed to speed up the performance of a CPU by preparing ahead of time the data it needs to read from a relatively slower medium such as main memory.

The organization and amount of cache can have a large impact on the performance, power consumption, die size, and consequently cost of the IC.

Cache is specified by its size, number of sets, associativity, block size, sub-block size, and fetch and write-back policies.

Note: All units are in kibibytes and mebibytes.
L1$512 KiB
524,288 B
0.5 MiB
L1I$256 KiB
262,144 B
0.25 MiB
8x32 KiB2-way set associative 
L1D$256 KiB
262,144 B
0.25 MiB
8x32 KiB8-way set associativewrite-back

L2$4 MiB
4,096 KiB
4,194,304 B
0.00391 GiB
  2x2 MiB16-way set associativewrite-back

Memory controller

The memory architecture of the Scorpio Engine is very different from the previous generational chip. The entire memory subsystem was redesigned to address the major deficiencies in prior chips. In previous chip, Microsoft chose to go with 8 GiB of quad-channel DDR3 on-board operating at 2133 MT/s for a total system memory bandwidth of 63.58 GiB/s. That was fairly low by any standard; at the time the PS4 SoC had a peak memory bandwidth of 163.9 GiB/s. In an attempt to mitigate the lack of bandwidth, the chip incorporated 32 MiB of eSRAM with a bus of 1,024-bit in each direction for a total bandwidth of 190 GiB/s. While this reduced the cost of memory considerably, it increased the cost in terms of die area (which would be roughly 1.6B transistors). Additionally, this architecture also made the programming model more complicated to work with.

The Scorpio Engine made a 180-degree turn vs previous chip architecture-wise. The entire memory subsystem has been re-designed with GDDR5 and high bandwidth in mind. The previous 32 MiB of eSRAM has been eliminated. Instead, Scorpio has a maximum bandwidth of 304 GiB/s through a 384-bit wide DRAM interface with an effective transfer of 6800 MT/s. The bandwidth increase has effectively reached the level of a discrete graphics card and almost 1.5x the bandwidth of the PS4 Pro (which is 203 GiB/s).

Console systems will come with 12 x32 GDDR chips for a total of 12 GiB while the development system will come with x16 chips for a total capacity of 24 GiB. Scorpio has 12 DRAM controllers for each of the 12 channels. It's worth noting that in order to maintain backwards compatibility with the Xbox One software, the CPU and GPU accesses the memory separately. The exact details of how the CPU accesses the memory has not been detailed. As with the previous generation, a bidirectional bus sits between the memory controllers and the CPUs and GPU. The bus maintains coherency between the CPU cores and the GPU. For the GPU, each of the 12 channels is a 256-bit directional bus for a total bus width of 3,072-bit. The 12 channels allow the system to lower the granularity of memory accesses while the wide bus allows the system to take full advantage of the available bandwidth.

[Edit/Modify Memory Info]

ram icons.svg
Integrated Memory Controller
Max TypeGDDR5-6800
Max Mem12 GiB
Controllers12
Channels12
Width384 bit
Max Bandwidth304 GiB/s
311,296 MiB/s
326.418 GB/s
326,417.514 MB/s
0.297 TiB/s
0.326 TB/s
Bandwidth
Single 25.33 GiB/s

Graphics

Further information: Arctic Islands microarchitecture
Hardware Accelerated Video Capabilities
Codec Encode Decode
MPEG-4 AVC (H.264) 4K @ 60 Hz
HEVC (H.265) 4K @ 60 Hz 4K @ 60 Hz
VP9 4K @ 60 Hz

Scorpio Engine feature 40 Compute Units (CUs) based on Arctic Islands (Similar to Polaris, Radeon RX 4xx). This is triple the amount over the original Xbox One SoC. The Compute Units operate at 1,172 MHz, each with 64 32-bit floating point multiply-accumulate units. At 1.172 GHz with 128 FLOP/cycle this chip can deliver 6.00064 TFLOPS raw peak performance - over four times previous chips (which were 1.3 and 1.4 TFLOPS). The performance of texture processing has also been increase to 187.5 G bilinear texels/second. Additionally, in each of the four shader arrays, there is one geometry engine for the purpose of doing fixed geometry processing, this is twice previous chip which is capable of 4.688 Giga primitives/second.

Similar to previous chips, Scorpio has 2 command processors (i.e., microcontrollers) that handle graphic and compute tasks however they are said to handle more parallel compute tasks than previous chip in order to increase performance. The number of ACEs (Asynchronous Compute Engine) on Scorpio has doubled to 4 in order to increase the parallelism possible. It's worth pointing out that since the Scorpio Engine is based on Arctic Islands, there are also two additional schedulers.

scorpio engine rb-mem subsys.svg
Scorpio Engine GPU
Unified shaders2560 (64 × 40 CUs)
ROPs32
TMUs160
Peak Performance ~6 TFLOPS (6,000,640,000,000 FLOPS)

RBs and Memory Subsystem

The render back-end has also been enhanced. The render back-end consist of eight Render Boxes (RB) color/depth engines, each with 256 KiB graphics L2 cache for a total of 2 MiB. Each pair of Render Boxes is wired to one Memory Controller Cluster (MCC). Each MMC consists of two dedicated memory controllers and two more that are shared between two pairs of MCCs. In total there are four clusters with each having two dedicated controllers and two more pair (4 controllers) of channels shared between a pair of MCCs for a total of 12 channels.

Display

Scorpio Engine supports DP 1.2a, HDMI 2.0b, HDCP 2.2, and two-stream MST. 4K, 64-bit, 3-surface resize and blending is supported.

Audio

8 custom processors.

Utilizing devices

The Scorpio Engine is used solely in the Xbox One X.

Die


scorpio engine die shot.png


scorpio engine die shot (annotated).png

Yield & redundancy

In normal microprocessors such as typical CPUs and GPUs, when a core in the case of a CPU or a shader unit in the case of a GPU has a defect, it's common for manufacturers to disable those affected cores/shaders (typically in a symmetrical way) and sell those chips as lower end models. Since the Scorpio Engine is only found in the single-specification Xbox One X machine, binning is not possible. In an attempt to improve yield the Scorpio Engine actually incorporates 11 Compute Units (CUs) in each shader array for redundancy, 10 of them are operational while the 11th one is used as a spare. With 4 shader arrays, there are 4 spares and 40 enabled CUs. If one or a few compute units are faulty but the rest of the chip is fully functional, the spare CUs can be enabled to compensate for this.

References

  • Sell, John "Scorpio Engine." IEEE Hot Chips 29 (2017).
  • Sell, John, and Patrick O'Connor. "XBOX One Silicon." IEEE Hot Chips 25 (2013).
base frequency2,300 MHz (2.3 GHz, 2,300,000 kHz) +
core count8 +
core nameEnhanced Jaguar +
designerAMD + and Microsoft +
die area359 mm² (0.556 in², 3.59 cm², 359,000,000 µm²) +
first announcedJune 11, 2017 +
first launchedNovember 7, 2017 +
full page namemicrosoft/scorpio engine +
has ecc memory supportfalse +
instance ofmicroprocessor +
isax86-64 +
isa familyx86 +
l1$ size512 KiB (524,288 B, 0.5 MiB) +
l1d$ description8-way set associative +
l1d$ size256 KiB (262,144 B, 0.25 MiB) +
l1i$ description2-way set associative +
l1i$ size256 KiB (262,144 B, 0.25 MiB) +
l2$ description16-way set associative +
l2$ size4 MiB (4,096 KiB, 4,194,304 B, 0.00391 GiB) +
ldateNovember 7, 2017 +
main imageFile:scorpio engine (front).png +
manufacturerTSMC +
market segmentConsole +
max cpu count1 +
max memory12,288 MiB (12,582,912 KiB, 12,884,901,888 B, 12 GiB, 0.0117 TiB) +
max memory bandwidth304 GiB/s (311,296 MiB/s, 326.418 GB/s, 326,417.514 MB/s, 0.297 TiB/s, 0.326 TB/s) +
max memory channels12 +
microarchitectureEnhanced Jaguar +
model numberScorpio Engine +
nameScorpio Engine +
process16 nm (0.016 μm, 1.6e-5 mm) +
smp max ways1 +
supported memory typeGDDR5-6800 +
technologyCMOS +
thread count8 +
transistor count7,000,000,000 +
used byXbox One X +
word size64 bit (8 octets, 16 nibbles) +