From WikiChip
Editing intel/microarchitectures/gen9.5

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

This page supports semantic in-text annotations (e.g. "[[Is specified as::World Heritage Site]]") to build structured and queryable content provided by Semantic MediaWiki. For a comprehensive description on how to use annotations or the #ask parser function, please have a look at the getting started, in-text annotation, or inline queries help pages.

Latest revision Your text
Line 15: Line 15:
 
| successor link  = intel/microarchitectures/gen10
 
| successor link  = intel/microarchitectures/gen10
 
}}
 
}}
'''Gen9.5''' (''Generation 9.5'') is the [[microarchitecture]] for [[Intel]]'s [[graphics processing unit]] utilized by {{\\|Kaby Lake}}-based, {{\\|Coffee Lake}}-based, {{\\|Comet Lake}}-based,and {{\\|Goldmont Plus}}-based microprocessors. Gen9.5 is the successor to {{\\|Gen9}} used by {{\\|Skylake}} and introduces a number of light enhancements.
+
'''Gen9.5''' (''Generation 9.5'') is the [[microarchitecture]] for [[Intel]]'s [[graphics processing unit]] utilized by {{\\|Kaby Lake}}-based microprocessors. Gen9.5 is the successor to {{\\|Gen9}} used by {{\\|Skylake}} and introduces a number of light enhancements.
  
 
== Codenames ==
 
== Codenames ==
Line 44: Line 44:
 
| Windows || Linux || Windows || Linux || [[High Level Shading Language|HLSL]] || Windows || Linux || Windows || Linux
 
| Windows || Linux || Windows || Linux || [[High Level Shading Language|HLSL]] || Windows || Linux || Windows || Linux
 
|-
 
|-
| {{intel|UHD Graphics 600}} || 12 || GT1 || {{intel|Gemini Lake|l=core}} || - || rowspan="11" colspan="2" style="text-align: center;" | '''1.1''' || rowspan="11" style="text-align: center;" | '''12''' || rowspan="11" style="text-align: center;" | '''N/A''' || rowspan="11" style="text-align: center;" | '''5.1''' || rowspan="11" style="text-align: center;" | '''4.5''' || rowspan="11" style="text-align: center;" | '''4.5''' || rowspan="11" style="text-align: center;"  colspan="1" | '''2.1''' || style="text-align: center;" rowspan="11" | '''2.0'''
+
| {{intel|HD Graphics 610}} || 12 || GT1 || {{intel|Kaby Lake S|S}}, {{intel|Kaby Lake U|U}} || - || rowspan="7" colspan="2" style="text-align: center;" | '''1.0''' || rowspan="7" style="text-align: center;" | '''12''' || rowspan="7" style="text-align: center;" | '''N/A''' || rowspan="7" style="text-align: center;" | '''5.1''' || rowspan="7" style="text-align: center;" | '''4.4''' || rowspan="7" style="text-align: center;" | '''4.5''' || rowspan="7" style="text-align: center;"  colspan="2" | '''2.0'''
|-
 
| {{intel|UHD Graphics 605}} || 18 || GT1.5 || {{intel|Gemini Lake|l=core}} || -
 
|-
 
| {{intel|HD Graphics 610}} || 12 || GT1 || {{intel|Kaby Lake S|S}}, {{intel|Kaby Lake U|U}} || -
 
 
|-
 
|-
 
| {{intel|HD Graphics 615}} || 24 || GT2|| {{intel|Kaby Lake Y|Y}} || -
 
| {{intel|HD Graphics 615}} || 24 || GT2|| {{intel|Kaby Lake Y|Y}} || -
 
|-
 
|-
 
| {{intel|HD Graphics 620}} || 24 || GT2 || {{intel|Kaby Lake U|U}} || -
 
| {{intel|HD Graphics 620}} || 24 || GT2 || {{intel|Kaby Lake U|U}} || -
|-
 
| {{intel|UHD Graphics 620}} || 24 || GT2 || {{intel|Kaby Lake U|U}} || -
 
 
|-
 
|-
 
| {{intel|HD Graphics 630}} || 24 || GT2 || {{intel|Kaby Lake S|S}}, {{intel|Kaby Lake H|H}} || -
 
| {{intel|HD Graphics 630}} || 24 || GT2 || {{intel|Kaby Lake S|S}}, {{intel|Kaby Lake H|H}} || -
|-
 
| {{intel|UHD Graphics 630}} || 23/24 || GT2 || {{intel|Coffee Lake S|S}} || -
 
 
|-
 
|-
 
| {{intel|HD Graphics P630}} || 24 || GT2 || {{intel|Kaby Lake H|H}} || -
 
| {{intel|HD Graphics P630}} || 24 || GT2 || {{intel|Kaby Lake H|H}} || -
Line 79: Line 71:
 
|-
 
|-
 
| {{intel|HD Graphics 620}} || KBL-U 2+2 || H0 || C0/B0 || 0x5916 || 0x2
 
| {{intel|HD Graphics 620}} || KBL-U 2+2 || H0 || C0/B0 || 0x5916 || 0x2
|-
 
| {{intel|UHD Graphics 620}} || || || || 0x5917 ||
 
 
|-
 
|-
 
| rowspan="2" | {{intel|HD Graphics 630}} || KBL-S 4+2 || B0 || F0/C0  || 0x5912 || 0x4
 
| rowspan="2" | {{intel|HD Graphics 630}} || KBL-S 4+2 || B0 || F0/C0  || 0x5912 || 0x4
Line 86: Line 76:
 
| KBL Halo 4+2 ||  ||  || 0x591B ||  
 
| KBL Halo 4+2 ||  ||  || 0x591B ||  
 
|-
 
|-
| rowspan="2" | {{intel|UHD Graphics 630}} || CFL-S 4+2 || rowspan="2" | 23/24 || U0 ||  || 0x3E91 ||
+
| {{intel|HD Graphics P630}} || KBL WKS 4+2 ||  ||  || 0x591D ||   
|-
 
| CFL-S 6+2 || U0 ||  || 0x3E92 ||
 
|-
 
| {{intel|HD Graphics P630}} || KBL WKS 4+2 || 24 ||  ||  || 0x591D ||   
 
 
|-
 
|-
 
| {{intel|Iris Plus Graphics 640}} || KBL-U 2+3 || rowspan="2" | 48 ||  J1 || D1/B1 || 0x5926 || 0x6
 
| {{intel|Iris Plus Graphics 640}} || KBL-U 2+3 || rowspan="2" | 48 ||  J1 || D1/B1 || 0x5926 || 0x6
Line 98: Line 84:
  
 
<references group=devID />
 
<references group=devID />
 
== Performance ==
 
<div style="overflow-x: auto;">
 
{| class="wikitable" style="text-align: center; white-space: nowrap;"
 
! rowspan="2" | Frequency !! colspan="12" | Peak Performance
 
|-
 
! rowspan="13" | &nbsp; || colspan="3" | Half Precision || rowspan="13" | &nbsp; || colspan="3" | Single Precision || rowspan="13" | &nbsp; || colspan="3" | Double Precision
 
|-
 
| Models || {{intel|HD Graphics 610|610}} || {{intel|HD Graphics 615|615}}, {{intel|HD Graphics 620|620}}, {{intel|HD Graphics 630|630}}, {{intel|HD Graphics P630|P630}} || {{intel|Iris Plus Graphics 640|640}}, {{intel|Iris Plus Graphics 650|650}} || {{intel|HD Graphics 610|610}} || {{intel|HD Graphics 615|615}}, {{intel|HD Graphics 620|620}}, {{intel|HD Graphics 630|630}}, {{intel|HD Graphics P630|P630}} || {{intel|Iris Plus Graphics 640|640}}, {{intel|Iris Plus Graphics 650|650}} || {{intel|HD Graphics 610|610}} || {{intel|HD Graphics 615|615}}, {{intel|HD Graphics 620|620}}, {{intel|HD Graphics 630|630}}, {{intel|HD Graphics P630|P630}} || {{intel|Iris Plus Graphics 640|640}}, {{intel|Iris Plus Graphics 650|650}}
 
|-
 
| Tiers || GT1 || GT2 || GT3e || GT1 || GT2 || GT3e ||  GT1 || GT2 || GT3e
 
|-
 
| Ref (FLOP/clk) || 384/cycle || 768/cycle || 1536/cycle || 192/cycle || 384/cycle || 768/cycle || 48/cycle || 96/cycle || 192/cycle
 
|-
 
| Base (300 MHz) || {{#expr: 384*.3}} [[GFLOPS]] || {{#expr: 768*.3}} GFLOPS || {{#expr: 1536*.3}} GFLOPS || {{#expr: 192*.3}} GFLOPS || {{#expr: 384*.3}} GFLOPS || {{#expr: 768*.3}} GFLOPS || {{#expr: 48*.3}} GFLOPS || {{#expr: 96*.3}} GFLOPS || {{#expr: 129*.3}} GFLOPS
 
|-
 
| Base (350 MHz) || {{#expr: 384*.35}} GFLOPS || {{#expr: 768*.35}} GFLOPS || {{#expr: 1536*.35}} GFLOPS || {{#expr: 192*.35}} GFLOPS || {{#expr: 384*.35}} GFLOPS || {{#expr: 768*.35}} GFLOPS || {{#expr: 48*.35}} GFLOPS || {{#expr: 96*.35}} GFLOPS || {{#expr: 129*.35}} GFLOPS
 
|-
 
| Boost (850 MHz) || {{#expr: 384*.850}} GFLOPS || {{#expr: 768*.850}} GFLOPS || {{formatnum:{{#expr: 1536*.850}}}} GFLOPS || {{#expr: 192*.850}} GFLOPS || {{#expr: 384*.850}} GFLOPS || {{#expr: 768*.850}} GFLOPS || {{#expr: 48*.850}} GFLOPS || {{#expr: 96*.850}} GFLOPS || {{#expr: 129*.850}} GFLOPS
 
|-
 
| Boost (900 MHz) || {{#expr: 384*.9}} GFLOPS || {{#expr: 768*.9}} GFLOPS || {{formatnum:{{#expr: 1536*.9}}}} GFLOPS || {{#expr: 192*.9}} GFLOPS || {{#expr: 384*.9}} GFLOPS || {{#expr: 768*.9}} GFLOPS || {{#expr: 48*.9}} GFLOPS || {{#expr: 96*.9}} GFLOPS || {{#expr: 129*.9}} GFLOPS
 
|-
 
| Boost (950 MHz) || {{#expr: 384*.95}} GFLOPS || {{#expr: 768*.95}} GFLOPS || {{formatnum:{{#expr: 1536*.95}}}} GFLOPS || {{#expr: 192*.95}} GFLOPS || {{#expr: 384*.95}} GFLOPS || {{#expr: 768*.95}} GFLOPS || {{#expr: 48*.95}} GFLOPS || {{#expr: 96*.95}} GFLOPS || {{#expr: 129*.95}} GFLOPS
 
|-
 
| Boost (1,000 MHz) || {{#expr: 384*1}} GFLOPS || {{#expr: 768*1}} GFLOPS || {{formatnum:{{#expr: 1536*1}}}} GFLOPS || {{#expr: 192*1}} GFLOPS || {{#expr: 384*1}} GFLOPS || {{#expr: 768*1}} GFLOPS || {{#expr: 48*1}} GFLOPS || {{#expr: 96*1}} GFLOPS || {{#expr: 129*1}} GFLOPS
 
|-
 
| Boost (1,050 MHz) || {{#expr: 384*1.05}} GFLOPS || {{#expr: 768*1.05}} GFLOPS || {{formatnum:{{#expr: 1536*1.05}}}} GFLOPS || {{#expr: 192*1.05}} GFLOPS || {{#expr: 384*1.05}} GFLOPS || {{#expr: 768*1.05}} GFLOPS || {{#expr: 48*1.05}} GFLOPS || {{#expr: 96*1.05}} GFLOPS || {{#expr: 129*1.05}} GFLOPS
 
|-
 
| Boost (1,100 MHz) || {{#expr: 384*1.1}} GFLOPS || {{#expr: 768*1.1}} GFLOPS || {{formatnum:{{#expr: 1536*1.1}}}} GFLOPS || {{#expr: 192*1.1}} GFLOPS || {{#expr: 384*1.1}} GFLOPS || {{#expr: 768*1.1}} GFLOPS || {{#expr: 48*1.1}} GFLOPS || {{#expr: 96*1.1}} GFLOPS || {{#expr: 129*1.1}} GFLOPS
 
|-
 
| Boost (1,150 MHz) || {{#expr: 384*1.15}} GFLOPS || {{#expr: 768*1.15}} GFLOPS || {{formatnum:{{#expr: 1536*1.15}}}} GFLOPS || {{#expr: 192*1.15}} GFLOPS || {{#expr: 384*1.15}} GFLOPS || {{#expr: 768*1.15}} GFLOPS || {{#expr: 48*1.15}} GFLOPS || {{#expr: 96*1.15}} GFLOPS || {{#expr: 129*1.15}} GFLOPS
 
|}
 
</div>
 
  
 
== Hardware Accelerated Video ==
 
== Hardware Accelerated Video ==
Line 143: Line 96:
  
 
=== Key changes from {{\\|Gen9}} ===
 
=== Key changes from {{\\|Gen9}} ===
* Enhanced "14nm+" process (while CPU cores base frequency was increased, GPU speed remains unchanged)
+
* Enhanced "14nm+" process (while CPU cores base frequency was increase, GPU speed remains unchanged)
** Power consumption is reduced
 
 
* Display block
 
* Display block
 
** [[Embedded DisplayPort]] ([[eDP]]) now supports eDP Standard 1.4 (From 1.3)
 
** [[Embedded DisplayPort]] ([[eDP]]) now supports eDP Standard 1.4 (From 1.3)
 
* Unslice
 
* Unslice
 
** New native hardware support for 4K HEVC/VP9
 
** New native hardware support for 4K HEVC/VP9
** WiDi (Miracast) support has been enhanced
 
** VQE wider color gamma
 
 
* DRM
 
* DRM
 
** Support for Microsoft's PlayReady 3.0
 
** Support for Microsoft's PlayReady 3.0
Line 232: Line 182:
  
 
==== Preemption Granularity ====
 
==== Preemption Granularity ====
Preemption in Gen9 ({{\\|Skylake}}) was improved over Gen8 in a number of ways. Preemption is important for multi-tasking system and especially important for improving responsiveness of operations (i.e. the ability to stop and start operations quickly with minimal latency interruption for the end user). In {{\\|Broadwell}} ({{\\|Gen8}}) Intel added support for the ability to stop operations on object-level for 3D workloads such as on a triangle boundary (i.e. beginning of a triangle, between two triangles, between two lines  or points) and be able to preempt and restore back to those operations. In Gen9 Intel added the ability to stop execution units on an instruction boundary and be able to restore them (previously such preemption was only possible at the boundary of a kernel - i.e. the entire kernel execution must take places before preemption was possible). Gen9 added support for thread-group (complete kernel execution) to mid-thread (instruction boundary) for compute workloads:  
+
Preemption in Gen9 ({{\\|Skylake}}) was improved over Gen8 in a number of way. Preemption is important for multi-tasking system and especially important for improving responsiveness of operations (i.e. the ability to stop and start operations quickly with minimal latency interruption for the end user). In {{\\|Broadwell}} ({{\\|Gen8}}) Intel added support for the ability to stop operations on object-level for 3D workloads such as on a triangle boundary (i.e. beginning of a triangle, between two triangles, between two lines  or points) and be able to preempt and restore back to those operations. In Gen9 Intel added the ability to stop execution units on an instruction boundary and be able to restore them (previously such preemption was only possible at the boundary of a kernel - i.e. the entire kernel execution must take places before preemption was possible). Gen9 added support for thread-group (complete kernel execution) to mid-thread (instruction boundary) for compute workloads:  
  
 
Example of responsiveness (Source: IDF15)
 
Example of responsiveness (Source: IDF15)
Line 248: Line 198:
  
 
=== Display ===
 
=== Display ===
The display has a memory interface (supporting high memory bandwidth coming directly to the display sub-system), a front-end that is responsible for sorting and sequencing the requests (as well as handling things such as rotated displays), and display pipes. The display pipes perform input format conversion, multi-plane composition, color conversion, and scaling the result. The final part of the display port are the prot encoders that convert the input form the display pipes to the appropriate standard used (DP/HDMI/eDP). A number of improvements in Gen9 in the display block were done with respect to the display pipes, specifically being able to consume lossless compression directly without doing any extra unnecessary conversion operations. Additionally the pipes now support render compressed surfaces, Y-tiled surfaces, and on the fly 90/270 rotations.
+
The display has a memory interface (supporting high memory bandwidth coming directly to the display sub-system), a front-end that is responsible for sorting and sequencing the requests (as well as handling things such as rotated displays), and display pipes. The display pipes perform input format conversion, multi-plane composition, color conversion, and scaling the result. The final part of the display port are the prot encoders that convert the input form the display pipes to the appropriate standard used (DP/HDMI/eDP). A number of improvements in Gen9 in the display block were done with respect to the display pipes, specifically being able to consume lossless compression directly without doing any extra unnecessary conversion operations. Additionally the pipes now support render compressed surfaces, Y-tiled surfaces, and on the fly 90/207 rotations.
 
   
 
   
 
[[File:gen9 display block.svg|650px]]
 
[[File:gen9 display block.svg|650px]]
Line 475: Line 425:
 
* [[:File:intel-gfx-prm-osrc-kbl-vol02b-commandreference-enumerations.pdf|Volume 2b: Command Reference: Enumerations]]
 
* [[:File:intel-gfx-prm-osrc-kbl-vol02b-commandreference-enumerations.pdf|Volume 2b: Command Reference: Enumerations]]
 
* [[:File:intel-gfx-prm-osrc-kbl-vol02c-commandreference-registers-part1.pdf|Volume 2c: Command Reference: Registers Part 1 – Registers A through L]]
 
* [[:File:intel-gfx-prm-osrc-kbl-vol02c-commandreference-registers-part1.pdf|Volume 2c: Command Reference: Registers Part 1 – Registers A through L]]
* [[:File:intel-gfx-prm-osrc-kbl-vol02c-commandreference-registers-part2.pdf|Volume 2c: Command Reference: Registers Part 2 – Registers M through Z]]
+
* Volume 2c: Command Reference: Registers Part 2 – Registers M through Z (Manual Missing from Intel)
 
* [[:File:intel-gfx-prm-osrc-kbl-vol02d-commandreference-structures.pdf|Volume 2d: Command Reference: Structures]]
 
* [[:File:intel-gfx-prm-osrc-kbl-vol02d-commandreference-structures.pdf|Volume 2d: Command Reference: Structures]]
 
* [[:File:intel-gfx-prm-osrc-kbl-vol03-gpu overview.pdf|Volume 3: GPU Overview]]
 
* [[:File:intel-gfx-prm-osrc-kbl-vol03-gpu overview.pdf|Volume 3: GPU Overview]]

Please note that all contributions to WikiChip may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see WikiChip:Copyrights for details). Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)
codenameGen9.5 +
designerIntel +
first launchedAugust 30, 2016 +
full page nameintel/microarchitectures/gen9.5 +
instance ofmicroarchitecture +
manufacturerIntel +
microarchitecture typeGPU +
nameGen9.5 +
process14 nm (0.014 μm, 1.4e-5 mm) +