From WikiChip
Difference between revisions of "google/pixel visual core"
< google

(Die)
(FLOPS is a unit that already includes the per second suffix no need to duplicate it)
 
(2 intermediate revisions by 2 users not shown)
Line 22: Line 22:
  
 
== Overview ==
 
== Overview ==
The pixel visual core is designed as a co-processor for various consumer products. Although it's currently only used in the Pixel 2 smartphone, Google have plans to use it in other IoT products in the future. The chip itself incorporates a dedicate [[ARM Holdings|ARM]] {{armh|Cortex-A53|l=arch}} core which handles the application-level resource requests and configures the core to handle the specific workload. For example, if the application sends a request to capture an image using HDR+, the management core will reconfigure the processing units such that an image captured by the camera will get processed and transformed into HDR+ format. The PVC is optimized for high performance by [[racing to sleep]] with a power budget of 6-8 W for very short bursts for around 10-20 seconds an dropping back down to milliwatt when idle. The chip relies equally on both hardware and software in order to achieve the high performance and efficiency by using TensorFlow for machine learning and Halide for image processing.
+
The pixel visual core is designed as a co-processor for various consumer products. Although it's currently only used in the Pixel 2 and Pixel 3 smartphones, Google have plans to use it in other IoT products in the future. The chip itself incorporates a dedicate [[ARM Holdings|ARM]] {{armh|Cortex-A53|l=arch}} core which handles the application-level resource requests and configures the core to handle the specific workload. For example, if the application sends a request to capture an image using HDR+, the management core will reconfigure the processing units such that an image captured by the camera will get processed and transformed into HDR+ format. The PVC is optimized for high performance by [[racing to sleep]] with a power budget of 6-8 W for very short bursts for around 10-20 seconds an dropping back down to milliwatt when idle. The chip relies equally on both hardware and software in order to achieve the high performance and efficiency by using TensorFlow for machine learning and Halide for image processing.
  
 
=== Architecture ===
 
=== Architecture ===
Line 29: Line 29:
 
There are two 16-bit ALUs per processing element and they can operate in three distinct ways: independent, joined, and fused. In the most common case, independent, the two ALUs can operate independently on two pairs of different operates (i.e., A1 op B1 and A2 op B2) while in the joined mode, the two ALUs act as a single big ALU producing 32-bit values. In the fused mode, the two ALUs are combined to form a fused 16-bit operation (i.e., A op [B op C]).
 
There are two 16-bit ALUs per processing element and they can operate in three distinct ways: independent, joined, and fused. In the most common case, independent, the two ALUs can operate independently on two pairs of different operates (i.e., A1 op B1 and A2 op B2) while in the joined mode, the two ALUs act as a single big ALU producing 32-bit values. In the fused mode, the two ALUs are combined to form a fused 16-bit operation (i.e., A op [B op C]).
  
Because the [[MACs]] are not [[pipelined]], they set the clock cycle. At 800 MHz, the chip is capable of 4,096 FLOPs/cycle (2*16*16*8) or 3.28 TeraFLOPS per second of raw compute power.
+
Because the [[MACs]] are not [[pipelined]], they set the clock cycle. At 800 MHz, the chip is capable of 4,096 [[FLOPs]]/cycle (2*16*16*8) or 3.28 TeraFLOPS of raw compute power.
  
 
== ISA ==
 
== ISA ==

Latest revision as of 10:03, 19 April 2019

Edit Values
Pixel Visual Core
General Info
DesignerGoogle
ManufacturerTSMC
Part NumberX726C502
S-SpecSR3HX
MarketMobile, Embedded
IntroductionOctober 17, 2017 (announced)
October 17, 2017 (launched)
General Specs
Frequency800 MHz
Microarchitecture
ISAvISA, pISA
Process28 nm
TechnologyCMOS
Electrical
TDP8 W

Pixel Visual Core (PVC) is an advanced image processing unit custom designed by Google introduced in late 2017 for their Pixel 2 smartphone and future IoT applications. Designed by Google and fabricated by TSMC on their 28HPM process, the IPU is a fully-programmable domain-specific processor designed from the ground-up in order to deliver the highest performance at low power.

Overview[edit]

The pixel visual core is designed as a co-processor for various consumer products. Although it's currently only used in the Pixel 2 and Pixel 3 smartphones, Google have plans to use it in other IoT products in the future. The chip itself incorporates a dedicate ARM Cortex-A53 core which handles the application-level resource requests and configures the core to handle the specific workload. For example, if the application sends a request to capture an image using HDR+, the management core will reconfigure the processing units such that an image captured by the camera will get processed and transformed into HDR+ format. The PVC is optimized for high performance by racing to sleep with a power budget of 6-8 W for very short bursts for around 10-20 seconds an dropping back down to milliwatt when idle. The chip relies equally on both hardware and software in order to achieve the high performance and efficiency by using TensorFlow for machine learning and Halide for image processing.

Architecture[edit]

The chip incorporates eight image processing units (IPUs) custom cores, each comprise 512 arithmetic logic units consisting of 256 processing elements (PEs) arranged as a 16 x 16 2-dimensional array. Those cores execute a custom VLIW ISA designed to expose maximum instruction-level and multiple program data parallelism. Though the chip supports 32-bit integers, the native operations are done on a much simpler logic that operates on 8-bit and 16-bit integers, thus larger data sizes will operate at half throughput. The basic primitive of the stencil operations is the multiply-accumulate which can accumulate 32 bits and multiply 16 bits.

There are two 16-bit ALUs per processing element and they can operate in three distinct ways: independent, joined, and fused. In the most common case, independent, the two ALUs can operate independently on two pairs of different operates (i.e., A1 op B1 and A2 op B2) while in the joined mode, the two ALUs act as a single big ALU producing 32-bit values. In the fused mode, the two ALUs are combined to form a fused 16-bit operation (i.e., A op [B op C]).

Because the MACs are not pipelined, they set the clock cycle. At 800 MHz, the chip is capable of 4,096 FLOPs/cycle (2*16*16*8) or 3.28 TeraFLOPS of raw compute power.

ISA[edit]

The exposed vISA is deployed as pISA to the individual cores. The pISA is a 119-bit VLIW.

FieldScalarMathMemoryImmMemImm
Bits4338121610

Die[edit]

Floorplan[edit]

google pvc floorplan.png

Die[edit]

  • TSMC 28nm 28HPM process
google pvc die.png

References[edit]

  • Some information was obtained directly from Google
  • IEEE ISSCC 2018
  • Ofer Shacham, "Pixel Visual Core: image processing and machine learning on Pixel 2", Oct 17, 2017.
  • Matt Cockrell, "Use of RISC-V on Pixel Visual Core", RISC-V Workshop Barcelona, May 8, 2018
  • John L. Hennessy, David A. Patterson, "Computer Architecture: A Quantitative Approach"
base frequency800 MHz (0.8 GHz, 800,000 kHz) +
designerGoogle +
first announcedOctober 17, 2017 +
first launchedOctober 17, 2017 +
full page namegoogle/pixel visual core +
isavISA + and pISA +
ldateOctober 17, 2017 +
manufacturerTSMC +
market segmentMobile + and Embedded +
namePixel Visual Core +
part numberX726C502 +
process28 nm (0.028 μm, 2.8e-5 mm) +
s-specSR3HX +
tdp8 W (8,000 mW, 0.0107 hp, 0.008 kW) +
technologyCMOS +