Latest revision |
Your text |
Line 1: |
Line 1: |
− | {{nervana title|Neural Network Processors (NNP)}} | + | {{nervana title|NNP}} |
| {{ic family | | {{ic family |
| | title = NNP | | | title = NNP |
Line 29: |
Line 29: |
| | predecessor = | | | predecessor = |
| | predecessor link = | | | predecessor link = |
− | | successor = Habana HL-Series | + | | successor = |
− | | successor link = habana/hl | + | | successor link = |
| }} | | }} |
− | '''Neural Network Processors''' ('''NNP''') is a family of [[neural processors]] designed by [[Intel Nervana]] for both [[inference]] and [[training]]. | + | '''Neural Network Processors''' ('''NNP''') are a family of [[neural processors]] designed by [[Intel Nervana]] for both [[inference]] and [[training]]. |
− | | |
− | The NNP family has been discontinued on January 31, 2019, in favor of the [[Habana]] {{habana|HL}} series.
| |
| | | |
| == Overview == | | == Overview == |
− | Neural network processors (NNP) is a family of [[neural processors]] designed by [[Intel]] for the [[acceleration]] of [[artificial intelligence]] workloads. The name and original architecture originated with the [[Nervana]] startup prior to its acquisition by [[Intel]] in [[2016]]. Although the first product was announced in 2017, it never made it past customer sampling which eventually served as a learning product. Intel eventually productized those chips starting with their second-generation designs in late 2019. | + | Neural network processors (NNP) are a family of [[neural processors]] designed by [[Intel]] for the [[acceleration]] of [[artificial intelligence]] workloads. The design initially originated by [[Nervana]] prior to their acquisition by [[Intel]]. Intel eventually productized those chips starting with their second-generation designs. |
− | | |
− | The NNP family comprises two separate series - '''NNP-I''' for [[inference]] and '''NNP-T''' for [[training]]. The two series use entirely different architectures. The training chip is a direct descendent of Nervana's original ASIC design. Those chips use the PCIe and [[OCP OAM|OAM]] form factors that have high TDPs designed for maximum performance at the data center and for workstations. Unlike the NNP-T, NNP-I inference chips are the product of Intel IDC which, architecturally, are very different from the training chips. They use Intel's low-power client SoC has the base SoC and build the AI architecture from there. The inference chips use low-power PCIe, M.2, and ruler form factors designed for servers, workstations, and embedded applications.
| |
− | | |
− | On January 31, 2020, Intel announced that it has discontinued the Nervana NNP product line in favor of the unified architecture it has acquired from [[Habana Labs]] a month earlier.
| |
− | | |
− | === Codenames ===
| |
− | {| class="wikitable"
| |
− | |-
| |
− | ! Introduction || Type || Microarchitecture || Process
| |
− | |-
| |
− | | 2017<sup>1</sup> || [[Training]] || {{nervana|Lake Crest|l=arch}} || [[TSMC 28 nm|28 nm]]
| |
− | |-
| |
− | | 2019 || Training || {{nervana|Spring Crest|l=arch}} || [[TSMC 16 nm|16 nm]]
| |
− | |-
| |
− | | 2019 || [[Inference]] || {{nervana|Spring Hill|l=arch}} || [[Intel 10 nm|10 nm]]
| |
− | |- style="text-decoration:line-through"
| |
− | | 2020 || Training+CPU || {{nervana|Knights Crest|l=arch}} || ?
| |
− | |}
| |
| | | |
− | 1 - Only sampled
| + | The NNP family comprises two separate series - '''NNP-I''' for [[inference]] and '''NNP-L''' for [[training]]. |
| | | |
− | == Training (NNP-T) == | + | == Learning (NNP-L) == |
| === Lake Crest === | | === Lake Crest === |
| {{main|nervana/microarchitectures/lake_crest|l1=Lake Crest µarch}} | | {{main|nervana/microarchitectures/lake_crest|l1=Lake Crest µarch}} |
| The first generation of NNPs were based on the {{nervana|Lake Crest|Lake Crest microarchitecture|l=arch}}. Manufactured on [[TSMC]]'s [[28 nm process]], those chips were never productized. Samples were used for customer feedback and the design mostly served as a software development vehicle for their follow-up design. | | The first generation of NNPs were based on the {{nervana|Lake Crest|Lake Crest microarchitecture|l=arch}}. Manufactured on [[TSMC]]'s [[28 nm process]], those chips were never productized. Samples were used for customer feedback and the design mostly served as a software development vehicle for their follow-up design. |
− | === T-1000 Series (Spring Crest) === | + | === NNP T-1000 (Spring Crest) === |
| [[File:nnp-l-1000 announcement.png|thumb|right|NNP T-1000]] | | [[File:nnp-l-1000 announcement.png|thumb|right|NNP T-1000]] |
| {{main|nervana/microarchitectures/spring_crest|l1=Spring Crest µarch}} | | {{main|nervana/microarchitectures/spring_crest|l1=Spring Crest µarch}} |
− | Launched in late 2019, second-generation NNP-Ts are branded as the NNP T-1000 series and are the first chips to be productized. Fabricated [[TSMC]]'s [[16 nm process]] based on the {{nervana|Spring Crest|Spring Crest microarchitecture|l=arch}}, those chips feature a number of enhancements and refinments over the prior generation including a shift from [[Flexpoint]] to [[Bfloat16]] and considerable performance uplift. Intel claims that these chips have about 3-4x the training performance of first generation. All NNP-T 1000 chips come with 32 GiB of four [[HBM2]] stacks in a [[CoWoS]] package and come in two form factors: [[PCIe Gen 3]] and an [[OCP OAM]] [[accelerator card]].
| + | Second-generation NNP-Ls are branded as the NNP T-1000 series and are the first chips to be productized. Fabricated [[TSMC]]'s [[16 nm process]] based on the {{nervana|Spring Crest|Spring Crest microarchitecture|l=arch}}, those chips feature a number of enhancements and refinments over the prior generation including a shift from [[Flexpoint]] to [[Bfloat16]]. Intel claims that these chips have about 3-4x the training performance of first generation. Those chips come with 32 GiB of four [[HBM2]] stacks and are [[packaged]] in two forms - [[PCIe x16 Gen 4 Card]] and an [[OCP OAM]]. |
− | [[File:spring_crest_ocp_board_(front).png|right|thumb|NNP-T 1400 [[OAM Module]].]]
| |
− | | |
− | * '''Proc''' [[16 nm process]]
| |
− | * '''Mem''' 32 GiB, HBM2-2400
| |
− | * '''TDP''' 300-400 W (150-250 W typical power)
| |
− | * '''Perf''' 108 TOPS ([[bfloat16]])
| |
− | | |
− | <!-- NOTE:
| |
− | This table is generated automatically from the data in the actual articles.
| |
− | If a microprocessor is missing from the list, an appropriate article for it needs to be
| |
− | created and tagged accordingly.
| |
− | | |
− | Missing a chip? please dump its name here: https://en.wikichip.org/wiki/WikiChip:wanted_chips
| |
− | -->
| |
− | {{comp table start}}
| |
− | <table class="comptable sortable tc4">
| |
− | {{comp table header|main|7:List of NNP-T 1000-based Processors}}
| |
− | {{comp table header|main|5:Main processor|1:Performance}}
| |
− | {{comp table header|cols|Launched|TDP|EUs|Frequency|[[HBM2]]|Peak Perf ([[bfloat16]])}}
| |
− | {{#ask: [[Category:microprocessor models by intel]] [[microarchitecture::Spring Crest]]
| |
− | |?full page name
| |
− | |?model number
| |
− | |?first launched
| |
− | |?tdp
| |
− | |?core count
| |
− | |?base frequency#MHz
| |
− | |?max memory#GiB
| |
− | |?peak flops (half-precision)#TFLOPS
| |
− | |format=template
| |
− | |template=proc table 3
| |
− | |userparam=8
| |
− | |mainlabel=-
| |
− | }}
| |
− | {{comp table count|ask=[[Category:microprocessor models by intel]] [[microarchitecture::Spring Crest]]}}
| |
− | </table>
| |
− | {{comp table end}}
| |
− | | |
− | ==== POD Reference Design ====
| |
− | [[File:ai hw summit supermicro ref pod rack.jpeg|right|thumb|POD Rack]]
| |
− | Along with the launch of the NNP-T 1000 series, Intel also introduced the POD reference design. Those systems were intended for large-scale out systems for the processing of very large neural networks. The POD reference design featured 10 racks with 6 nodes per rack. Each of the nodes features eight interconnected OAM cards, producing a system with a total of 480 NNP-Ts.
| |
| | | |
| | | |
− | :[[File:ai hw summit supermicro ref pod.jpeg|500px]]
| + | {{future information}} |
| | | |
| == Inference (NNP-I) == | | == Inference (NNP-I) == |
− | === I-1000 Series (Spring Hill) === | + | === NNP I-1000 Series === |
| [[File:nnp-i-1000.png|right|thumb|NNP I-1000]] | | [[File:nnp-i-1000.png|right|thumb|NNP I-1000]] |
− | {{main|intel/microarchitectures/spring_hill|l1=Spring Hill µarch}} | + | {{main|nervana/microarchitectures/spring_hill|l1=Spring Hill µarch}} |
− | The NNP I-1000 series is Intel's first series of devices designed specifically for the [[acceleration]] of inference workloads. Fabricated on [[Intel's 10 nm process]], these chips are based on {{nervana|Spring Hill|l=arch}} and incorporate a {{intel|Sunny Cove|Sunny Cove core|l=arch}} along with twelve specialized inference acceleration engines. The overall SoC design borrows considerable amount of IP from {{intel|Ice Lake (Client)|Ice Lake|l=arch}}. Those devices come in [[M.2]] and PCIe form factors. | + | The NNP I-1000 series is Intel's first chips designed specifically for the acceleration of inference workloads. Fabricated on [[Intel]]'s [[10 nm process]], thos chips are based on {{nervana|Spring Hill|l=arch}} and incorporate a {{intel|Sunny Cove|Sunny Cove core|l=arch}}. Those devices come in [[M.2]] form factor. |
− | [[File:nnp-i ruler.jpg|right|thumb|NNP-I Ruler]]
| |
− | [[File:supermicro nnp-i chassis.jpg|thumb|right|NNP-I Ruler Chassis.]]
| |
− | | |
− | * '''Proc''' [[10 nm process]]
| |
− | * '''Mem''' 4x32b LPDDR4x-4200
| |
− | * '''TDP''' 10-50 W
| |
− | * '''Eff''' 2.0-4.8 TOPs/W
| |
− | * '''Perf''' 48-92 TOPS (Int8)
| |
| | | |
− | <!-- NOTE:
| |
− | This table is generated automatically from the data in the actual articles.
| |
− | If a microprocessor is missing from the list, an appropriate article for it needs to be
| |
− | created and tagged accordingly.
| |
− |
| |
− | Missing a chip? please dump its name here: https://en.wikichip.org/wiki/WikiChip:wanted_chips
| |
− | -->
| |
− | {{comp table start}}
| |
− | <table class="comptable sortable tc4">
| |
− | {{comp table header|main|5:List of NNP-I-1000-based Processors}}
| |
− | {{comp table header|main|3:Main processor|1:Performance}}
| |
− | {{comp table header|cols|Launched|TDP|EUs|Peak Perf (Int8)}}
| |
− | {{#ask: [[Category:microprocessor models by intel]] [[microarchitecture::Spring Hill]]
| |
− | |?full page name
| |
− | |?model number
| |
− | |?first launched
| |
− | |?tdp
| |
− | |?core count
| |
− | |?peak integer ops (8-bit)#TOPS
| |
− | |format=template
| |
− | |template=proc table 3
| |
− | |userparam=6
| |
− | |mainlabel=-
| |
− | }}
| |
− | {{comp table count|ask=[[Category:microprocessor models by intel]] [[microarchitecture::Spring Hill]]}}
| |
− | </table>
| |
− | {{comp table end}}
| |
| | | |
− | Intel also announced NNP-I in an [[EDSFF]] (ruler) form factor which was designed to provide the highest compute density possible for inference. Intel hasn't announced specific models. The rulers were planned t come with a 10-35W TDP range. 32 NNP-Is in a ruler form factor can be packed in a single 1U rack.
| + | {{future information}} |
| | | |
| == See also == | | == See also == |
− | * [[neural processor]]
| |
| * {{intel|DL Boost}} | | * {{intel|DL Boost}} |
− | * {{habana|HL Series}}
| |