(initial page) |
|||
Line 5: | Line 5: | ||
'''CoreMark/MHz''' is a measurement of [[single thread performance]] per [[clock frequency]]. The number is based on the [[CoreMark]] [[benchmark]] score. It is obtained by taking the single-core [[CoreMark]] number and dividing it by the [[clock speed]] used when the benchmark is performed. | '''CoreMark/MHz''' is a measurement of [[single thread performance]] per [[clock frequency]]. The number is based on the [[CoreMark]] [[benchmark]] score. It is obtained by taking the single-core [[CoreMark]] number and dividing it by the [[clock speed]] used when the benchmark is performed. | ||
− | While CoreMark is a relatively simple benchmark that addresses some of the deficiencies with [[Dhrystone]], it has been designed around embedded applications and therefore demonstrates highly favorable numbers for relatively simple designs (e.g., [[dual-issue]] [[in-order]]) while having weaker performance scaling in complex designs (e.g., [[out-of-order]] [[superscalar]]). Therefore it may sometimes show that a very well-design in-order core achieves >80% the performance of very complex high-performance OoO cores while real-world applications will demonstratively show significantly bigger gaps and discrepancies. | + | While CoreMark is a relatively simple benchmark that addresses some of the deficiencies with [[Dhrystone]], it has been designed around embedded applications and therefore demonstrates highly favorable numbers for relatively simple designs (e.g., [[dual-issue]] [[in-order]]) while having weaker performance scaling in complex designs (e.g., [[out-of-order]] [[superscalar]]). Therefore it may sometimes show that a very well-design in-order core achieves >80% the performance of very complex high-performance OoO cores while real-world applications will demonstratively show significantly bigger gaps and discrepancies. Additionally, since the score is normilized by [[clock frequency]], it cannot be used to derived absolute performances. Furthermore, since it's possible to achieve higher CoreMark at considerably lower frequency through well-known techniques such as shortening the pipeline which saves significant amount of silicon, using CoreMark/MHz per unite area to derive area-efficiency is problematic. |
+ | |||
+ | == Scores == | ||
+ | Note that some values were taken as reported by the chip designers while other values were calculated by WikiChip. | ||
+ | |||
+ | {| class="wikitable sortable" | ||
+ | |- | ||
+ | ! Designer !! Microarchitecture !! Intro !! [[ISA]] !! CoreMark/MHz | ||
+ | |- | ||
+ | | [[Arm]] || {{armh|Cortex-M55|l=arch}} || February 10, 2020 || {{arm|ARMv8.1-M}} Mainline, FPU, {{arm|Helium}} || 4.2 | ||
+ | |- | ||
+ | | [[Arm]] || {{armh|Cortex-M7|l=arch}} || || {{arm|ARMv7-M}}, FPU || 5.01 | ||
+ | |- | ||
+ | | [[Arm]] || {{armh|Cortex-M35P|l=arch}} || || {{arm|ARMv8-M}} Mainline, FPU || 4.02 | ||
+ | |- | ||
+ | | [[Arm]] || {{armh|Cortex-M33|l=arch}} || || {{arm|ARMv8-M}} Mainline, FPU || 4.02 | ||
+ | |- | ||
+ | | [[Arm]] || {{armh|Cortex-M4|l=arch}} || || {{arm|ARMv7-M}}, FPU || 3.42 | ||
+ | |- | ||
+ | | [[Arm]] || {{armh|Cortex-M3|l=arch}} || || {{arm|ARMv7-M}} || 3.34 | ||
+ | |- | ||
+ | | [[Arm]] || {{armh|Cortex-M23|l=arch}} || || {{arm|ARMv8-M}} Mainline || 2.64 | ||
+ | |- | ||
+ | | [[Arm]] || {{armh|Cortex-M1|l=arch}} || || {{arm|ARMv6-M}} || 1.85 | ||
+ | |- | ||
+ | | [[Arm]] || {{armh|Cortex-M0+|l=arch}} || || {{arm|ARMv6-M}} || 2.46 | ||
+ | |- | ||
+ | | [[Arm]] || {{armh|Cortex-M0|l=arch}} || || {{arm|ARMv6-M}} || 2.33 | ||
+ | |- | ||
+ | | [[Arm]] || {{armh|Cortex-R4|l=arch}} || || {{arm|ARMv7-R}} || 3.47 | ||
+ | |- | ||
+ | | [[Arm]] || {{armh|Cortex-R5|l=arch}} || || {{arm|ARMv7-R}} || 3.47 | ||
+ | |- | ||
+ | | [[Arm]] || {{armh|Cortex-R7|l=arch}} || || {{arm|ARMv7-R}} || 4.62 | ||
+ | |- | ||
+ | | [[Arm]] || {{armh|Cortex-R8|l=arch}} || || {{arm|ARMv7-R}} || 4.62 | ||
+ | |- | ||
+ | | [[Arm]] || {{armh|Cortex-R52|l=arch}} || || {{arm|ARMv8-R}} || 4.2 | ||
+ | |- | ||
+ | | [[SiFive]] || {{sifive|E20|l=arch}} || || {{riscv|RV32IMC}} || 2.51 | ||
+ | |- | ||
+ | | [[SiFive]] || {{sifive|E21|l=arch}} || || {{riscv|RV32IMAC}} || 3.1 | ||
+ | |- | ||
+ | | [[SiFive]] || {{sifive|E24|l=arch}} || || {{riscv|RV32IMAFC}} || 3.1 | ||
+ | |- | ||
+ | | [[SiFive]] || {{sifive|U74|l=arch}} || || {{riscv|RV64GC}} || 5.1 | ||
+ | |- | ||
+ | | [[SiFive]] || {{sifive|U54|l=arch}} || || {{riscv|RV64GC}} || 3.01 | ||
+ | |- | ||
+ | | [[SiFive]] || {{sifive|E31|l=arch}} || || {{riscv|RV32IMAC}} || 3.01 | ||
+ | |- | ||
+ | | [[SiFive]] || {{sifive|E76|l=arch}} || || {{riscv|RV32IMAFC}} || 5.1 | ||
+ | |- | ||
+ | | [[SiFive]] || {{sifive|S76|l=arch}} || || {{riscv|RV64GC}} || 5.1 | ||
+ | |- | ||
+ | | [[SiFive]] || {{sifive|E34|l=arch}} || || {{riscv|RV32IMAFC}} || 3.01 | ||
+ | |- | ||
+ | | [[SiFive]] || {{sifive|S54|l=arch}} || || {{riscv|RV64IMAFDC}} || 3.01 | ||
+ | |- | ||
+ | | [[UC Berkeley]] || {{ucberkeley|BOOMv2|l=arch}} (2-way) || || {{riscv|RV64IMAFD}} || 3.92 | ||
+ | |- | ||
+ | | [[UC Berkeley]] || {{ucberkeley|BOOMv2|l=arch}} (4-way) || || {{riscv|RV64IMAFD}} || 4.7 | ||
+ | |- | ||
+ | | [[Western Digital]] || [[SweRV]] || || {{riscv|RV32IMC}} || 4.9 | ||
+ | |} | ||
== See also == | == See also == | ||
* [[Dhrystone]] [[DMIPS/MHz]] | * [[Dhrystone]] [[DMIPS/MHz]] | ||
* [[Whetstone]] [[WMIPS/MHz]] | * [[Whetstone]] [[WMIPS/MHz]] |
Revision as of 14:15, 16 February 2020
CoreMark/MHz is a measurement of single thread performance per clock frequency based on the CoreMark benchmark.
Overview
CoreMark/MHz is a measurement of single thread performance per clock frequency. The number is based on the CoreMark benchmark score. It is obtained by taking the single-core CoreMark number and dividing it by the clock speed used when the benchmark is performed.
While CoreMark is a relatively simple benchmark that addresses some of the deficiencies with Dhrystone, it has been designed around embedded applications and therefore demonstrates highly favorable numbers for relatively simple designs (e.g., dual-issue in-order) while having weaker performance scaling in complex designs (e.g., out-of-order superscalar). Therefore it may sometimes show that a very well-design in-order core achieves >80% the performance of very complex high-performance OoO cores while real-world applications will demonstratively show significantly bigger gaps and discrepancies. Additionally, since the score is normilized by clock frequency, it cannot be used to derived absolute performances. Furthermore, since it's possible to achieve higher CoreMark at considerably lower frequency through well-known techniques such as shortening the pipeline which saves significant amount of silicon, using CoreMark/MHz per unite area to derive area-efficiency is problematic.
Scores
Note that some values were taken as reported by the chip designers while other values were calculated by WikiChip.
Designer | Microarchitecture | Intro | ISA | CoreMark/MHz |
---|---|---|---|---|
Arm | Cortex-M55 | February 10, 2020 | ARMv8.1-M Mainline, FPU, Helium | 4.2 |
Arm | Cortex-M7 | ARMv7-M, FPU | 5.01 | |
Arm | Cortex-M35P | ARMv8-M Mainline, FPU | 4.02 | |
Arm | Cortex-M33 | ARMv8-M Mainline, FPU | 4.02 | |
Arm | Cortex-M4 | ARMv7-M, FPU | 3.42 | |
Arm | Cortex-M3 | ARMv7-M | 3.34 | |
Arm | Cortex-M23 | ARMv8-M Mainline | 2.64 | |
Arm | Cortex-M1 | ARMv6-M | 1.85 | |
Arm | Cortex-M0+ | ARMv6-M | 2.46 | |
Arm | Cortex-M0 | ARMv6-M | 2.33 | |
Arm | Cortex-R4 | ARMv7-R | 3.47 | |
Arm | Cortex-R5 | ARMv7-R | 3.47 | |
Arm | Cortex-R7 | ARMv7-R | 4.62 | |
Arm | Cortex-R8 | ARMv7-R | 4.62 | |
Arm | Cortex-R52 | ARMv8-R | 4.2 | |
SiFive | E20 | RV32IMC | 2.51 | |
SiFive | E21 | RV32IMAC | 3.1 | |
SiFive | E24 | RV32IMAFC | 3.1 | |
SiFive | U74 | RV64GC | 5.1 | |
SiFive | U54 | RV64GC | 3.01 | |
SiFive | E31 | RV32IMAC | 3.01 | |
SiFive | E76 | RV32IMAFC | 5.1 | |
SiFive | S76 | RV64GC | 5.1 | |
SiFive | E34 | RV32IMAFC | 3.01 | |
SiFive | S54 | RV64IMAFDC | 3.01 | |
UC Berkeley | BOOMv2 (2-way) | RV64IMAFD | 3.92 | |
UC Berkeley | BOOMv2 (4-way) | RV64IMAFD | 4.7 | |
Western Digital | SweRV | RV32IMC | 4.9 |