From WikiChip
Difference between revisions of "coremark-mhz"

(corrected swerv score according to official WD data https://riscv.org/wp-content/uploads/2019/12/12.11-14.20a3-Bandic-WD_SweRV_Cores_Roadmap_v4SCR.pdf)
 
(6 intermediate revisions by 5 users not shown)
Line 5: Line 5:
 
'''CoreMark/MHz''' is a measurement of [[single thread performance]] per [[clock frequency]]. The number is based on the [[CoreMark]] [[benchmark]] score. It is obtained by taking the single-core [[CoreMark]] number and dividing it by the [[clock speed]] used when the benchmark is performed.
 
'''CoreMark/MHz''' is a measurement of [[single thread performance]] per [[clock frequency]]. The number is based on the [[CoreMark]] [[benchmark]] score. It is obtained by taking the single-core [[CoreMark]] number and dividing it by the [[clock speed]] used when the benchmark is performed.
  
While CoreMark is a relatively simple benchmark that addresses some of the deficiencies with [[Dhrystone]], it has been designed around embedded applications and therefore demonstrates highly favorable numbers for relatively simple designs (e.g., [[dual-issue]] [[in-order]]) while having weaker performance scaling in complex designs (e.g., [[out-of-order]] [[superscalar]]). Therefore it may sometimes show that a very well-design in-order core achieves >80% the performance of very complex high-performance OoO cores while real-world applications will demonstratively show significantly bigger gaps and discrepancies. Additionally, since the score is normilized by [[clock frequency]], it cannot be used to derived absolute performances. Furthermore, since it's possible to achieve higher CoreMark at considerably lower frequency through well-known techniques such as shortening the pipeline which saves significant amount of silicon, using CoreMark/MHz per unite area to derive area-efficiency is problematic.
+
While CoreMark is a relatively simple benchmark that addresses some of the deficiencies with [[Dhrystone]], it has been designed around embedded applications and therefore demonstrates highly favorable numbers for relatively simple designs (e.g., [[dual-issue]] [[in-order]]) while having weaker performance scaling in complex designs (e.g., [[out-of-order]] [[superscalar]]). Therefore it may sometimes show that a very well-design in-order core achieves >80% of the performance of very complex high-performance OoO cores while real-world applications will demonstratively show significantly bigger gaps and discrepancies. Additionally, since the score is normalized by [[clock frequency]], it cannot be used to derived absolute performances. Furthermore, since it's possible to achieve higher CoreMark at considerably lower frequency through well-known techniques such as shortening the pipeline which saves a significant amount of silicon, using CoreMark/MHz per unit area to derive area-efficiency is problematic.
  
 
== Scores ==
 
== Scores ==
Line 27: Line 27:
 
|-
 
|-
 
| [[Arm]] || {{armh|Cortex-M23|l=arch}} ||  || {{arm|ARMv8-M}} Mainline || 2.64
 
| [[Arm]] || {{armh|Cortex-M23|l=arch}} ||  || {{arm|ARMv8-M}} Mainline || 2.64
 +
|-
 +
| [[OnChip]] || [[Apicalis]] ||  || {{riscv|RV32IM}} || 2.7
 
|-
 
|-
 
| [[Arm]] || {{armh|Cortex-M1|l=arch}} ||  || {{arm|ARMv6-M}} || 1.85
 
| [[Arm]] || {{armh|Cortex-M1|l=arch}} ||  || {{arm|ARMv6-M}} || 1.85
Line 43: Line 45:
 
|-
 
|-
 
| [[Arm]] || {{armh|Cortex-R52|l=arch}} ||  || {{arm|ARMv8-R}} || 4.2
 
| [[Arm]] || {{armh|Cortex-R52|l=arch}} ||  || {{arm|ARMv8-R}} || 4.2
 +
|-
 +
| [[Arm]] || {{armh|Cortex-R82|l=arch}} ||  || {{arm|ARMv8-R}} || 5.82
 
|-
 
|-
 
| [[SiFive]] || {{sifive|E20|l=arch}} || || {{riscv|RV32IMC}} || 2.51
 
| [[SiFive]] || {{sifive|E20|l=arch}} || || {{riscv|RV32IMC}} || 2.51
Line 68: Line 72:
 
| [[UC Berkeley]] || {{ucberkeley|BOOMv2|l=arch}} (4-way) || || {{riscv|RV64IMAFD}} || 4.7
 
| [[UC Berkeley]] || {{ucberkeley|BOOMv2|l=arch}} (4-way) || || {{riscv|RV64IMAFD}} || 4.7
 
|-
 
|-
| [[Western Digital]] || [[SweRV EH1]] || || {{riscv|RV32IMC}} || 4.9
+
| [[Western Digital]] || [[SweRV EH1]] || || {{riscv|RV32IMC}} || 5.78
 +
|-
 +
| [[Western Digital]] || [[SweRV EH2 single thread]] || || {{riscv|RV32IMC}} || 6.03
 
|-
 
|-
| [[Western Digital]] || [[SweRV EH2 single thread]] || || {{riscv|RV32IMC}} || 4.9
+
| [[Western Digital]] || [[SweRV EH2 two threads]] || || {{riscv|RV32IMAC}} || 7.83
 
|-
 
|-
| [[Western Digital]] || [[SweRV EH2 two threads]] || || {{riscv|RV32IMC}} || 6.3
+
| [[Western Digital]] || [[SweRV EL2]] || || {{riscv|RV32IMC}} || 4.33
 
|-
 
|-
| [[Western Digital]] || [[SweRV EL2]] || || {{riscv|RV32IMC}} || 3.6
+
| [[Alibaba]] || [[XT910]] || July 25, 2019 || {{riscv|RV64GCV}} || 7.1
 
|}
 
|}
  

Latest revision as of 10:32, 16 June 2022

CoreMark/MHz is a measurement of single thread performance per clock frequency based on the CoreMark benchmark.

Overview[edit]

CoreMark/MHz is a measurement of single thread performance per clock frequency. The number is based on the CoreMark benchmark score. It is obtained by taking the single-core CoreMark number and dividing it by the clock speed used when the benchmark is performed.

While CoreMark is a relatively simple benchmark that addresses some of the deficiencies with Dhrystone, it has been designed around embedded applications and therefore demonstrates highly favorable numbers for relatively simple designs (e.g., dual-issue in-order) while having weaker performance scaling in complex designs (e.g., out-of-order superscalar). Therefore it may sometimes show that a very well-design in-order core achieves >80% of the performance of very complex high-performance OoO cores while real-world applications will demonstratively show significantly bigger gaps and discrepancies. Additionally, since the score is normalized by clock frequency, it cannot be used to derived absolute performances. Furthermore, since it's possible to achieve higher CoreMark at considerably lower frequency through well-known techniques such as shortening the pipeline which saves a significant amount of silicon, using CoreMark/MHz per unit area to derive area-efficiency is problematic.

Scores[edit]

Note that some values were taken as reported by the chip designers while other values were calculated by WikiChip.

Designer Microarchitecture Intro ISA CoreMark/MHz
Arm Cortex-M55 February 10, 2020 ARMv8.1-M Mainline, FPU, Helium 4.2
Arm Cortex-M7 ARMv7-M, FPU 5.01
Arm Cortex-M35P ARMv8-M Mainline, FPU 4.02
Arm Cortex-M33 ARMv8-M Mainline, FPU 4.02
Arm Cortex-M4 ARMv7-M, FPU 3.42
Arm Cortex-M3 ARMv7-M 3.34
Arm Cortex-M23 ARMv8-M Mainline 2.64
OnChip Apicalis RV32IM 2.7
Arm Cortex-M1 ARMv6-M 1.85
Arm Cortex-M0+ ARMv6-M 2.46
Arm Cortex-M0 ARMv6-M 2.33
Arm Cortex-R4 ARMv7-R 3.47
Arm Cortex-R5 ARMv7-R 3.47
Arm Cortex-R7 ARMv7-R 4.62
Arm Cortex-R8 ARMv7-R 4.62
Arm Cortex-R52 ARMv8-R 4.2
Arm Cortex-R82 ARMv8-R 5.82
SiFive E20 RV32IMC 2.51
SiFive E21 RV32IMAC 3.1
SiFive E24 RV32IMAFC 3.1
SiFive U74 RV64GC 5.1
SiFive U54 RV64GC 3.01
SiFive E31 RV32IMAC 3.01
SiFive E76 RV32IMAFC 5.1
SiFive S76 RV64GC 5.1
SiFive E34 RV32IMAFC 3.01
SiFive S54 RV64IMAFDC 3.01
UC Berkeley BOOMv2 (2-way) RV64IMAFD 3.92
UC Berkeley BOOMv2 (4-way) RV64IMAFD 4.7
Western Digital SweRV EH1 RV32IMC 5.78
Western Digital SweRV EH2 single thread RV32IMC 6.03
Western Digital SweRV EH2 two threads RV32IMAC 7.83
Western Digital SweRV EL2 RV32IMC 4.33
Alibaba XT910 July 25, 2019 RV64GCV 7.1

See also[edit]