From WikiChip
Cache Coherent Interconnect for Accelerators (CCIX)

Cache Coherent Interconnect for Accelerators (CCIX), pronounced "see-six", is an open cache coherent interconnect architecture developed by the CCIX Consortium. CCIX is designed to simplify the communication between the central processor and the various accelerators in the system through a cache-coherent extension to standard PCIe.

Overview

CCIX is a high-performance, chip-to-chip interconnect architecture that provides a cache coherent framework for heterogeneous system architectures. Cache coherency is automatically maintained at all time between the central processing unit and the various other accelerators in the system. Operating over standard PCIe, CCIX supports signaling rates between 16 GT/s and 25 GT/s per link with support for port aggregation for higher performance.

Every CCIX-support device incorporates at least a single CCIX port which is pin-compatible with any other CCIX-enabled device. CCIX supports a large set of topologies such as chip-to-chip, chip-switch-chip, mesh, daisy chains, and rings.

ccix topologies.svg

Versions

  • CCIX base specification 1.0 was released on June 18, 2018.

Architecture

CCIX is largely an extension of the PCI Express architecture, consists of a number of stacked protocol layers.

  • CCIX Protocol Layer - coherency protocol that manages memory reads and writes, provides a mapping for the on-chip architecture-dependent coherency protocols (e.g., AMBA CHI/ACE). This layer also defines the cache state (e.g. shared/unique, clean/dirty).
  • CCIX Link Layer - formatting protocol for transport (e.g., PCIe). This layer also manages link aggregation.
  • CCIX Transport Specification
    • PCIe Transaction Layer - Handles PCIe transactions. The layer also supports virtual channels, permitting transmission for different data streams across a single shared link.
    • CCIX Transaction Layer - Handles CCIX transactions. This layer optimizes out superfluous PCIe packet fields, reducing overhead.
  • PCIe Data Link Layer - Handles standard data link layer (e.g., CRC, ACK).
  • CCIX/PCIe Physical Layer - Handles standard PCIe physical layer. This layer also supports 25 GT/s ESM mode.

Agents

CCIX defines a number of possible agents which are identified by their Agent ID.

  • Request Agent (RA) - performs non-local read/write transactions on behalf of local functional blocks. The RA can also perform caching and maintain coherency.
  • Home Agent (HA) - manages coherency and memory access to a specific address range. A change in cache state will generate a snoop transaction to the request agent.
    • Slave Agent (SA) - an extension of the home agent that manages coherency and memory access to a specific address range found on an external device.
  • Error Agent (EA) - processes received error messages

Under the common case, the host memory is shared with an attached accelerator. In this scenario, the host contains the home agent along with the request agent while the accelerator only has a request agent. In case the accelerator also has a large pool of memory that extends the virtual address space, there would be a second home agent on the accelerator itself. All memory requests are initiated via the request agent (local or foreign). Requests are sent to the home agent which manages the memory access and caching. In more complex systems where the memory might reside on a separate chip, the home agent will send a request to the slave agent for the access which will return to the home agent to be sent back to the request agent.

Bibliography

External links