The graphics engine architecture
The Graphics Engine Architecture
AMD moved away from the VLIW5 and VLIW4 architecture we have seen in the past generation of products. If anything, VLIW4 has shown certain inefficiencies in the Radeon HD 6900 series and while VLIW designs are fine for graphics, they are not so grand for computing.
The R9 series are also based on the very same GCN architecture the 7900 series is based off. GCN is short for Graphics Core Next architecture and the architecture building block has changed significantly to remove certain inefficiencies seen in the VLIW architecture. A GCN in essence is the basis of a GPU that performs well at both graphical and computing tasks. For the compute side of things the new GCN Compute unit model has been introduced, it is designed for better utilization, high throughput and multi tasking, e.g. performance, performance, performance.
So your basic new Shader cluster is one called a (GCN) Compute Unit:
- Non-VLIW Design
- 16 wide SIMD units
- 64 KB registers / SIMD unit
Now if we take 4 of these SIMD units that will be the basis of one Compute Unit (CU), each SIMD unit is 16 wide, times four per compute unit means that each CU has 64 shader processors.
The Pitcairn PRO GPU has 16 Compute Units meaning 64SIMDs x 16 CUs = 1024 Shader processors (for the R7850).
The Pitcairn XT GPU has 20 Compute Units meaning 64SIMDs x 20 CUs = 1280 Shader processors (for the R7870 / R9-270/270X).
- Engine has Dual Geometry engines / Asynchronous Compute engines
- 32 color ROPs per clock cycle
- Engine ties to 512KB R/W L2 cache
- Pitcairn GPU has up-to 20 Compute Units
The Graphics Core Next Compute Unit (CU) has about the same floating point power per clock as the previous one (i.e. Cayman). It also has the same amount of register space (for the vector units). Each CU also has its own registers and local data share.
Older 7800 slide, same architecure aside from clocks on memory bendwith and GPU core.
GCN is more efficient since it does not require instruction level parallelism (we assume it costs some more area/transistors as well). The outcome, compiling also becomes much more uncomplicated and that means more efficiency and thus there it is again, better performance. GCN is all about creating a GPU good for both graphics and computing purposes.