Graphics engine architecture - PCIe Gen 3 and Eyefinity 2.0
The Graphics Engine Architecture
With the initial release of the 7000 series AMD moved away from the VLIW5 and VLIW4 architecture we have seen in the last generation of products. If anything, VLIW4 has shown certain inefficiencies in the Radeon HD 6900 series and while VLIW designs are fine for graphics, they are not so grand for computing.
Both the Radeon HD 7700 and 7800 series are also based on the very same GCN architecture the 7900 series is based off.
GCN is short for Graphics Core Next architecture and the architecture building block has changed significantly to remove certain inefficiencies seen in the VLIW architecture. A GCN in essence is the basis of a GPU that performs well at both graphical and computing tasks. For the compute side of things the new GCN Compute unit model has been introduced, it is designed for better utilization, high throughput and multi tasking, e.g. performance, performance, performance.
So your basic new Shader cluster is one called a (GCN) Compute Unit:
- Non-VLIW Design
- 16 wide SIMD units
- 64 KB registers / SIMD unit
Now if we take 4 of these SIMD units that will be the basis of one Compute Unit (CU), each SIMD unit is 16 wide, times four per compute unit means that each CU has 64 shader processors.
The Pitcairn PRO GPU has 16 Compute Units meaning 64SIMDs x 16 CUs = 1024 Shader Processors (for the R7850).
The Pitcairn XT GPU has 20 Compute Units meaning 64SIMDs x 20 CUs = 1280 Shader Processors (for the R7870).
- Engine has Dual Geometry Engines / Asynchronous Compute Engines
- 32 color ROPs per clock cycle
- Engine ties to 512KB R/W L2 cache
- Pitcairn GPU has up-to 20 Compute Units
The Graphics Core Next Compute Unit (CU) has about the same floating point power per clock as the previous one (i.e. Cayman). It also has the same amount of register space (for the vector units). Each CU also has its own registers and local data share.
GCN is more efficient since it does not require instruction level parallelism (we assume it costs some more area/transistors as well). The outcome, compiling also becomes much more uncomplicated and that means more efficiency and thus there it is again, better performance. GCN is all about creating a GPU good for both graphics and computing purposes.
PCIe Gen 3
In Q3 and Q4 of 2011 we saw a lot of PCIe gen 3 motherboard announcements. What's that all about you ask? In a nutshell, PCI Express Gen 3 provides a 2x faster transfer rate than the previous generation, this delivers capabilities for next generation extreme gaming solutions.
So opposed to the current PCI Express slots which are Gen 2, the PCI Express Gen 3 will have twice the available bandwidth at 32GB/s, improved efficiency and compatibility and as such it will offer better performance for current and next gen PCI Express cards.
To make it even more understandable, going from PCIe Gen 2 to Gen 3 doubles the bandwidth available to the add-on cards installed, from 500MB/s per lane to 1GB/s per lane.
So a Gen 3 PCI Express x16 slot is capable of offering 16GB/s (or 128Gbit/s) of bandwidth in each direction. That results in 32GB/sec bi-directional bandwidth.
The big problem is that you need a symbiosis of proper compatible hardware, like a Gen 3 supporting motherboard, Gen 3 capable processor and thus a graphics card supporting the new standard. A lot of Z68 and all X79 are PCIe Gen 3 certified. However, processor wise the upcoming Ivy Bridge CPUs from Intel will support Gen 3. It is still pending whether or not Sandy Bridge-E will get Gen 3 support.
Eyefinity 2.0
One of the biggest success stories of the Radeon series was the introduction of Eyefinity. Eyefinity allows you to use multiple monitors in desktop and gaming mode. Typically you needed the very same monitors and resolutions, Eyefinity 2.0 changes that. You are now actually able to create a custom resolution. So if you have three differently sized monitors, you can actually get that working (not that I'd recommend it).
More monitor signal bandwidth is created with the 7000 series cards as well, you may now create resolutions of 16k x 16k. This for a fact allows you to set up say five screens in 5x1 landscape mode in 1920x1200 and even 2560x1600 monitors.
You guys slowly start to understand now why the R7000 has SO MUCH graphics memory right, huge resolutions require huge framebuffers. And for the above mentioned setup with 2560x1600 monitors that would boil down to 12800 x 1600 pixels, that's a 20 Megapixel resolution.
In AMD Catalyst drivers from Feb/March 2012 and onwards you will see support for the aforementioned custom resolutions as well. So 3072x768 can be made manually as well as 5040x1050 or 5670x1200. You are in control of the resolution you like to apply to your monitors.
The Catalyst 12.2 and newer also bring a new feature called Taskbar Positioning. Say you set up 3 or even 5 screens in landscape mode, it's always a total bitch that the start menu and icons are located all the way to the far left screen. The new feature will allow you to configure the position of the taskbar, so if you want it positioned on the middle monitor, that will become an option. That's progress folks...