Kepler GK110 Revision B Graphics Architecture
Kepler GK110 Revision B Graphics Architecture
As you can understand, the massive memory partitions, bus-width and combination of GDDR5 memory (quad data rate) allows the GPU to work with a very high framebuffer bandwidth (effective). Let's again put most of the data in a chart to get an idea and better overview of changes:
Graphics card | GeForce GTX 480 | GeForce GTX 580 | GeForce GTX 680 | GeForce GTX 780 | GeForce GTX Titan | GeForce GTX 780 Ti |
Fabrication node | 40nm | 40nm | 28nm | 28nm | 28nm | 28nm |
Shader processors | 480 | 512 | 1536 | 2304 | 2688 | 2880 |
Streaming Multiprocessors (SMX) | 15 | 16 | 8 | 12 | 14 | 15 |
Texture Units | 60 | 64 | 128 | 192 | 224 | 240 |
ROP units | 48 | 48 | 32 | 48 | 48 | 48 |
Graphics Clock (Core) | 700 MHz | 772 MHz | 1006/1058 MHz | 863/900 MHz | 836/876 MHz | 875/928 MHz |
Shader Processor Clock | 1401 MHz | 1544 MHz | 1006/1058 MHz | 863/900 MHz | 836/876 MHz | 875/928 MHz |
Memory Clock / Data rate | 924 MHz / 3696 MHz | 1000 MHz / 4000 MHz | 1502 MHz / 6008 MHz | 1502 MHz / 6008 MHz | 1502 MHz / 6008 MHz | 1750 MHz / 7000 MHz |
Graphics memory | 1536 MB | 1536 MB | 2048 MB | 3072 MB | 6144 MB | 3072 MB |
Memory interface | 384-bit | 384-bit | 256-bit | 384-bit | 384-bit | 384-bit |
Memory bandwidth | 177 GB/s | 192 GB/s | 192 GB/s | 288 GB/s | 288 GB/s | 336 GB/s |
Power connectors | 1x6-pin PEG, 1x8-pin PEG | 1x6-pin PEG, 1x8-pin PEG | 2x6-pin PEG | 1x6-pin PEG, 1x8-pin PEG | 1x6-pin PEG, 1x8-pin PEG | 1x6-pin PEG, 1x8-pin PEG
|
Max board power (TDP) | 250 Watts | 244 Watts | 170 Watts | 250 Watts | 250 Watts | 250 Watts |
Recommended Power supply | 600 Watts | 600 Watts | 550 Watts | 600 Watts | 600 Watts | 600 Watts |
GPU Thermal Threshold | 105 degrees C | 97 degrees C | 98 degrees C | 95 degrees C | 95 degrees C | 95 degrees C |
So we talked about the core clocks, specifications and memory partitions. Obviously there's a lot more to talk through. We feel that to be able to understand a graphics processor, you simply need to break it down into small pieces to better understand it. Let's first look at the raw data that most of you can understand and grasp. This bit will be about the Kepler GK110B architecture, if you're not interested in g33k talk, by all means please browse to the next page.
Right, so have a close look at the GK110 die as shown above. You'll notice the five green clusters. These are the polymorph GPC engines, each containing 3 SMX (streaming multi processor) clusters, 5 x 3 = 15 SMX clusters in total. You'll spot six 64-bit memory interfaces, bringing in a 384-bit path towards the graphics memory. That's instant extra memory bandwith by the way, combined with a 7 Gbps clock, the cards can reach 336 GB/sec.
So above, we see the GK110 block diagram that entails Kepler architecture. Let's break it down into bits and pieces. The GK110B will have:
- 2880 (GTX 780 Ti) or 2688 (Titan) or 2304 (GTX 780) CUDA processors (Shader cores)
- There are 192 CUDA cores (shader processors) per cluster (SMX).
When we zoom in even further at one SMX cluster (192 shader processors) we see a change from the GK104 (GTX 680) as there are 64 double-precision math units.
- GeForce GTX 580 has 16 SMX x 4 Texture units = 64
- GeForce GTX 680 has 8 SMX x 16 Texture units = 128
- GeForce GTX 780 has 12 SMX x 16 Texture units = 192
- GeForce GTX Titan has 14 SMX x 16 Texture units = 224
- GeForce GTX 780 Ti has 15 SMX x 16 Texture units = 240
So there's a total 15 SMX x16 TU = 240 texture filtering units available for the GK110 silicon itself (if all SMXes where enabled). Still with me?