Maxwell Graphics Architecture
Technology & Specifications (Reference)
The GeForce GTX 960 series is based on the latest iteration of GPU architecture called Maxwell, the cards use revision A1 of GM206; as explained the 20 nm node is not yet ready (if ever) and these products are based on the good ol' 28 nm fab node. That will make the chips relatively large in size. Maxwell is an advanced design, the product has has almost 3 Billion transistors tucked away in a S-FCBGA chip. GeForce GTX 960 comes with 1024 CUDA (shader) cores while its big brother the GeForce GTX 980 has 2048 shader processors. The change in shader amount is amongst the biggest differentials together with ROP and TMU count.
- GeForce GTX 960 has 1024 shader processors and 2 GB of GDDR5 graphics memory.
- GeForce GTX 970 has 1664 shader processors and 4 GB of GDDR5 graphics memory.
- GeForce GTX 980 has 2048 shader processors and 4 GB of GDDR5 graphics memory.
The product is obviously PCI-Express 3.0 ready, it has a max TDP of around 120 Watts with a typical idle power draw of 10 Watts. That TDP is a maximum overall, and on average your GPU will not consume that amount of power.
The GM206 is based off the Maxwell architecture, as such you will get the pre-modelled SMX clusters of what is now 128 shader processors per cluster (that used to be 192 on Kepler). There are 8 active clusters for the GTX 960, times 128 shader processors which thus offers you 1024 shader processors. The reference GeForce GTX 960 has a core clock frequency of 1127 MHz with a Boost frequency that can run up to 1178 MHz.
As far as the memory specs of the GM206 Maxwell GPU are concerned, these boards will feature a narrow 128-bit memory bus connected to 2 GB of GDDR5 video buffer memory, AKA VRAM AKA framebuffer AKA graphics memory for the graphics card. On the memory controller side of things you'll see that the reference memory clock (effective data-rate) is now set at 7 GHz / Gbps. The GeForce GTX 900 series is DirectX 11.3 and 12 ready, with Windows 8.1, 7 and Vista also being compatible to take advantage of DirectCompute, multi-threading, hardware tessellation and the latest shader 5.0 extensions. The latest revision of DX12 is a Windows 8 feature only, yet will bring in significant optimizations. DirectX 12 - Direct 3D 12 (low overhead – cross-platform – ready now).
- Features: Rasterizer Ordered
Typed UAV load
Volume tiles resources
conservative raster - Low overhead – Reduce CPU overhead – increase scalability across platforms – Superset of DirectX 11 rendering functionality.
- Cross Platform
For your reference here's a quick overview of some past generation high-end GeForce cards. Yes, the Maxwell products might seem slower if you look at the specs, but they are heavily optimized and are running at relatively high clock frequencies.
GeForce GTX | 780 | 780 Ti | Titan Black | Titan Z | 960 | 970 | 980 |
Stream (Shader) Processors | 2304 | 2880 | 2880 | 5760 | 1024 | 1664 | 2048 |
Core Clock (MHz) | 863 | 875 | 889 | 705 | 1126 | 1050 | 1126 |
Boost Clock | 900 | 928 | 980 | 876 | 1178 | 1178 | 1216 |
Memory Clock (effective MHz) | 6000 | 7000 | 7000 | 7000 | 7000 | 7000 | 7000 |
Memory amount | 3072 | 3072 | 6144 | 12288 | 2048 | 4096 | 4096 |
Memory Interface | 384-bit | 384-bit | 384-bit | 384-bit | 128-bit | 256-bit | 256-bit |
Memory Type | GDDR5 | GDDR5 | GDDR5 | GDDR5 | GDDR5 | GDDR5 | GDDR5 |
HDCP | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Display Port | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
HDMI | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
With 2 GB per GPU, the GTX 960 is not an appealing product with modern games. Lots of them use more than 2 GB these days, even at 1080P. The 128-bit bus doesn't help there either, however Nvidia reworked the memory subsystem quite a bit, enabling much higher memory clock frequency speeds compared to previous generation GeForce GPUs. The result is this; memory speeds up-to 7 Gbps. Combined with some clever advancements in color compression Nvidia can claim even more bandwidth as Maxwell cards now use 3rd generation delta color compression. (ex. 7 Gbps *1/75%) = 9.3 Gbps effective bandwidth thanks to enhanced color compression and enhanced caching techniques. That's a theoretical number though.
The MSI GeForce GTX 960 Gaming OC runs at faster clocks (1216 MHz core / 1279 MHz Boost) opposed to the reference specs we mentioned above