Pascal GPU Architecture
The New Pascal Based GPUs
The GeForce GTX 1000 series graphics cards are based on the latest iteration of GPU architecture called Pascal (named after the famous mathematician), the GeForce GTX 1080 Ti uses revision A1 of GP102-350.
- Pascal Architecture - The Nvidia Pascal architecture is the most powerful GPU design ever built. Comprised of 12 billion transistors and including 3,584 single-precision CUDA cores, the card is the world's fastest consumer GPU.
- 16 nm FinFET - The GP102 GPU is fabricated using a new 16nm FinFET manufacturing process that allows the chip to be built with more transistors, ultimately enabling new GPU features, higher performance, and improved power efficiency.
- GDDR5X Memory - GDDR5X provides a significant memory bandwidth improvement over the GDDR5 memory that was used previously in NVIDIA's flagship GeForce GTX GPUs. Running at a data rate of 11 Gbps, the 1080 Ti's 352-bit memory interface provides way more memory bandwidth than NVIDIA's prior GeForce GTX 980 GPU. Combined with architectural improvements in memory compression, the total effective memory bandwidth increase compared to GTX 980 is 1.8x.
The rectangular die of the GP102 was measured at close to 471 mm² in a BGA package which houses a transistor-count of well over 12 billion. Pascal GPUs are fabbed by the Taiwan Semiconductor Manufacturing Company (TSMC) in the 16nm node.
NVIDIA GeForce GTX 1080 Ti
Alright, we are stepping back to reference material for a second here. The GeForce GTX 1080 Ti gets a shader processor count of 3,584 shader processors. This product is pretty slick as it can manage clock frequencies that are really high, whilst sticking to a 250 Watt TDP. The GeForce GTX 1080 Ti is the card that comes fitted with fast pace 11 Gbps GDDR5X-memory, and sure, a weird 11 GB of it. The reference cards have a base-clock of 1.48 GHz with a boost clock of 1.58 GHz.
The reference 1080 Ti is capable of achieving 11.5 TFLOP/sec of Single Precision performance. To compare it a little, a reference design GeForce GTX 980 pushes 4.6 TFLOPS and a 980 Ti can push 6.4 TFLOPS, - a GTX 1060 does 4.6 TFLOP/sec. The change in shader amount is among the biggest differentials together with ROP, TMU count and memory tied to it. The product is obviously PCI-Express 3.0 compatible, it has a max TDP of around 250 Watts with a typical idle power draw of 5 to 10 Watts. That TDP is a maximum overall, and on average your GPU will not consume that amount of power. So during gaming that average will be lower. The Founders Edition cards run cool and silent enough. Pascal GPUs are fabbed by the Taiwan Semiconductor Manufacturing Company (TSMC) at 16 nm FinFET. You will have noticed the two memory types used in the 1050/1060/1070/1080/1080 Ti and Titan X range already which can be a bit confusing. What was interesting to see was another development, slowly but steadily graphics card manufacturers want to move to HBM memory, stacked High Bandwidth Memory that they can place on-die (close to the GPU die). HBM revision 1 however is limited to four stacks of 1 GB, thus if used you'd only see 4 GB graphics cards. HBM2 can go to 8 GB and 16 GB, however that production process is just not yet ready and/or affordable enough for volume production. With HBM2 being an expensive and limited one it’s simply not the right time to make the move; Big Pascal whenever it releases to the consumer in, say, some sort of Titan or Ti edition will get HBM2 memory, 16 GB of it separated over 4 stacks. But we do not see Big Pascal with HBM2 launching anytime sooner than Q3 of 2017. So with HBM/HBM2 out of the running, basically there are two solutions left, go with traditional GDDR5 memory or make use of GDDR5X, let’s call that turbo GDDR5.
- Nvidia in fact opted for both, the GeForce GTX 1060 and 1070 are to be fitted with your "regular" GDDR5 memory.
- The GeForce GTX 1080, 1080 Ti and Titan X have a little extra bite in bandwidth as they will be fitted with Micron's all new GDDR5X memory.
So, the GeForce GTX Ti is tied to 11 GB GDDR5X DRAM memory. You can look at GDDR5X memory chips as your normal GDDR5 memory however, opposed to delivering 32 byte/access to the memory cells, this is doubled up to 64 byte/access. And that in theory could double up graphics card memory bandwidth, Pascal certainly likes large quantities of memory bandwidth to do its thing in. Nvidia states it to be 352-bit GDDR5X @ 11 Gbps (which is an effective data-rate).
Display Connectivity
Nvidia's Pascal generation products will receive a nice upgrade in terms of monitor connectivity. First off, the cards will get three DisplayPort connectors, one HDMI connector and a DVI connector. The days of Ultra High resolution displays are here, Nvidia is adapting to it. The HDMI connector is HDMI 2.0 revision b which enables:
- Transmission of High Dynamic Range (HDR) video
- Bandwidth up to 18 Gbps
- 4K@50/60 (2160p), which is 4 times the clarity of 1080p/60 video resolution
- Up to 32 audio channels for a multi-dimensional immersive audio experience
DisplayPort wise compatibility has shifted upwards to DP 1.4 which provides 8.1 Gbps of bandwidth per lane and offers better color support using Display Stream Compression (DSC), a "visually lossless" form of compression that VESA says "enables up to 3:1 compression ratio." DisplayPort 1.4 can drive 60 Hz 8K displays and 120 Hz 4K displays with HDR "deep color." DP 1.4 also supports:
- Forward Error Correction: FEC, which overlays the DSC 1.2 transport, addresses the transport error resiliency needed for compressed video transport to external displays.
- HDR meta transport: HDR meta transport uses the “secondary data packet” transport inherent in the DisplayPort standard to provide support for the current CTA 861.3 standard, which is useful for DP to HDMI 2.0a protocol conversion, among other examples. It also offers a flexible metadata packet transport to support future dynamic HDR standards.
- Expanded audio transport: This spec extension covers capabilities such as 32 audio channels, 1536 kHz sample rate, and inclusion of all known audio formats.
High Dynamic Range (HDR) Display Compatibility
Nvidia obviously can now fully support HDR and deep color all the way. HDR is becoming a big thing, especially for the movie aficionados. Think better pixels, a wider color space, more contrast and more interesting content on that screen of yours. We've seen some demos on HDR screens, and it is pretty darn impressive to be honest. By this year you will see the first HDR compatible Ultra HD TVs, and then next year likely monitors and games supporting it properly. HDR is the buzz-word for 2017. With Ultra HD Blu-ray just being released in Q1 2016 there will be a much welcomed feature, HDR. HDR will increase the strength of light in terms of brightness. High-dynamic-range rendering (HDRR or HDR rendering), also known as high-dynamic-range lighting, is the rendering of computer graphics scenes by using lighting calculations done in a larger dynamic range. This allows preservation of details that may be lost due to limiting contrast ratios. Video games and computer-generated movies and special effects benefit from this as it creates more realistic scenes than with the more simplistic lighting models used. With HDR you should remember three things: bright things can be really bright, dark things can be really dark, and details can be seen in both. High-dynamic-range will reproduce a greater dynamic range of luminosity than is possible with digital imaging. We measure this in Nits, and the number of Nits for UHD screens and monitors is going up.
What's a nit? Candle brightness measured over one meter is 1 nits, the sun is 16,000,000,000 nits, typical objects have 1~250 nits, current PC displays have 1 to 250 nits, and excellent HDTVs have 350 to 400 nits. A HDR OLED screen is capable of 500 nits and here it’ll get more important, new screens in 2016 will go to 1,000 nits. HDR offers high nits values to be used. We think HDR will be implemented in 2017 for PC gaming, Hollywood has already got end-to-end access content ready of course. As consumers start to demand higher-quality monitors, HDR technology is emerging to set an excitingly high bar for overall display quality. HDR panels are characterized by: Brightness between 600-1200 cd/m2 of luminance, with an industry goal to reach 2,000 contrast ratios that closely mirror human visual sensitivity to contrast (SMPTE 2084) And the Rec.2020 color gamut that can produce over 1 billion colors at 10 bits per color HDR can represent a greater range of luminance levels than can be achieved using more "traditional" methods, such as many real-world scenes containing very bright, direct sunlight to extreme shade, or very faint nebulae.
HDR displays can be designed with the deep black depth of OLED (black is zero, the pixel is disabled), or the vivid brightness of local dimming LCD. Now meanwhile, if you cannot wait to play games in HDR and did purchase a HDR HDTV this year, you could stream it. A HDR game rendered on your PC with a Pascal GPU can be streamed towards your Nvidia Shield Android TV and then over HDMI connect to that HDR telly as Pascal has support for 10 bit HEVC HDR encoding and the Shield Android TV can decode it. Hey, just sayin'. A selection of Ultra HDTVs are already available, and consumer monitors are expected to reach the market late 2016 and 2017. Such displays will offer unrivaled color accuracy, saturation, brightness, and black depth - in short, they will come very close to simulating the real world.