AMD: Asynchronous shaders in GCN handy with DirectX 12

OnnA

2015-03-31 13:09

Yep ! thats GCN :banana: And of course Devs can handle (must) because of PS4 and XboX 1 So the porting will be much easier + we PC gamers have better Gaming exp. overall So DX12 (with GCN 12_3) will benefit all MS platforms. Finger crossed 🤓

#5041374

Noisiv

2015-03-31 13:22

HD 7000 & Rx 240/250/270/280 : processeur de commandes x1 queue + 2 ACE x1 queue + 2 DMA engines ->Graphics/Compute/Copy with limitations HD 7790 & R7 260 : processeur de commandes x1 queue + 2 ACE x8 queues + 2 DMA engines ->Graphics/Compute/Copy R9 285/290 : processeur de commandes x1 queue + 8 ACE x8 queues + 2 DMA engines ->Graphics/Compute/Copy GTX 400/500/600/700 : processeur de commandes x1 queue + 1 DMA engine ->No support GTX 750/780/Titan : processeur de commandes x32 queues (limité) + 1 DMA engine ->Compute/Compute GTX 900/Titan X : processeur de commandes x32 queues + 2 DMA engines ->Graphics/Compute/Copy At the latest graphics cards, the GeForce GTX 900 will take full advantage of optimizations associated with concomitant tasks, as will the Radeon R9 290 for example. By cons, it remains to see what that will do the developers. The gains will not be automatic and will require that the different stages of 3D rendering are suitable.

http://www.hardware.fr/news/14133/gdc-d3d12-amd-parle-gains-gpu.html

#5041396

fantaskarsef

2015-03-31 14:10

http://www.hardware.fr/news/14133/gdc-d3d12-amd-parle-gains-gpu.html

This reads as if the Maxwells and the middle to top 300 AMD cards will treat this pretty much the same, or am I wrong?

#5041479

Denial

2015-03-31 16:41

This reads as if the Maxwells and the middle to top 300 AMD cards will treat this pretty much the same, or am I wrong?

Maxwell can do 32 Queues, 290 can do 64.

#5041581

sykozis

2015-03-31 18:42

this sounds like a GPU version of hyperthreading to me. Happy to see the efficiency improving - and not just for team red or green. so everyone who has a compatible card regardless of vendor will be able to enjoy these improvements. can't wait to see the real life results 🙂

No. Each shader processor will still only be able to execute 1 thread at a time whereas hyperthreading allows a single processor core to execute 2 threads. They're just finally implementing true, simultaneous multi-threading for GPU's.....and doing so through software. There are enough shader processors within a GPU where HyperThreading really isn't needed. My GTX970, for example, has 1664 shader processors (or CUDA cores as NVidia calls them). GPUs, under DX11 and OpenGL, are essentially "In-Order" processors where data is processed in the exact order it's received. With DX12 and "Vulkan", the GPU will function more like an "Out-of-Order" processor where instructions are prioritized and executed in order of importance.

#5041686

anxious_f0x

2015-03-31 21:13

It's an interesting way of doing things, let's hope it's actually utilised by developers on both PC and console, certainly puts the PS4 in a good position with it's 8 ACE'S.

#5041944

fantaskarsef

2015-04-01 07:13

Maxwell can do 32 Queues, 290 can do 64.

so we checked with NVIDIA on queues. Fermi/Kepler/Maxwell 1 can only use a single graphics queue or their complement of compute queues, but not both at once – early implementations of HyperQ cannot be used in conjunction with graphics. Meanwhile Maxwell 2 has 32 queues, composed of 1 graphics queue and 31 compute queues (or 32 compute queues total in pure compute mode). So pre-Maxwell 2 GPUs have to either execute in serial or pre-empt to move tasks ahead of each other, which would indeed give AMD an advantage..

I'm still not entirely sure I get it... doesn't that mean 64 AMD vs a single one with Maxwell 2 cards? That would indeed look like an avantage for AMD...

#5042561

Dazz

2015-04-02 06:15

Doesn't Maxwell do this anyway but in hardware? it tries to prioritise traffic this can clearly be seen in the Maxwell version of the 970 since it puts frequent information on the fast memory partition and stored less used cache data on the reserved part. In essence nVidia should get a nice increase if it's done in software first then hardware can either change it on it's requirements or ignore it as being already efficient enough. AMD's solution doesn't do this so may benefit immensely. Time will tell tho.

#5042704

xg-ei8ht

2015-04-02 14:06

PS4 has 8 ACES and 64 queues.

#5042725

Spets

2015-04-02 14:46

Both are not related.. Here we are speaking about Asynchronous computing on the shader processors level.. ( not fix a bad design conception on memory access level how they can ) I enjoin you to read the article from Anandtech: http://www.anandtech.com/show/9124/amd-dives-deep-on-asynchronous-shading Its an architecture advantages from AMD, as basically GCN have been designed for it since the 1.0 iteration ( HD 7970). When for Nvidia they have now Maxwell who support it, but will indeed not been as good as GCN for it.. But i think on the time developpers will really take advantage of it, certainly that 2016 GPU's will be out ( so Pascal ). I can bet that on many front, Pascal will look really similar of GCN.

Going off the chart from the article you linked, it looks like Maxwell 2 has better support than GCN. Everything up to it though does lack in comparison. Would be nice to see developers taking advantage of this.