Wednesday, June 1, 2016

MSI GeForce GTX 1080 GAMING X 8G review - Introduction

MSI GeForce GTX 1080 GAMING X 8GX gonna give it to ya!
When the GeForce GTX 1080 launched two weeks ago, it caught us a bit by surprise, the actual reference review took down this site for a couple of minutes as our load-balanced front-end servers could not handle the near 2500% increase in traffic. Crazy stuff, and that is testimony to the fact that you guys have been waiting very long on the new graphics cards from both AMD and Nvidia. It's for good reason, the graphics card industry, or the GPU industry has been on hold, waiting for a smaller
GPU fabrication process to become viable. Last generation GPUs were based on a 28 nm fabrication, an intermediate move to 20 nm was supposed to be the answer for today’s GPUs, but it was a problematic technology. Aside from some smaller ASICs the 20 nm node has been a fail. Therefore the industry had to wait until an ever newer and smaller fabrication process was available in order to shrink the die which allows for less voltage usage in the chips, less transistor gate leakage and, obviously, more transistors in a GPU. The answer was to be found in the recent 14/15/16 nm fabrication processors and processes with the now all too familiar FinFET + VLSI technology (basically wings on a transistor). Intel has been using it for a while, and now both Nvidia and AMD are moving towards such nodes as well. Nvidia is the first to announce their new products based on a TSMC 16 nm process fab by introducing Pascal GPU architecture, named after the mathematician much like Kepler, Maxwell and Fermi. That stage has now passed, the GeForce GTX 1070 and 1080 have been announced with the 1080 slowly becoming available in stores as we speak, the 1070 cards you'll start to see selling by next week (June 10th 2016). Both cards are equally impressive in it's product positioning, though I do feel the 1070 will be the more attractive product due to it's price level, the 1080 cards really is what everybody want (but perhaps can't afford). The good news though is that the board partner cards will sell for less opposed to the Nvidia reference / Founder edition cars. Obviously the higher-end all customized SKUs will likely level with that founders edition card price level again, but I am pretty certain you'd rather spend your money on a fully customized AIB card that is already factory tweaked a bit opposed the the reference one.
In this AIB review we look at the MSI GeForce GTX 1080 GAMING X 8G fitted with a Pascal GP104 based GPU. A product series that is to replace the GeForce GTX 980. It's all custom with 10 power phases, has a nice dark aesthetic feel and comes with the all new TwinFrozr VI model cooler, that is marketed as a cooler with Balls of Steel. Seriously I am not making that up, it literally was in the press release and actually refers towards the Double Ball Bearings that the fans use -- made out of steel.

Regular Gaming - X or Z ?

You will have noticed that MSI is to release a regular Gaming and then an X and a Gaming Z model. The regular Gaming model (no X or Z) will not have the backplate and configurable RGB LED light system and basic clock frequencies. Then there are the  X models, these are released in the initial launch and come with a backplate and RGB system and are clocked a notch higher. Then the Z models will be the most high-end SKU, even more overclocked with all the benefits the X model has as well. So we test the X model, but there will be even faster clocked revisions. The Gaming Z gives you all the features of the Gaming X, but with higher clock speeds. Right with that explained, the cooling perf has been improved and combined with a new generation fans, the airflow is improved whilst remaining silent. Up-to 60 Degrees C the card will even stay in passive mode, e.g. the fans will not spin. The TWIN FROZR cooler is now intensified by a red GAMING glow piercing through the cover, while the MSI GAMING dragon RGB LED on the side can be set to any of 16.8 million colors to match the LED lights in the color-tone of your PC. The GTX 1080 GAMING X 8G comes with MSI's traditional Military Class 4 components and holds both an 8- and 6-pin power connector. At the backside you'll spot a nice matte black solid backplate. Both versions do have TwinFrozer VI as wel as a memory cooling plate and a PWM heatsink. You will have noticed the X in the product name. MSI will offer two models here, a slightly lower clocked X version and then the Z version with high factory clocks as well as the inclusion of the back-plate. The MSI GeForce GTX 1080 GAMING X 8G  has default clock frequencies of 1848 MHz (boost) / 1709 (base) MHz with a reference clocked 8192 MB GDDR5X / 10108 MHz effective data-rate on the memory.
Right, we have enough to talk about and to show, let's head on-wards in the review. We'll start with a product overview in the photo-shoot.
 


MSI GeForce GTX 1080 GAMING X 8G

Product Showcase

Let's start with our photo-shoot. A few pages that show the ins and outs with photos, all taken with an in-house photo-shoot of course.
  

So the MSI GeForce GTX 1080 gaming X is quite something, MSI moved away from the reference design and just re-used that Pascal GPU. You will spot a nice matte black PCB with 10-phases and two power headers (one 8-pin and one 6-pin) for a little more overclocking headroom. The PCB is as mentioned matte black in color, and of course the new revision TwinFrozr VI cooler is being used. These cards will look just terrific in a dark themed PC. 


As board partners are allowed to release the 1080 model cards in their own configurations you will see many versions, mostly based on customized PCB/component and the obviously mandatory different cooling solutions. This is the X edition of the Gaming series, meaning it has high but not the highest clocks and a back-plate, all quite impressive as well. The MSI GeForce GTX 1080 GAMING X 8G has default clock frequencies of 1848 MHz (boost) / 1709 (base) MHz with a reference clocked 8192 MB GDDR5X / 10108 MHz effective data-rate on the memory.


The card itself is a dual-slot solution, it is composite heat-pipe based, the GPU is cooled by a nickel-plated copper base plate connected to Super Pipes (8mm heat pipes) on this MSI GAMING series graphics card. A SU heat pipe layout increases efficiency by reducing the length of unused heat pipe and a special SU-form design. Zero Frozr technology eliminates fan noise in low-load situations by stopping the fans when they are not needed Up-to roughly 60 Degrees C, the fans won't even spin. The LEDs embedded in this graphics card can be controlled with the MSI Gaming APP, though we haven't tried it (due to lack of time) these are RGB configurable with a few animations as well.  Check out the backside where there is a thick sturdy metal back-plate with plenty of venting spaces applied as well. 
The card will have a power design of roughly 180 Watts, but due to the high clocks and extensive tweaking design please add 10, maybe 15 extra Watts. Two power headers in combo with component selection like Hi-c CAPs, Super Ferrite Chokes and solid capacitors should be plenty for a nice tweak as well. The GeForce GTX 1080 is DisplayPort 1.2 certified and DP 1.3/1.4 Ready, enabling support for 4K displays at 120Hz, 5K displays at 60Hz, and 8K displays at 60Hz (using two cables). This model includes three DisplayPort connectors, one HDMI 2.0b connector, and one dual-link DVI connector. Up to four display heads can be driven simultaneously from one card. The GTX 1080 display pipeline supports HDR gaming, as well as video encoding and decoding. New to Pascal is HDR Video (4K@60 10/12b HEVC Decode), HDR Record/Stream (4K@60 10b HEVC Encode), and HDR Interface Support (DP 1.4).

The MSI GeForce GTX 1080 GAMING X 8G  takes advantage of Nvidia's new Pascal GPU based on 16 nm FinFet architecture, and with 7.2 billion transistors, 2,560 shader/stream cores, and 8 GB of GDDR5X, it’s a rather fast product. In Ultra HD it can advance up-to 25 to 40% in performance over the GeForce GTX 980 as we learned. It is a good amount faster compared to the 980 Ti and Titan X as well.

The GPU empowering the product is called the GP104-400 GPU, which is Pascal architecture based. It has 2,560 CUDA Cores, while texture filtering is performed by 160 texture units. The reference/founder cards have a base clock frequency of 1,607 MHz where MSI clocks the base frequency 102 MHz higher at 1,709 MHz.


The card has a 180 Watt rated TDP, 75-150 Watts is delivered though the PCIe slot, then 150 Watts through the single 8-pin PEG (PCI Express graphics) power connector and another 75 Watts though the extra 6-pin connector. So yes, you'll have spare for a nice overclock. The GeForce GTX 1080 display engine is capable of supporting the latest high resolution displays, including 4K and 5K screens. And with HDMI 2.0 support, the GeForce GTX 1080 can be used by gamers who want to game on the newest state-of-the-art big screen TVs. 



 

Overall you are looking at a card that consumes roughly 180 Watts under full load. The new TwiNFrozr revision VI cooler is as always impressive in both cooling performance as well as its low noise-levels. The card remains passive, as up-to a GPU temperature of 60 Degrees C the fans do not even spin. The fans (Torx) used are a new revision and should also offer a bit more airflow. MSI will try to force as much performance out of the cards at a maximum cooling threshold of roughly 70 Degrees C for this 1080 model with the TwiNFrozr series VI cooler.



The GeForce GTX 1080 is roughly 11 inches in length which is 28 cm so it should fit comfortably in pretty much any decent chassis. Precise measurements btw are 279x140x42mm.

 

Above the TwinFrozr VI cooler stripped away, you'll notice that MSI placed extra shielding heatsinks over the GDDR5X memory as well as a heatsink directly on top of the VRM area. Good stuff alright. The PCB board design is custom from MSI build upon their Military Class IV standard. It is a bit of marketing yes, but the component selection does follow MIL-STD-810G certification, these components have proven to be able to withstand the torturous circumstances in both gaming and overclocking. The GPU empowering the product is called the A1 revision of the GP104-400 GPU




For those that wonder, the board is equipped with Micron GDDR5X memory ICs. An actual 10 phase power supply is responsible for supplying the GP104-400 GPU with power. The card (with cooler) weights 1100g.



RGB LEDs are a bit of a trend and like everybody MSI is following. They did it right and subtle though. The LED system can be controlled individually by choosing any of the animation effects available in the MSI Gaming App, ranging from responding to your game sounds or music to steady light, breathing and flashing. Of course, you can also turn them off.


And yes, I do thing that is a terrific looking product with the red accents and all matte black. And this isn't even the top-tier Z model just yet.


The New Pascal Based GPUs

The GeForce GTX 1070 and 1080 graphics cards are based on the latest iteration of GPU architecture called Pascal (named after the famous mathematician), the cards use revision A1 of GP104. Both cards will have slightly different configurations though.

  • Pascal Architecture - The GeForce GTX 1080's Pascal architecture is the most efficient GPU design ever built. Comprised of 7.2 billion transistors and including 2,560 single-precision CUDA Cores, the GeForce GTX 1080 is the world's fastest GPU. With an intense focus on craftsmanship in chip and board design, NVIDIA's engineering team achieved unprecedented results in frequency of operation and energy efficiency.
  • 16 nm FinFET - The GeForce GTX 1080's GP104 GPU is fabricated using a new 16 nm FinFET manufacturing process that allows the chip to be built with more transistors, ultimately enabling new GPU features, higher performance, and improved power efficiency.
  • GDDR5X Memory - GDDR5X provides a significant memory bandwidth improvement over the GDDR5 memory that was used previously in NVIDIA's flagship GeForce GTX GPUs. Running at a data rate of 10 Gbps, the GeForce GTX 1080's 256-bit memory interface provides 43% more memory bandwidth than NVIDIA's prior GeForce GTX 980 GPU. Combined with architectural improvements in memory compression, the total effective memory bandwidth increase compared to GTX 980 is 1.7x.
The rectangular die of the GP104 was measured at close to 15.35 mm x 19.18 mm with a 37.5 × 37.5 mm 314 mm² BGA package which houses a transistor-count of well over 7 billion. Pascal GPUs are fabbed by the Taiwan Semiconductor Manufacturing Company (TSMC).

GeForceGTX 1080GTX 1070GTX Titan X GTX 980 TiGTX 980
GPUGP104-400-A1GP104-200-A1GM200GM200GM204
ArchitecturePascalPascalMaxwellMaxwellMaxwell
Transistor count7.2 Billion7.2 Billion8 Billion8 Billion5.2 Billion
Fabrication NodeTSMC 16 nmTSMC 16 nmTSMC 28 nmTSMC 28 nmTSMC 28 nm
CUDA Cores2,5601,9203,0722,8162,048
SMMs / SMXs2015242216
ROPs6464969664
GPU Clock Core1,607 MHz1,506 MHz1,002 MHz1,002 MHz1,127 MHz
GPU Boost clock1,733 MHz1,683 MHz1,076 MHz1,076 MHz1,216 MHz
Memory Clock1,250 MHz2,000 MHz1,753 MHz1,753 MHz1,753 MHz
Memory Size8 GB8 GB12 GB6 GB4 GB
Memory Bus256-bit256-bit384-bit384-bit256-bit
Memory Bandwidth320 GB/s256 GB/s337 GB/s337 GB/s224 GB/s
FP Performance9.0 TFLOPS6.45 TFLOPS7.0 TFLOPS 6.4 TFLOPS 4.61 TFLOPS
GPU Thermal Threshold94 Degrees C94 Degrees C91 Degrees C91 Degrees C95 Degrees C
TDP180 Watts150 Watts250 Watts250 Watts165 Watts
Launch MSRP ref$599/$699$379/$449$999$699$549


GeForce GTX 1080

The GeForce GTX 1080; the rumors have been correct, including a correct shader processor count of 2,560 shader processors for the 1080. This product is pretty wicked as it can manage clock frequencies that are really high, whilst sticking to a 180 Watt TDP only. The Nvidia GeForce GTX 1080 is based on a GP104-A1  GPU which holds 7.2 Billion transistors (FinFET). The GeForce GTX 1080 is the card that comes fitted with Micron's new GDDR5X-memory, a proper 8 GB of it. The card has a base-clock of 1.61 GHz with a boost clock of 1.73 GHz, and that's just the Founders/reference design. This edition is capable of achieving 9 TFLOP/sec of single precision performance. To compare it a little, a reference design GeForce GTX 980 pushes 4.6 TFLOPS and a 980 Ti can push 6.4 TFLOPS. Overclocks over 2 GHz on the boost frequency are possible. In fact, just at default clocks (depending on load and title) we've seen ~1,850 MHz clock frequencies. The memory base frequency is 2,500 MHz on a 256-bit wide memory bus, but being GDDR5X that reads as 5,000 MHz, and double it up (double data-rate) that means an affective data-rate of 10 Gbps - and thus an effective speed of 10 GHz. Nvidia will sell its reference card (named 'Founders Edition') at 699 USD, the board partner cards will start at 599 USD for the most simple models.
Though not announced today, the GeForce GTX 1070 has a very similar GP104 GPU based on Pascal-architecture, the product is the more affordable one. This card comes with "regular" GDDR-memory, again, regular GDDR5 memory so that effective bandwidth is cut in half. The product also uses a GP104 Pascal GPU, but has a smaller number of shader processors that are active. In the Founders Edition configuration (the reference design) it would still offer 6.5 TFLOPS of performance (single precision).
GeForce GTX 1080 comes with 2,560 shader (CUDA) cores while its little brother, the GeForce GTX 1070, is expected to get 2,048 shader processors at its disposal (this remains speculation until the final numbers arrive though). The change in shader amount is among the biggest differentials together with ROP, TMU count and memory tied to it.
  • GeForce GTX 960 has 1,024 shader processors and 2 GB of GDDR5 graphics memory.
  • GeForce GTX 970 has 1,664 shader processors and 4 GB of GDDR5 graphics memory.
  • GeForce GTX 980 has 2,048 shader processors and 4 GB of GDDR5 graphics memory.
  • GeForce GTX Titan X has 3,072 shader processors and 12 GB of GDDR5 graphics memory.
  • GeForce GTX 1070 has 1,920 shader processors and 8 GB of GDDR5 graphics memory.
  • GeForce GTX 1080 has 2,560 shader processors and 8 GB of GDDR5X graphics memory.
The product is obviously PCI-Express 3.0 compatible, it has a max TDP of around 180 Watts with a typical idle power draw of 5 to 10 Watts. That TDP is a maximum overall, and on average your GPU will not consume that amount of power. So during gaming that average will be lower. Both Founders Edition cards run cool and silent enough.  

Using Both GDDR5 & GDDR5X

You will have noticed the two memory types used already. What was interesting to see was another development, slowly but steadily graphics card manufacturers want to move to HBM memory, stacked High Bandwidth Memory that they can place on-die (close to the GPU die). HBM revision 1 however is limited to four stacks of 1 GB, thus if used you'd only see 4 GB graphics cards. HBM2 can go 8 GB and 16 GB, however that production process is just not yet ready and/or affordable enough for volume production. With HBM2 being an expensive and limited one it’s simply not the right time to make the move; Big Pascal whenever it releases to the consumer in, say, some sort of Titan or Ti edition will get HBM2 memory, 16 GB of it separated over 4 stacks. But we do not see Big Pascal (the Ti or Titan equivalent for Pascal) launching anytime sooner than Christmas or even Q1 of 2017. So with HBM/HBM2 out of the running, basically there are three solutions left, go with traditional GDDR5 memory or make use of GDDR5X, let’s call that turbo GDDR5. Nvidia in fact opted for both, the GeForce GTX 1070 is to be fitted with your "regular" GDDR5 memory. But to get the GTX 1080 a little extra bite in bandwidth they will fit it with Micron's all new GDDR5X memory. So yes, the GP104 GPU can be tied to both memory types. The 1080 tied to GDDR5X DRAM memory is rather interesting. You can look at GDDR5X memory chips as your normal GDDR5 memory however, opposed to delivering 32 byte/access to the memory cells, this is doubled up towards 64 byte/access. And that in theory could double up graphics card memory bandwidth, Pascal certainly likes large quantities of memory bandwidth to do its thing in. Nvidia states it to be 256-bit GDDR5X @ 10 Gbps (which is an effective data-rate).


Display Connectivity

Nvidia's Pascal generation products will receive a nice upgrade in terms of monitor connectivity. First off, the cards will get three DisplayPort connectors, one HDMI connector and a DVI connector. The days of Ultra High resolution displays are here, Nvidia is adapting to it. The HDMI connector is HDMI 2.0 revision b which enables:
  • Transmission of High Dynamic Range (HDR) video
  • Bandwidth up to 18 Gbps
  • 4K@50/60 (2160p), which is 4 times the clarity of 1080p/60 video resolution
  • Up to 32 audio channels for a multi-dimensional immersive audio experience
DisplayPort wise compatibility has shifted upwards to DP 1.4 which provides 8.1 Gbps of bandwidth per lane and offers better color support using Display Stream Compression (DSC), a "visually lossless" form of compression that VESA says "enables up to 3:1 compression ratio." DisplayPort 1.4 can drive 60 Hz 8K displays and 120 Hz 4K displays with HDR "deep color." DP 1.4 also supports:
  • Forward Error Correction: FEC, which overlays the DSC 1.2 transport, addresses the transport error resiliency needed for compressed video transport to external displays.
  • HDR meta transport: HDR meta transport uses the “secondary data packet” transport inherent in the DisplayPort standard to provide support for the current CTA 861.3 standard, which is useful for DP to HDMI 2.0a protocol conversion, among other examples. It also offers a flexible metadata packet transport to support future dynamic HDR standards.
  • Expanded audio transport: This spec extension covers capabilities such as 32 audio channels, 1536kHz sample rate, and inclusion of all known audio formats.
      


High Dynamic Range (HDR) Display Compatibility

Nvidia obviously can now fully support HDR and deep color all the way. HDR is becoming a big thing, especially for the movie aficionados. Think better pixels, a wider color space, more contrast and more interesting content on that screen of yours. We've seen some demos on HDR screens, and it is pretty darn impressive to be honest. By this year you will see the first HDR compatible Ultra HD TVs, and then next year likely monitors and games supporting it properly. HDR is the buzz-word for 2016. With Ultra HD Blu-ray just being released in Q1 2016 there will be a much welcomed feature, HDR. HDR will increase the strength of light in terms of brightness. High-dynamic-range rendering (HDRR or HDR rendering), also known as high-dynamic-range lighting, is the rendering of computer graphics scenes by using lighting calculations done in a larger dynamic range. This allows preservation of details that may be lost due to limiting contrast ratios. Video games and computer-generated movies and special effects benefit from this as it creates more realistic scenes than with the more simplistic lighting models used. With HDR you should remember three things: bright things can be really bright, dark things can be really dark, and details can be seen in both. High-dynamic-range will reproduce a greater dynamic range of luminosity than is possible with digital imaging. We measure this in Nits, and the number of Nits for UHD screens and monitors is going up. What's a nit? Candle brightness measured over one meter is 1 nits, the sun is 1.6000.000.000 nits, typical objects have 1~250 nits, current PC displays have 1 to 250 nits, and excellent HDTVs have 350 to 400 nits. A HDR OLED screen is capable of 500 nits and here it’ll get more important, new screens in 2016 will go to 1,000 nits. HDR offers high nits values to be used.
We think HDR will be implemented in 2016 for PC gaming, Hollywood has already got end-to-end access content ready of course. As consumers start to demand higher-quality monitors, HDR technology is emerging to set an excitingly high bar for overall display quality. HDR panels are characterized by: Brightness between 600-1200 cd/m2 of luminance, with an industry goal to reach 2,000 contrast ratios that closely mirror human visual sensitivity to contrast (SMPTE 2084) And the Rec.2020 color gamut that can produce over 1 billion colors at 10 bits per color HDR can represent a greater range of luminance levels than can be achieved using more "traditional" methods, such as many real-world scenes containing very bright, direct sunlight to extreme shade, or very faint nebulae. HDR displays can be designed with the deep black depth of OLED (black is zero, the pixel is disabled), or the vivid brightness of local dimming LCD. Now meanwhile, if you cannot wait to play games in HDR and did purchase a HDR HDTV this year, you could stream it. A HDR game rendered on your PC with a Pascal GPU can be streamed towards your Nvidia Shield Android TV and then over HDMI connect to that HDR telly as Pascal has support for 10 bit HEVC HDR encoding and the Shield Android TV can decode it. Hey, just sayin'. A selection of Ultra HDTVs are already available, and consumer monitors are expected to reach the market late 2016 and 2017. Such displays will offer unrivaled color accuracy, saturation, brightness, and black depth - in short, they will come very close to simulating the real world.

The Pascal GPU

The GF104 is based on DX12 compatible architecture called Pascal. Much like in the past designs you will see pre-modelled SMX clusters that hold what is 128 shader processors per cluster. Pascal GPUs are composed of different configurations of Graphics Processing Clusters (GPCs), Streaming Multiprocessors (SMs), and memory controllers. Each SM is paired with a PolyMorph Engine that handles vertex fetch, tessellation, viewport transformation, vertex attribute setup, and perspective correction. The GP104 PolyMorph Engine also includes a new Simultaneous Multi-Projection units.
There are 20 active SM clusters for a fully enabled Pascal GP104 GPU. 20 x 128 shader processors makes a total of 2,560 shader processors. Each SM however has a cluster of 64 shader / stream / cuda processors doubled up. Don't let that confuse you, it is 128 shader units per SM. Each GPC ships with a dedicated raster engine and five SMs. Each SM contains 128 CUDA cores, 256 KB of register file capacity, a 96 KB shared memory unit, 48 KB of total L1 cache storage, and eight texture units. The reference (Founders Edition) card will be released with a core clock frequency of 1.61 GHz with a Boost frequency that can run up to 1.73 GHz (and even higher depending on load and thermals). As far as the memory specs of the GP104 GPU are concerned, these boards will feature a 256-bit memory bus connected to a nice 8 GB of GDDR5 / GDDR5X video buffer memory, AKA VRAM AKA framebuffer AKA graphics memory for the graphics card. The GeForce GTX 1000 series are DirectX 12 ready, in our testing we'll address some Async compute tests as well as Pascal now has Enhanced Async compute. The latest revision of DX12 is a Windows 10 feature only, yet will bring in significant optimizations. For your reference here's a quick overview of some past generation high-end GeForce cards.


GeForce GTX780 Ti970980TitanTitan X107010801080 Gaming X
Stream (Shader) Processors2,8801,6642,0482,6883,0721,9202,5602,560
Core Clock (MHz)8751,0501,1268361,0021,5061,6071,709
Boost Clock9281,1781,2168761,0761,6831,7331,848
Memory Clock (effective MHz)7,0007,0007,0006,0007,0008,00010,00010,000
Memory amount3,0724,0964,0966,14412,2888,1928,1928,192
Memory Interface384-bit256-bit256-bit384-bit384-bit256-bit256-bit256-bit
Memory TypeGDDR5GDDR5GDDR5GDDR5GDDR5GDDR5GDDR5XGDDR5X

With 8 GB graphics memory available for one GPU, the GTX 1070 and 1080 are very attractive for both modern and future games no matter what resolution you game at.

Improved Color Compression

You will have noticed the GDDR5X memory on the 1080, it increases bandwidth, and you can never have too much of it. So there are other tricks to save up on memory bandwidth, color compression being one of them. The GPU’s compression pipeline has a number of different algorithms that intelligently determine the most efficient way to compress the data. One of the most important algorithms is delta color compression. With delta color compression, the GPU calculates the differences between pixels in a block and stores the block as a set of reference pixels plus the delta values from the reference. If the deltas are small then only a few bits per pixel are needed. If the packed together result of reference values plus delta values is less than half the uncompressed storage size, then delta color compression succeeds and the data is stored at half size (2:1 compression). GeForce GTX 1080 includes a significantly enhanced delta color compression capability:
  • 2:1 compression has been enhanced to be effective more often
  • A new 4:1 delta color compression mode has been added to cover cases where the per pixel deltas are very small and are possible to pack into ¼ of the original storage
  • A new 8:1 delta color compression mode combines 4:1 constant color compression of 2x2 pixel blocks with 2:1 compression of the deltas between those blocks.
With that additional memory bandwidth combined with new advancements in color compression Nvidia can claim even more bandwidth as Pascal cards now use 4th generation delta color compression thanks to enhanced color compression and enhanced caching techniques. Up-to Maxwell the GPU could handle 2:1 color compression ratios, newly added are 4:1 and 8:1 delta color compression. So on one hand the Raw memory bandwidth increases 1.4x (for the GeForce GTX 1080 with GDDR5X) and then there's a compression benefit of 1.2x which is a nice step up in this generation technology wise. Overall there is an increase of roughly 1.6x - 1.7x in memory bandwidth thanks to the faster memory and new color compression technologies. More effective bandwidth thanks to enhanced color compression and enhanced caching techniques. The effectiveness of delta color compression depends on the specifics of which pixel ordering is chosen for the delta color calculation. The GPU is able to significantly reduce the number of bytes that have to be fetched from memory per frame.

Pascal Graphics Architecture

Let's place the more important data on the GPU into a chart to get an idea and better overview of changes in terms of architecture like shaders, ROPs and where we are at frequencies wise:

GeForceGTX 1080GTX 1070GTX Titan X GTX 980 TiGTX 980
GPUGP104GP104GM200GM200GM204
ArchitecturePascalPascalMaxwellMaxwellMaxwell
Transistor count7.2 Billion7.2 Billion8 Billion8 Billion5.2 Billion
Fabrication NodeTSMC 16 nm FFTSMC 16 nm FFTSMC 28 nmTSMC 28 nmTSMC 28 nm
CUDA Cores2,5601,9203,0722,8162,048
SMMs / SMXs2015242216
ROPs6464969664
GPU Clock Core1,607 MHz1,506 MHz1,002 MHz1,002 MHz1,127 MHz
GPU Boost clock1,733 MHz1,683 MHz1,076 MHz1,076 MHz1,216 MHz
Memory Clock1,250 MHz2,000 MHz1,753 MHz1,753 MHz1,753 MHz
Memory Size8 GB8 GB12 GB6 GB4 GB
Memory Bus256-bit256-bit384-bit384-bit256-bit
Mem Bandwidth320 GB/sec 256 GB/s337 GB/s337 GB/s224 GB/s
FP Performance9 TFLOPS6.5 TFLOPS7.0 TFLOPS 6.4 TFLOPS 4.61 TFLOPS
Thermal Threshold97 Degrees C97 Degrees C91 Degrees C91 Degrees C95 Degrees C
TDP180 Watts150 Watts250 Watts250 Watts165 Watts
Launch MSRP$599/$699 $379/$449$999$699$549

So we talked about the core clocks, specifications and memory partitions. However, to be able to better understand a graphics processor you simply need to break it down into tiny pieces. Let's first look at the raw data that most of you can understand and grasp. This bit will be about the architecture. NVIDIA’s “Pascal” GPU architecture implements a number of architectural enhancements designed to extract even more performance and more power efficiency per watt consumed. Above, in the chart photo, we see the GP104 block diagram that visualizes the architecture, Nvidia started developing the Pascal architecture around 2013/2014 already. Each of the GPCs has 10 SMX/SMM (streaming multi-processors) clusters in total. You'll spot eight 32-bit memory interfaces, bringing in a 256-bit path to the graphics GDDR5 or GDDR5X memory. Tied to each 32-bit memory controller are eight ROP units and 256 KB of L2 cache. The full GP104 chip used in GTX 1080 ships with a total of 64 ROPs and 2,048 KB of L2 cache.
A fully enabled GP104 GPU will have:
  • 2,560 CUDA/Shader/Stream processors
  • There are 128 CUDA cores (shader processors) per cluster (SM)
  • 7.1 Billion Transistors (FinFet at 16nm)
  • 160 Texture units
  • 64 ROP units
  • 2 MB L2 cache
  • 256-bit GDDR5 / GDDR5X
What about double-precision? It's dumbed down to not interfere with Quadro sales -- double-precision instruction throughput is 1/32 the rate of single-precision instruction throughput. An important thing to focus on is the SM (block of shader processors) clusters (SMX), which have 128 shader processors. One SMX holds 128 single‐precision shader cores, double‐precision units, special function units (SFU), and load/store units. So based on a full 20 SM (2,560 shader proc) core chip the looks are fairly familiar in design. In the pipeline we run into the ROP (Raster Operation) engine and the GP104 has 64 engines for features like pixel blending and AA. The GPU has 64 KB of L1 cache for each SMX plus a special 48 KB texture unit memory that can be utilized as a read-only cache. The GPU’s texture units are a valuable resource for compute programs with a need to sample or filter image data. The texture throughput then, each SMX unit contains 8 texture filtering units.
  • GeForce GTX 960 has 8 SMX x 8 Texture units = 64
  • GeForce GTX 970 has 13 SMX x 8 Texture units = 104
  • GeForce GTX 980 has 16 SMX x 8 Texture units = 128
  • GeForce GTX Titan X has 24 SMX x 8 Texture units = 192
  • GeForce GTX 1070 has 15 SMX x 8 Texture units = 120
  • GeForce GTX 1080 has 20 SMX x 8 Texture units = 160
So there's a total of up-to 20 SMX x 8 TU = 160 texture filtering units available for the silicon itself (if all SMXes are enabled for the SKU).

Asynchronous Compute

Modern gaming workloads are increasingly complex, with multiple independent, or “asynchronous,” workloads that ultimately work together to contribute to the final rendered image. Some examples of asynchronous compute workloads include:
  • GPU-based physics and audio processing
  • Postprocessing of rendered frames 
  • Asynchronous timewarp, a technique used in VR to regenerate a final frame based on head position just before display scanout, interrupting the rendering of the next frame to do so 
These asynchronous workloads create two new scenarios for the GPU architecture to consider. The first scenario involves overlapping workloads. Certain types of workloads do not fill the GPU completely by themselves. In these cases there is a performance opportunity to run two workloads at the same time, sharing the GPU and running more efficiently — for example a PhysX workload running concurrently with graphics rendering. For overlapping workloads, Pascal introduces support for “dynamic load balancing.” In Maxwell generation GPUs, overlapping workloads were implemented with static partitioning of the GPU into a subset that runs graphics, and a subset that runs compute. This is efficient provided that the balance of work between the two loads roughly matches the partitioning ratio. However, if the compute workload takes longer than the graphics workload, and both need to complete before new work can be done, and the portion of the GPU configured to run graphics will go idle. This can cause reduced performance that may exceed any performance benefit that would have been provided from running the workloads overlapped. Hardware dynamic load balancing addresses this issue by allowing either workload to fill the rest of the machine if idle resources are available.


Time critical workloads are the second important asynchronous compute scenario. For example, an asynchronous timewarp operation must complete before scanout starts or a frame will be dropped. In this scenario, the GPU needs to support very fast and low latency preemption to move the less critical workload off of the GPU so that the more critical workload can run as soon as possible. As a single rendering command from a game engine can potentially contain hundreds of draw calls, with each draw call containing hundreds of triangles, and each triangle containing hundreds of pixels that have to be shaded and rendered. A traditional GPU implementation that implements preemption at a high level in the graphics pipeline would have to complete all of this work before switching tasks, resulting in a potentially very long delay. To address this issue, Pascal is the first GPU architecture to implement Pixel Level Preemption. The graphics units of Pascal have been enhanced to keep track of their intermediate progress on rendering work, so that when preemption is requested, they can stop where they are, save off context information about where to start up again later, and preempt quickly. The illustration below shows a preemption request being executed.



In the command pushbuffer, three draw calls have been executed, one is in process and two are waiting. The current draw call has six triangles, three have been processed, one is being rasterized and two are waiting. The triangle being rasterized is about halfway through. When a preemption request is received, the rasterizer, triangle shading and command pushbuffer processor will all stop and save off their current position. Pixels that have already been rasterized will finish pixel shading and then the GPU is ready to take on the new high priority workload. The entire process of switching to a new workload can complete in less than 100 microseconds (Ī¼s) after the pixel shading work is finished. Pascal also has enhanced preemption support for compute workloads. Thread Level Preemption for compute operates similarly to Pixel Level Preemption for graphics. Compute workloads are composed of multiple grids of thread blocks, each grid containing many threads. When a preemption request is received, the threads that are currently running on the SMs are completed. Other units save their current position to be ready to pick up where they left off later, and then the GPU is ready to switch tasks. The entire process of switching tasks can complete in less than 100 Ī¼s after the currently running threads finish. For gaming workloads, the combination of pixel level graphics preemption and thread level compute preemption gives Pascal the ability to switch workloads extremely quickly with minimal preemption overhead.
Technologies introduced with Pascal - with the announcement of the GeForce GTX 1070 & 1080 Nvidia also launched a few new technologies, a new way of making screenshots, VR Audio, Improved multi-monitor support and SLI, these are items I wanted to run through.

Nvidia Ansel: 360 Degree Screenshots

Named after a famous photographer, Nvidia intros Ansel, a new way of making in-game screenshots. Capturing stills from games we pretty much do on a daily basis here at Guru3D. With the Ansel announcement Nvidia steps it up a notch, Nvidia even called it an Artform (I wouldn't go that far though). Screenshots typically are based on a 2D image taken from a 3D rendered scene. Nvidia figures (with VR in mind) why not grab a 360 screenshot in-game (if the game supports Ansel technology) so that you can grab a still, save it and then later on use your VR headset to look at the screenshot in 3D. It can also be used to create super-resolution screenshots or just "regular" screenshots to which you can then apply EXR effects and filters.



Ansel offers the ability to grab screenhots in 3D at incredible resolutions, up-to 61,440 x 34,560 pixels with silly sized screengrabs that can be 1.5 GB for one grab. Funky however is that Nvidia "borrowed" some SweetFX ideas. After you've captured the screenshot you can alter the RAW data and this makes an image darker/lighter, set color tone and thus apply filters to that screenshot (think Instagram effects). While not "necessary", Ansel was designed with VR in mind so that you can grab a still and then alter it and then watch it in 3D with your Oculus Rift or HTC Vive. Ansel will also become available for previous generation products and is not a Pascal specific thing.
Ansel does need game support, some titles that do and will support it: Tom Clancy's The Division, The Witness, Law Breakers, The Witcher Wild Hunt, Paragon, No Man's Sky and Unreal Tournament will be the first adopter games to offer support for this new way of grabbing screenshots.

Nvidia FAST SYNC

Nvidia is adding a new SYNC mode. This mode works especially well with high FPS games, the folks that like to play Counter-Strike at 100 FPS. With this feature, Nvidia is basically decoupling the render engine and display by using a third buffer. This method was specifically designed for that high FPS demographic, the new sync mode will eliminate stuttering and screen tearing with high FPS games and thus offers low latency across the board. It can be combined with GSYNC (which works great with the lower spectrum of refresh rates). Fast Sync is a latency-conscious alternative to traditional Vertical Sync (V-SYNC) that eliminates tearing, while allowing the GPU to render unrestrained by the refresh rate to reduce input latency. If you use V-SYNC ON, the pipeline gets back-pressured all the way to the game engine, and the entire pipeline slows down to the refresh rate of the display. With V-SYNC ON, the display is essentially telling the game engine to slow down, because only one frame can be effectively generated for every display refresh interval. The upside of V-SYNC ON is the elimination of frame tearing, but the downside is high input latency. When using V-SYNC OFF, the pipeline is told to ignore the display refresh rate, and to deliver game frames as fast as possible. The upside of V-SYNC OFF is low input latency (as there is no back-pressure), but the downside is frame tearing. These are the choices that gamers face today, and the vast majority of eSports gamers are playing with V-SYNC OFF to leverage its lower input latencies, lending them a competitive edge. Unfortunately, tearing at high FPS causes a vast amount of jittering, which can hamper their gameplay.


NVIDIA has decoupled the front end of the render pipeline from the backend display hardware. This allows different ways to manipulate the display that can deliver new benefits to gamers. Fast Sync is one of the first applications of this new approach. With Fast Sync, there is no flow control. The game engine works as if V-SYNC is OFF. And because there is no back-pressure, input latency is almost as low as with V-SYNC OFF. Best of all, there is no tearing because FAST SYNC chooses which of the rendered frames to scan to the display. FAST SYNC allows the front of the pipeline to run as fast as it can, and it determines which frames to scan out to the display, while simultaneously preserving entire frames so they are displayed without tearing. The experience that FAST SYNC delivers, depending on frame rate, is roughly equal to the clarity of V-SYNC ON combined with the low latency of V-SYNC OFF.

VR Audio

Another technology that was introduced is again VR related. Nvidia offers VRWorks, a set of developer tools that allows the devs to improve their games with VR. One of the new additions is VRWorks Audio, this technology makes it possible to simulate GPU reflections and absorption of audio waves within a virtual 3D-space. Basically the GPU will calculate and predict how certain audio would sound if it bounces off a hard floor or a soft one combined with other objects it bounces off. For example, if you talk in a room with concrete walls it would sound different opposed to that same room with carpets hanging on the walls. The last time that a GPU manufacturer added 3D audio over the GPU it failed to become a success alright, that would be AMD True Audio.



The question arises, will VRWorks audio create enough interest and momentum so that developers will actually implement it? To demo all the possibilities of VRWorks Nvidia will release a new demo-application soon, it is called Nvidia VR Funhouse. The application will not just demo VRWorks Audio but also Physics simulations in a VR environment.

SMP - Simultaneous Multi-Projection

One of the more interesting and bigger new technologies demoed at the GeForce GTX Series 1000 Pascal launch was Simultaneous Multi-Projection, which is pretty brilliant for people using three monitors. You know it, when you place your right and left monitors at an angle, the game will get warped and looks as if the angle does not match. See the screengrab below:

Look at the angle related bends when you place your surround monitors at an angle.
New with Pascal is simultaneous multi-projection, a technology that allows the GPU to calculate a maximum of sixteen camera view points, at nearly no performance loss. Previous gen GPU-architectures can only do the math on one cam viewing angle / point of view. This feature is not software based, it is located in hardware in the GPU pipeline. So why would you want to be able to do this you might wonder? Well, I spilled the beans already a bit in the opening paragraph. Currently when you game on multiple screens you are looking at a "Projection" of one 2D image. Now, if you have one 2D Image on three monitors, then it's only going to look good if the monitors are standing in straight lines next to each other. When you angle the "curve" of the monitors around you, the angles will distort the image, e.g. a straight line would have a bend. Some games have fixes for this, but nothing solid. Well, with Pascal this is a thing of the past as this is exactly where Simultaneous Multi-Projection comes in. With the help of your driver control panel you can alter the angle of your monitors so that it matches how you have set up the monitors. The 3D imagery will now be calculated for each screen on the GPU, based on the angle of your monitors. So if you were to surround yourself with three monitors, the rendered images displayed will not be warped, but will be displayed correctly.


Have a look at the screengrab again, now with Simultaneous Multi-Projection enabled.
  
The beauty here is that due to the added core-logic on the GPU, this angle correction does not come at a performance loss, or at least a very little one. SMP obviously also helps out in VR environments where typically you need to do all kinds of tricks for the normally two rendered images versus lenses and warping. To be able to do this in one pass in hardware on the GPU will create huge performance increases for the upcoming GeForce GTX 1070 and 1080 on VR. Again, this is hardware based and thus cannot be added to Fermi and/or Maxwell models with a driver update. The Simultaneous Multi-Projection Engine is capable of processing geometry through up to 16 preconfigured projections, sharing the center of projection (the viewpoint), and with up to 2 different projection centers, offset along the X axis. Projections can be independently tilted or rotated around an axis. Since each primitive may show up in multiple projections simultaneously, the SMP engine provides multi-cast functionality, allowing the application to instruct the GPU to replicate geometry up to 32 times (16 projections x 2 projection centers) without additional application overhead as the geometry flows through the pipe.

GPU Boost 3.0

A few years ago Nvidia introduced boost modes for the graphics cards. Typically a GPU clock frequency was fixed at a certain MHz, they altered that to a base frequency, and then a Boost frequency. That Boost frequency would allow the GPU to reach high clocks if, say, the temperature of the GPU is low enough, or say the GPU would have low enough load. So, ever since it was introduced, dynamic clock frequencies and voltages have become a popular thing, Nvidia calls this Nvidia Boost, and it has now reached revision three. A fundamental change has been made as the GPU is now even more adaptive and allows for per voltage point frequency offsets. Meaning at each stage on MHz you will have a certain tolerance in voltage that point can take. The advantage here is that each stage can get an optimal voltage for your boost and thus overclocking frequency. It is highly complex, but does offer a new technology to make these cards run faster at even higher clock frequencies.
So basically new addition would be:
  1. Adjust per point clock frequency frequency offset (controlled by the end user).
  2. Overvoltage setting mode changes (exact voltage->set range 0%~100%), voltages are now based on percentage .
  3. Add new limit option -> No load limit (=utilization limit)
  4. This new NVAPI only supports PASCAL GPUs. meaning the new features that we discuss and sho today only will work on GeForce GTX 1070 and 1080 (and other TBA products).
Basically with future updates in overclocking software you will see mutilple stages of control:
  • Regular voltage control in percentage (no longer can fixed/exact Voltage offsets be used). Maximum voltage will vary based on temperature. 
Then on the GPU core frequency:
  • Basic mode - a single clock frequency offset applied to all V/F points. 
  • Linear mode control - You can specify a frequency offset for the maximum clock and minimum clocks. In AfterBurner this linearly interpolates to fill a curve.
  • Manual mode - per point frequency offset control through the V/F editor in AfterBurner.


So, for tweaking a new option has arrived, it previously was a single frequency offset e.g. +50 MHz on the boost (still can be applied. But you other options; per point frequency tweaking. We are not sure yet if this is something the end-user will like, as seems a bit complex + voltage is now set on offset percentages. And if we learned one thing over the years, overclocking voltages needs to be as simple as possible for everybody.

An Update To SLI

With Pascal there is a change invoked for SLI. One critical ingredient to NVIDIA’s SLI technology is the SLI Bridge, which is a digital interface that transfers display data between GeForce graphics cards in a system. Two of these interfaces have historically been used to enable communications between three or more GPUs (i.e., 3-Way and 4-Way SLI configurations). The second SLI interface is required for these scenarios because all other GPUs need to transfer their rendered frames to the display connected to the master GPU, and up to this point each interface has been independent.



Beginning with NVIDIA Pascal GPUs, the two interfaces are now linked together to improve bandwidth between GPUs. This new dual-link SLI mode allows both SLI interfaces to be used in tandem to feed one Hi-res display or multiple displays for NVIDIA Surround. Dual-link SLI mode is supported with a new SLI Bridge called SLI HB. The bridge facilitates high-speed data transfer between GPUs, connecting both SLI interfaces, and is the best way to achieve full SLI clock speeds with GeForce GTX 1080 GPUs running in SLI. The GeForce GTX 1080 is also compatible with legacy SLI bridges; however, the GPU will be limited to the maximum speed of the bridge being used. Using this new SLI HB Bridge, GeForce GTX 1080’s new SLI interface runs at 650 MHz, compared to 400 MHz in previous GeForce GPUs using legacy SLI bridges. Where possible though, older SLI Bridges will also get a speed boost when used with Pascal. Specifically, custom bridges that include LED lighting will now operate at up to 650 MHz when used with GTX 1080, taking advantage of Pascal’s higher speed IO.

New Multi GPU Modes

Compared to prior DirectX APIs, Microsoft has made a number of changes that impact multi-GPU functionality in DirectX 12. At the highest level, there are two basic options for developers to use multi-GPU on NVIDIA hardware in DX12: Multi Display Adapter (MDA) Mode, and Linked Display Adapter (LDA) mode. LDA Mode has two forms: Implicit LDA Mode which NVIDIA uses for SLI, and Explicit LDA Mode where game developers handle much of the responsibility needed for multi-GPU operation to work successfully. MDA and LDA Explicit Mode were developed to give game developers more control. The following table summarizes the three modes supported on NVIDIA GPUs:



In LDA Mode, each GPU’s memory can be linked together to appear as one large pool of memory to the developer (although there are certain corner case exceptions regarding peer-to-peer memory); however, there is a performance penalty if the data needed resides in the other GPU’s memory, since the memory is accessed through inter-GPU peer-to-peer communication (like PCIe). In MDA Mode, each GPU’s memory is allocated independently of the other GPU: each GPU cannot directly access the other’s memory. LDA is intended for multi-GPU systems that have GPUs that are similar to each other, while MDA Mode has fewer restrictions—discrete GPUs can be paired with integrated GPUs, or with discrete GPUs from another manufacturer—but MDA Mode requires the developer to more carefully manage all of the operations that are needed for the GPUs to communicate with each other. By default, GeForce GTX 1070/1080 SLI supports up to two GPUs. 3-Way and 4-Way SLI modes are no longer recommended. As games have evolved, it is becoming increasingly difficult for these SLI modes to provide beneficial performance scaling for end users. For instance, many games become bottlenecked by the CPU when running 3-Way and 4-Way SLI, and games are increasingly using techniques that make it very difficult to extract frame-to-frame parallelism. Of course, systems will still be built targeting other Multi-GPU software models including:
  • MDA or LDA Explicit targeted
  • 2 Way SLI + dedicated PhysX GPU

Wait Just A Minute... So No More 3 & 4-Way SLI?

Correct, NVIDIA no longer recommends 3 or 4 way systems for SLI and places its focus on 2-way SLI only. However, for those of you that do want more than two GPUs, there is a way (albeit complex). However... meet the enthusiast key. While NVIDIA no longer recommends 3 or 4 way systems for SLI, they know that true enthusiasts will not be swayed and in fact some games will continue to deliver great scaling beyond two GPUs. For this class of user they have developed an Enthusiast Key that can be downloaded off of NVIDIA’s website and loaded into an individual’s GPU. This process involves:
  1. Run an app locally to generate a signature for your GPU
  2. Request an Enthusiast Key from an upcoming NVIDIA Enthusiast Key website
  3. Download your key
  4. Install your key to unlock the 3 and 4-way function
Full details on the process are available on the NVIDIA Enthusiast Key website, which will be available at the time GeForce GTX 1080/1070 GPUs are available in users’ hands.

Hardware Installation

Installation of any of the Nvidia GeForce cards is really easy. Once the card is seated into the PC make sure you hook up the monitor and of course any external power connectors like 6 and/or 8-pin PEG power connectors. Preferably get yourself a power supply that has these PCIe PEG connectors natively as converting them from a Molex Peripheral connector anno 2016 is sooo 2008. Purchase a proper power supply, ok?



Once done, we boot into Windows, install the latest drivers and after a reboot all should be working. No further configuration is required or needed unless you like to tweak the settings, for which you can open the NVIDIA control panel. 

Power Consumption

Let's have a look at how much power draw we measure with this graphics card installed. The methodology: We have a device constantly monitoring the power draw from the PC. We stress the GPU to the max, and the processor as little as possible. The before and after wattage will tell us roughly how much power a graphics card is consuming under load. Our test system is based on an eight-core Intel Core i7-5960X Extreme Edition setup on the X99 chipset platform. This setup is overclocked to 4.40 GHz on all cores. Next to that we have energy saving functions disabled for this motherboard and processor (to ensure consistent benchmark results). We'll be calculating the GPU power consumption here, not the total PC power consumption.
Measured Power Consumption
Mind you, the system wattage is measured at the wall socket side and there are other variables like PSU power efficiency. So this is an estimated value, albeit a very good one. Below, a chart of relative power consumption. Again, the Wattage shown is the card with the GPU(s) stressed 100%, showing only the peak GPU power draw, not the power consumption of the entire PC and not the average gaming power consumption.

Power consumption TDP in KWhKWh price2 hrs day4 hrs day
Graphics card measured TDP0,1980,230,090,18
Cost 5 days per week / 4 hrs day€ 0,91
Cost per Month€ 3,95
Cost per Year 5 days week / 4 hrs day / € 47,36
Cost per Year 5 days week / 4 hrs day / $$ 62,52
Here is Guru3D's power supply recommendation:
  • GeForce GTX 1080 - On your average system the card we recommend a 600 Watts power supply unit.
  • GeForce GTX 1080 SLI - On your average system the cards we recommend a 800 Watts power supply unit.
If you are going to overclock your GPU or processor, then we do recommend you purchase something with some more stamina. Also, at half the PSU load (50% usage) your PSU is the most energy efficient. There are many good PSUs out there, please do have a look at our many PSU reviews as we have loads of recommended PSUs for you to check out in there. What could happen if your PSU can't cope with the load is:
  • Bad 3D performance
  • Crashing games
  • Spontaneous reset or imminent shutdown of the PC
  • Freezing during gameplay
  • PSU overload can cause it to break down

Let's move to the next page where we'll look into GPU heat levels and noise levels coming from this graphics card.

Graphics Card Temperatures

So here we'll have a look at GPU temperatures. First up, IDLE (desktop) temperatures as reported through software on the thermal sensors of the GPU. IDLE temperatures first, overall anything below 50 Degrees C is considered okay, anything below 40 Degrees C is nice. We threw in some cards at random that we have recently tested in the above chart. But what happens when we are gaming? We fire off an intense game-like application at the graphics card and measure the highest temperature of the GPU.
So with the cards fully stressed we kept monitoring temperatures and noted down the GPU temperature as reported by the thermal sensor.
  • The card's temperature under heavy game stress stabilized at roughly 70 Degrees C. We note down the hottest GPU reading, not the average.
These tests have been performed with a 20~21 Degrees C room temperature, this is a peak temperature based on a FireStrike loop. Please note that under 60 Degrees C the fans remain inactive, this will result in slightly higher IDLE temperatures.



Thermal Imaging Temperature Measurements

A new addition to our reviews will be the inclusion of Forward Looking Infra Red thermal images of hardware. Over the past years we have been trying to figure out what the best possible way is to measure temperatures on hardware. Multiple options are available but the best thing to do is to visualize heat coming from the product or component being tested. The downside of thermal imaging hardware is simple, FLIR camera's with a bit of decent resolution costs up-to 10,000 EUR. Hence we passed on it for a long time. With a thermal imaging camera a special lens focuses the infrared light emitted by all of the objects in view. This focused light is scanned by a phased array of infrared-detector elements. The detector elements create a very detailed temperature pattern called a thermogram. It only takes about one-thirtieth of a second for the detector array to obtain the temperature information to make the thermogram. This information is obtained from several thousand points in the field of view of the detector array. The thermogram created by the detector elements is translated into electric impulses. The impulses are sent to a signal-processing unit, a circuit board with a dedicated chip that translates the information from the elements into data for the display. The signal-processing unit sends the information to the display, where it appears as various colors depending on the intensity of the infrared emission. The combination of all the impulses from all of the elements creates the image. We can seek hotspots on the PCB indicating, for example, GPU but also VRM temperature as well as how heat is distributed throughout a product. We do hope you will enjoy this new technology as it did cost us an arm and a leg to be able to implement it.


We can measure pretty accurate temperatures at the GPUs and VRM areas. So once we start to stress the GPU the thermals quickly change. We can measure thermals down to a 10th of a degree, our thermal camera was calibrated in 2016.
  • We reach almost 70 degrees C on M1, the GPU marker at the back-plate, which is spot on with what the thermal sensor reports back.
  • At M3 (Measure Point 3) the PCB area / backplate temp can be measured it runs just close to 44 Degrees C on that spot, that is considered to be a very normal temperature. Make sure you have plenty airflow inside your chassis as that will always help. 
  • At M2 we are spot on the VRM area, at 80 Degrees C. That's OK but a notch higher then I had hoped for.
The thermal image clearly shows that the back-plate is not trapping heat. We've seen closed back-plate designs on the founder edition cards, and such ones are not our favorites, we like to see lots of gaps to vent. So this is pretty good.
When we position the thermal camera outwards we can see that the overall cooler design really works well. The hottest point is the top side of the card where there is some residual heat detected. M1 is the hottest spot at 70 Degrees C, the VRM area. Remember, this is the graphics card 100% stressed in a FireStrike scene 1 loop.


A lot of heat is exhausted at the top side, at the M1 position we do see that quite a bit of heat, again the VRM area. That heat ends up in your PC. The card produces quite a bit of heat under full stress so ventilation inside your PC is a must.




Graphics Card Noise Levels

When graphics cards produce a lot of heat, usually that heat needs to be transported away from the hot core as fast as possible. Often you'll see massive active fan solutions that can indeed get rid of the heat, yet all the fans these days make the PC, a noisy son of a gun. Do remember that the test we do is extremely subjective. We bought a certified dBA meter and will start measuring how many dBA originate from the PC. Why is this subjective you ask? Well, there is always noise in the background, from the streets, from the HDD, PSU fan, etc, so this is by a mile or two, an imprecise measurement. You could only achieve objective measurement in a sound test chamber.
The human hearing system has different sensitivities at different frequencies. This means that the perception of noise is not at all equal at every frequency. Noise with significant measured levels (in dB) at high or low frequencies will not be as annoying as it would be when its energy is concentrated in the middle frequencies. In other words, the measured noise levels in dB will not reflect the actual human perception of the loudness of the noise. That's why we measure the dBA level. A specific circuit is added to the sound level meter to correct its reading in regard to this concept. This reading is the noise level in dBA. The letter A is added to indicate the correction that was made in the measurement. Frequencies below 1 kHz and above 6 kHz are attenuated, whereas frequencies between 1 kHz and 6 kHz are amplified by the A weighting.

Examples of Sounds Levels

Jet takeoff (200 feet)120 dBA
Construction Site110 dBA Intolerable
Shout (5 feet)100 dBA
Heavy truck (50 feet) 90 dBA Very noisy
Urban street 80 dBA
Automobile interior 70 dBA Noisy
Normal conversation (3 feet) 60 dBA
Office, classroom 50 dBA Moderate
Living room 40 dBA
Bedroom at night 30 dBA Quiet
Broadcast studio 20 dBA
Rustling leaves 10 dBA Barely audible
There are a lot of differences in measurements among websites. Some even place the dBA meter 10 cm away from the card. Considering that's not where your ear is located, we do it our way, at 75 cm distance.

For each dBA test we close the PC/chassis and move the dBA gun 75 cm away from the PC. Roughly the same proximity you'll have from a PC in a real-world situation. Above, the IDLE (desktop mode) results where the GPU hardly has to do anything. The system idle results are really good.
Please note: under 60 Degrees C (idle) the fans do not spin and as such the card is in-audible. 33 DBa is the lowest we can measure (which is incredibly silent).
Once the card is in a fully stressed status (in-game) it touches only 38~39 dBA. The card under stress remains very silent. We also did not hear any coil whine/noise.




Test Environment & Equipment

Here is where we begin the benchmark portion of this article, but first let me show you our test system plus the software we used.
Mainboard
MSI X99A GODLIKE Gaming - Review
Processor
Core i7 5960X (Haswell-E) @ 4.4 GHz on all eight cores - Review
Graphics Cards
GeForce GTX 1080 - 8 GB GDDR5X graphics memory MSI Gaming X edition
Memory
16 GB (4x 4096 MB) 2,133 MHz DDR4
Power Supply Unit
1,200 Watts Platinum Certified Corsair AX1200i - Review
Monitor
Dell 3007WFP - QHD up to 2560x1600
ASUS PQ321 native 4K UHD Monitor  at 3840 x 2160 - Review
OS related software
Windows 10 64-bit
DirectX 9/10/11/12 End User Runtime (Download)
AMD Radeon Software Crimson Driver 16.5.2.1 (Download)
NVIDIA GeForce Driver 368.25 (Download)
Software benchmark suite
  • (DX12) Hitman (2016)
  • (DX12) Rise of the Tomb Raider (2016) 
  • (DX12) Ashes of Singularity
  • (DX11) The Division
  • (DX11) Far Cry Primal
  • (DX11) Fallout 4
  • (DX11) Anno 2205 (2016)
  • (DX11) Battlefield Hardline
  • (DX11) Grand Theft Auto V
  • (DX11) The Witcher III
  • (DX11) 3DMark 11
  • (DX11) 3DMark 2013 FireStrike
  • (DX11) Thief
  • (DX11) Alien: Isolation
  • (DX11) LOTR Middle Earth: Shadow of Mordor
A Word About "FPS"
What are we looking for in gaming, performance wise? First off, obviously Guru3D tends to think that all games should be played at the best image quality (IQ) possible. There's a dilemma though, IQ often interferes with the performance of a graphics card. We measure this in FPS, the number of frames a graphics card can render per second, the higher it is the more fluently your game will display itself.
A game's frames per second (FPS) is a measured average of a series of tests. That test is often a time demo, a recorded part of the game which is a 1:1 representation of the actual game and its gameplay experience. After forcing the same image quality settings; this time-demo is then used for all graphics cards so that the actual measuring is as objective as can be.
Frames per secondGameplay
<30 FPSVery limited gameplay
30-40 FPSAverage yet very playable
40-60 FPSGood gameplay
>60 FPSBest possible gameplay
  • So if a graphics card barely manages less than 30 FPS, then the game is not very playable, we want to avoid that at all cost.
  • With 30 FPS up-to roughly 40 FPS you'll be very able to play the game with perhaps a tiny stutter at certain graphically intensive parts. Overall a very enjoyable experience. Match the best possible resolution to this result and you'll have the best possible rendering quality versus resolution, hey you want both of them to be as high as possible.
  • When a graphics card is doing 60 FPS on average or higher then you can rest assured that the game will likely play extremely smoothly at every point in the game, turn on every possible in-game IQ setting.
  • Over 100 FPS? You either have a MONSTER graphics card or a very old game.
Monitor Setup
Before playing games, setting up your monitor's contrast & brightness levels is a very important thing to do. I realized recently that a lot of you guys have set up your monitor improperly. How do we know this? Because we receive a couple of emails every now and then telling us that a reader can't distinguish between the benchmark charts (colors) in our reviews. We realized, if that happens, your monitor is not properly set up.
What Are You Looking For?
  • Top bar - This simple test pattern is evenly spaced from 0 to 255 brightness levels, with no profile embedded. If your monitor is correctly set up, you should be able to distinguish each step, and each step should be visually distinct from its neighbours by the same amount. Also, the dark-end step differences should be about the same as the light-end step differences. Finally, the first step should be completely black.
  • The three lower blocks - The far left box is a black box with in the middle a little box a tint lighter than black. The middle box is a lined square with a central grey square. The far right white box has a smaller "grey" box that should barely be visible.
You should be able to distinguish all small differences, only then you monitor is setup properly contrast and saturation wise.

DX12: Rise of the Tomb Raider (2016)

The game begins with a small camp at Siberia, where Lara is gazing at the huge mountains that await them. Jonah tells her that the others won't be accompanying them anymore except him. The climb soon becomes challenging, with Lara nearly falling down to her death. As Lara reaches the top, she sees the ruins of Kitezh from afar, but the storm strikes the mountain and separates the two.
 This particular test has the following enabled:
  • DX12
  • Very high Quality mode
  • FXAA/HBAO+ enabled
  • 16x AF enabled
  • Pure Hair Normal (on)
  • Tessellation On
Our scores are average frame rates so you need to keep a margin in mind for lower FPS at all times. As such, we say 40 FPS in your preferred monitor resolution for this game should be your minimum, while 60 FPS (frames per second) can be considered optimal.  




DX12: Hitman (2016)

Hitman is a stealth video game series developed by the Danish company IO Interactive. The story revolves around Agent 47 (usually simply referred to as "47" or "Mr. 47"), a cloned assassin-for-hire, whose flawless record places him in high demand among the wealthy and elite.
  • DirectX 12
  • Ultra Quality settings
  • MSAA
  • 16x AF
  • Internal benchmark
Here is where we return to single card performance again, with a wide scope of products. We think WQHD at 2560x1440 is going to be a sweet-spot for gamers this year resolution wise. Hitman system requirements will vary greatly depending the graphics settings, but a modest PC will be required to run the game at medium settings or better. Hitman is built on the Glacier game engine, and will have variable system requirements.


DX12: Ashes Of The Singularity Benchmark

Oxide Games released its RTS game Ashes of the SingularityAshes is an RTS title powered by their Nitrous game engine. The game’s look and feel somewhat resembles Total Annihilation, with large numbers of units on-screen simultaneously, and heavy action between ground and flying units. The game has been in development for several years, and it’s the debut title for the new Nitrous engine.
Truth be told, we remain a little hesitant in deciding whether or not to use this benchmark in our test suite. But is does show improvements in Async compute hence we have added this DX12 class title. We use the internal benchmark with high quality settings. The test run executes an identical flyby pass and tests various missions and unit match-ups. This benchmark doesn't pre-compute the results. Every segment of the game engine, including its AI, audio, physics, and firing solutions is executed in real-time, every single time. Making it a valid test. Also, not using it would raise even more questions I guess, so from this point onwards we'll start to include results.
We measure at high detail settings at a monitor resolution of 2560x1440 - You will notice that the benchmark results designates batches, CPU framerate and average of all frames. Batches can be seen as predefined draw calls. “Normal” batches contain a relatively light number of draw calls, heavy batches are those frames that include a huge number of draw calls increasing scene complexity with loads of extra objects to render and move around. The idea here is that with a proper GPU you can fire off heavy batches with massive draw calls. And as we all know, one of the major benefits of DirectX 12 is the ability to increase draw calls to the system with lower API overhead, resulting in more complex scenes and better use of lower spec hardware. As with all benchmarks, higher=better. 



Additional chart with different workloads (batches)

OpenGL: DOOM (2016)

Doom (stylized as DOOM) is a first-person shooter video game developed by id Software and published by Bethesda Softworks. The game is a reboot of the Doom series and is the first major installment in the series since the release of Doom 3 in 2004. Doom's single-player mode has "badass demons, big effing guns, and moving really fast" as key principles, according to id Software executive producer Marty Stratton. The game features a large arsenal of weapons, which can be collected and freely switched by players throughout the game. Weapons such as the super shotgun and BFG 9000 return.
Optimized for DOOM. We test this OpenGL title at Ultra quality settings with SMAA 1tx eneabled.



DX11: Far Cry Primal

Set during the savage Stone Age, Far Cry Primal is a single player experience that takes place 10,000 BCE, in a time when massive beasts like the woolly mammoth and sabretooth tiger ruled the Earth. Players take on the role of Takkar, a seasoned hunter and the last surviving member of his tribe. Arriving in the majestic and savage land of Oros, Takkar has only one goal: survive in a world where humans are the prey. Players will journey as the first human to tame the wilderness and rise above extinction. Along the way, they will have to hunt for food, master fire, fend off fierce predators, craft weapons and tools partly from the bones of slain beasts and face off against other tribes to conquer Oros.
We use Very High quality settings as it offers an acceptable level of performance versus quality and gives the mid-range cards some breathing room as well (for objective comparison). Games typically should be able to run in the 40 FPS range combined with your monitor resolution. From there on-wards you can enable/disable things if you need-more performance or demand even better game rendering quality.

DX11: Anno 2205

Anno 2205 is a city-building and economic simulation game, with real-time strategy elements, developed by Ubisoft Blue Byte and published by Ubisoft. 2205 is the sixth game of the Anno series, and was released in November 2015. As with Anno 2070, the game is set in the future, with players having the opportunity to set up colonies on the moon.
We use near Ultra High quality settings  on everything (hey this is PC gaming right?) versus an acceptable level of performance. Games typically should be able to run in the 40 FPS range combined with your monitor resolution. Take a good look at our settings which we can recommend for the best PC experience. A land based RTS already feels good at 30 FPS (true that). From there onwards you can enable/disable things if you need more performance or demand even better game rendering quality.



Fallout 4 Benchmarks

After something happened in the vault (I won't share spoilers here) and as survivor of Vault 111, you enter a world destroyed by nuclear war. Every second is a fight for survival, and every choice is yours. Only you can rebuild and determine the fate of the Wasteland. First or third person combat. Combat slow motion, using VATS (Vault Assisted Targeting system). Cinematic combat carnage. Highly advanced crafting system. Collect, upgrade and build thousands of items. Build and manage entire settlements. For this title we will cover benchmarks in the sense of average framerates, we'll look at all popular resolutions scaling from Full HD (1920x1080/1200), WQHD (2560x1440) and of course that big-whopper of a resolution Ultra HD, UHDTV (2160p) is 3840 pixels wide by 2160 pixels tall (8.29 megapixels), which is four times as many pixels as 1920x1080 (2.07 megapixels). We use near Ultra quality settings on everything (hey this is PC gaming, right?) versus an acceptable level of performance. Games should be able to run in the 40 FPS range combined with your monitor resolution. From there onwards you can enable/disable things if you need more performance or demand even better game rendering quality.
Take a good look at our settings which we can recommend for the best PC experience. Ehm, Gameworks we already mentioned and it has an effect on AMD Radeon cards, we used Shadow Quality at HIGH and Godrays quality at HIGH as well to balance things out inbetween the two brands. The rest of the settings are maxed.




Grand Theft Auto V 

Grand Theft Auto V is an open world, action-adventure video game developed by Rockstar North and published by Rockstar Games. It was released on 17 September 2013 for the PlayStation 3 and Xbox 360. An enhanced version of the game was released on 18 November 2014 for the PlayStation 4 and Xbox One, and was scheduled to be released on 14 April 2015 for PC slash Microsoft Windows. Our settings are as follows with very high quality, 16xAF, 2xMSAA and FXAA enabled. The game is the first main entry in the Grand Theft Auto series since 2008's Grand Theft Auto IV. Set within the fictional state of San Andreas (based on Southern California), the single-player story follows three criminals and their efforts to commit heists while under pressure from a government agency. The open world design lets players freely roam San Andreas, which includes open countryside and the fictional city of Los Santos (based on Los Angeles). 
* Entries with a 0 FPS could not be measured or have not been measured just yet.





DX11: Tom Clancy's The Division

Tom Clancy's The Division is an online open world third-person shooter role-playing video game developed and published by Ubisoft, with assistance from Red Storm Entertainment, for Microsoft Windows, PlayStation 4 and Xbox One. The Division takes place in mid-crisis Manhattan, an open world with destructive environments that are free for players to explore. The player's mission is to restore order by investigating the source of a virus. We use a time based measurement based upon the internal benchmark this title offers. These are some of the in-game quality settings options. We simply flick on the best (ULTRA) quality settings and then disabled VSYNC. Games typically should be able to run in the 40 FPS range combined with your monitor resolution.

DX11: Thief 2014

Thief is a series of stealth video games in which the player takes the role of Garrett, a master thief in a fantasy/steampunk world resembling a cross between the Late Middle Ages and the Victorian era, with more advanced technologies interspersed. Thief is the fourth title in the Thief series, developed by Eidos Montreal and published by Square Enix. The story is set several hundreds of years after the events of the original series in the same universe (clues to the backstory are hidden among documents, plaques, and letters). The original master thief Garrett's (known as the legendary master Sneak Thief) iconic Mechanical Eye is one of the hidden Unique Loots in the game (and can be found inside of a prison complex he apparently failed to escape). Other iconic factions such as the Keepers and Hammerites and other old gods have been outlawed, and now lie in ruins throughout the city and beneath.
Image Quality Settings:
  • DX11
  • Very High Image Quality
  • 8x Anisotropic Filtering
  • Screenspace reflection on
  • Parallax Occlusion mapping on
  • FXAA on
  • Contact Hardening Shadows on
  • Tesselation on


The Witcher III: Wild Hunt Benchmarks

Wild Hunt has improved on several aspects from past games. Combat revolves around an action role-playing game system combined with the use of magic. The fighting system has been completely revamped. Wild Hunt introduces some new mechanics, such as witcher-sense, combat on horseback, and at sea, swimming underwater and using a crossbow. Additionally, Geralt can now jump, climb, and vault over smaller obstacles. Our measurements are taken in game. Our settings are Ultra quality with AA enabled. Please find an overview of the exact settings here. Hairworks is DISABLED to objectively compare inbetween AMD and Nvidia cards. 
Our test run has enabled:
  • DX11
  • Ultra mode
  • AA enabled
  • 16x AF enabled
  • SSAO enabled
  • Nvidia hairworks OFF
  • Other settings ON



DX11: Battlefield Hardline

Unlike the previous games in the series, Hardline focuses on crime/heist/policing elements instead of military warfare. Miami is embroiled in a drug war and Officer Nick Mendoza (Nicholas Gonzalez) has just made detective. Alongside his partner, veteran detective Khai Minh Dao (Kelly Hu), he follows the drug supply chain from the streets to the source. In a series of increasingly off-the-books cases the two detectives come to realize that power and corruption can affect both sides of the law. Before we begin with the graphics performance tests a little explanation. We use a time based measurement based upon framerate recording. The test is a representative for any modern age GPU.  
Our rather nice settings have enabled:
  • DX11
  • Ultra mode
  • 4x MSAA enabled
  • 16x AF enabled
  • HBAO enabled
  • Gator Bait level (close to the dock)
* Entries with a 0 FPS could not be measured or have not been measured just yet.




DX11: Alien Isolation

The game follows Amanda Ripley, who is investigating the disappearance of her mother. Amanda is transferred to the space station Sevastopol to find the flight recorder of the Nostromo only to discover an Alien has terrorized the station and killed the vast majority of the crew. The player has the ability to crouch to hide behind objects to break line of sight with the Alien, and the player can then covertly peek over or lean around to gain view. The player can also run and possesses both a flashlight and a motion tracker to detect the Alien's movements. However, using any of these creates noise or light, which increases the chance of the Alien finding the player. The player can go under tables or inside lockers to hide from the Alien, and will sometimes have to press a button to make Amanda hold her breath to avoid making noise. Levels are designed to be non-linear, with multiple entry and exit points for each room providing alternative routes for the Alien to attack or the player to escape.
For Alien Isolation we enabled:
  • Ultra Quality settings
  • Depth of Field
  • SSAO standard
  • Anisotropic filtering 16x
  • Volumetirc Lighting On
  • SMAA T2x
  • Rest of settings maxed out
* Entries with a 0 FPS could not be measured or have not been measured just yet.

DX11 5K Ultra HD - Middle Earth: Shadow Mordor Benchmarks

Middle-earth: Shadow of Mordor is an action role-playing video game set in The Lord of the Rings universe, developed by Monolith Productions and released by Warner Bros. The game takes place during the gap between the events of The Hobbit and The Lord of the Rings saga. We set image quality to in-game "Very High". Middle Earth: Shadow of Mordor has a built in benchmark. It is not exactly the most precise tool in the shed, therefore you should always keep a variance of roughly 5% in mind. You can run this benchmark 3 times and all times the results can differ slightly. Also switching in-between quality modes and restarting the game often results in the game forcing old image quality settings again. We'll see how this benchmark is going to develop in the future though.


DX11: 3DMark 11

3DMark 11 is the latest version of what is probably the most popular graphics card benchmark series. Designed to measure your PC’'s gaming performance, 3DMark 11 makes extensive use of all the new features in DirectX 11 including tessellation, compute shaders and multi-threading. Trusted by gamers worldwide to give accurate and unbiased results, 3DMark 11 is the best way to consistently and reliably test DirectX 11 under game-like loads. We test 3DMark 11 in performance mode which will give is a good indication of graphics card performance in the low, mid-range and high end graphics card segment.  

DX11: 3DMark FireStrike (2013)

3DMark includes everything you need to benchmark your hardware. With three all new tests you can bench everything from smartphones and tablets, to notebooks and home PCs, to the latest high-end, multi-GPU gaming desktops. And it's not just for Windows. With 3DMark you can compare your scores with Android and iOS devices too. Here (below) are 3DMark FireStrike results. Fire Strike is the new showcase DirectX 11 benchmark designed for high-performance gaming PCs.


It is FutureMark's most ambitious and technical benchmark ever, featuring real-time graphics rendered with detail and complexity far beyond what is found in other benchmarks and games today. 
We always have system info and validation disabled to prevent accidentally leaking results online prior to embargo material. 



With a benchmark technology called FCAT on the following few pages, we will look into Frame Experience Analysis. With the charts shown we are trying to show you graphics anomalies like stutters and glitches in a plotted chart. Lately there has been a new measurement introduced, latency measurement. Basically it is the opposite of FPS.
  • FPS mostly measures performance, the number of frames rendered per passing second.
  • Frametime AKA Frame Experience recordings mostly measures and exposes anomalies - here we look at how long it takes to render one frame. Measure that chronologically and you can see anomalies like peaks and dips in a plotted chart, indicating something could be off. 
Frame time
in milliseconds
FPS
8.3120
1566
2050
2540
3033
5020
7014
We have a detailed article (read here) on the new FCAT methodology used, and it also explains why we do not use FRAPS anymore.
Frametime - Basically the time it takes to render one frame can be monitored and tagged with a number, this is latency. One frame can take, say, 17 ms. Higher latency can indicate a slow frame-rate, and weird latency spikes indicate a stutter, jitter, twitches; basically anomalies that are visible on your monitor.

What Do These Measurements Show?

Basically, what these measurements show are anomalies like small glitches and stutters that you can sometimes (and please do read that well, sometimes) see on-screen. Below, I'd like to run through a couple of titles with you. Keep in mind that average FPS matters more than frametime measurements.

Rise of the Tomb Raider Frame Experience Analysis



Above, a percentile chart of the first 31 seconds @ 2560x1440 of the benchmark recorded. In this particular chart we plot FPS and place it in relation to percentiles.
  • If you look at the X axis at 50%, that is 50% of the time measured frames is close to 125 FPS in the first segment of the benchmark. This you can consider the average frame-rate (this is the intro scene where Lara walks over the snowy mountain).

Now we move to latency measurements (frame-times). Above, the card at 2560x1440. On this 31 second run the graphics card manages extremely well; as you can see, there are no stutters recorded. This is perfect rendering (frame-time wise lower is better). At the end what you see is a scene change, it is not an anomaly. Now let's add a Radeon R9 Fury to the graph.

So on each benchmark page I will add one extra FCAT result, in here you can see the GeForce GTX 1080 and Radeon R9 Fury, just a little extra for comparison's sake. As you can see, in the intro scene the Fury produces stutters, I can visibly see these on screen, they are for real.

Hitman 2016  Frame Experience Analysis


Above, the percentile chart of the 31 seconds gaming @ 2560x1440, this is the intro run of the integrated benchmark. Here we plot FPS and place it in relation to percentiles.
  • GTX 1080 - 50% of the time measured frames is doing roughly 95 FPS, so that is the average framerate (higher = better).

Once we move and plot the frame-time analysis, we cannot detect a significant glitch or stutter in frametime scaling (chart wise) that is extremely significant or long-lasting. Each square step above is a scene change indicating a higher/lower framerate
For those that do not understand what you are looking at, the above is a game-time scene recorded for 30 seconds. With this chart, lower = better. Huge spikes above 40 ms to 50 ms can be considered a problem like a stutter or indicate a low framerate. These are impressive results, BTW this is DirectX 12 enabled.


And combined, the GTX 1080 and the R9 Fury from AMD in the 31 seconds run. Both cards perform excellently really, albeit we did see two stutters with the Fury, which you can clearly see in the plot. Hitman is continuously loading stuff in the background though, even during the benchmark run.



Ashes Of Singularity Frame Experience Analysis


Above, the percentile chart of a 31 second recording @ 2560x1440. Here we plot FPS and place it in relation to percentiles. We do have DirectX 12 enabled again.
  • GTX 1080 - 50% of the time measured frames is doing roughly 100 FPS, so that is the average framerate (higher = better).
      

We expected a bit of trouble in DX12 with Nvidia cards, but that is not the case. We cannot see a significant enough glitch or stutter in frametime scaling (chart wise) that is extremely significant or long-lasting. With this chart, lower = better. Huge spikes above 40 ms to 50 ms can be considered a problem like a stutter or indicate a low framerate.

And combined, the GTX 1080 and the R9 Fury from AMD. Though the cards perform well we do see more inconsistency with the Fury, especially in the final 5 seconds of our recording there's a lot of stuff going on. Though in their defense, none of that is really visible on screen. But yeah, it's a difference alright.




Tom Clancy's The Division Frame Experience Analysis


Above, the percentile chart of the 31 seconds @ 2560x1440. Here we plot FPS and place it in relation to percentiles.
  • GTX 1080 50% of the time measured frames is doing roughly 80 FPS, so that is the average framerate (higher = better).

The above is a gametime scene recorded for 31 seconds. With this chart, lower = better. Huge spikes above 40 ms to 50 ms can be considered a problem like a stutter or indicate a low framerate. All cards show minor glitches so it is game engine rendering related. Again, these are impressive results.

And combined, the GTX 1080 and the R9 Fury from AMD in the 31 second run. Both cards perform equally in consistency really.



Alien: Isolation Frame Experience Analysis


Above, the percentile chart of the first 31 seconds of the internal benchmark @ 2560x1440. Here we plot FPS and place it in relation to percentiles.
  • 1080 - 50% of the time measured frames is doing roughly 180 FPS.
Please let me remind you of the fact that this is measured at a monitor resolution of 2560x1440. The higher the line, the better.

Again, with this chart, lower = better as the faster one frame is rendered, the lower latency will be. Huge spikes above 40 ms to 50 ms can be considered a problem like a stutter or indicate a low framerate. Well... that just nears perfection really...

And combined, the GTX 1080 and R9 Fury from AMD. Both cards perform excellently.



Middle Earth Mordor Frame Experience Analysis


Above, the percentile chart of 31 seconds gaming @ 2560x1440. Here we plot FPS and place it in relation to percentiles.
  • GTX 1080 - 50% of the time measured frames is doing roughly 120 FPS, so that is the average framerate (higher = better).

We cannot detect a significant glitch or stutter in frametime scaling (chart wise) whatsoever, these are such impressive results. Basically this is what you want to see in any scenario or game.

And combined, the GTX 1080 and R9 Fury from AMD. Both cards perform excellently really, the single frame drop for the Fury means nothing really, that just might as well have happened with the GTX 1080.



Battlefield: Hardline Frame Experience Analysis


Above, the percentile chart of 31 seconds gaming @ 2560x1440. Here we plot FPS and place it in relation to percentiles.
  • GTX 1080 - 50% of the time measured frames is doing roughly 100 FPS, so that is the average framerate (higher = better).

Again, we cannot detect a significant glitch or stutter in frametime scaling (chart wise). Above, a good example of rendering perfection.

And combined, the GTX 1080 and the R9 Fury from AMD in the same 31 second run. Both cards show excellent and stable frametimes really.

Far Cry Primal Frame Experience Analysis


Above, the percentile chart of 31 seconds gaming @ 2560x1440. Here we plot FPS and place it in relation to percentiles.

  • GTX 1080 - 50% of the time measured frames is doing roughly 78 FPS, so that is the average framerate (higher = better).

Again, rendering perfection. We cannot detect a significant glitch or stutter in frametime scaling (chart wise) that is extremely significant or longlasting. For those that do not understand what you are looking at, the above is a gametime scene recorded for 30 seconds. With this chart, lower = better. Huge spikes above 40 ms to 50 ms can be considered a problem like a stutter or indicate a low framerate. All cards show minor glitches here so it is game engine rendering related.

And combined, the GTX 1080 and the R9 Fury from AMD. Both cards perform excellently really, there's one small frame drop measured for team red and a very tiny stutter that you cannot even see on screen. Remember, you are looking at a plot of 1,860 rendered and analyzed frames there.

Grand Theft Auto V Frame Experience Analysis


Above, the percentile chart of 31 seconds gaming @ 2560x1440. Here we plot FPS and place it in relation to percentiles.

  • GTX 1080 - 50% of the time measured frames is doing roughly 112 FPS, so that is the average framerate (higher = better).

Again we see sheer rendering perfection as we cannot detect a significant glitch or stutter in frametime scaling (chart wise) that is significant or longlasting. For those that do not understand what you are looking at, the above is a gametime scene recorded for 30 seconds. With this chart, lower = better. Huge spikes above 40 ms to 50 ms can be considered a problem like a stutter or indicate a low framerate. All cards show minor glitches so it is game engine rendering related. Again, very impressive results.

And combined, the GTX 1080 and the R9 Fury from AMD. Both cards perform excellently really. Basically, in this scene we start up the game and walk among other pedestrians in the city.

Overclocking The Graphics Card

As most of you know, with most video cards you can apply a simple series of tricks to boost the overall performance a little. Typically you can tweak on core clock frequencies and voltages. By increasing the frequency of the videocard's memory and GPU, we can make the videocard increase its calculation clock cycles per second. It sounds hard, but it can really be done in less than a few minutes. I always tend to recommend to novice users and beginners, to not increase the frequency any higher than 5% on the core and memory clock. Example: If your card runs at 600 MHz (which is pretty common these days) then I suggest that you don't increase the frequency any higher than 30 to 50 MHz.
More advanced users push the frequency often way higher. Usually when your 3D graphics start to show artifacts such as white dots ("snow"), you should back down 25 MHz and leave it at that. Usually when you are overclocking too hard, it'll start to show artifacts, empty polygons or it will even freeze. Carefully find that limit and then back down at least 20 MHz from the moment you notice an artifact. Look carefully and observe well. I really wouldn't know why you need to overclock today's tested card anyway, but we'll still show it. All in all... do it at your own risk!

 
OriginalThis sampleOverclocked 
Core Clock: 1607 MHzCore Clock: 1709 MHzCore Clock: 1789 MHz
Boost Clock: 1733 MHzBoost Clock: 1848 MHzBoost Clock: 1955~2067 MHz
Memory Clock: 5005/10010 MHzMemory Clock: 5005/10010 MHzMemory Clock: 5622/11244 MHz
If anything, tweaking and overclocking has become more complicated starting with Pascal. You'll see that most cards out there all will tweak to roughly the same levels due to all kinds of hardware protection kicking in.
We applied the following settings:
  • Temp Target 95 Degrees C
  • CPU clock +80 MHz
  • Mem clock +575 MHz
  • Voltage +100%
  • FAN RPM 60 % (remains really silent with TwiNFrozr VI)
The Boost clock will now render at roughly 1950~2067 MHz depending on the power and temperature signature. The GPU will continuously be dynamically altered on voltage and clock frequency to match the power and temperature targets versus the increased core clock. In FireStrike we are now hovering at the 2 GHz marker on the Boost frequency for example, but some games jumped to roughly 2.1 GHz one second and dipped below 2 GHz the other.


For all overclocked games above we have used the very same image quality settings as shown before. Overall, the generic thumb of rule here for a decent tweak and overclock is that performance can gain anywhere from 5 to 10% performance. The end result depends on a lot of variables though, including power limiters, temperature limiters, fill-rate and so on, the performance increment can differ per card, brand, heck... even cooling solution and your chassis airflow.



Conclusion

Obviously Nvidia unleashed Fury (pun intended) with their 16nm Pascal processors. They nailed it performance, TDP and temps wise, but unfortunately did price the 1080 series very steep. I spotted the card on-line this morning for € 750, but due to low volume availability retailers will easily inflate that to € 800 in most shops for the time being. This is why I've been waiting anxiously on board partner cards which often are better cooled (less throttling), more silent, run faster and often look damn terrific. The MSI GeForce GTX 1080 GAMING X 8G review as far as I am concerned ticks the right boxes.
You may expect that dynamic boost clock hovering at the 1.9 GHz marker with the default clocks and yes it's is a notch faster compared to the founder editions. It does seem that the Pascal generation scales a little less compared to Maxwell in terms of relative tweaking/overclocking performance. It's mostly due to the high clocks really, if you go from 1800 towards 2000 MHz it's just ten percent extra perf. Despite that observation, it remains a truckload of performance + you get 8GB of the fastest graphics memory your money can get you. AIB/AIC prices will more competitive opposed to the reference founder edition cards, meaning that for roughly the same amount of money you can purchase that muscular, beefy board partner product. The Pascal GP104 architecture is interesting, however aside from a few changes in the pipeline, it looks VERY similar to Maxwell. Make no mistake there have been changes, but it shares a very similar structure. So the biggest benefit for Nvidia was the move to 16nm, as it allows them to drive their products to incredible clock frequencies whilst they can use less voltage, and that results in power consumption under the 200 Watt marker. The next ticked box for GeForce GTX 1080 is GDDR5X, effectively it brought them a data-rate of 10 Gbps which brings in a very nice memory performance boost. Combined with the new color compression technologies they can effectively achieve 43% more bandwidth compared to the GeForce GTX 980. And I do bring up the GeForce GTX 980 with good reason, the GTX 1080 is to replace that product. However the 1080 is faster than the 980 Ti and faster then the Titan X. You really are looking at 20% to 40% performance increases depending on game title and resolution. So that is a huge step forward, and again... this is just the reference clocked product. The board partners are going to get crazy with this chip. I also need to applaud the step of moving towards 8 GB of graphics memory. Whilst the highest used we ever measured is in the 5 to 6 GB VRAM domain, that 8 GB 256-bit GDDR5X memory is an excellent and well-balanced amount of graphics memory. Keep in mind:
  1. Enthusiast level gaming requires a lot of graphics memory.
  2. People simply like gigantic graphics memory buffers. 
So yes, the raw rendering performance versus that massive VRAM partition is going to attract a certain kind of people, the demographics would match a large chunk of the Guru3D.com reader-base. Honestly I can talk and explain why you should or shouldn't purchase a card like this, the fact remains people that can afford it will purchase it, people that can't or refuse to, won't. This is the Porsche that nobody can afford, but really likes (the Ferrari still needs to be released with Big Pascal). But hey, who knows with titles like The Division / GTA5 and technologies like Ultra HD and / or DSR versus performance and VRAM what you find valid, or not. High up there in the enthusiast space there certainly is a market for cards like these. That makes these 8 GB models relevant for gaming. And it really is a partition of 8 GB and not 7.5 GB of graphics memory :P



Aesthetics

MSI tweaked the design a bit for the new Gaming X and upcoming Z series, the more stylish TwinFrozr VI cooler looks serious and now comes with RGB LED lighting control. Switch it on/off or to any color and animation you prefer, the choice is yours. Cool dibs is that backplate, with opening at the proper areas (GPU/VRM) for venting. As you can see, I remain skeptical about backplates, they potentially can trap heat and thus warm up the PCB. But the flip-side is that they do look better and can protect your PCB and components from damage. Consumer demand is always decisive, and you guys clearly like graphics cards with backplates. Both the front IO plate and backplate are dark matte black which certainly gives the card that premium feel. All that combined with a nicely design 10 phase PCB again in matte black, and the end result is a lovely looking product.

Cooling & Noise Levels

The reference design (founder editions) of the GTX 1080  are set at an offset threshold of 80 degrees C. Once the GPU gets warmer the card will clock down / lower its voltage etc to try and keep the card cooler, that's throttling and it part of the design. MSI however throws in a cooler that manages roughly 500 to 600W of cooling performance. It is a really good one, so good that up-to a degree or 60 on the GPU, this card remains passive and thus inaudible. Once the fans kick in, you can expect to hover at the 70 Degrees C marker, with seriously demanding games. Please do note that you will need proper ventilation inside your chassis to achieve that number. So MSI shaved off a good 10 Degrees C over reference. Noise wise, we can’t complain about cooling whatsoever. Expect sound pressure values in the 38~39 dBA range at max under load and warm circumstances. That's measured 75 CM away from the PC. This means you can barely hear the card while using it. Once overclocked with added voltage we always do recommend a little more fan RPM, this does increase noise a tiny bit, but it's nothing dramatic by any standard. Overall this is a very sound and solid cooling solution.

Power Consumption

Any GP104 Pascal GPU and thus GP104 based graphics card is rated as having a 180 Watt TDP under full stress, our measurements back that up albeit a notch higher due to the faster clocks and thus voltage usage. Anyhow, at this performance level you are looking at a card that consumes roughly 400~450 Watts for a stressed PC in total, that is okay. We think a 500~600 Watt PSU would be sufficient and if you go with 2-way SLI say an 800 Watt power supply is recommended. It's definitely more than needed but remember - when purchasing a PSU, aim to double up in Wattage as your PSU is most efficient when it is under 50% load. Here again keep in mind we measure peak power consumption, the average power consumption is a good notch lower depending on GPU utilization. Also, if you plan to overclock the CPU/memory and/or GPU with added voltage, please do purchase a power supply with enough reserve. People often underestimate it, but if you tweak all three aforementioned variables, you can easily add 200 Watts to your peak power consumption budget as increasing voltages and clocks increases your power consumption.



Overall gaming performance

Do you really need a card as beefy as the GeForce GTX 1080 really is though? Well, that depends on a rather abstract external factor, your monitor(s) and in specific the resolution you play your games at. If you game at a resolution of 1920x1080 (Full HD) then no, not really. However, more is better and with technologies like DSR (super-sampling) and Ultra HD the raw horsepower this card offers certainly isn't distasteful. Also, with with surround gaming (three monitors) the GeForce GTX 1080 will just make a lot of sense, especially with the new simultaneous multi-projection feature build into the rendering pipeline, that probably is one of the most innovative features Nvidia has added that I have seen in a long time. From 1080p to Ultra HD the GeForce GTX 1080 hauls the proverbial toosh compared to whatever other single GPU based graphics card you can name in existence. Obviously it is the fastest kid on the block. This much performance and graphics memory helps you in Ultra HD, hefty complex anti-aliasing modes, DSR and of course the latest gaming titles. I consider this to be among the first viable single GPU solutions that allows you to game properly in Ultra HD with some very nice eye candy enabled. However, I was kinda hoping to be closer to 60 FPS on average with the GTX 1080 in Ultra HD. But that will probably take the future Big Pascal (Ti / Titan). As always, drivers wise we can't complain at all, we did not stumble into any issues. And with a single GPU there's no micro-stuttering and no multi-GPU driver issues to fight off. Performance wise, really there's not one game that won't run seriously good at the very best image quality settings. Gaming you must do with a nice 30" monitor of course, at 2560x1440/1600 or Ultra HD. Now, we can discuss the advantages of an 8 GB framebuffer, but hey, you can draw your own conclusions there. At least you won't run out of graphics memory for the years to come right? So in that respect the card is rather future proof. SLI then, we have to mention this. Starting with Pascal the primary focus for Nvidia in terms of multi-GPU setups is that they will support 2-way SLI, but really that's it and all. For those of you that want to run 3 and 4-way configuration, it's going to be difficult but remains possible as the game needs to support and you will need to obtain a driver key from the Nvidia website. Do not expect Nvidia to enhance drivers for it, they'll just open up the floodgate and have you deal with the rest. Personally I am okay with this choice. I have been complaining for years now that once you pass two card, you will almost exponentially faster run into some sort of driver issue and thus irritation. So yes, some of you might be disappointed about this news. Me personally, I am fine with the choice to focus on proper 2-way SLI opposed to all the arbitrary configurations that less then 0,01% of the end-users uses.
Once last remark on performance. You will have noticed that in some games this higher clocked product is a good 10% faster where in other just a few percent. That's Nvidia's limiters at work for you. All card under very hefty load will be limited in a way more narrow bracket. Whereas games that leave enough breathing room can advance on that GPU and score better opposed to some other games.

Overclocking

Due to the many limiters and hardware protections Nvidia has built in all and any cards will hover roughly at the 2 GHz on the Boost marker. Now, that frequency will differ per game/application. On 3DMark Firestrike for example it may hover at 1950~2000 MHz, while in Rise of the Tom Raider (2016) you will be close towards 2.1 GHz. The reality is that Nvidia monitors and adapts to hardware specific loads, e.g. an application that is nearly viral like on the GPU will have the effect of the GPU protecting itself by lowering clocks and voltages. The opposite applies here as well, if a game does not try & fry that GPU, it'll clock a bit faster withing the tweaked thresholds at your disposal. Tweaking is fun, but definitely more complicated anno 2016. The memory can reach 11 Gbps effectively, I have seen some card even reach 1.2 GHz/Gbps. Pascal GPUs do like their memory bandwidth though, so if you can find a high enough stable tweak, definitely go or it if you are seeking that last bit of extra performance.

Concluding

Despite being positioned in a high price bracket, it's hard to not like the MSI GeForce GTX 1080 GAMING X 8G. It does give you tremendous gaming performance, and combined with extra TLC from MSI it has been improved to an even better level. The new Pascal architecture proves its agility and the die shrink to 16 nm FiNFET shows low power consumption due to lower voltages and obviously the high clock-speeds and that GDDR5X memory offer the complete package that the GTX 1080 is. If you stick to the WHQD 2560x1440 domain this is the card that will last you years to come combined with that lovely 8 GB of graphics memory. For long-term Ultra HD usage (high FPS) however the answer still needs to be found in two cards. But hey, if WHQD is your domain then the GeForce GTX 1080 is a rather future proof product with that proper and fast 8GB GDDR5X graphics memory. The MSI GeForce GTX 1080 GAMING X 8G is overclocked, yet relatively mildy. That tweak by itself will not be the decisive factor for the purchase as the perf increase is not that relevant. However the overall design, cooling, looks, RGB LED system and and sure, that x-factor does make the MSI GeForce GTX 1080 GAMING X 8G combined with it's raw sheer and brutal game performance a very enticing product. I still say that the 1070 cards are going to be the biggest hit of the Summer of 2016, but for those with a bigger wallet the MSI GeForce GTX 1080 GAMING X 8G certainly is a product that we can wholeheartedly recommend.
Hold it my kind Sir - we've got more Pascal GPU related content for you to read if interested:
Recommended  Downloads

2 comments:

  1. Metal Core PCB is used in Power equipment, LED lights, Cars, high power Audio device etc.

    ReplyDelete
  2. We interactively display visualizations on any device. Downloading large data is avoided with thin computing-style delivery model. Bandwidth and latency limitations are overcome through patented interactive remote visualization over Internet. The knowledge within enormous data is unleashed using A.I. on proprietary supercomputing cloud infrastructure. Graphics Processing Unit (GPU): Supercomputing Power at Fraction of the Cost. Internet latency issues remain the challenging factor for the growth, that in order for cloud gaming to work, a gamer has to be close in proximity to the server that serves the content. With our patents and technologies, this limitation is eliminated allowing secure and fast access to gaming contents.

    Interactive Streaming AI Platform RIS PACS

    ReplyDelete