Early February NVIDIA released OptiX 6.0, the first version of OptiX with support for Turing GPUs and RT Cores. With the help of the new SDK developers can finally take advantage of hardware accelerated ray tracing targeting RTX graphics cards. In this article we measure the performance improvement of the RTX enabled/disabled mode in a simple ray casting application, plus, the Fermat library also gets re-tested with the trio of GTX 1080 Ti, Titan V and RTX 2080 Ti. OTOY have also updated its OctaneBench with experimental support to put RT Cores inside Turing to good use and speed-up rendering. Let's cast some rays!
OptiX 6.0 introduces API changes to the programming interface and a new "RTX execution strategy" as well. With the new execution model OptiX can utilize the RT Cores inside the Turing RTX GPUs for BVH traversal and triangle intersection. Previous generation Maxwell and Pascal GPUs might also benefit from the new "RTX execution strategy". To test out the new features we run the optixMeshViewer application found in the OptiX SDK in RTX disabled/enabled configurations at 8k resolution. The optixMeshViewer sample provides command line argument control (--no-triangle-api) to choose between the new "RTX execution strategy" or the old "mega-kernel" model. The app casts primary rays and utilizes a simple phong material with a single shadow casting light.
For Turing running in mode "RTX off" means there is no HW accelerated ray tracing in the background, instead CUDA cores are utilized for ray-triangle intesection/BVH traversal and the old mega-kernel approach is deployed. Enabling RT Cores in the flagship RTX 2080 Ti transletes into around 1.5x speedup in avarege. The hairball scene features the heaviest workload, in this case the difference is more than 2.5X. For the previous gen cards "RTX off" defaults to the older "mega-kernel execution strategy" as with Turing. "RTX on" enables the new "RTX execution strategy" implemented in Optix 6.0 and accelerates rendering for the older GPUs as well.
Fermat 2.0, a physically based research renderer, also received a huge update to support OptiX 6.0 and CUDA SDK 10. We extended our benchmark set with two cornell box inspired test scenes. Fermat 2.0 introduced support for more complex materials and better samplers so one can't compare results with our previous results with Fermat 1.0.
Ray tracing workloads are dependent on scene complexity, triangle density and material setup. As RT Cores accelerate ray-triangle intersection and BVH traversal, the RTX 2080 Ti hammers the TITAN V in the triangle-heavy, complex scenes.
As a last minute addition, RTX OctaneBench 2019 Preview also found its way into this "RTX on" GPU rendering benchmark session. OTOY picked a real world scene that is geometrically dense and is composed of hundreds of thousands of scattered instanced meshes as NVIDIA RTX platform provides larger speed boosts in these scenarios.
OptiX 6.0 and the new RTX GPUs offer a true generation leap when it comes to ray tracing applications and GPU rendering. When comparing RTX to GTX, the performance is just on another level. Be prepared for more RTX tests as we hope to benchmark other render engines as they get updated with RTX magic!
PSU: Cooler Master 1000W VANGUARD
MOTHERBOARD: ASRock X470 Taichi
CPU: AMD Ryzen 7 2700X
GPU: NVIDIA GTX 1080Ti FE
GPU: NVIDIA RTX 2080Ti FE
GPU: NVIDIA TITAN V
OS: Microsoft Windows 10 (10.0) Home 64-bit - Version 1809/RS5 (17763.316)
DRIVER: NV 418.91 for OptiX 6.0 (Fermat & optixMeshViewer) - NV 417.71 for OctaneBench (419.17 should work)
RAM: G.Skill FlareX 16GB (2X8GB) DDR4 3200MHz
STORAGE: Samsung 960 EVO NVMe M.2 500GB
COOLER: Wraith Prism with RGB LED
Update 01: Fermat 1.0 vs Fermat 2.0 score difference explanation