3DMark Wild Life Extreme uses a cross-platform graphics engine optimized for mobile devices and notebooks. The engine was developed in-house with input from members of the UL Benchmark Development Program.
The rendering, including scene update, visibility evaluation and command recording is done with multiple CPU threads using one thread per available logical CPU core. The purpose is to reduce the CPU load by utilizing multiple cores.
Graphics features
Clustered Light Culling
The scene lights are culled and stored in a three-dimensional screen space grid. The light culling is done using the CPU before the rendering passes.
Geometry Rendering
Opaque objects are rendered using a deferred rendering method in the graphics pipeline using temporary G-Buffer targets for PBR material parameters. The shading is done using the clustered light information in linear HDR color space utilizing temporary G-Buffer data. In addition to the final lighting result, the deferred rendering pass outputs depth information for other subsequent rendering effects.
Transparent objects are rendered in an order-independent way using two passes. The first pass uses temporary targets to store accumulated color contribution and transparency weighted based on linear depth and the total transparency. The second pass calculates the final color from the accumulated color and transparency. The result of the transparent objects pass is blended on top of the final surface illumination.
Environment reflections are based on a single cube map. Geometry shaders and tessellation are not supported.
Particles
Particles are simulated on the GPU using compute shaders. The particles are self-illuminated. The particles are rendered at the same time with transparent geometries using the same order-independent technique.
Asynchronous compute
Asynchronous compute is used overlap multiple rendering passes whenever possible for maximum utilization of the GPU.
Post-Processing
Heat Distortion
The heat distortion effect is generated with the use of particles. For particles that generate the effect, a distortion field is rendered to a texture using a 3D noise texture as input. This field is then used to distort the input image in the post-processing phase.
TAA
Motion vectors are computed in a fragment shader during g-buffer rendering. Two history textures are used to store data from previous frames, depth and illumination, and an exponential moving average with variance clipping is used to blend the data of the current frame. Depth texture is linearized in a separate pass for the blending to work correctly. Motion vectors from the current frame are used as an offset for sampling the history textures in the resolve pass. This pass is done for the final illumination texture and linearized depth as the first post-processing pass, before bloom and depth of field.
Bloom
Bloom is based on a compute shader FFT that evaluates several effects with one filter kernel. The effects are blur, streaks, anamorphic flare and lenticular halo. The bloom resolution in Wild Life Extreme is double the resolution used in the Wild Life test. The Wild Life Extreme implementation also uses shared memory.
Adaptive Screen Space Ambient Occlusion
Wild Life Extreme used an ambient occlusion technique developed by Intel for low-power devices. The implementation uses the lowest quality settings.
Volume illumination
Volume illumination is computed by approximating the light scattered towards the viewer by the medium between the eye and the visible surface on each lit pixel. The approximation is based on volume ray casting and a simple scattering and attenuation model. One ray is cast on each lit pixel for each light. The cast ray is sampled at several depth levels. The achieved result is blurred before combining the volume illumination with the surface illumination. Wild Life Extreme increases the sample count by 50% compared with Wild Life.
Depth of Field
The depth of field effect is computed by filtering rendered illumination in half resolution with three separable skewed box filters that form a hexagonal bokeh pattern when combined. The filtering is performed in two passes that exploit similarities in the three filters to avoid duplicate work.
The first pass renders to two render targets. The second pass renders to one target combining the results of the three filters. Before filtering, a circle of confusion radius is evaluated for each pixel and the illumination is premultiplied with the radius.
After filtering, illumination is reconstructed by dividing the result with the radius. This makes the filter gather out of focus illumination and prevents it from bleeding in focus illumination to neighbor pixels.