GPU Path Tracing in Unity – Part 2

“There is nothing worse than sharp image of a fuzzy concept.” – Ansel Adams

In the first part of this series, we created a Whitted ray tracer capable of tracing perfect reflections and hard shadows. What’s missing are the fuzzy effects: diffuse interreflection, glossy reflections and soft shadows.

Building upon the code we already have, we will iteratively solve the rendering equation formulated my James Kajiya in 1986 and transform our renderer into a path tracer able to capture the mentioned effects. Again, we will be using C# for scripts and HLSL for shaders. Code is hosted on Bitbucket.

This article is substantially more mathematical than the previous one, but don’t be afraid. I will try to explain every formula as best as I can. The formulas are there to help you see what’s going on and why our renderer works, so I advise you to try to understand them and ask away in the comment section should anything be unclear.

The following image is rendered using HDRI Haven’s Graffiti Shelter. The other images in this article are rendered using Kiara 9 Dusk.

The Rendering Equation

Formally, the task of a photorealistic renderer is solving the rendering equation, which can be written as follows:

$$L(x, \, \vec \omega_{o}) = L_e(x, \, \vec \omega_{o}) + \int_{\Omega}{f_r(x, \, \vec \omega_{i}, \, \vec \omega_{o}) \, (\vec \omega_{i} \cdot \vec n) \, L(x, \, \vec \omega_{i}) \, d\vec \omega_{i}}$$

Let’s break it down. Ultimately we want to determine the brightness of a pixel on the screen. The rendering equation gives us the amount of light \(L(x, \, \vec \omega_{o})\) going from point \(x\) (a ray’s hit point) in direction \(\vec \omega_{o}\) (the direction the ray is coming from). The surface might be a light source itself, emitting light \(L_e(x, \, \vec \omega_{o})\) in our direction. Most surfaces don’t, so they only reflect light coming from outside. This is what the integral is for. Intuitively, it accumulates the light coming from every possible direction in the hemisphere \(\Omega\) around the normal (so for now we’re only considering light reaching the surface from above, not from below which would be required for translucent materials).

The first part \(f_r\) is called the bidirectional reflectance distribution function (BRDF). This function visually describes the kind of material we are dealing with – metal or dielectric, bright or dark, glossy or dull. The BRDF defines which proportion of light coming from \(\vec \omega_{i}\) is reflected in direction \(\vec \omega_{o}\). In practice, this is handled using a 3-component vector for the amount of red, green and blue light, each in range \([0,1]\).

The second part \((\vec \omega_{i} \cdot \vec n)\) is equivalent1 to \(cos \theta\) where \(\theta\) is the angle between incoming light and surface normal \(\vec n\). Imagine a beam of parallel light hitting a surface head-on. Now imagine the same beam hitting the surface at a flat angle. The light will spread over a larger area, but that also means that each point in this area appears darker than before. The cosine accounts for this.

Finally, the actual light coming from \(\vec \omega_{i}\) is determined recursively using the same equation. So the light at point \(x\) depends on incoming light from all possible directions in the upper hemisphere. In each of those directions from point \(x\) lies another point \(x\prime\), for which the brightness again depends on the incoming light from all possible directions in that point’s upper hemisphere. Rinse and repeat.

So here it is. An infinitely recursive integral equation over infinitely many hemispherical integration domains. We won’t be able to solve this equation directly, but there is a fairly simple solution.


1Remember this! We will be talking about the cosine quite a lot, and when we do, we mean the dot product. Since \(\vec{a}\cdot\vec{b}=\|\vec{a}\|\ \|\vec{b}\|\cos(\theta)\) and we are dealing with directions (unit-length vectors), the dot product is the cosine for most purposes in Computer Graphics.

Monte Carlo to the rescue

Monte Carlo integration is a numerical integration technique that allows us to estimate any integral using a finite number of random samples. Moreover, Monte Carlo guarantees convergence towards the correct solution – the more samples you take, the better. Here is the general form:

$$F_N \approx \frac{1}{N} \sum_{n=0}^{N}{\frac{f(x_n)}{p(x_n)}}$$

The integral of a function \(f(x_n)\) can thus be estimated by averaging random samples over the integration domain. Each sample is divided by the probability \(p(x_n)\) of it being chosen. This way, a sample that is frequently selected will be weighted less than a sample that is chosen only rarely.

With uniform sampling over the hemisphere (each direction has the same probability of being selected), the probability of samples is constant: \(p(\omega) = \frac{1}{2 \pi}\) (because \(2 \pi\) is the surface area of the unit hemisphere). If you bring it all together, that’s what you get:

$$L(x, \, \vec \omega_{o}) \approx L_e(x, \, \vec \omega_{o}) + \frac{1}{N} \sum_{n=0}^{N}{\color{Green}{2 \pi \, f_r(x, \, \vec \omega_{i}, \, \vec \omega_{o}) \, (\vec \omega_{i} \cdot \vec n)} \, L(x, \, \vec \omega_{i})}$$

The emission \(L_e(x, \, \vec \omega_{o})\) is simply the return value of our Shade function. The \(\frac{1}{N}\) is what happens already in our AddShader. The multiplication with \(L(x, \, \vec \omega_{i})\) happens when we reflect the ray and trace it further. Our mission is to fill the green part of the equation with some life.

Prerequisites

Before we can embark on our adventures, let’s take care of some things: sample accumulation, deterministic scenes and shader randomness.

Accumulation

For some reason or another, Unity wouldn’t give me an HDR texture as destination in OnRenderImage. The format for me was R8G8B8A8_Typeless, so the precision would quickly be too low for adding up more than a few samples. To overcome this, let’s add private RenderTexture _converged to our C# script. This will be our buffer to accumulate the results with high precision, before displaying it on screen. Initialize / release this texture exactly the same as _target in the InitRenderTexture function. In the Render function, double the blit:

Graphics.Blit(_target, _converged, _addMaterial);
Graphics.Blit(_converged, destination);

Deterministic Scenes

When you make a change to your rendering, it helps to compare it to previous results to judge the effect. Currently, we will be presented with a new random scene every time we restart play mode or recompile scripts. To overcome this, add a public int SphereSeed to your C# script and add the following line at the beginning of SetUpScene:

Random.InitState(SphereSeed);

You can now manually set the seed of the scene. Enter any number and disable / reenable the RayTracingMaster component until you have found a scene that you like.

The settings used for the example images are: Sphere Seed 1223832719, Sphere Radius [5, 30], Spheres Max 10000, Sphere Placement Radius 100.

Shader Randomness

Before we can start any stochastic sampling, we need randomness in our shader.  I’m using the canonical one-liner I found on the web somewhere, modified for more convenience:

float2 _Pixel;
float _Seed;

float rand()
{
    float result = frac(sin(_Seed / 100.0f * dot(_Pixel, float2(12.9898f, 78.233f))) * 43758.5453f);
    _Seed += 1.0f;
    return result;
}

Initialize _Pixel directly in CSMain as _Pixel = id.xy, so each pixel will use different random values. _Seed is initialized from C# in the SetShaderParameters function.

RayTracingShader.SetFloat("_Seed", Random.value);

The quality of the random numbers we generate here is uncertain. It would be worth investigating and testing this function, analyzing the effect of the parameters and comparing it to other approaches. For the time being, we’ll just use it and hope for the best.

Hemisphere Sampling

First things first: We’ll need random directions uniformly distributed on the hemisphere. For the full sphere, this non-trivial challenge is described in detail in this article by Cory Simon. It is easily adapted to the hemisphere. Here’s the shader code:

float3 SampleHemisphere(float3 normal)
{
    // Uniformly sample hemisphere direction
    float cosTheta = rand();
    float sinTheta = sqrt(max(0.0f, 1.0f - cosTheta * cosTheta));
    float phi = 2 * PI * rand();
    float3 tangentSpaceDir = float3(cos(phi) * sinTheta, sin(phi) * sinTheta, cosTheta);

    // Transform direction to world space
    return mul(tangentSpaceDir, GetTangentSpace(normal));
}

The directions are generated for a hemisphere centered around positive Z, so we need to transform it to be centered around whatever normal we need. To this end, we generate a tangent and binormal (two vectors orthogonal to the normal and orthogonal to each other). We first choose a helper vector to generate the tangent. We take positive X for this, and only fall back to positive Z if the normal is (nearly) coaligned with the X axis. Then we can use the cross product to generate a tangent and subsequently a binormal.

float3x3 GetTangentSpace(float3 normal)
{
    // Choose a helper vector for the cross product
    float3 helper = float3(1, 0, 0);
    if (abs(normal.x) > 0.99f)
        helper = float3(0, 0, 1);

    // Generate vectors
    float3 tangent = normalize(cross(normal, helper));
    float3 binormal = normalize(cross(normal, tangent));
    return float3x3(tangent, binormal, normal);
}

Lambert Diffuse

Now that we have our uniform random directions, we can start implementing the first BRDF. The Lambert BRDF is the most commonly used one for diffuse reflection, and it’s strikingly simple: \(f_r(x, \, \vec \omega_{i}, \, \vec \omega_{o}) = \frac{k_d}{\pi}\), where \(k_d\) is the albedo of the surface. Let’s insert it into our Monte Carlo rendering equation (I’ll drop the emission term for now) and see what happens:

$$L(x, \, \vec \omega_{o}) \approx \frac{1}{N} \sum_{n=0}^{N}{\color{BlueViolet}{2 k_d} \, (\vec \omega_{i} \cdot \vec n) \, L(x, \, \vec \omega_{i})}$$

Let’s put this in our shader rightaway. In the Shade function, replace the code inside the if (hit.distance < 1.#INF) clause with the following lines:

// Diffuse shading
ray.origin = hit.position + hit.normal * 0.001f;
ray.direction = SampleHemisphere(hit.normal);
ray.energy *= 2 * hit.albedo * sdot(hit.normal, ray.direction);
return 0.0f;

The new direction of the reflected ray is determined using our uniform hemisphere sampling function. The ray’s energy is multiplied with the relevant part of the equation above. Since the surface does not emit any light (but only reflects the light it receives directly or indirectly from the sky), we return 0 here. Remember that our AddShader averages the samples for us, so we don’t need to care about \(\frac{1}{N} \sum\). The CSMain function already contains the multiplication with \(L(x, \, \vec \omega_{i})\) (the next reflected ray), so there’s not much for us to do.

The sdot function is a simple utility that I have defined for myself. It simply returns the result of the dot product, with an optional factor and then clamped to \([0,1]\):

float sdot(float3 x, float3 y, float f = 1.0f)
{
    return saturate(dot(x, y) * f);
}

Let’s recap what the code does so far. CSMain generates our primary camera rays and calls Shade. If a surface is hit, this function will in turn generate a new ray (uniform random in the hemisphere around the normal) and factor the material’s BRDF and the cosine into the ray’s energy. If the sky is hit, we’ll sample the HDRI – our only light source – and return the light, which is multiplied with the ray’s energy (i.e. the product of all prior hits starting from the camera). This is a single sample that is blended with the converged result. In the end, each sample has a contribution of \(\frac{1}{N}\).

Time to try it out. Since metals don’t have any diffuse reflection, let’s disable them for now in our C# script’s SetUpScene function (still calling Random.value here to preserve scene determinism):

bool metal = Random.value < 0.0f;

Enter play mode and see how the initially noisy image clears up and converges to a nice rendering like this:

Phong Specular

Not so bad for a few lines of code (and some careful math – I see you’re slowly becoming friends). Let’s spice things up with specular reflections by adding a Phong BRDF. The original Phong formulation had its share of problems (not reciprocal, not energy-conserving), but thankfully, other people took care of that. The modified Phong BRDF looks like this, where \(\vec \omega_{r}\) is the perfectly reflected light direction and \(\alpha\) is the Phong exponent controlling the roughness:

$$f_r(x, \, \vec \omega_{i}, \, \vec \omega_{o}) = k_s \, \frac{\alpha + 2}{2 \pi} \, (\vec \omega_{r} \cdot \vec \omega_{o})^\alpha$$
Here is a little 2D graph displaying what the Phong BRDF for \(\alpha=15\) looks like for an incident ray at 45° angle. Click in the bottom right corner to change the \(\alpha\) value yourself.

Plug it into our Monte Carlo rendering equation:

$$L(x, \, \vec \omega_{o}) \approx \frac{1}{N} \sum_{n=0}^{N}{\color{brown}{k_s \, (\alpha + 2) \, (\vec \omega_{r} \cdot \vec \omega_{o})^\alpha} \, (\vec \omega_{i} \cdot \vec n) \, L(x, \, \vec \omega_{i})}$$

And finally add this to the Lambert BRDF we already have:

$$L(x, \, \vec \omega_{o}) \approx \frac{1}{N} \sum_{n=0}^{N}{[\color{BlueViolet}{2 k_d} + \color{brown}{k_s \, (\alpha + 2) \, (\vec \omega_{r} \cdot \vec \omega_{o})^\alpha}] \, (\vec \omega_{i} \cdot \vec n) \, L(x, \, \vec \omega_{i})}$$

And here it is in code together with the Lambert diffuse:

// Phong shading
ray.origin = hit.position + hit.normal * 0.001f;
float3 reflected = reflect(ray.direction, hit.normal);
ray.direction = SampleHemisphere(hit.normal);
float3 diffuse = 2 * min(1.0f - hit.specular, hit.albedo);
float alpha = 15.0f;
float3 specular = hit.specular * (alpha + 2) * pow(sdot(ray.direction, reflected), alpha);
ray.energy *= (diffuse + specular) * sdot(hit.normal, ray.direction);
return 0.0f;

Note that we substituted the dot product with a slightly different, but equivalent one (reflected \(\omega_o\) instead of \(\omega_i\)). Now reenable metallic materials in the SetUpScene function and give it a shot.

Playing around with different \(\alpha\) values, you will notice a problem: Lower exponents already take a long time to converge, and for higher exponents the noise is particularly stubborn. Even after several minutes of waiting, the result is far from pretty, which is unacceptable for such a simple scene. \(\alpha = 15\) and \(\alpha = 300\) with 8192 samples look like this:

“Why is that? We had such nice perfect reflections (\(\alpha = \infty\)) before!”, you might ask. The problem is that we are generating uniform samples, and weighting them according to the BRDF. For high Phong exponents, the value of the BRDF is tiny for all but those directions that are very close to the perfect reflection, and it is very unlikely that we will randomly select them using our uniform samples. On the other hand, if we actually hit one of those directions, the BRDF is huge to compensate for all the other tiny samples. A very high variance is the result. Paths with multiple specular reflections are even worse, resulting in the noise you see in the images above.

Better Sampling

To make our path tracer practical, we need a change of paradigm. Instead of wasting precious samples on areas where they won’t matter in the end (because they will get a very low BRDF and / or cosine factor), let’s generate samples that matter.

As a first step, we’ll get our perfect reflections back, and then see how we can generalize this idea. To do this, we will split our shading logic up in diffuse and specular. For each sample, we will randomly choose one or the other (depending on the ratio of \(k_d\) and \(k_s\)). For diffuse, we’ll stick to the uniform sampling, but for specular, we will explicitly reflect the ray in the single direction that matters. Since less samples will now be spent on each reflection type, we need to increase the contribution of the samples accordingly to end up with the same net amount, like so:

// Calculate chances of diffuse and specular reflection
hit.albedo = min(1.0f - hit.specular, hit.albedo);
float specChance = energy(hit.specular);
float diffChance = energy(hit.albedo);
float sum = specChance + diffChance;
specChance /= sum;
diffChance /= sum;

// Roulette-select the ray's path
float roulette = rand();
if (roulette < specChance)
{
    // Specular reflection
    ray.origin = hit.position + hit.normal * 0.001f;
    ray.direction = reflect(ray.direction, hit.normal);
    ray.energy *= (1.0f / specChance) * hit.specular * sdot(hit.normal, ray.direction);
}
else
{
    // Diffuse reflection
    ray.origin = hit.position + hit.normal * 0.001f;
    ray.direction = SampleHemisphere(hit.normal);
    ray.energy *= (1.0f / diffChance) * 2 * hit.albedo * sdot(hit.normal, ray.direction);
}

return 0.0f;

The energy function is a little helper that averages the color channels:

float energy(float3 color)
{
    return dot(color, 1.0f / 3.0f);
}

Here it is, the prettified variant of the Whitted ray tracer we built last time, but now with real diffuse shading (read ‘soft shadows, ambient occlusion, diffuse global illumination’):

Importance Sampling

Let’s look at the basic Monte Carlo formula again:

$$F_N \approx \frac{1}{N} \sum_{n=0}^{N}{\frac{f(x_n)}{p(x_n)}}$$

As you can see, we divide each sample’s contribution by the probability that this particular sample is chosen. So far, we used uniform sampling over the hemisphere and therefore had a constant \(p(\omega)=\frac{1}{2 \pi}\). As we saw earlier, this is far from optimal e.g. in case of the Phong BRDF which is large in a very narrow set of directions.

Imagine we could find a probability distribution that exactly matches the integrand: \(p(x) = f(x)\). This is what would happen:

$$F_N \approx \frac{1}{N} \sum_{n=0}^{N}{1}$$

Now there are no samples that get very little contribution. Instead, those samples will inherently be selected with a lower probability. This will drastically reduce the variance of the result and make rendering converge faster.

In practice, it is unrealistic to find such a perfect distribution, because some factors of the integrand (in our case BRDF × cosine × incoming light) are not known (most prominently the incoming light), but already distributing samples according to BRDF × cosine or even only BRDF will do us a lot of good. This is known as Importance Sampling.

Cosine Sampling

For the following steps, we need to replace our uniform sample distribution by a cosine (power) distribution. Remember, instead of multiplying uniform samples with a cosine, lowering their contribution, we want to generate proportionally fewer samples.

This article by Thomas Poulet describes how this is done. We’ll add an alpha parameter to our SampleHemisphere function that determines the power of the cosine sampling: 0 for uniform, 1 for cosine, or above for higher Phong exponents. In code:

float3 SampleHemisphere(float3 normal, float alpha)
{
    // Sample the hemisphere, where alpha determines the kind of the sampling
    float cosTheta = pow(rand(), 1.0f / (alpha + 1.0f));
    float sinTheta = sqrt(1.0f - cosTheta * cosTheta);
    float phi = 2 * PI * rand();
    float3 tangentSpaceDir = float3(cos(phi) * sinTheta, sin(phi) * sinTheta, cosTheta);

    // Transform direction to world space
    return mul(tangentSpaceDir, GetTangentSpace(normal));
}

Now the probability of each sample is \(p(\omega) = \frac{\alpha + 1}{2 \pi} \, (\vec \omega \cdot \vec n)^\alpha\). The beauty of this might not be immediately obvious, but it will unfold in a minute.

Importance-Sampling Lambert

First, we’ll improve our diffuse rendering. Our uniform distribution already fits the constant Lambert BRDF quite well, but we can do better by including the cosine factor. The probability distribution of the cosine sampling (where \(\alpha = 1\)) is \(\frac{(\vec \omega_{i} \cdot \vec n)}{\pi}\), which simplifies our diffuse Monte Carlo formula to:

$$L(x, \, \vec \omega_{o}) \approx \frac{1}{N} \sum_{n=0}^{N}{\color{BlueViolet}{k_d} \, L(x, \, \vec \omega_{i})}$$

// Diffuse reflection
ray.origin = hit.position + hit.normal * 0.001f;
ray.direction = SampleHemisphere(hit.normal, 1.0f);
ray.energy *= (1.0f / diffChance) * hit.albedo;

This will give our diffuse shading a slight speedup. Let’s take care of the real culprit now.

Importance-Sampling Phong

For the Phong BRDF, the procedure is similar. This time, we have a product of two cosines: the regular cosine from the rendering equation (like in the diffuse case) times the BRDF’s own powered cosine. We will only take care of the latter.

Let’s insert the probability distribution from above into our Phong equation. A detailed derivation can be found in Lafortune and Willems: Using the Modified Phong Reflectance Model for Physically Based Rendering (1994):

$$L(x, \, \vec \omega_{o}) \approx \frac{1}{N} \sum_{n=0}^{N}{\color{brown}{k_s \, \frac{\alpha + 2}{\alpha + 1}} \, (\vec \omega_{i} \cdot \vec n) \, L(x, \, \vec \omega_{i})}$$

// Specular reflection
float alpha = 15.0f;
ray.origin = hit.position + hit.normal * 0.001f;
ray.direction = SampleHemisphere(reflect(ray.direction, hit.normal), alpha);
float f = (alpha + 2) / (alpha + 1);
ray.energy *= (1.0f / specChance) * hit.specular * sdot(hit.normal, ray.direction, f);

These changes are enough to fix any problems with high Phong exponents and make our rendering converge in a way more reasonable time.

Materials

Finally, let’s extend our scene generation so we get varying values for smoothness and emission of our spheres! In C#, add a public float smoothness and public Vector3 emission to struct Sphere. Since we changed the size of the struct, we need to adjust the stride when creating the Compute Buffer (4 × number of floats, remember?). Make the SetUpScene function put in some values for smoothness and emission.

Back in the shader, add both variables to struct Sphere and struct RayHit and initialize them in CreateRayHit. Last but not least, set both values in IntersectGroundPlane (hard-coded, put in anything you want) and IntersectSphere (taking the values from the Sphere).

I’d like to use smoothness values the way I’m used to from the Unity Standard shader, which is different than the kinda arbitrary Phong exponent. Here’s a conversion that works reasonably well, to be used in the Shade function:

float SmoothnessToPhongAlpha(float s)
{
    return pow(1000.0f, s * s);
}
float alpha = SmoothnessToPhongAlpha(hit.smoothness);

Using emission is simply done by returning the value in Shade:

return hit.emission;

Results

Take a deep breath, relax, and wait for your image to clear into a soothing sight like this:

Congratulations! You have made it through this math-ridden forest. You implemented a path tracer capable of diffuse and specular shading, and you learned about Importance Sampling, applying the concept right away to make rendering converge in a matter of minutes rather than hours or days.

This article has been quite a leap from the last one in terms of complexity, but also in terms of quality of the results. Working your way through the math takes time, but is worth it, since it will drastically deepen your understanding of what’s going on and will allow you to extend the algorithm without breaking physical plausibility.

Thank you for following along! In the third part, we will leave the thicket of sampling and shading behind us (for now…), and go back to civilization for a rendezvous with Gentlemen Möller and Trumbore. They have a word or two to say about triangles.

Game Developer for Three Eyed Games, programmer for Volkswagen’s Virtual Engineering Lab, computer graphics researcher and Heavy Metal musician.