Inspiration

I played Alan Wake recently and thought it would be fun to implement a flashlight utilising the indirect lighting from reflective shadow maps.

The key elements I wanted were:

  • Indirect lighting
  • A flashlight texture (a cookie)
  • Particles lit by the flashlight
  • Some kind of fog effect

Implementation

So first I had to convert the reflective shadow map for spotlights. The only difference between directional light and spotlight is the flux where you have to consider the attenuation of the spotlight.

A lot of the magic comes from a function that checks whether a world position is inside a cone with a certain angle.

I use this to decide the alpha of particles and whether or not to calculate volumetric light:

bool IsInSpotlightCone(float3 aWorldPosition, float aAngle)
{
    float3 toLight = lightPosition - aWorldPosition;
    float distToLight = length(toLight);
    float3 lightDir = normalize(toLight);
    float angle = dot(lightDir, lightDirection);

    return (angle > cos(aAngle)) && (distToLight < lightRange);
}

Changing angles of the volumetric light automatically hides or reveals particles.


GPU Particles

The very first thing I did to include particles was to write a Sprite class to render a textured square that’s always facing the camera. At first I filled the whole scene with particles and updated their positions on the CPU, and then rendered them one by one. The more particles I added the slower it got, obviously. So I first had to draw them instanced which helped a lot with performance. But updating the positions of each particle was still taking a large chunk of time out of the CPU.

There are a couple of solutions for this. First of all, don’t update and render particles over the whole scene, and second, utilise the graphics card for heavy computation. So I decided to familiarise myself with compute shaders.

At first I had some trouble understanding the correlation between the dispatch size and the number of threads declared in the compute shader. I initially had a dispatch call like this:

pContext->Dispatch(1024, 1, 1);

And a number of threads in the computer shader like this:

[numthreads(1024, 1, 1)]

I put the same number because I thought they had to match.

After doing some more research I finally understood how they correlate. The dispatch number declares how many thread groups that are dispatched to the GPU. And the number of threads in the compute shader is per those dispatched groups. So 1024 thread groups consisting of 1024 threads each would result in 1 048 576 threads running the shader code.

I wanted to fill the whole Sponza scene with 2 million particles. So I thought I was clever when I did this in the compute shader:

[numthreads(1024, 1, 1)]
void main(uint3 DTid : SV_DispatchThreadID)
{
    [unroll(2)]
    for (int i = 0; i < 2; ++i)
    {
        const uint updateIndex = DTid.x + 1048576 * i;
...

Only the problem there is jumping that far in GPU memory is not very efficient. But it worked!

My next idea was to limit the positions of each particle to wrap around the camera’s position so that the whole scene doesn’t have to be riddled with particles no one is looking at. That also meant I could go from having 2 million particles to 131 072 to save performance, and still make it appear that there are particles everywhere.

I chose a dispatch of 2048:

pContext->Dispatch(2048, 1, 1);

And thread group size of 64, and no more for-loop:

[numthreads(64, 1, 1)]
void main(uint3 DTid : SV_DispatchThreadID)
{
    const uint updateIndex = DTid.x;

    instanceData[updateIndex].instanceTransform = UpdateTransform(updateIndex);
    instanceData[updateIndex].colour.a = UpdateAlpha(updateIndex);
    
    particleData[updateIndex].travelAngles = UpdateAngles(updateIndex);
    particleData[updateIndex].startPosition = ClampPosition(transform);
}

So that all 2048 thread groups run 64 threads, which results in all 131 072 particles being updated.


The Result

Putting everything together I have a flashlight that:

  • Casts shadows
  • Reflects indirect light on objects around it
  • Uses a texture for the direct light
  • Illuminates GPU particles
  • Utilises volumetric lighting for a fog effect

Next post