↪️ 🛂 👨🏽‍🎤 The reverse engineering of the rendering of The Witcher 3: various sky effects 👌🏾 🔨 👩🏽‍🤝‍👨🏻

[Previous parts of the analysis: first and second and third .]

Part 1. Cirrus Clouds

When the game takes place in open spaces, one of the factors determining the credibility of the world is the sky. Think about it - most of the time the sky literally takes about 40-50% of the entire screen. The sky is much more than a beautiful gradient. It has stars, the sun, the moon, and finally clouds.

Although current trends seem to consist in volumetric rendering of clouds using raymarching (see this article ), the clouds in The Witcher 3 are completely texture-based. I already examined them before, but it turned out that with them everything is more complicated than I originally expected. If you followed my series of articles, then you know that there is a difference between the Blood and Wine DLC and the rest of the game. And, as you might guess, there are some changes in the work with clouds in the DLC.

The Witcher 3 has several layers of clouds. Depending on the weather, it can only be cirrus clouds , high cumulus clouds , possibly a few clouds from the family of layered clouds (for example, during a storm). In the end, there may be no clouds at all.

Some layers differ in terms of the textures and shaders used to render them. Obviously, this affects the complexity and length of the assembler code for the pixel shader.

Despite all this diversity, there are some common patterns that can be observed when rendering clouds in Witcher 3. First, they all render in a proactive pass, and this is the perfect choice. All of them use mixing (see below). This makes it much easier to control how a separate layer covers the sky - this is affected by the alpha value from the pixel shader.

More interestingly, some layers are rendered twice with the same parameters.

After looking at the code, I chose the shortest shader in order to (1) most likely perform its full reverse engineering, (2) figure out all its aspects.

I took a closer look at the cirrus clouds from Witcher 3: Blood and Wine.

Here is an example frame:

Before rendering

After the first render pass

After the second render pass

In this particular frame, cirrus clouds are the first layer in rendering. As you can see, it is rendered twice, which increases its brightness.

Geometric and vertex shader

Before the pixel shader, we will briefly talk about the used geometric and vertex shaders. The mesh for displaying clouds is a bit like a regular sky dome:

All vertices are in the interval [0-1], so to center the mesh on the point (0,0,0), scaling and deviation are used before converting to worldViewProj (we already know this pattern from the previous parts of the series). In the case of clouds, the mesh stretches strongly along the XY plane (the Z axis points up) to cover more space than the pyramid of visibility. The result is as follows:

In addition, the mesh has normal and tangent vectors. The vertex shader also calculates the bi-tangent vector by the vector product - all three are displayed in normalized form. There is also a top calculation of fog (its color and brightness).

Pixel shader

The pixel shader assembly code looks like this:

 ps_5_0  
    dcl_globalFlags refactoringAllowed  
    dcl_constantbuffer cb0[10], immediateIndexed  
    dcl_constantbuffer cb1[9], immediateIndexed  
    dcl_constantbuffer cb12[238], immediateIndexed  
    dcl_constantbuffer cb4[13], immediateIndexed  
    dcl_sampler s0, mode_default  
    dcl_resource_texture2d (float,float,float,float) t0  
    dcl_resource_texture2d (float,float,float,float) t1  
    dcl_input_ps linear v0.xyzw  
    dcl_input_ps linear v1.xyzw  
    dcl_input_ps linear v2.w  
    dcl_input_ps linear v3.xyzw  
    dcl_input_ps linear v4.xyz  
    dcl_input_ps linear v5.xyz  
    dcl_output o0.xyzw  
    dcl_temps 4  
   0: mul r0.xyz, cb0[9].xyzx, l(1.000000, 1.000000, -1.000000, 0.000000)  
   1: dp3 r0.w, r0.xyzx, r0.xyzx  
   2: rsq r0.w, r0.w  
   3: mul r0.xyz, r0.wwww, r0.xyzx  
   4: mul r1.xy, cb0[0].xxxx, cb4[5].xyxx  
   5: mad r1.xy, v1.xyxx, cb4[4].xyxx, r1.xyxx  
   6: sample_indexable(texture2d)(float,float,float,float) r1.xyzw, r1.xyxx, t0.xyzw, s0  
   7: add r1.xyz, r1.xyzx, l(-0.500000, -0.500000, -0.500000, 0.000000)  
   8: add r1.xyz, r1.xyzx, r1.xyzx  
   9: dp3 r0.w, r1.xyzx, r1.xyzx  
  10: rsq r0.w, r0.w  
  11: mul r1.xyz, r0.wwww, r1.xyzx  
  12: mul r2.xyz, r1.yyyy, v3.xyzx  
  13: mad r2.xyz, v5.xyzx, r1.xxxx, r2.xyzx  
  14: mov r3.xy, v1.zwzz  
  15: mov r3.z, v3.w  
  16: mad r1.xyz, r3.xyzx, r1.zzzz, r2.xyzx  
  17: dp3_sat r0.x, r0.xyzx, r1.xyzx  
  18: add r0.y, -cb4[2].x, cb4[3].x  
  19: mad r0.x, r0.x, r0.y, cb4[2].x  
  20: dp2 r0.y, -cb0[9].xyxx, -cb0[9].xyxx  
  21: rsq r0.y, r0.y  
  22: mul r0.yz, r0.yyyy, -cb0[9].xxyx  
  23: add r1.xyz, -v4.xyzx, cb1[8].xyzx  
  24: dp3 r0.w, r1.xyzx, r1.xyzx  
  25: rsq r1.z, r0.w  
  26: sqrt r0.w, r0.w  
  27: add r0.w, r0.w, -cb4[7].x  
  28: mul r1.xy, r1.zzzz, r1.xyxx  
  29: dp2_sat r0.y, r0.yzyy, r1.xyxx  
  30: add r0.y, r0.y, r0.y  
  31: min r0.y, r0.y, l(1.000000)  
  32: add r0.z, -cb4[0].x, cb4[1].x  
  33: mad r0.z, r0.y, r0.z, cb4[0].x  
  34: mul r0.x, r0.x, r0.z  
  35: log r0.x, r0.x  
  36: mul r0.x, r0.x, l(2.200000)  
  37: exp r0.x, r0.x  
  38: add r1.xyz, cb12[236].xyzx, -cb12[237].xyzx  
  39: mad r1.xyz, r0.yyyy, r1.xyzx, cb12[237].xyzx  
  40: mul r2.xyz, r0.xxxx, r1.xyzx  
  41: mad r0.xyz, -r1.xyzx, r0.xxxx, v0.xyzx  
  42: mad r0.xyz, v0.wwww, r0.xyzx, r2.xyzx  
  43: add r1.x, -cb4[7].x, cb4[8].x  
  44: div_sat r0.w, r0.w, r1.x  
  45: mul r1.x, r1.w, cb4[9].x  
  46: mad r1.y, -cb4[9].x, r1.w, r1.w  
  47: mad r0.w, r0.w, r1.y, r1.x  
  48: mul r1.xy, cb0[0].xxxx, cb4[11].xyxx  
  49: mad r1.xy, v1.xyxx, cb4[10].xyxx, r1.xyxx  
  50: sample_indexable(texture2d)(float,float,float,float) r1.x, r1.xyxx, t1.xyzw, s0  
  51: mad r1.x, r1.x, cb4[12].x, -cb4[12].x  
  52: mad_sat r1.x, cb4[12].x, v2.w, r1.x  
  53: mul r0.w, r0.w, r1.x  
  54: mul_sat r0.w, r0.w, cb4[6].x  
  55: mul o0.xyz, r0.wwww, r0.xyzx  
  56: mov o0.w, r0.w  
  57: ret

Two seamless textures are input. One of them contains a normal map ( xyz channels ) and a cloud shape (channel a ). The second is noise to distort the shape.

Normal Map, CD Projekt Red Property

Cloud Shape, Property CD Projekt Red

Noise texture, property of CD Projekt Red

The main buffer of constants with cloud parameters is cb4. For this frame, it has the following meanings:

In addition, other values from other cbuffers are used. Do not worry, we will consider them too.

Z-direction inverted sunlight

The first thing that happens in the shader is the calculation of the normalized direction of sunlight inverted along the Z axis:

   0: mul r0.xyz, cb0[9].xyzx, l(1.000000, 1.000000, -1.000000, 0.000000)  
   1: dp3 r0.w, r0.xyzx, r0.xyzx  
   2: rsq r0.w, r0.w  
   3: mul r0.xyz, r0.wwww, r0.xyzx  

   float3 invertedSunlightDir = normalize(lightDir * float3(1, 1, -1) );

As mentioned earlier, the Z axis is directed upward, and cb0 [9] is the direction of sunlight. This vector is aimed at the sun - it is important! You can verify this by writing a simple computational shader that runs a simple NdotL, and inserting it into the deferred shader pass.

Cloud Texture Sampling

The next step is to compute texcoords to sample the cloud texture, unpack the normal vector and normalize it.

   4: mul r1.xy, cb0[0].xxxx, cb4[5].xyxx   
   5: mad r1.xy, v1.xyxx, cb4[4].xyxx, r1.xyxx   
   6: sample_indexable(texture2d)(float,float,float,float) r1.xyzw, r1.xyxx, t0.xyzw, s0   
   7: add r1.xyz, r1.xyzx, l(-0.500000, -0.500000, -0.500000, 0.000000)   
   8: add r1.xyz, r1.xyzx, r1.xyzx   
   9: dp3 r0.w, r1.xyzx, r1.xyzx   
  10: rsq r0.w, r0.w   
   
   
   // Calc sampling coords  
   float2 cloudTextureUV = Texcoords * textureScale + elapsedTime * speedFactors;  
   
   // Sample texture and get data from it  
   float4 cloudTextureValue = texture0.Sample( sampler0, cloudTextureUV ).rgba;  
   float3 normalMap = cloudTextureValue.xyz;  
   float cloudShape = cloudTextureValue.a;  
   
   // Unpack normal and normalize it  
   float3 unpackedNormal = (normalMap - 0.5) * 2.0;  
   unpackedNormal = normalize(unpackedNormal);

Let's deal with it gradually.

To get the movement of the clouds, we need elapsed time in seconds ( cb [0] .x ) multiplied by the speed coefficient, which affects how fast the clouds move across the sky ( cb4 [5] .xy ).

As I said earlier, UVs are stretched along the geometry of the sky dome, and we also need texture scaling factors that affect the size of the clouds ( cb4 [4] .xy ).

The final formula is:

samplingUV = Input.TextureUV * textureScale + time * speedMultiplier;

After sampling all 4 channels, we have a normal map (rgb channels) and a cloud shape (channel a).

To unpack the normal map from the interval [0; 1] in the interval [-1; 1] we use the following formula:

unpackedNormal = (packedNormal - 0.5) * 2.0;

You can also use this:

unpackedNormal = packedNormal * 2.0 - 1.0;

Finally, we normalize the unpacked normal vector.

Overlay normals

Having the normal vectors, the tangent and the bi-tangent vectors from the vertex shader, and the normal vector from the normal map, we normally map the normals.

  11: mul r1.xyz, r0.wwww, r1.xyzx  
  12: mul r2.xyz, r1.yyyy, v3.xyzx  
  13: mad r2.xyz, v5.xyzx, r1.xxxx, r2.xyzx  
  14: mov r3.xy, v1.zwzz  
  15: mov r3.z, v3.w  
  16: mad r1.xyz, r3.xyzx, r1.zzzz, r2.xyzx  
    
   // Perform bump mapping  
   float3 SkyTangent = Input.Tangent;  
   float3 SkyNormal = (float3( Input.Texcoords.zw, Input.param3.w ));  
   float3 SkyBitangent = Input.param3.xyz;  
        
   float3x3 TBN = float3x3(SkyTangent, SkyBitangent, SkyNormal);  
   float3 finalNormal = (float3)mul( unpackedNormal, (TBN) );

Brightness (1)

In the next step, NdotL calculation is applied and this affects the amount of illumination of a specific pixel.

Consider the following assembler code:

  17: dp3_sat r0.x, r0.xyzx, r1.xyzx  
  18: add r0.y, -cb4[2].x, cb4[3].x  
  19: mad r0.x, r0.x, r0.y, cb4[2].x

Here is the visualization of NdotL on the frame in question:

This scalar product (with saturation) is used to interpolate between minIntensity and maxIntensity. Thanks to this, parts of the clouds illuminated by sunlight will be brighter.

   // Calculate cosine between normal and up-inv lightdir  
   float NdotL = saturate( dot(invertedSunlightDir, finalNormal) );  
   
   // Param 1, line 19, r0.x  
   float intensity1 = lerp( param1Min, param1Max, NdotL );

Brightness (2)

There is another factor affecting the brightness of the clouds.

Clouds located in that part of the sky where the sun is, should be more highlighted. To do this, we calculate the gradient based on the XY plane.

This gradient is used to calculate linear interpolation between the min / max values, similar to what happens in part (1).

That is, theoretically, we can ask to darken the clouds located on the opposite side of the sun, but this does not happen in this particular frame, because param2Min and param2Max ( cb4 [0] .x and cb4 [1] .x ) are set to 1.0f.

  20: dp2 r0.y, -cb0[9].xyxx, -cb0[9].xyxx  
  21: rsq r0.y, r0.y  
  22: mul r0.yz, r0.yyyy, -cb0[9].xxyx  
  23: add r1.xyz, -v4.xyzx, cb1[8].xyzx  
  24: dp3 r0.w, r1.xyzx, r1.xyzx  
  25: rsq r1.z, r0.w  
  26: sqrt r0.w, r0.w  
  27: add r0.w, r0.w, -cb4[7].x  
  28: mul r1.xy, r1.zzzz, r1.xyxx  
  29: dp2_sat r0.y, r0.yzyy, r1.xyxx  
  30: add r0.y, r0.y, r0.y  
  31: min r0.y, r0.y, l(1.000000)  
  32: add r0.z, -cb4[0].x, cb4[1].x  
  33: mad r0.z, r0.y, r0.z, cb4[0].x  
  34: mul r0.x, r0.x, r0.z  
  35: log r0.x, r0.x  
  36: mul r0.x, r0.x, l(2.200000)  
  37: exp r0.x, r0.x   
   
   
   // Calculate normalized -lightDir.xy (20-22)  
   float2 lightDirXY = normalize( -lightDir.xy );  
   
   // Calculate world to camera  
   float3 vWorldToCamera = ( CameraPos - WorldPos );  
   float worldToCamera_distance = length(vWorldToCamera);  
        
   // normalize vector  
   vWorldToCamera = normalize( vWorldToCamera );  
        
   
   float LdotV = saturate( dot(lightDirXY, vWorldToCamera.xy) );  
   float highlightedSkySection = saturate( 2*LdotV );  
   float intensity2 = lerp( param2Min, param2Max, highlightedSkySection );  
   
   float finalIntensity = pow( intensity2 *intensity1, 2.2);

At the very end, we multiply both brightnesses and raise the result to a power of 2.2.

Cloud color

The calculation of the color of the clouds begins with obtaining from the buffer constants two values indicating the color of the clouds next to the sun and the clouds on the opposite side of the sky. Between them, linear interpolation is performed based on highlightedSkySection .

Then the result is multiplied by finalIntensity .

And in the end, the result is mixed with fog (for performance reasons, it was calculated by the vertex shader).

  38: add r1.xyz, cb12[236].xyzx, -cb12[237].xyzx  
  39: mad r1.xyz, r0.yyyy, r1.xyzx, cb12[237].xyzx  
  40: mul r2.xyz, r0.xxxx, r1.xyzx  
  41: mad r0.xyz, -r1.xyzx, r0.xxxx, v0.xyzx  
  42: mad r0.xyz, v0.wwww, r0.xyzx, r2.xyzx  
   
  float3 cloudsColor = lerp( cloudsColorBack, cloudsColorFront, highlightedSunSection );  
  cloudsColor *= finalIntensity;  
  cloudsColor = lerp( cloudsColor, FogColor, FogAmount );

Make cirrus clouds more visible on the horizon

This is not very noticeable on the frame, but in fact this layer is more visible near the horizon than above Geralt's head. Here's how to do it.

You could notice that when calculating the second brightness, we calculated the length of the vector worldToCamera :

  23: add r1.xyz, -v4.xyzx, cb1[8].xyzx  
  24: dp3 r0.w, r1.xyzx, r1.xyzx  
  25: rsq r1.z, r0.w  
  26: sqrt r0.w, r0.w

Let's find the following occurrences of this length in the code:

  26: sqrt r0.w, r0.w  
  27: add r0.w, r0.w, -cb4[7].x  
  ...  
  43: add r1.x, -cb4[7].x, cb4[8].x  
  44: div_sat r0.w, r0.w, r1.x

Wow, what is it with us?

cb [7] .x and cb [8] .x have the values 2000.0 and 7000.0.

It turns out that this is the result of using the linstep function .

She receives three parameters: min / max - interval and v - value.

This works as follows: if v is in the interval [ min - max ], then the function returns linear interpolation in the interval [0.0 - 1.0]. On the other hand, if v is out of range, then linstep returns 0.0 or 1.0.

A simple example:

linstep( 1000.0, 2000.0, 999.0) = 0.0
linstep( 1000.0, 2000.0, 1500.0) = 0.5
linstep( 1000.0, 2000.0, 2000.0) = 1.0

That is, it is quite similar to the smoothstep from HLSL, except that in this case, instead of Hermitian interpolation, linear is performed.

Linstep is not a feature in HLSL, but it is very useful. It is worth having it in your toolkit.

 // linstep:  
 //  
 // Returns a linear interpolation between 0 and 1 if t is in the range [min, max]   
 // if "v" is <= min, the output is 0  
 // if "v" i >= max, the output is 1  
   
 float linstep( float min, float max, float v )  
 {  
   return saturate( (v - min) / (max - min) );  
 }

Let's go back to Witcher 3: after calculating this indicator, reporting how far a particular part of the sky is from Geralt, we use it to weaken the brightness of the clouds:

  45: mul r1.x, r1.w, cb4[9].x  
  46: mad r1.y, -cb4[9].x, r1.w, r1.w  
  47: mad r0.w, r0.w, r1.y, r1.x  
   
   float distanceAttenuation = linstep( fadeDistanceStart, fadeDistanceEnd, worldToCamera_distance );  
    
   float fadedCloudShape = closeCloudsHidingFactor * cloudShape;  
   cloudShape = lerp( fadedCloudShape, cloudShape, distanceAttenuation );

cloudShape is the .a channel from the first texture, and closeCloudsHidingFactor is a constant buffer value that controls the visibility of clouds above Geralt's head. In all the frames I tested, it was equal to 0.0, which is equivalent to the absence of clouds. As distanceAttenuation approaches 1.0 (the distance from the camera to the dome of the sky increases), the clouds become more visible.

Noise texture sampling

Calculation of coordinates of sampling noise texture similar calculations for the texture of clouds, except that you use a different set of textureScale and speedMultiplier .

Of course, a sampler with the wrap addressing mode enabled is used to sample all these textures .

  48: mul r1.xy, cb0[0].xxxx, cb4[11].xyxx  
  49: mad r1.xy, v1.xyxx, cb4[10].xyxx, r1.xyxx  
  50: sample_indexable(texture2d)(float,float,float,float) r1.x, r1.xyxx, t1.xyzw, s0  
   
   // Calc sampling coords for noise  
   float2 noiseTextureUV = Texcoords * textureScaleNoise + elapsedTime * speedFactorsNoise;  
   
   // Sample texture and get data from it  
   float noiseTextureValue = texture1.Sample( sampler0, noiseTextureUV ).x;

Putting It All Together

Having received the noise value, we must combine it with cloudShape.

I had some problems understanding these lines, where there is param2.w (which is always 1.0) and noiseMult (has a value of 5.0, taken from the constant buffer).

Be that as it may, the most important thing here is the final value of generalCloudsVisibility , which affects the visibility of the clouds.

Take a look also at the final value of the noise. The output color of cloudsColor is multiplied by the final noise, which is also output to the alpha channel.

  51: mad r1.x, r1.x, cb4[12].x, -cb4[12].x
  52: mad_sat r1.x, cb4[12].x, v2.w, r1.x
  53: mul r0.w, r0.w, r1.x
  54: mul_sat r0.w, r0.w, cb4[6].x
  55: mul o0.xyz, r0.wwww, r0.xyzx
  56: mov o0.w, r0.w
  57: ret   

   // Sample noise texture and get data from it  
   float noiseTextureValue = texture1.Sample( sampler0, noiseTextureUV ).x;  
   noiseTextureValue = noiseTextureValue * noiseMult - noiseMult;  
     
   float noiseValue = saturate( noiseMult * Input.param2.w + noiseTextureValue);  
   noiseValue *= cloudShape;  
     
   float finalNoise = saturate( noiseValue * generalCloudsVisibility);  
   
   return float4( cloudsColor*finalNoise, finalNoise );

Total

The finished result looks very believable.

You can compare. The first picture is my shader, the second is the game shader:

If you're curious, the shader is available here .

Part 2. Fog

Fog can be implemented in various ways. However, the times when we could apply a simple distance-dependent fog and do away with it were forever in the past (most likely). Living in the world of programmable shaders has opened the door for new crazy, but more importantly, physically accurate and visually realistic solutions.

Current trends in fog rendering are based on computational shaders (for details, see this presentation by Bart Wronsky).

Despite the fact that this presentation appeared in 2014, and The Witcher 3 was released in 2015/2016, the fog in the last part of Geralt’s adventures is completely dependent on the screen and is implemented as a typical post-processing.

Before we begin our next reverse engineering session, I must say that over the past year I tried to figure out the fog of Witcher 3 at least five times, and each time failed. The assembler code, as you will soon see, is quite complicated, and this makes the process of creating an readable fog shader on HLSL almost impossible.

However, I managed to find a fog shader on the Internet that immediately attracted my attention due to its similarity to The Witcher 3 fog in terms of variable names and the general order of instructions. This shader was not exactly the same as in the game, so I had to rework it a bit. I want to say this that the main part of the HLSL code that you see here was, with two exceptions, not created / analyzed by me. Remember this.

Here's the assembler code for the pixel fog shader - it's worth noting that it is the same for the entire game (the main part of 2015 and both DLCs):

 ps_5_0  
    dcl_globalFlags refactoringAllowed  
    dcl_constantbuffer cb3[2], immediateIndexed  
    dcl_constantbuffer cb12[214], immediateIndexed  
    dcl_resource_texture2d (float,float,float,float) t0  
    dcl_resource_texture2d (float,float,float,float) t1  
    dcl_resource_texture2d (float,float,float,float) t2  
    dcl_input_ps_siv v0.xy, position  
    dcl_output o0.xyzw  
    dcl_temps 7  
   0: ftou r0.xy, v0.xyxx  
   1: mov r0.zw, l(0, 0, 0, 0)  
   2: ld_indexable(texture2d)(float,float,float,float) r1.x, r0.xyww, t0.xyzw  
   3: mad r1.y, r1.x, cb12[22].x, cb12[22].y  
   4: lt r1.y, r1.y, l(1.000000)  
   5: if_nz r1.y  
   6:  utof r1.yz, r0.xxyx  
   7:  mul r2.xyzw, r1.zzzz, cb12[211].xyzw  
   8:  mad r2.xyzw, cb12[210].xyzw, r1.yyyy, r2.xyzw  
   9:  mad r1.xyzw, cb12[212].xyzw, r1.xxxx, r2.xyzw  
  10:  add r1.xyzw, r1.xyzw, cb12[213].xyzw  
  11:  div r1.xyz, r1.xyzx, r1.wwww  
  12:  ld_indexable(texture2d)(float,float,float,float) r2.xyz, r0.xyww, t1.xyzw  
  13:  ld_indexable(texture2d)(float,float,float,float) r0.x, r0.xyzw, t2.xyzw  
  14:  max r0.x, r0.x, cb3[1].x  
  15:  add r0.yzw, r1.xxyz, -cb12[0].xxyz  
  16:  dp3 r1.x, r0.yzwy, r0.yzwy  
  17:  sqrt r1.x, r1.x  
  18:  add r1.y, r1.x, -cb3[0].x  
  19:  add r1.zw, -cb3[0].xxxz, cb3[0].yyyw  
  20:  div_sat r1.y, r1.y, r1.z  
  21:  mad r1.y, r1.y, r1.w, cb3[0].z  
  22:  add r0.x, r0.x, l(-1.000000)  
  23:  mad r0.x, r1.y, r0.x, l(1.000000)  
  24:  div r0.yzw, r0.yyzw, r1.xxxx  
  25:  mad r1.y, r0.w, cb12[22].z, cb12[0].z  
  26:  add r1.x, r1.x, -cb12[22].z  
  27:  max r1.x, r1.x, l(0)  
  28:  min r1.x, r1.x, cb12[42].z  
  29:  mul r1.z, r0.w, r1.x  
  30:  mul r1.w, r1.x, cb12[43].x  
  31:  mul r1.zw, r1.zzzw, l(0.000000, 0.000000, 0.062500, 0.062500)  
  32:  dp3 r0.y, cb12[38].xyzx, r0.yzwy  
  33:  add r0.z, r0.y, cb12[42].x  
  34:  add r0.w, cb12[42].x, l(1.000000)  
  35:  div_sat r0.z, r0.z, r0.w  
  36:  add r0.w, -cb12[43].z, cb12[43].y  
  37:  mad r0.z, r0.z, r0.w, cb12[43].z  
  38:  mul r0.w, abs(r0.y), abs(r0.y)  
  39:  mad_sat r2.w, r1.x, l(0.002000), l(-0.300000)  
  40:  mul r0.w, r0.w, r2.w  
  41:  lt r0.y, l(0), r0.y  
  42:  movc r3.xyz, r0.yyyy, cb12[39].xyzx, cb12[41].xyzx  
  43:  add r3.xyz, r3.xyzx, -cb12[40].xyzx  
  44:  mad r3.xyz, r0.wwww, r3.xyzx, cb12[40].xyzx  
  45:  movc r4.xyz, r0.yyyy, cb12[45].xyzx, cb12[47].xyzx  
  46:  add r4.xyz, r4.xyzx, -cb12[46].xyzx  
  47:  mad r4.xyz, r0.wwww, r4.xyzx, cb12[46].xyzx  
  48:  ge r0.y, r1.x, cb12[48].y  
  49:  if_nz r0.y  
  50:   add r0.y, r1.y, cb12[42].y  
  51:   mul r0.w, r0.z, r0.y  
  52:   mul r1.y, r0.z, r1.z  
  53:   mad r5.xyzw, r1.yyyy, l(16.000000, 15.000000, 14.000000, 13.000000), r0.wwww  
  54:   max r5.xyzw, r5.xyzw, l(0, 0, 0, 0)  
  55:   add r5.xyzw, r5.xyzw, l(1.000000, 1.000000, 1.000000, 1.000000)  
  56:   div_sat r5.xyzw, r1.wwww, r5.xyzw  
  57:   add r5.xyzw, -r5.xyzw, l(1.000000, 1.000000, 1.000000, 1.000000)  
  58:   mul r1.z, r5.y, r5.x  
  59:   mul r1.z, r5.z, r1.z  
  60:   mul r1.z, r5.w, r1.z  
  61:   mad r5.xyzw, r1.yyyy, l(12.000000, 11.000000, 10.000000, 9.000000), r0.wwww  
  62:   max r5.xyzw, r5.xyzw, l(0, 0, 0, 0)  
  63:   add r5.xyzw, r5.xyzw, l(1.000000, 1.000000, 1.000000, 1.000000)  
  64:   div_sat r5.xyzw, r1.wwww, r5.xyzw  
  65:   add r5.xyzw, -r5.xyzw, l(1.000000, 1.000000, 1.000000, 1.000000)  
  66:   mul r1.z, r1.z, r5.x  
  67:   mul r1.z, r5.y, r1.z  
  68:   mul r1.z, r5.z, r1.z  
  69:   mul r1.z, r5.w, r1.z  
  70:   mad r5.xyzw, r1.yyyy, l(8.000000, 7.000000, 6.000000, 5.000000), r0.wwww  
  71:   max r5.xyzw, r5.xyzw, l(0, 0, 0, 0)  
  72:   add r5.xyzw, r5.xyzw, l(1.000000, 1.000000, 1.000000, 1.000000)  
  73:   div_sat r5.xyzw, r1.wwww, r5.xyzw  
  74:   add r5.xyzw, -r5.xyzw, l(1.000000, 1.000000, 1.000000, 1.000000)  
  75:   mul r1.z, r1.z, r5.x  
  76:   mul r1.z, r5.y, r1.z  
  77:   mul r1.z, r5.z, r1.z  
  78:   mul r1.z, r5.w, r1.z  
  79:   mad r5.xy, r1.yyyy, l(4.000000, 3.000000, 0.000000, 0.000000), r0.wwww  
  80:   max r5.xy, r5.xyxx, l(0, 0, 0, 0)  
  81:   add r5.xy, r5.xyxx, l(1.000000, 1.000000, 0.000000, 0.000000)  
  82:   div_sat r5.xy, r1.wwww, r5.xyxx  
  83:   add r5.xy, -r5.xyxx, l(1.000000, 1.000000, 0.000000, 0.000000)  
  84:   mul r1.z, r1.z, r5.x  
  85:   mul r1.z, r5.y, r1.z  
  86:   mad r0.w, r1.y, l(2.000000), r0.w  
  87:   max r0.w, r0.w, l(0)  
  88:   add r0.w, r0.w, l(1.000000)  
  89:   div_sat r0.w, r1.w, r0.w  
  90:   add r0.w, -r0.w, l(1.000000)  
  91:   mul r0.w, r0.w, r1.z  
  92:   mad r0.y, r0.y, r0.z, r1.y  
  93:   max r0.y, r0.y, l(0)  
  94:   add r0.y, r0.y, l(1.000000)  
  95:   div_sat r0.y, r1.w, r0.y  
  96:   add r0.y, -r0.y, l(1.000000)  
  97:   mad r0.y, -r0.w, r0.y, l(1.000000)  
  98:   add r0.z, r1.x, -cb12[48].y  
  99:   mul_sat r0.z, r0.z, cb12[48].z  
  100:  else  
  101:   mov r0.yz, l(0.000000, 1.000000, 0.000000, 0.000000)  
  102:  endif  
  103:  log r0.y, r0.y  
  104:  mul r0.w, r0.y, cb12[42].w  
  105:  exp r0.w, r0.w  
  106:  mul r0.y, r0.y, cb12[48].x  
  107:  exp r0.y, r0.y  
  108:  mul r0.yw, r0.yyyw, r0.zzzz  
  109:  mad_sat r1.xy, r0.wwww, cb12[189].xzxx, cb12[189].ywyy  
  110:  add r5.xyz, -r3.xyzx, cb12[188].xyzx  
  111:  mad r5.xyz, r1.xxxx, r5.xyzx, r3.xyzx  
  112:  add r0.z, cb12[188].w, l(-1.000000)  
  113:  mad r0.z, r1.y, r0.z, l(1.000000)  
  114:  mul_sat r5.w, r0.z, r0.w  
  115:  lt r0.z, l(0), cb12[192].x  
  116:  if_nz r0.z  
  117:   mad_sat r1.xy, r0.wwww, cb12[191].xzxx, cb12[191].ywyy  
  118:   add r6.xyz, -r3.xyzx, cb12[190].xyzx  
  119:   mad r3.xyz, r1.xxxx, r6.xyzx, r3.xyzx  
  120:   add r0.z, cb12[190].w, l(-1.000000)  
  121:   mad r0.z, r1.y, r0.z, l(1.000000)  
  122:   mul_sat r3.w, r0.z, r0.w  
  123:   add r1.xyzw, -r5.xyzw, r3.xyzw  
  124:   mad r5.xyzw, cb12[192].xxxx, r1.xyzw, r5.xyzw  
  125:  endif  
  126:  mul r0.z, r0.x, r5.w  
  127:  mul r0.x, r0.x, r0.y  
  128:  dp3 r0.y, l(0.333000, 0.555000, 0.222000, 0.000000), r2.xyzx  
  129:  mad r1.xyz, r0.yyyy, r4.xyzx, -r2.xyzx  
  130:  mad r0.xyw, r0.xxxx, r1.xyxz, r2.xyxz  
  131:  add r1.xyz, -r0.xywx, r5.xyzx  
  132:  mad r0.xyz, r0.zzzz, r1.xyzx, r0.xywx  
  133: else  
  134:  mov r0.xyz, l(0, 0, 0, 0)  
  135: endif  
  136: mov o0.xyz, r0.xyzx  
  137: mov o0.w, l(1.000000)  
  138: ret

Honestly, the shader is quite long. Probably too long for an effective reverse engineering process.

Here is an example of a sunset scene with fog:

Let's take a look at the input:

As for textures, we have a depth buffer, Ambient Occlusion, and an HDR color buffer.

Inbound depth buffer

Incoming ambient occlusion

The incoming HDR color buffer

... and the result of applying the fog shader in this scene looks like this:

HDR texture after applying fog.

Depth buffer is used to recreate the position in the world. This is the standard pattern for Witcher 3 shaders.

Having ambient occlusion data (if enabled) allows us to obscure the fog. A very clever idea, perhaps an obvious one, but I never thought of it that way. I will return to this aspect later.

A shader starts by determining if a pixel is in the sky. In case the pixel lies in the sky (depth == 1.0), the shader returns black. If the pixel is in the scene (depth <1.0), then we recreate the position in the world using the depth buffer (lines 7-11) and continue to calculate the fog.

The passage of fog occurs shortly after the delayed shading process. You may notice that some elements related to the forward run are not yet available. In this particular scene, deferred lighting volumes were applied, and after that we rendered Geralt's hair / face / eyes.

The first thing you need to know about fog in “The Witcher 3”: it consists of two parts - “color of the fog” and “color of the atmosphere”.

 struct FogResult  
 {  
    float4 paramsFog;     // RGB: color, A: influence  
    float4 paramsAerial;  // RGB: color, A: influence  
 };

For each part there are three colors: front, middle and rear. That is, in the constant buffer there is such data as “FogColorFront”, “FogColorMiddle”, “AerialColorBack”, etc. ... Let's look at the incoming data:

   // *** Inputs *** //  
   float3 FogSunDir = cb12_v38.xyz;  
   float3 FogColorFront = cb12_v39.xyz;  
   float3 FogColorMiddle = cb12_v40.xyz;  
   float3 FogColorBack = cb12_v41.xyz;  
     
   float4 FogBaseParams = cb12_v42;  
   float4 FogDensityParamsScene = cb12_v43;  
   float4 FogDensityParamsSky = cb12_v44;  
     
   float3 AerialColorFront = cb12_v45.xyz;  
   float3 AerialColorMiddle = cb12_v46.xyz;  
   float3 AerialColorBack = cb12_v47.xyz;  
   float4 AerialParams = cb12_v48;

Before calculating the final colors, we need to calculate the vectors and scalar products. The shader has access to the pixel position in the world, camera position (cb12 [0] .xyz) and fog / lighting direction (cb12 [38] .xyz). This allows us to calculate the scalar product of the vector of the form and direction of the fog.

   float3 frag_vec = fragPosWorldSpace.xyz - customCameraPos.xyz;  
   float frag_dist = length(frag_vec);  
     
   float3 frag_dir = frag_vec / frag_dist;  
   
   float dot_fragDirSunDir = dot(GlobalLightDirection.xyz, frag_dir);

To calculate the mixing gradient, you need to use the square of the absolute scalar product, and then again multiply the result by some parameter that depends on the distance:

   float3 curr_col_fog;  
   float3 curr_col_aerial;  
   {  
     float _dot = dot_fragDirSunDir;  
   
     float _dd = _dot;  
     {  
       const float _distOffset = -150;  
       const float _distRange = 500;  
       const float _mul = 1.0 / _distRange;  
       const float _bias = _distOffset * _mul;  
   
       _dd = abs(_dd);  
       _dd *= _dd;  
       _dd *= saturate( frag_dist * _mul + _bias );  
     }  
   
     curr_col_fog = lerp( FogColorMiddle.xyz, (_dot>0.0f ? FogColorFront.xyz : FogColorBack.xyz), _dd );  
     curr_col_aerial = lerp( AerialColorMiddle.xyz, (_dot>0.0f ? AerialColorFront.xyz : AerialColorBack.xyz), _dd );  
   }

This code block makes it clear to us where these 0.002 and -0.300 came from. As we can see, the scalar product between the vectors of view and lighting is responsible for the choice between the “front” and “back” colors. Clever!

Here is a visualization of the resulting final gradient (_dd).

However, calculating the effect of the atmosphere / fog is much more complicated. As you can see, we have much more options than just rgb colors. They include, for example, scene density. We use raymarching (16 steps, and this is why the cycle can be expanded) to determine the size of the fog and the scale factor:

Having a vector [camera ---> world], we can divide all its components into 16 - this will be one raymarching step. As we see below, only the .z (height) component ( curr_pos_z_step ) is involved in the calculations .

Read more about the fog implemented by raymarching, for example, here .

   float fog_amount = 1;  
   float fog_amount_scale = 0;  
   [branch]  
   if ( frag_dist >= AerialParams.y )  
   {  
     float curr_pos_z_base = (customCameraPos.z + FogBaseParams.y) * density_factor;  
     float curr_pos_z_step = frag_step.z * density_factor;  
   
     [unroll]  
     for ( int i=16; i>0; --i )  
     {  
       fog_amount *= 1 - saturate( density_sample_scale / (1 + max( 0.0, curr_pos_z_base + (i) * curr_pos_z_step ) ) );  
     }  
   
     fog_amount = 1 - fog_amount;  
     fog_amount_scale = saturate( (frag_dist - AerialParams.y) * AerialParams.z );  
   }  
   
   FogResult ret;  
   
   ret.paramsFog = float4 ( curr_col_fog, fog_amount_scale * pow( abs(fog_amount), final_exp_fog ) );  
   ret.paramsAerial = float4 ( curr_col_aerial, fog_amount_scale * pow( abs(fog_amount), final_exp_aerial ) );

The amount of fog obviously depends on the height (components .z), at the end the amount of fog is raised to the degree of fog / atmosphere.

final_exp_fog and final_exp_aerial are taken from the constant buffer; they allow you to control how the colors of fog and atmosphere affect the world with increasing altitude.

Fog override

The shader I found did not have the following assembly code fragment:

  109:  mad_sat r1.xy, r0.wwww, cb12[189].xzxx, cb12[189].ywyy  
  110:  add r5.xyz, -r3.xyzx, cb12[188].xyzx  
  111:  mad r5.xyz, r1.xxxx, r5.xyzx, r3.xyzx  
  112:  add r0.z, l(-1.000000), cb12[188].w  
  113:  mad r0.z, r1.y, r0.z, l(1.000000)  
  114:  mul_sat r5.w, r0.w, r0.z  
  115:  lt r0.z, l(0.000000), cb12[192].x  
  116:  if_nz r0.z  
  117:   mad_sat r1.xy, r0.wwww, cb12[191].xzxx, cb12[191].ywyy  
  118:   add r6.xyz, -r3.xyzx, cb12[190].xyzx  
  119:   mad r3.xyz, r1.xxxx, r6.xyzx, r3.xyzx  
  120:   add r0.z, l(-1.000000), cb12[190].w  
  121:   mad r0.z, r1.y, r0.z, l(1.000000)  
  122:   mul_sat r3.w, r0.w, r0.z  
  123:   add r1.xyzw, -r5.xyzw, r3.xyzw  
  124:   mad r5.xyzw, cb12[192].xxxx, r1.xyzw, r5.xyzw  
  125:  endif

Judging by what I was able to understand, this is like redefining the color and the effect of fog:

Most of the time, only one redefinition is performed (cb12_v192.x is 0.0), but in this particular case its value is ~ 0.22, so we do the second override.

 #ifdef OVERRIDE_FOG  
     
   // Override  
   float fog_influence = ret.paramsFog.w; // r0.w  
   
   float override1ColorScale = cb12_v189.x;  
   float override1ColorBias = cb12_v189.y;  
   float3 override1Color = cb12_v188.rgb;  
     
   float override1InfluenceScale = cb12_v189.z;  
   float override1InfluenceBias = cb12_v189.w;  
   float override1Influence = cb12_v188.w;  
     
   float override1ColorAmount = saturate(fog_influence * override1ColorScale + override1ColorBias);  
   float override1InfluenceAmount = saturate(fog_influence * override1InfluenceScale + override1InfluenceBias);    
     

   float4 paramsFogOverride;  
   paramsFogOverride.rgb = lerp(curr_col_fog, override1Color, override1ColorAmount ); // ***r5.xyz   
     
   float param1 = lerp(1.0, override1Influence, override1InfluenceAmount); // r0.x  
   paramsFogOverride.w = saturate(param1 * fog_influence ); // ** r5.w  
   
     
   const float extraFogOverride = cb12_v192.x;  
     
   [branch]   
   if (extraFogOverride > 0.0)  
   {  
     float override2ColorScale = cb12_v191.x;  
     float override2ColorBias = cb12_v191.y;  
     float3 override2Color = cb12_v190.rgb;  
     
     float override2InfluenceScale = cb12_v191.z;  
     float override2InfluenceBias = cb12_v191.w;  
     float override2Influence = cb12_v190.w;  
       
     float override2ColorAmount = saturate(fog_influence * override2ColorScale + override2ColorBias);  
     float override2InfluenceAmount = saturate(fog_influence * override2InfluenceScale + override2InfluenceBias);  
      

     float4 paramsFogOverride2;  
     paramsFogOverride2.rgb = lerp(curr_col_fog, override2Color, override2ColorAmount); // r3.xyz   
           
     float ov_param1 = lerp(1.0, override2Influence, override2InfluenceAmount); // r0.z  
     paramsFogOverride2.w = saturate(ov_param1 * fog_influence); // r3.w  
   
     paramsFogOverride = lerp(paramsFogOverride, paramsFogOverride2, extraFogOverride);  
   
   }  
   ret.paramsFog = paramsFogOverride;  
     
 #endif

Here is our finished price without redefining fog (first image), with one redefinition (second image) and double redefinition (third image, final result):

Regulation of ambient occlusion

The shader I found also did not use ambient occlusion at all. Let's take a look at the texture of AO again and the code that interests us:

  13:  ld_indexable(texture2d)(float,float,float,float) r0.x, r0.xyzw, t2.xyzw  
  14:  max r0.x, r0.x, cb3[1].x  
  15:  add r0.yzw, r1.xxyz, -cb12[0].xxyz  
  16:  dp3 r1.x, r0.yzwy, r0.yzwy  
  17:  sqrt r1.x, r1.x  
  18:  add r1.y, r1.x, -cb3[0].x  
  19:  add r1.zw, -cb3[0].xxxz, cb3[0].yyyw  
  20:  div_sat r1.y, r1.y, r1.z  
  21:  mad r1.y, r1.y, r1.w, cb3[0].z  
  22:  add r0.x, r0.x, l(-1.000000)  
  23:  mad r0.x, r1.y, r0.x, l(1.000000)

Perhaps this scene is not the best example, because we do not see the details on a distant island. However, let's take a look at the constant buffer, which is used to set the ambient occlusion value:

We start by loading AO from the texture, then execute the max instruction. In this scene, cb3_v1.x is very high (0.96888), which makes the AO very weak.

The next part of the code calculates the distance between the positions of the camera and the pixels in the world.

I believe that the code sometimes speaks for itself, so let's look at HLSL, which does the bulk of this setup:

 float AdjustAmbientOcclusion(in float inputAO, in float worldToCameraDistance)  
 {  
   // *** Inputs *** //  
   const float aoDistanceStart = cb3_v0.x;  
   const float aoDistanceEnd = cb3_v0.y;  
   const float aoStrengthStart = cb3_v0.z;  
   const float aoStrengthEnd = cb3_v0.w;  
      
   // * Adjust AO  
   float aoDistanceIntensity = linstep( aoDistanceStart, aoDistanceEnd, worldToCameraDistance );  
   float aoStrength = lerp(aoStrengthStart, aoStrengthEnd, aoDistanceIntensity);   
   float adjustedAO = lerp(1.0, inputAO, aoStrength);  
     
   return adjustedAO;   
 }

The calculated distance between the camera and the world is used for the linstep function. We already know this function, it appeared in the cirrus cloud shader.

As you can see, in the constant buffer we have the AO start / end distance values. The output of linstep affects the strength of the AO (as well as from cbuffer), and the strength affects the output of the AO.

A brief example: the pixel is far, for example, the distance is 500.

linstep returns 1.0;
aoStrength is equal to aoStrengthEnd;

This results in an AO return, which is approximately 77% (final force) of the input value.

The incoming AO for this function was previously subjected to the max operation.

Putting it all together

Having received the color and effect for the color of the fog and the color of the atmosphere, you can finally combine them.

We start by attenuating the effect with the resulting AO:

   ...
   FogResult fog = CalculateFog( worldPos, CameraPosition, fogStart, ao, false );  
      
   // Apply AO to influence  
   fog.paramsFog.w *= ao;  
   fog.paramsAerial.w *= ao; 
      
   // Mix fog with scene color  
   outColor = ApplyFog(fog, colorHDR);

All the magic happens in the ApplyFog function :

 float3 ApplyFog(FogResult fog, float3 color)  
 {  
   const float3 LuminanceFactors = float3(0.333f, 0.555f, 0.222f);  
   
   float3 aerialColor = dot(LuminanceFactors, color) * fog.paramsAerial.xyz;  
   color = lerp(color, aerialColor, fog.paramsAerial.w);  
   color = lerp(color, fog.paramsFog.xyz, fog.paramsFog.w);  
    
   return color.xyz;  
 }

First, we calculate the luminosity of pixels:

Then we multiply it by the color of the atmosphere:

Then we combine the HDR color with the color of the atmosphere:

The last step is to combine the intermediate result with the color of the fog:

That's all!

Some debugging screenshots

Atmospheric effect

Atmosphere color

Fog effect

Fog color

Finished scene without fog

Ready-made scene with fog only

The finished scene is just the main fog

Ready-made scene again with all the fog for ease of comparison

Total

I think you can understand a lot of the above, if you look at the shader, it is here .

I can say with pleasure that this shader is exactly the same as the original one - it makes me very happy.

In general, the final result is highly dependent on the values passed to the shader. This is not a “magical” solution that gives perfect colors for the output, it requires many iterations and artists to make the final result look decent. I think it can be a long process, but after you complete it, the result will be very convincing, just like this sunset scene.

The Witcher 3 Sky Shader also uses fog calculations to create a smooth transition of colors near the horizon. However, a different set of density coefficients is passed to the sky shader.

Let me remind you - most of this shader was not created / analyzed by me. All acknowledgments should be sent to CD PROJEKT RED. Support them, they do an excellent job.

Part 3. Shooting Stars

In The Witcher 3 there is a small but curious detail - shooting stars. Interestingly, they do not seem to be in the Blood and Wine DLC.

In the video you can see how they look:

Let's see how we managed to get this effect.

As you can see, the body of a shooting star is much brighter than the tail. This is an important property that we will use later.

Our agenda is quite familiar: first I will describe the general properties, then I will talk about topics related to geometry, and in the end we will move on to the pixel shader, where the most interesting things are happening.

1. General overview

Briefly describe what is happening.

Shooting stars are drawn in a proactive passage, immediately after the dome of the sky, sky and moon:

DrawIndexed (720) - the dome of the sky,
DrawIndexed (2160) - the sphere for the sky / moon,
DrawIndexed (36) - is irrelevant, looks like a parallelepiped of the occlusion of the sun (?)
DrawIndexed (12) - the shooting star
DrawIndexedInstanced (1116, 1) - cirrus clouds

Like cirrus clouds , each shooting star is drawn twice in a row.

Before the first draw call

Result of the first draw call

Result of the second draw call

In addition, as in many elements of the preemptive pass of this game, the following mixing state is used:

2. Geometry

From the point of view of geometry, the first thing to mention is that each shooting star is represented by a thin quadrilateral with texcoords: 4 vertices, 6 indices. This is the simplest quad possible.

Approximate quad of a shooting star.

Even closer is the approximate quad of a shooting star. You can see the wireframe display of a line denoting two triangles.

Wait a minute , but there is DrawIndexed (12) ! Does this mean that we draw two shooting stars at the same time?

Yes.

In this frame, one of the shooting stars is completely outside the pyramid of visibility.

Let's look at the assembler code for the vertex shader:

 vs_5_0  
    dcl_globalFlags refactoringAllowed  
    dcl_constantbuffer cb1[9], immediateIndexed  
    dcl_constantbuffer cb2[3], immediateIndexed  
    dcl_constantbuffer cb12[193], immediateIndexed  
    dcl_input v0.xyz  
    dcl_input v1.xyzw  
    dcl_input v2.xy  
    dcl_input v3.xy  
    dcl_input v4.xy  
    dcl_input v5.xyz  
    dcl_input v6.x  
    dcl_input v7.x  
    dcl_output o0.xyzw  
    dcl_output o1.xyzw  
    dcl_output o2.xy  
    dcl_output o3.xyzw  
    dcl_output_siv o4.xyzw, position  
    dcl_temps 5  
   0: mov r0.xyz, v0.xyzx  
   1: mov r0.w, l(1.000000)  
   2: dp4 r1.x, r0.xyzw, cb2[0].xyzw  
   3: dp4 r1.y, r0.xyzw, cb2[1].xyzw  
   4: dp4 r1.z, r0.xyzw, cb2[2].xyzw  
   5: add r0.x, v2.x, v2.y  
   6: add r0.y, -v2.y, v2.x  
   7: add r2.xyz, -r1.zxyz, cb1[8].zxyz  
   8: dp3 r0.z, r2.xyzx, r2.xyzx  
   9: rsq r0.z, r0.z  
  10: mul r2.xyz, r0.zzzz, r2.xyzx  
  11: dp3 r0.z, v5.xyzx, v5.xyzx  
  12: rsq r0.z, r0.z  
  13: mul r3.xyz, r0.zzzz, v5.xyzx  
  14: mul r4.xyz, r2.xyzx, r3.yzxy  
  15: mad r2.xyz, r2.zxyz, r3.zxyz, -r4.xyzx  
  16: dp3 r0.z, r2.xyzx, r2.xyzx  
  17: rsq r0.z, r0.z  
  18: mul r2.xyz, r0.zzzz, r2.xyzx  
  19: mad r0.z, v7.x, v6.x, l(1.000000)  
  20: mul r3.xyz, r0.zzzz, r3.xyzx  
  21: mul r3.xyz, r3.xyzx, v3.xxxx  
  22: mul r2.xyz, r2.xyzx, v3.yyyy  
  23: mad r0.xzw, r3.xxyz, r0.xxxx, r1.xxyz  
  24: mad r0.xyz, r2.xyzx, r0.yyyy, r0.xzwx  
  25: mov r0.w, l(1.000000)  
  26: dp4 o4.x, r0.xyzw, cb1[0].xyzw  
  27: dp4 o4.y, r0.xyzw, cb1[1].xyzw  
  28: dp4 o4.z, r0.xyzw, cb1[2].xyzw  
  29: dp4 o4.w, r0.xyzw, cb1[3].xyzw  
  30: add r0.xyz, r0.xyzx, -cb12[0].xyzx  
  31: dp3 r0.w, r0.xyzx, r0.xyzx  
  32: sqrt r0.w, r0.w  
  33: div r0.xyz, r0.xyzx, r0.wwww  
  34: add r0.w, r0.w, -cb12[22].z  
  35: max r0.w, r0.w, l(0)  
  36: min r0.w, r0.w, cb12[42].z  
  37: dp3 r0.x, cb12[38].xyzx, r0.xyzx  
  38: mul r0.y, abs(r0.x), abs(r0.x)  
  39: mad_sat r1.x, r0.w, l(0.002000), l(-0.300000)  
  40: mul r0.y, r0.y, r1.x  
  41: lt r1.x, l(0), r0.x  
  42: movc r1.yzw, r1.xxxx, cb12[39].xxyz, cb12[41].xxyz  
  43: add r1.yzw, r1.yyzw, -cb12[40].xxyz  
  44: mad r1.yzw, r0.yyyy, r1.yyzw, cb12[40].xxyz  
  45: movc r2.xyz, r1.xxxx, cb12[45].xyzx, cb12[47].xyzx  
  46: add r2.xyz, r2.xyzx, -cb12[46].xyzx  
  47: mad o0.xyz, r0.yyyy, r2.xyzx, cb12[46].xyzx  
  48: ge r0.y, r0.w, cb12[48].y  
  49: if_nz r0.y  
  50:  mad r0.y, r0.z, cb12[22].z, cb12[0].z  
  51:  mul r0.z, r0.w, r0.z  
  52:  mul r0.z, r0.z, l(0.062500)  
  53:  mul r1.x, r0.w, cb12[43].x  
  54:  mul r1.x, r1.x, l(0.062500)  
  55:  add r0.x, r0.x, cb12[42].x  
  56:  add r2.x, cb12[42].x, l(1.000000)  
  57:  div_sat r0.x, r0.x, r2.x  
  58:  add r2.x, -cb12[43].z, cb12[43].y  
  59:  mad r0.x, r0.x, r2.x, cb12[43].z  
  60:  add r0.y, r0.y, cb12[42].y  
  61:  mul r2.x, r0.x, r0.y  
  62:  mul r0.z, r0.x, r0.z  
  63:  mad r3.xyzw, r0.zzzz, l(16.000000, 15.000000, 14.000000, 13.000000), r2.xxxx  
  64:  max r3.xyzw, r3.xyzw, l(0, 0, 0, 0)  
  65:  add r3.xyzw, r3.xyzw, l(1.000000, 1.000000, 1.000000, 1.000000)  
  66:  div_sat r3.xyzw, r1.xxxx, r3.xyzw  
  67:  add r3.xyzw, -r3.xyzw, l(1.000000, 1.000000, 1.000000, 1.000000)  
  68:  mul r2.y, r3.y, r3.x  
  69:  mul r2.y, r3.z, r2.y  
  70:  mul r2.y, r3.w, r2.y  
  71:  mad r3.xyzw, r0.zzzz, l(12.000000, 11.000000, 10.000000, 9.000000), r2.xxxx  
  72:  max r3.xyzw, r3.xyzw, l(0, 0, 0, 0)  
  73:  add r3.xyzw, r3.xyzw, l(1.000000, 1.000000, 1.000000, 1.000000)  
  74:  div_sat r3.xyzw, r1.xxxx, r3.xyzw  
  75:  add r3.xyzw, -r3.xyzw, l(1.000000, 1.000000, 1.000000, 1.000000)  
  76:  mul r2.y, r2.y, r3.x  
  77:  mul r2.y, r3.y, r2.y  
  78:  mul r2.y, r3.z, r2.y  
  79:  mul r2.y, r3.w, r2.y  
  80:  mad r3.xyzw, r0.zzzz, l(8.000000, 7.000000, 6.000000, 5.000000), r2.xxxx  
  81:  max r3.xyzw, r3.xyzw, l(0, 0, 0, 0)  
  82:  add r3.xyzw, r3.xyzw, l(1.000000, 1.000000, 1.000000, 1.000000)  
  83:  div_sat r3.xyzw, r1.xxxx, r3.xyzw  
  84:  add r3.xyzw, -r3.xyzw, l(1.000000, 1.000000, 1.000000, 1.000000)  
  85:  mul r2.y, r2.y, r3.x  
  86:  mul r2.y, r3.y, r2.y  
  87:  mul r2.y, r3.z, r2.y  
  88:  mul r2.y, r3.w, r2.y  
  89:  mad r2.zw, r0.zzzz, l(0.000000, 0.000000, 4.000000, 3.000000), r2.xxxx  
  90:  max r2.zw, r2.zzzw, l(0, 0, 0, 0)  
  91:  add r2.zw, r2.zzzw, l(0.000000, 0.000000, 1.000000, 1.000000)  
  92:  div_sat r2.zw, r1.xxxx, r2.zzzw  
  93:  add r2.zw, -r2.zzzw, l(0.000000, 0.000000, 1.000000, 1.000000)  
  94:  mul r2.y, r2.z, r2.y  
  95:  mul r2.y, r2.w, r2.y  
  96:  mad r2.x, r0.z, l(2.000000), r2.x  
  97:  max r2.x, r2.x, l(0)  
  98:  add r2.x, r2.x, l(1.000000)  
  99:  div_sat r2.x, r1.x, r2.x  
  100:  add r2.x, -r2.x, l(1.000000)  
  101:  mul r2.x, r2.x, r2.y  
  102:  mad r0.x, r0.y, r0.x, r0.z  
  103:  max r0.x, r0.x, l(0)  
  104:  add r0.x, r0.x, l(1.000000)  
  105:  div_sat r0.x, r1.x, r0.x  
  106:  add r0.x, -r0.x, l(1.000000)  
  107:  mad r0.x, -r2.x, r0.x, l(1.000000)  
  108:  add r0.y, r0.w, -cb12[48].y  
  109:  mul_sat r0.y, r0.y, cb12[48].z  
  110: else  
  111:  mov r0.xy, l(1.000000, 0.000000, 0.000000, 0.000000)  
  112: endif  
  113: log r0.x, r0.x  
  114: mul r0.z, r0.x, cb12[42].w  
  115: exp r0.z, r0.z  
  116: mul r0.z, r0.z, r0.y  
  117: mul r0.x, r0.x, cb12[48].x  
  118: exp r0.x, r0.x  
  119: mul o0.w, r0.x, r0.y  
  120: mad_sat r0.xy, r0.zzzz, cb12[189].xzxx, cb12[189].ywyy  
  121: add r2.xyz, -r1.yzwy, cb12[188].xyzx  
  122: mad r2.xyz, r0.xxxx, r2.xyzx, r1.yzwy  
  123: add r0.x, cb12[188].w, l(-1.000000)  
  124: mad r0.x, r0.y, r0.x, l(1.000000)  
  125: mul_sat r2.w, r0.x, r0.z  
  126: lt r0.x, l(0), cb12[192].x  
  127: if_nz r0.x  
  128:  mad_sat r0.xy, r0.zzzz, cb12[191].xzxx, cb12[191].ywyy  
  129:  add r3.xyz, -r1.yzwy, cb12[190].xyzx  
  130:  mad r1.xyz, r0.xxxx, r3.xyzx, r1.yzwy  
  131:  add r0.x, cb12[190].w, l(-1.000000)  
  132:  mad r0.x, r0.y, r0.x, l(1.000000)  
  133:  mul_sat r1.w, r0.x, r0.z  
  134:  add r0.xyzw, -r2.xyzw, r1.xyzw  
  135:  mad o1.xyzw, cb12[192].xxxx, r0.xyzw, r2.xyzw  
  136: else  
  137:  mov o1.xyzw, r2.xyzw  
  138: endif  
  139: mov o3.xyzw, v1.xyzw  
  140: mov o2.xy, v4.yxyy  
  141: ret

Here, the calculation of fog can immediately attract attention (lines 30-138). Calculation of fog top makes sense for performance reasons. In addition, we do not need such accuracy of fog - meteoroids usually fly over Geralt's head and do not reach the horizon.

The atmospheric parameters (rgb = color, a = influence) are stored in o0.xyzw, and the fog parameters in o1.xyzw.

o2.xy (line 140) is just texcoords.
o3.xyzw (line 139) is irrelevant.

Now let's say a few words about calculating a position in the world. Vertex shaders perform billboarding . First of all, incoming data for billboards comes from the vertex buffer - let's take a look at them.

The first data is Position:

As mentioned above, here we have 2 quad-a: 8 vertices, 12 indices.

But why is the position the same for every quad? Quite simple - this is the position of the center of the quad.

Further, each vertex has an offset from the center to the edge of the quad:

This means that every shooting star has a size of (400, 3) units in world space. (on the XY plane, in Witcher 3, the Z axis is directed up)

The last element that each vertex has is a unit direction vector in world space that controls the motion of a shooting star:

Since the data comes from the CPU, it is difficult to understand how it is calculated.

Now let's move on to the billboarding code. The idea is quite simple - first you get a unit vector from the center of the quad to the camera:

   7: add r2.xyz, -r1.zxyz, cb1[8].zxyz  
   8: dp3 r0.z, r2.xyzx, r2.xyzx  
   9: rsq r0.z, r0.z  
  10: mul r2.xyz, r0.zzzz, r2.xyzx

Then we get a single tangent vector that controls the motion of the shooting star.

Given that this vector is already normalized on the CPU side, this normalization is redundant.

  11: dp3 r0.z, v5.xyzx, v5.xyzx  
  12: rsq r0.z, r0.z  
  13: mul r3.xyz, r0.zzzz, v5.xyzx

If there are two vectors, a vector product is used to determine the bi-tangent vector perpendicular to both incoming vectors.

  14: mul r4.xyz, r2.xyzx, r3.yzxy  
  15: mad r2.xyz, r2.zxyz, r3.zxyz, -r4.xyzx  
  16: dp3 r0.z, r2.xyzx, r2.xyzx  
  17: rsq r0.z, r0.z  
  18: mul r2.xyz, r0.zzzz, r2.xyzx

Now we have normalized vectors tangent (r3.xyz) and bitangent (r2.xyz).

Let's introduce Xsize and Ysize corresponding to the incoming element TEXCOORD1, so for example (-200, 1.50).

The final calculation of the position in world space is performed as follows:

  19: mad r0.z, v7.x, v6.x, l(1.000000)  
  20: mul r3.xyz, r0.zzzz, r3.xyzx  
  21: mul r3.xyz, r3.xyzx, v3.xxxx  
  22: mul r2.xyz, r2.xyzx, v3.yyyy  
  23: mad r0.xzw, r3.xxyz, r0.xxxx, r1.xxyz  
  24: mad r0.xyz, r2.xyzx, r0.yyyy, r0.xzwx  
  25: mov r0.w, l(1.000000)

Given that r0.x, r0.y and r0.z are equal to 1.0, the final calculation is simplified:

worldSpacePosition = quadCenter + tangent * Xsize + bitangent * Ysize

The last part is a simple multiplication of a position in world space by a view-projection matrix to obtain SV_Position:

  26: dp4 o4.x, r0.xyzw, cb1[0].xyzw  
  27: dp4 o4.y, r0.xyzw, cb1[1].xyzw  
  28: dp4 o4.z, r0.xyzw, cb1[2].xyzw  
  29: dp4 o4.w, r0.xyzw, cb1[3].xyzw

3. Pixel Shader

As stated in the General Overview section, the following blending state is used: where SrcColor and SrcAlpha are the .rgb and .a components from the pixel shader, respectively , and DestColor is the .rgb color currently in rendertarget. The main indicator that controls transparency is SrcAlpha . Many proactive game shaders calculate it as opacity and apply it at the end as follows: The falling star shader was no exception. Following this pattern, we consider three cases in which the opacity is 1.0, 0.1, and 0.0.

FinalColor = SrcColor * One + DestColor * (1.0 - SrcAlpha) =
FinalColor = SrcColor + DestColor * (1.0 - SrcAlpha)

return float4( color * opacity, opacity )

a) opacity = 1.0

FinalColor = color * opacity + DestColor * (1.0 - opacity) =
FinalColor = color = SrcColor

b) opacity = 0.1

FinalColor = color * opacity + DestColor * (1.0 - opacity) =
FinalColor = 0.1 * color + 0.9 * DestColor

c) opacity = 0.0

FinalColor = color * opacity + DestColor * (1.0 - opacity) =
FinalColor = DestColor

The fundamental idea of this shader is to model and use the opacity function opacity (x) , which controls the opacity of a pixel along a shooting star. The main requirement is that the opacity should reach maximum values at the end of the star (its “body”) and smoothly fade to 0.0 (to its “tail”).

When we begin to understand the assembler code of the pixel shader, this becomes obvious:

 ps_5_0  
    dcl_globalFlags refactoringAllowed  
    dcl_constantbuffer cb0[10], immediateIndexed  
    dcl_constantbuffer cb2[3], immediateIndexed  
    dcl_constantbuffer cb4[2], immediateIndexed  
    dcl_input_ps linear v0.xyzw  
    dcl_input_ps linear v1.xyzw  
    dcl_input_ps linear v2.y  
    dcl_input_ps linear v3.w  
    dcl_output o0.xyzw  
    dcl_temps 4  
   0: mov_sat r0.x, v2.y  
   1: ge r0.y, r0.x, l(0.052579)  
   2: ge r0.z, l(0.965679), r0.x  
   3: and r0.y, r0.z, r0.y  
   4: if_nz r0.y  
   5:  ge r0.y, l(0.878136), r0.x  
   6:  add r0.z, r0.x, l(-0.052579)  
   7:  mul r1.w, r0.z, l(1.211303)  
   8:  mov_sat r0.z, r1.w  
   9:  mad r0.w, r0.z, l(-2.000000), l(3.000000)  
  10:  mul r0.z, r0.z, r0.z  
  11:  mul r0.z, r0.z, r0.w  
  12:  mul r2.x, r0.z, l(0.084642)  
  13:  mov r1.yz, l(0.000000, 0.000000, 0.084642, 0.000000)  
  14:  movc r2.yzw, r0.yyyy, r1.yyzw, l(0.000000, 0.000000, 0.000000, 0.500000)  
  15:  not r0.z, r0.y  
  16:  if_z r0.y  
  17:   ge r0.y, l(0.924339), r0.x  
  18:   add r0.w, r0.x, l(-0.878136)  
  19:   mul r1.w, r0.w, l(21.643608)  
  20:   mov_sat r0.w, r1.w  
  21:   mad r3.x, r0.w, l(-2.000000), l(3.000000)  
  22:   mul r0.w, r0.w, r0.w  
  23:   mul r0.w, r0.w, r3.x  
  24:   mad r1.x, r0.w, l(0.889658), l(0.084642)  
  25:   mov r1.yz, l(0.000000, 0.084642, 0.974300, 0.000000)  
  26:   movc r2.xyzw, r0.yyyy, r1.xyzw, r2.xyzw  
  27:  else  
  28:   mov r2.y, l(0)  
  29:   mov r0.y, l(-1)  
  30:  endif  
  31:  not r0.w, r0.y  
  32:  and r0.z, r0.w, r0.z  
  33:  if_nz r0.z  
  34:   ge r0.y, r0.x, l(0.924339)  
  35:   add r0.x, r0.x, l(-0.924339)  
  36:   mul r1.w, r0.x, l(24.189651)  
  37:   mov_sat r0.x, r1.w  
  38:   mad r0.z, r0.x, l(-2.000000), l(3.000000)  
  39:   mul r0.x, r0.x, r0.x  
  40:   mul r0.x, r0.x, r0.z  
  41:   mad r1.x, r0.x, l(-0.974300), l(0.974300)  
  42:   mov r1.yz, l(0.000000, 0.974300, 0.000000, 0.000000)  
  43:   movc r2.xyzw, r0.yyyy, r1.xyzw, r2.xyzw  
  44:  endif  
  45: else  
  46:  mov r2.yzw, l(0.000000, 0.000000, 0.000000, 0.500000)  
  47:  mov r0.y, l(0)  
  48: endif  
  49: mov_sat r2.w, r2.w  
  50: mad r0.x, r2.w, l(-2.000000), l(3.000000)  
  51: mul r0.z, r2.w, r2.w  
  52: mul r0.x, r0.z, r0.x  
  53: add r0.z, -r2.y, r2.z  
  54: mad r0.x, r0.x, r0.z, r2.y  
  55: movc r0.x, r0.y, r2.x, r0.x  
  56: mad r0.y, cb4[1].x, -cb0[9].w, l(1.000000)  
  57: mul_sat r0.y, r0.y, v3.w  
  58: mul r0.x, r0.y, r0.x  
  59: mul r0.yzw, cb2[2].xxyz, cb4[0].xxxx  
  60: mul r0.x, r0.x, cb2[2].w  
  61: dp3 r1.x, l(0.333000, 0.555000, 0.222000, 0.000000), r0.yzwy  
  62: mad r1.xyz, r1.xxxx, v0.xyzx, -r0.yzwy  
  63: mad r0.yzw, v0.wwww, r1.xxyz, r0.yyzw  
  64: add r1.xyz, -r0.yzwy, v1.xyzx  
  65: mad r0.yzw, v1.wwww, r1.xxyz, r0.yyzw  
  66: mul o0.xyz, r0.xxxx, r0.yzwy  
  67: mov o0.w, r0.x  
  68: ret

In general, the shader is a bit overcomplicated and it was hard for me to figure out what was going on in it. For example, where did all the values like 1.211303, 21.643608 and 24.189651 come from?

If we are talking about the opacity function, then we need one input value. With this, it's quite simple - texcoord in the range from [0,1] (line 0) will be useful here, so that we can apply the function to the entire length of the meteoroid.

The opacity function has three segments / intervals defined by four control points:

   // current status: no idea how these are generated  
   const float controlPoint0 = 0.052579;  
   const float controlPoint1 = 0.878136;  
   const float controlPoint2 = 0.924339;  
   const float controlPoint3 = 0.965679;

I have no idea how they were selected / calculated.

As we can see from the assembler code, the first condition is just checking if the input value is in the range [controlPoint0 - controlPoint3]. If not, then the opacity is just 0.0.

   // Input for the opacity function
   float y = saturate(Input.Texcoords.y);  // r0.x
     
   // Value of opacity function.  
   // 0 - no change  
   // 1 - full color  
   float opacity = 0.0;  
     
   [branch]   
   if (y >= controlPoint0 && y <= controlPoint3)  
   {  
      ...

Decryption of the assembler code below is necessary if we want to understand how the opacity function works:

   6: add r0.z, r0.x, l(-0.052579)   
   7: mul r1.w, r0.z, l(1.211303)   
   8: mov_sat r0.z, r1.w   
   9: mad r0.w, r0.z, l(-2.000000), l(3.000000)   
  10: mul r0.z, r0.z, r0.z   
  11: mul r0.z, r0.z, r0.w   
  12: mul r2.x, r0.z, l(0.084642)

Line 9 has the coefficients '-2.0' and '3.0', which hints at the use of the smoothstep function . Yes, this is a good guess.

The HLSL smoothstep function with prototype: ret smoothstep (min, max, x) always limits x to [ min-max ]. From an assembler point of view, this subtracts min from the input value (that is, from r0.z on line 9), but there is nothing like that in the code. For max, this implies a multiplication of the input value, but there is nothing like 'mul_sat' in the code. Instead, there is 'mov_sat'. This tells us that the min and max functions of smoothstep are 0 and 1.

Now we know that xmust be in the interval [0, 1]. As stated above, there are three segments in the opacity function. This clearly hints that the code is looking for where we are in the interval [segmentStart-segmentEnd].

The answer is the Linstep function!

 float linstep(float min, float max, float v)  
 {  
   return ( (v-min) / (max-min) );  
 }

For example, let's take the first segment: [0.052579 - 0.878136]. Subtraction is on line 6. If we replace division by multiplication -> 1.0 / (0.878136 - 0.052579) = 1.0 / 0.825557 = ~ 1.211303.

The result of smoothstep is in the range [0, 1]. The multiplication on line 12 is the weight of the segment. Each segment has its own weight, allowing you to control the maximum opacity of this particular segment.

This means that for the first segment [0.052579 - 0.878136], the opacity is in the range [0 - 0.084642].

An HLSL function that calculates opacity for an arbitrary segment can be written as follows:

 float getOpacityFunctionValue(float x, float cpLeft, float cpRight, float weight)  
 {  
   float val = smoothstep( 0, 1, linstep(cpLeft, cpRight, x) );  
   return val * weight;  
 }

So, the whole point is simply to call this function for the corresponding segment.

Take a look at the weights:

   const float weight0 = 0.084642;  
   const float weight1 = 0.889658;  
   const float weight2 = 0.974300; // note: weight0+weight1 = weight2

According to the assembler code, the opacity (x) function is calculated as follows:

   float opacity = 0.0;

   [branch]   
   if (y >= controlPoint0 && y <= controlPoint3)  
   {  
     // Range of v: [0, weight0]  
     float v = getOpacityFunctionValue(y, controlPoint0, controlPoint1, weight0);  
     opacity = v;  
     
     [branch]  
     if ( y >= controlPoint1 )  
     {  
       // Range of v: [0, weight1]  
       float v = getOpacityFunctionValue(y, controlPoint1, controlPoint2, weight1);  
       opacity = weight0 + v;  
   
       [branch]  
       if (y >= controlPoint2)  
       {  
         // Range of v: [0, weight2]  
         float v = getOpacityFunctionValue(y, controlPoint2, controlPoint3, weight2);
         opacity = weight2 - v;          
       }  
     }  
   }

Here is a graph of the opacity function. You can easily see a sharp increase in opacity, indicating the beginning of the body of a shooting star:

Graph opacity function.

Red channel - opacity value.
Green channel - control points.
Blue channel - weights.

After calculating the opacity, everything else is just the finishing touches. Then there are additional multiplications: the opacity of the stars, the color of the shooting star and the influence of fog. As usual in TW3 shaders, you can also find redundant multiplications by 1.0 here:

   // cb4_v1.x = 1.0  
   float starsOpacity = 1.0 - cb0_v9.w * cb4_v1.x;    
   opacity *= starsOpacity;  

   // Calculate color of a shooting star  
   // cb4_v0.x = 10.0
   // cb2_v2.rgb = (1.0, 1.0, 1.0)
   float3 color = cb2_v2.rgb * cb4_v0.x;
     
   // cb2_v2.w = 1  
   opacity *= cb2_v2.w;
     
   FogResult fr = { Input.FogParams, Input.AerialParams };  
   color = ApplyFog(fr, color);
     
   return float4( color*opacity, opacity);  
 }

4. Summary

The main difficulty lies in the part with the opacity function. After decoding it, everything else is quite simple to understand.

I said above that the pixel shader is a bit overcomplicated. In fact, we only care about the value of the opacity (x) function , which is stored in r2.x (starting at line 49). However, the opacity function in the assembler code creates three additional variables: minRange (r2.y), maxRange (r2.z) and value (r2.w). All of them are parameters used to calculate opacity when opacity (x) is not used:

lerp( minRange, maxRange, smoothstep(0, 1, value) );

In fact, the final opacity value is obtained in the conditional branch on line 55 - if the input value is xis in the range [controlPoint0 - controlPoint3], this means that the opacity function is used, therefore r2.x is selected. Otherwise, when x is outside the interval, the opacity is calculated from r0.x, that is, according to the above equation.

I debugged a few pixels outside the interval [controlPoint0 - controlPoint3], and the final opacity always turned out to be zero.

That's all for today. And, as always, thanks for reading.

The reverse engineering of the rendering of The Witcher 3: various sky effects

Part 1. Cirrus Clouds

Geometric and vertex shader

Pixel shader

Z-direction inverted sunlight

Cloud Texture Sampling

Overlay normals

Brightness (1)

Brightness (2)

Cloud color

Make cirrus clouds more visible on the horizon

Noise texture sampling

Putting It All Together

Total

Part 2. Fog

Fog override

Regulation of ambient occlusion

Putting it all together

Some debugging screenshots

Total

Part 3. Shooting Stars

1. General overview

2. Geometry

3. Pixel Shader

4. Summary

More articles: