Early Z-Cull with Clamped Depth


Check out these pool balls from the PSN game Hustle Kings. They do a pretty cool trick where the balls where they render a quad in front of where the ball should be. Then they essentially do a ray-sphere intersection in the pixel shader. It’s a pretty cool trick and looks pretty good. You can find a more detailed explanation at the VooFoo Studios website.

In the future, I’m pretty bullish on techniques that so some kind of mini raytracing inside a pixel shader. That’s a longer post. But the question becomes what to do with early z-cull for pixel shaders that change the depth. Here is a quick picture showing the problem.

Suppose that we render the green shape first. Now we are going to render the red sphere. We can do that by rendering a cube that bounds it (in blue) and then using a pixel shader that does a raytrace with the sphere. If we want to have spheres intersect other objects, we need that pixel shader to change the output depth.

This causes a problem for early z-cull. Since the GPU does not know if the new depth will be closer or farther than the original depth, the GPU can’t discard any pixels with early z-cull.

We could solve this with a hypothetical extension that I’m tentatively calling “Clamped Depth”. Basically, we could have a state which says that when a pixel shader writes depth, the output depth will be forced to be at least as far as the original sample. That way, we could still perform early z-cull on pixel shaders that write depth.

For the record, I am definitely not the only person who has had this idea. And I haven’t payed enough attention to DX11 changes so for all I know that feature may already be in there. Also, I think I remember a poster at Siggraph 2007 with the exact same idea.

So far, it hasn’t been an issue since there aren’t that many cool techniques that require depth-changing shaders. But I could definitely see them becoming viable in the next 5-10 years, so I really hope that we have that option on the PS4/Xbox 720. And of course, the devs at VooFoo Studios have shown that for certain cases, it is viable today.

8 Responses to “Early Z-Cull with Clamped Depth”

  1. Yes, DX11 has this feature (“conservative depth”). See here: gamasutra article or GDC presentation, slide 7


  2. You should read the RSX programming guide more carefully. :-)


  3. To get early z-culling for the sphere you could just use depth bound testing on nVidia hardware.

    You probably already use this on your titles, but it is also supported on a lot of nVidia cards on PC using some creative use of some DX9 render states.

    Check out page 55:

    http://developer.download.nvidia.com/GPU_Programming_Guide/GPU_Programming_Guide_G80.pdf


  4. Yay!!! It’s good to be wrong. (-:


  5. The DX11 equivalent of conservative depth on OpenGL: GL_AMD_conservative_depth.

    Also, summary of various vendor specific D3D9 hacks, including the above mentioned “nvidia depth bounds”: D3D9 GPU Hacks. OpenGL equivalent: EXT_depth_bounds_test.


  6. [...] that came in D3D10.1). Not too long ago one of these neat tricks came to my attention by way of John Hable’s blog, which inspired me to dig around a bit and try out some of other neat tricks I was missing out on. [...]


  7. You can actually already do that on X360. We have a better control of the low level gpu state and if we know that the depth wrote by the pixel shader is behind the one from the rasterized triangle, it is safe to keep Hierarchical Z

    A typical example is a depth sprite. instead of drawing the quad at the depth center, and bias the depth from -1 to 1, you shift the vertex in front of the depth, and bias the pixel depth from 0 to 1!


  8. Nicolas: That’s cool! Would have been nice to know when I used to work on the 360. (-:


Leave a Reply