Wgpu/webgl2 possible to force early stencil/depth test?

In webgpu / webgl2, is it possible to force early stencil / depth test (to run before the fragment shader runs) ?

XY problem: writing a Rust/wasm32 app that uses webgpu / webgl2. I have a fragment shader that is rather expensive; and if possible, would prefer to force early stencil / depth test.

I'm not familiar with the webgpu API, but in general, you can use deferred rendering to reduce number of invocations of fragment shaders if early z test is not available.

I should have been clearer, my fragment shaders are of the form:

if (expensive ray trace) {
  out_color = simple_expr;
} else {

At this point, I don't think deferred shading helps me.

in that case, I'm afraid I'm not qualified to give any useful advices; I have no experience at all on how ray tracing pipelines work.

just saying, or traditional rasterizer pipelines, deferred shading is a matter of split the rendering process into a geometry pass and a color pass. unless your per fragment geometry data (which is the input to the color calculation) is unreasonably large and your g-buffer may explode, it should be possible to convert a forward rendering pipeline into a deferred pipeline in general, but the actual savings of the geometry pass and the amount of work required to modify the code depend heavily on the complexity of the original rendering pipelines.

I'm not 100% sure, but I think:

  1. if there was no 'discard', what you described above works

  2. because of the 'discard', it is not easy to split this into two stages (more precisely: we can split into two stages, but do not save much, as everything up to the 'discard' has to run in the first stage)

that's not entirely true, the first pass output to g-buffers, which stores the input data to the "(expensive ray trace)` calculation without doing the calculation. as I said, depending the complexity of your actual lighting calculation, it might not be practical to store all the required data into g-buffers. but basically it's a space-vs-time trade off.

not all forward rendering pipeline can be easily converted to deferred shading, as a start point, it seems you only need to change the fragment shader, but often you end up re-writing the entire rendering pipeline.

The problem is, due to the discard, until we execute the fragment shader, we don't know which triangle's data to write to the g-buffers.

I'm not sure what you're trying to achieve specifically here, but you can't skip to the discard without the if!

If you simply need to reduce overdraw, then yes you're looking at deferred rendering, or perhaps it's modern engine fancy pants version, clustered rendering.

This seems to be a pretty good overview of the concepts and implementation: A Primer On Efficient Rendering Algorithms & Clustered Shading.

You might be good with even the very first step of a z-prepass!


with the assumption that writing to g-buffers is much cheaper than the light calculation, I think you can write out the g-buffers unconditionally and rely on the z buffer for visibility test, no discard involved.

different passes have different purposes.

this is false; the 'light calculation' is non-existen, and the expensive part is deciding whether to write to the g-buffer or to discard

How reliable is the claim of 1 texture lookup == 200 arith ops ?

I'm sure this is heavily hardware dependant, and especially in cache hit rate. But back of the envelope calculation would be that you have 1 arithmetic op per cycle, which means at 2GHz 200 cycles works out to 10ms, which sounds reasonable.

1 Like

ah, I see, that makes sense now. I misunderstood the problem then. the cost you are trying to reduce in your fragment shader is not for the lighting but some kind of screen space, eh, computation. that, I have absolute no idea how to do then.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.