In webgpu / webgl2, is it possible to force early stencil / depth test (to run before the fragment shader runs) ?
XY problem: writing a Rust/wasm32 app that uses webgpu / webgl2. I have a fragment shader that is rather expensive; and if possible, would prefer to force early stencil / depth test.
I'm not familiar with the webgpu API, but in general, you can use deferred rendering to reduce number of invocations of fragment shaders if early z test is not available.
in that case, I'm afraid I'm not qualified to give any useful advices; I have no experience at all on how ray tracing pipelines work.
just saying, or traditional rasterizer pipelines, deferred shading is a matter of split the rendering process into a geometry pass and a color pass. unless your per fragment geometry data (which is the input to the color calculation) is unreasonably large and your g-buffer may explode, it should be possible to convert a forward rendering pipeline into a deferred pipeline in general, but the actual savings of the geometry pass and the amount of work required to modify the code depend heavily on the complexity of the original rendering pipelines.
if there was no 'discard', what you described above works
because of the 'discard', it is not easy to split this into two stages (more precisely: we can split into two stages, but do not save much, as everything up to the 'discard' has to run in the first stage)
that's not entirely true, the first pass output to g-buffers, which stores the input data to the "(expensive ray trace)` calculation without doing the calculation. as I said, depending the complexity of your actual lighting calculation, it might not be practical to store all the required data into g-buffers. but basically it's a space-vs-time trade off.
not all forward rendering pipeline can be easily converted to deferred shading, as a start point, it seems you only need to change the fragment shader, but often you end up re-writing the entire rendering pipeline.
I'm not sure what you're trying to achieve specifically here, but you can't skip to the discard without the if!
If you simply need to reduce overdraw, then yes you're looking at deferred rendering, or perhaps it's modern engine fancy pants version, clustered rendering.
with the assumption that writing to g-buffers is much cheaper than the light calculation, I think you can write out the g-buffers unconditionally and rely on the z buffer for visibility test, no discard involved.
I'm sure this is heavily hardware dependant, and especially in cache hit rate. But back of the envelope calculation would be that you have 1 arithmetic op per cycle, which means at 2GHz 200 cycles works out to 10ms, which sounds reasonable.
ah, I see, that makes sense now. I misunderstood the problem then. the cost you are trying to reduce in your fragment shader is not for the lighting but some kind of screen space, eh, computation. that, I have absolute no idea how to do then.