-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Various Solari improvements #21649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Various Solari improvements #21649
Conversation
JMS55
commented
Oct 25, 2025
- Fix compile error when compiling with DLSS enabled after Add BindGroupLayout caching via descriptors #21205
- Use permutation sampling for ReSTIR DI temporal reuse to fix artifacts under DLSS-RR
- For both DI and GI, removed the spatial raytrace, and moved it to the final reservoir before shading.
- Reduced DI initial samples 32 -> 8 for better performance at the cost of quality
- Various specular GI improvements and bugfixes (still kinda terrible overall, I need to do some research on how people usually do this kind of thing)
- Made the world cache adapt faster / be less stable
- Switched spatial hashing collisions from to linear probing
| var<push_constant> constants: PushConstants; | ||
|
|
||
| const INITIAL_SAMPLES = 32u; | ||
| const INITIAL_SAMPLES = 8u; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the perf difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, couldn't get bistro setup properly to test :(
| var radiance: vec3<f32>; | ||
| var wi: vec3<f32>; | ||
| if surface.material.roughness > 0.04 { | ||
| if surface.material.roughness > 0.1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be worth subgroupAny-ing this (and flipping the condition)? One branch is a lot more expensive than the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe? Would have to test. I feel like for latency reasons, you'd want to trace the minimum amount of rays possible, even if some threads end up idle.
| let diffuse_brdf = ray_hit.material.base_color / PI; | ||
| radiance += throughput * diffuse_brdf * query_world_cache(ray_hit.world_position, ray_hit.geometric_world_normal, view.world_position); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is only sampling the glossy path, I'm not sure if it's worth treating the surface as diffuse and lighting it with the world cache, especially since the moment we hit a rough surface we would do the same (in the previous code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm idk. But I think we want to sample the world cache at every bounce, and not just the last. I don't have screenshots on me atm, but comparing the screws in the PICA PICA scene before/after this PR, they look a lot closer to the PT reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work