Commit graph

60 commits

Author SHA1 Message Date
gdkchan 524fe3bea4
Implement shader HelperThreadNV (#2163)
* Implement shader HelperThreadNV

* Bump shader cache version

* Use gl_HelperInvocation since its supported across all vendors

* Nit
2021-04-02 21:50:35 +11:00
gdkchan a0b4799f19
Fix ZN flags set for shader instructions using RZ.CC dest (#2147)
* Fix ZN flags set for shader instructions using RZ.CC dest

* Shader cache version bump and nits
2021-03-27 22:59:05 +01:00
mageven a5d5ca0635
Shader Cache: Move bindless checking from translation to decode (#2145) 2021-03-27 00:50:26 +01:00
Mary aef25980a7
Salieri: Detect and avoid caching shaders using bindless textures (#2097)
* Salieri: Add blacklist system and blacklist shaders using bindless

Currently the shader cache doesn't have the right format to support
bindless textures correctly and may cache shaders that it cannot rebuild
after host invalidation.

This PR address the issue by blacklisting shaders using bindless
textures.

THis also support detection of already cached broken shader and handle removal
of those.

* Move to a feature flags design to avoid intrusive changes in the translator

This remove the auto correct behaviour

* Reduce diff on TranslationFlags

* Reduce comma on last entry of TranslationFlags

* Fix inverted logic and remove leftovers

* remove debug edits oops
2021-03-19 20:07:37 +01:00
riperiperi ede26556f2
Traverse PhiNodes for Bindless Elimination (#2089)
This allows bindless handles to be found for image/texture instructions with predicates, when the assignment of the texture handle is within the same predicate.

This seems to cover the remaining bindless handles that compilers seem to be creating due to optimizations.

Will affect newer UE4 games, and games by NdCube (Super Mario Party, Clubhouse Games)
2021-03-09 17:27:44 -03:00
gdkchan 4047477866
Simplify handling of shader vertex A (#1999)
* Simplify handling of shader vertex A

* Theres no transformation feedback, its transform

* Merge TextureHandlesForCache
2021-02-08 10:42:17 +11:00
gdkchan 053dcfdb05
Use multiple dest operands for shader call instructions (#1975)
* Use multiple dest operands for shader call instructions

* Passing opNode is no longer needed
2021-02-01 11:13:38 +11:00
gdkchan f93089a64f
Implement geometry shader passthrough (#1961)
* Implement geometry shader passthrough

* Cache version change
2021-01-29 14:38:51 +11:00
gdkchan 4b7c7dab9e
Support multiple destination operands on shader IR and shuffle predicates (#1964)
* Support multiple destination operands on shader IR and shuffle predicates

* Cache version change
2021-01-28 10:59:47 +11:00
gdkchan 36c6e67df2
Implement shader CC mode for ISCADD, X mode for ISETP and fix STL/STS/STG with RZ (#1901)
* Implement shader CC mode for ISCADD, X mode for ISETP and fix STS/STG with RZ

* Fix STG too and bump shader cache version

* Fix wrong name

* Fix Carry being inverted on comparison
2021-01-13 08:52:13 +11:00
Mary 48f6570557
Salieri: shader cache (#1701)
Here come Salieri, my implementation of a disk shader cache!

"I'm sure you know why I named it that."
"It doesn't really mean anything."

This implementation collects shaders at runtime and cache them to be later compiled when starting a game.
2020-11-13 00:15:34 +01:00
gdkchan 934a78005e
Simplify logic for bindless texture handling (#1667)
* Simplify logic for bindless texture handling

* Nits
2020-11-09 19:35:04 -03:00
gdkchan 8d168574eb
Use explicit buffer and texture bindings on shaders (#1666)
* Use explicit buffer and texture bindings on shaders

* More XML docs and other nits
2020-11-08 12:10:00 +01:00
gdkchan 49f970d5bd
Implement CAL and RET shader instructions (#1618)
* Add support for CAL and RET shader instructions

* Remove unused stuff

* Fix a bug that could cause the wrong values to be passed to a function

* Avoid repopulating function id dictionary every time

* PR feedback

* Fix vertex shader A/B merge
2020-10-25 17:00:44 -03:00
gdkchan 5264d55b39
Fix gl_in being used with built-in variables that are not per-vertex (#1624) 2020-10-17 10:16:40 +02:00
gdkchan d36c4bfba5
Fix output component register on pixel shaders (#1613)
* Fix output component register on pixel shaders

* Clean up usings

* Do not advance if no component is enabled for the target, this keeps the previous behavior
2020-10-13 14:44:55 +11:00
gdkchan 90ab28d1c6
Align register index between output targets on pixel shaders (#1559) 2020-09-21 13:45:04 +10:00
gdkchan 1eea35554c
Better viewport flipping and depth mode detection method (#1556)
* Use a better viewport flipping approach

* New approach to detect depth mode

* nit: Sort method on the OpenGL backend

* Adjust spacing on comment

* Unswap near and far parameters based on ScaleZ
2020-09-19 19:46:49 -03:00
gdkchan 636542d817
Refactor shader translator ShaderConfig and reduce the number of out args (#1438) 2020-07-30 15:53:23 +10:00
gdkchan 991784868f
Fix shader regression on Intel iGPUs by reverting layout changes (#1425) 2020-07-29 08:01:11 +10:00
gdkchan 8dbcae1ff8
Implement BGRA texture support (#1418)
* Implement BGRA texture support

* Missing AppendLine

* Remove empty lines

* Address PR feedback
2020-07-26 00:03:40 -03:00
riperiperi 484eb645ae
Implement Zero-Configuration Resolution Scaling (#1365)
* Initial implementation of Render Target Scaling

Works with most games I have. No GUI option right now, it is hardcoded.

Missing handling for texelFetch operation.

* Realtime Configuration, refactoring.

* texelFetch scaling on fragment shader (WIP)

* Improve Shader-Side changes.

* Fix potential crash when no color/depth bound

* Workaround random uses of textures in compute.

This was blacklisting textures in a few games despite causing no bugs. Will eventually add full support so this doesn't break anything.

* Fix scales oscillating when changing between non-native scales.

* Scaled textures on compute, cleanup, lazier uniform update.

* Cleanup.

* Fix stupidity

* Address Thog Feedback.

* Cover most of GDK's feedback (two comments remain)

* Fix bad rename

* Move IsDepthStencil to FormatExtensions, add docs.

* Fix default config, square texture detection.

* Three final fixes:

- Nearest copy when texture is integer format.
- Texture2D -> Texture3D copy correctly blacklists the texture before trying an unscaled copy (caused driver error)
- Discount small textures.

* Remove scale threshold.

Not needed right now - we'll see if we run into problems.

* All CPU modification blacklists scale.

* Fix comment.
2020-07-07 04:41:07 +02:00
gdkchan e13154c83d
Implement shader LEA instruction and improve bindless image load/store (#1355) 2020-07-04 01:48:44 +02:00
gdkchan a15b951721
Fix wrong face culling once and for all (#1277)
* Viewport swizzle support on NV and clip origin

* Initialize default viewport swizzle state, emulate viewport swizzle on shaders when not supported

* Address PR feedback
2020-05-28 09:03:07 +10:00
gdkchan 5795bb1528
Support separate textures and samplers (#1216)
* Support separate textures and samplers

* Add missing bindless flag, fix SNORM format on buffer textures

* Add missing separation

* Add comments about the new handles
2020-05-27 16:07:10 +02:00
gdkchan b8eb6abecc
Refactor shader GPU state and memory access (#1203)
* Refactor shader GPU state and memory access

* Fix NVDEC project build

* Address PR feedback and add missing XML comments
2020-05-06 11:02:28 +10:00
gdkchan 03711dd7b5
Implement SULD shader instruction (#1117)
* Implement SULD shader instruction

* Some nits
2020-04-22 09:35:28 +10:00
gdkchan e93ca84b14
Better IPA shader instruction implementation (#1082)
* Fix varying interpolation on fragment shader

* Some nits

* Alignment
2020-04-03 11:20:47 +11:00
gdkchan 5b5239ab5b
Remove output interpolation qualifier (#1070) 2020-04-02 12:24:55 +11:00
gdkchan dc97457bf0
Initial support for double precision shader instructions. (#963)
* Implement DADD, DFMA and DMUL shader instructions

* Rename FP to FP32

* Correct double immediate

* Classic mistake
2020-03-03 15:02:08 +01:00
gdkchan 9e4f668f6c
Update bindless to indexed conversion code pattern match (#938)
* Update bindless to indexed conversion code pattern match

* Correct index shift
2020-02-14 11:29:58 +01:00
gdkchan 7e4d986a73
Support compute uniform buffers emulated with global memory (#924) 2020-02-11 01:10:05 +01:00
gdkchan 796e5d14b4
Use correct shader local memory size instead of a hardcoded size (#914)
* Use correct shader local size instead of a hardcoded size

* Remove unused uniform block

* Update XML doc

* Local memory size has 23 bits on maxwell

* Generate compute QMD struct from nv open doc header

* Remove dummy arrays when shared or local memory is not used, other improvements
2020-02-02 14:25:52 +11:00
gdkchan 81cca88bcd Fix shader output color buffer index when non-sequential render targets are used (#895) 2020-01-19 00:09:46 +01:00
gdkchan b8e3909d80 Add a GetSpan method to the memory manager and use it on GPU (#877) 2020-01-13 10:27:50 +11:00
gdkchan 29a825b43b Address PR feedback
Removes a useless null check

Aligns some values to improve readability
2020-01-09 02:13:00 +01:00
gdkchan 18814d44b2 Address PR feedback
Add TODO comment for GL_EXT_polygon_offset_clamp
2020-01-09 02:13:00 +01:00
gdkchan 92703af555 Address PR feedback 2020-01-09 02:13:00 +01:00
gdkchan 9bfb373bdf Remove more unused code 2020-01-09 02:13:00 +01:00
gdkchan 654e617fe7 Some code cleanup 2020-01-09 02:13:00 +01:00
gdkchan 9d7a142a48 Support texture rectangle targets (non-normalized coords) 2020-01-09 02:13:00 +01:00
gdkchan 2eccc7023a Partial support for shader memory barriers 2020-01-09 02:13:00 +01:00
gdkchan 171c3d54c6 Correct non-constant offset rewrite for texelFetch 2020-01-09 02:13:00 +01:00
gdkchan f2c85c5d58 Support non-constant texture offsets on non-NVIDIA gpus 2020-01-09 02:13:00 +01:00
gdkchan cb171f6ebf Support shared color mask, implement more shader instructions
Support shared color masks (used by Nouveau and maybe the NVIDIA
driver).
Support draw buffers (also required by OpenGL).
Support viewport transform disable (disabled for now as it breaks some
games).
Fix instanced rendering draw being ignored for multi draw.
Fix IADD and IADD3 immediate shader encodings, that was not matching
some ops.
Implement FFMA32I shader instruction.
Implement IMAD shader instruction.
2020-01-09 02:13:00 +01:00
gdk 6a98c643ca Add a pass to turn global memory access into storage access, and do all storage related transformations on IR 2020-01-09 02:13:00 +01:00
gdk 442485adb3 Partial support for branch with CC, and fix a edge case of branch out of loop on shaders 2020-01-09 02:13:00 +01:00
gdk 99f236fcf0 Simplified F2I shader instruction codegen 2020-01-09 02:13:00 +01:00
gdk b8528c6317 Implement HSET2 shader instruction and fix errors uncovered by Rodrigo tests 2020-01-09 02:13:00 +01:00
gdk 3ca675223a Remove TranslatorConfig struct 2020-01-09 02:13:00 +01:00