Commit graph

42 commits

Author SHA1 Message Date
Isaac Marovitz 8ac53c66b4
Remove Half Conversion (#4106)
* Remove HalfConversion

* Update `CodeGenVersion`
2022-12-14 21:13:23 -03:00
riperiperi f23b2878cc
Shader: Add fallback for LDG from "ube" buffer ranges. (#4027)
We have a conversion from LDG on the compute shader to a special constant buffer binding that's used to exceed hardware limits on compute, but it was only running if the byte offset could be identified. The fallback that checks all of the bindings at runtime only checks the storage buffers.

This PR adds checking ube ranges to the LoadGlobal fallback. This extends the changes in #4011 to only check ube entries which are accessed by the shader.

Fixes particles affected by the wind in The Legend of Zelda: Breath of the Wild. May fix other weird issues with compute shaders in some games.

Try a bunch of games and drivers to make sure they don't blow up loading constants willynilly from searchable buffers.
2022-12-06 23:15:44 +00:00
gdkchan bbb24d8c7e
Restrict shader storage buffer search when match fails (#4011)
* Restrict storage buffer search when match fails

* Shader cache version bump
2022-12-05 19:11:32 +00:00
riperiperi 476b4683cf
Fix CB0 alignment with addresses used for 8/16-bit LDG/STG (#3897)
This replacement is meant to be done with the original identified byteOffset, not the one assigned later on by the below conditionals (that already has the constant offset added, for instance).

This fixes videos being pixelated in Xenoblade 3, and other regressions that might have happened since #3847.
2022-11-25 14:39:03 +00:00
riperiperi 33a4d7d1ba
GPU: Eliminate CB0 accesses when storage buffer accesses are resolved (#3847)
* Eliminate CB0 accesses

Still some work to do, decouple from hle?

* Forgot the important part somehow

* Fix and improve alignment test

* Address Feedback

* Remove some complexity when checking storage buffer alignment

* Update Ryujinx.Graphics.Shader/Translation/Optimizations/GlobalToStorage.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-11-17 18:47:41 +01:00
gdkchan 66f16f4392
Fix bindless 1D textures having a buffer type on the shader (#3697)
* Fix bindless 1D textures having a buffer type on the shader

* Shader cache version bump
2022-09-13 08:53:55 +02:00
gdkchan 408bd63b08
Transform shader LDC into constant buffer access if offset is constant (#3672)
* Transform shader LDC into constant buffer access if offset is constant

* Shader cache version bump
2022-09-07 20:25:22 -03:00
gdkchan b34de74f81
Avoid adding shader buffer descriptors for constant buffers that are not used (#3478)
* Avoid adding shader buffer descriptors for constant buffers that are not used

* Shader cache version
2022-07-23 11:15:58 -03:00
gdkchan 4523a73f75
Propagate Shader phi nodes with the same source value from all blocks (#3457)
* Propagate Shader phi nodes with the same source value from all incoming blocks

* Shader cache version bump
2022-07-12 00:36:58 +02:00
gdkchan 5afd521c5a
Bindless elimination for constant sampler handle (#3424)
* Bindless elimination for constant sampler handle

* Shader cache version bump

* Update TextureHandle.ReadPackedId for new bindless elimination
2022-07-02 15:03:35 -03:00
gdkchan 3dee712164
Fix bindless/global memory elimination with inverted predicates (#2826)
* Fix bindless/global memory elimination with inverted predicates

* Shader cache version bump
2021-11-08 12:57:28 -03:00
gdkchan 04dfb86fde
Preserve image types for shader bindless surface instructions (.D variants) (#2779)
* Preserve image types for SULD/SUST .D variants

* Make format unknown for surface atomic if bindless and not sized
2021-10-24 19:40:20 -03:00
gdkchan 63f1663fa9
Fix shader 8-bit and 16-bit STS/STG (#2741)
* Fix 8 and 16-bit STG

* Fix 8 and 16-bit STS

* Shader cache version bump
2021-10-18 20:24:15 -03:00
gdkchan 25fd4ef10e
Extend bindless elimination to work with masked and shifted handles (#2727)
* Extent bindless elimination to work with masked handles

* Extend bindless elimination to catch shifted pattern, refactor handle packing/unpacking
2021-10-17 17:28:18 -03:00
riperiperi f0b00c1ae9
Fix TXQ for 3D textures. (#2613)
* Fix TXQ for 3D textures.

Assumes the texture is 3D if the component mask contains Z.

This fixes a bug in UE4 games where parts of the map had garbage pointers to lighting voxels, as the lookup 3D texture was not being initialized. Most notable game is THPS1+2.

May need another PR to keep image store data alive and properly flush it in order using the AutoDeleteCache.

* Get sampler type for TextureSize from bound textures.
2021-09-02 00:17:43 -03:00
riperiperi 142cededd4
Implement Shader Instructions SUATOM and SURED (#2090)
* Initial Implementation

* Further improvements (no support for float/64-bit types)

* Merge atomic and reduce instructions, add missing format switch

* Fix rebase issues.

* Not used.

* Whoops. Fixed.

* Partial implementation of inc/dec, cleanup and TODOs

* Remove testing path

* Address Feedback
2021-08-31 02:51:57 -03:00
gdkchan c3e2646f9e
Workaround for Intel FrontFacing built-in variable bug (#2540) 2021-08-11 23:01:06 +02:00
gdkchan 65fee49e8a
Fix separate bindless sampler at offset 0 (#2360) 2021-06-20 20:48:12 +02:00
gdkchan 02e2e561ac
Support bindless textures with separate constant buffers for texture and sampler (#2339) 2021-06-09 00:42:25 +02:00
gdkchan 3b90adcd1d
Fix shaders with mixed PBK and SSY addresses on the stack (#2329)
* Fix shaders with mixed PBK and SSY addresses on the stack

* Address PR feedback and nits
2021-06-03 01:41:53 +02:00
gdkchan 49745cfa37
Move shader resource descriptor creation out of the backend (#2290)
* Move shader resource descriptor creation out of the backend

* Remove now unused code, and other nits

* Shader cache version bump

* Nits

* Set format for bindless image load/store

* Fix buffer write flag
2021-05-19 23:15:26 +02:00
riperiperi 0129250c2e
Pass CbufSlot when getting info from the texture descriptor (#2291)
* Pass CbufSlot when getting info from the texture descriptor

Fixes some issues with bindless textures, when CbufSlot is not equal to the current TextureBufferIndex.

Specifically fixes a random chance of full screen colour flickering in Super Mario Party.

* Apply suggestions from code review

Oops

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2021-05-19 20:05:43 +02:00
gdkchan 40e276c9b5
Improve shader global memory to storage pass (#2200)
* Improve shader global memory to storage pass

* Formatting and more comments

* Shader cache version bump
2021-04-18 12:31:39 +02:00
riperiperi ede26556f2
Traverse PhiNodes for Bindless Elimination (#2089)
This allows bindless handles to be found for image/texture instructions with predicates, when the assignment of the texture handle is within the same predicate.

This seems to cover the remaining bindless handles that compilers seem to be creating due to optimizations.

Will affect newer UE4 games, and games by NdCube (Super Mario Party, Clubhouse Games)
2021-03-09 17:27:44 -03:00
gdkchan 053dcfdb05
Use multiple dest operands for shader call instructions (#1975)
* Use multiple dest operands for shader call instructions

* Passing opNode is no longer needed
2021-02-01 11:13:38 +11:00
gdkchan 4b7c7dab9e
Support multiple destination operands on shader IR and shuffle predicates (#1964)
* Support multiple destination operands on shader IR and shuffle predicates

* Cache version change
2021-01-28 10:59:47 +11:00
gdkchan 934a78005e
Simplify logic for bindless texture handling (#1667)
* Simplify logic for bindless texture handling

* Nits
2020-11-09 19:35:04 -03:00
gdkchan 49f970d5bd
Implement CAL and RET shader instructions (#1618)
* Add support for CAL and RET shader instructions

* Remove unused stuff

* Fix a bug that could cause the wrong values to be passed to a function

* Avoid repopulating function id dictionary every time

* PR feedback

* Fix vertex shader A/B merge
2020-10-25 17:00:44 -03:00
gdkchan e13154c83d
Implement shader LEA instruction and improve bindless image load/store (#1355) 2020-07-04 01:48:44 +02:00
gdkchan 5795bb1528
Support separate textures and samplers (#1216)
* Support separate textures and samplers

* Add missing bindless flag, fix SNORM format on buffer textures

* Add missing separation

* Add comments about the new handles
2020-05-27 16:07:10 +02:00
gdkchan b8eb6abecc
Refactor shader GPU state and memory access (#1203)
* Refactor shader GPU state and memory access

* Fix NVDEC project build

* Address PR feedback and add missing XML comments
2020-05-06 11:02:28 +10:00
gdkchan dc97457bf0
Initial support for double precision shader instructions. (#963)
* Implement DADD, DFMA and DMUL shader instructions

* Rename FP to FP32

* Correct double immediate

* Classic mistake
2020-03-03 15:02:08 +01:00
gdkchan 9e4f668f6c
Update bindless to indexed conversion code pattern match (#938)
* Update bindless to indexed conversion code pattern match

* Correct index shift
2020-02-14 11:29:58 +01:00
gdkchan 7e4d986a73
Support compute uniform buffers emulated with global memory (#924) 2020-02-11 01:10:05 +01:00
gdkchan 29a825b43b Address PR feedback
Removes a useless null check

Aligns some values to improve readability
2020-01-09 02:13:00 +01:00
gdkchan 9d7a142a48 Support texture rectangle targets (non-normalized coords) 2020-01-09 02:13:00 +01:00
gdk 6a98c643ca Add a pass to turn global memory access into storage access, and do all storage related transformations on IR 2020-01-09 02:13:00 +01:00
gdk 769c02235f Add ATOMS, LDS, POPC, RED, STS and VOTE shader instructions, start changing the way how global memory is handled 2020-01-09 02:13:00 +01:00
gdk a31fced221 Remove some unused constants and other code 2020-01-09 02:13:00 +01:00
gdk 3ab5c23f49 Add partial support for array of samplers, and add pass to identify them from bindless texture accesses 2020-01-09 02:13:00 +01:00
gdk 278a4c317c Implement BFI, BRK, FLO, FSWZADD, PBK, SHFL and TXD shader instructions, misc. fixes 2020-01-09 02:13:00 +01:00
gdk 1876b346fe Initial work 2020-01-09 02:13:00 +01:00