Commit graph

34 commits

Author SHA1 Message Date
gdkchan 4523a73f75
Propagate Shader phi nodes with the same source value from all blocks (#3457)
* Propagate Shader phi nodes with the same source value from all incoming blocks

* Shader cache version bump
2022-07-12 00:36:58 +02:00
gdkchan 5afd521c5a
Bindless elimination for constant sampler handle (#3424)
* Bindless elimination for constant sampler handle

* Shader cache version bump

* Update TextureHandle.ReadPackedId for new bindless elimination
2022-07-02 15:03:35 -03:00
gdkchan 3dee712164
Fix bindless/global memory elimination with inverted predicates (#2826)
* Fix bindless/global memory elimination with inverted predicates

* Shader cache version bump
2021-11-08 12:57:28 -03:00
gdkchan 04dfb86fde
Preserve image types for shader bindless surface instructions (.D variants) (#2779)
* Preserve image types for SULD/SUST .D variants

* Make format unknown for surface atomic if bindless and not sized
2021-10-24 19:40:20 -03:00
gdkchan 63f1663fa9
Fix shader 8-bit and 16-bit STS/STG (#2741)
* Fix 8 and 16-bit STG

* Fix 8 and 16-bit STS

* Shader cache version bump
2021-10-18 20:24:15 -03:00
gdkchan 25fd4ef10e
Extend bindless elimination to work with masked and shifted handles (#2727)
* Extent bindless elimination to work with masked handles

* Extend bindless elimination to catch shifted pattern, refactor handle packing/unpacking
2021-10-17 17:28:18 -03:00
riperiperi f0b00c1ae9
Fix TXQ for 3D textures. (#2613)
* Fix TXQ for 3D textures.

Assumes the texture is 3D if the component mask contains Z.

This fixes a bug in UE4 games where parts of the map had garbage pointers to lighting voxels, as the lookup 3D texture was not being initialized. Most notable game is THPS1+2.

May need another PR to keep image store data alive and properly flush it in order using the AutoDeleteCache.

* Get sampler type for TextureSize from bound textures.
2021-09-02 00:17:43 -03:00
riperiperi 142cededd4
Implement Shader Instructions SUATOM and SURED (#2090)
* Initial Implementation

* Further improvements (no support for float/64-bit types)

* Merge atomic and reduce instructions, add missing format switch

* Fix rebase issues.

* Not used.

* Whoops. Fixed.

* Partial implementation of inc/dec, cleanup and TODOs

* Remove testing path

* Address Feedback
2021-08-31 02:51:57 -03:00
gdkchan c3e2646f9e
Workaround for Intel FrontFacing built-in variable bug (#2540) 2021-08-11 23:01:06 +02:00
gdkchan 65fee49e8a
Fix separate bindless sampler at offset 0 (#2360) 2021-06-20 20:48:12 +02:00
gdkchan 02e2e561ac
Support bindless textures with separate constant buffers for texture and sampler (#2339) 2021-06-09 00:42:25 +02:00
gdkchan 3b90adcd1d
Fix shaders with mixed PBK and SSY addresses on the stack (#2329)
* Fix shaders with mixed PBK and SSY addresses on the stack

* Address PR feedback and nits
2021-06-03 01:41:53 +02:00
gdkchan 49745cfa37
Move shader resource descriptor creation out of the backend (#2290)
* Move shader resource descriptor creation out of the backend

* Remove now unused code, and other nits

* Shader cache version bump

* Nits

* Set format for bindless image load/store

* Fix buffer write flag
2021-05-19 23:15:26 +02:00
riperiperi 0129250c2e
Pass CbufSlot when getting info from the texture descriptor (#2291)
* Pass CbufSlot when getting info from the texture descriptor

Fixes some issues with bindless textures, when CbufSlot is not equal to the current TextureBufferIndex.

Specifically fixes a random chance of full screen colour flickering in Super Mario Party.

* Apply suggestions from code review

Oops

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2021-05-19 20:05:43 +02:00
gdkchan 40e276c9b5
Improve shader global memory to storage pass (#2200)
* Improve shader global memory to storage pass

* Formatting and more comments

* Shader cache version bump
2021-04-18 12:31:39 +02:00
riperiperi ede26556f2
Traverse PhiNodes for Bindless Elimination (#2089)
This allows bindless handles to be found for image/texture instructions with predicates, when the assignment of the texture handle is within the same predicate.

This seems to cover the remaining bindless handles that compilers seem to be creating due to optimizations.

Will affect newer UE4 games, and games by NdCube (Super Mario Party, Clubhouse Games)
2021-03-09 17:27:44 -03:00
gdkchan 053dcfdb05
Use multiple dest operands for shader call instructions (#1975)
* Use multiple dest operands for shader call instructions

* Passing opNode is no longer needed
2021-02-01 11:13:38 +11:00
gdkchan 4b7c7dab9e
Support multiple destination operands on shader IR and shuffle predicates (#1964)
* Support multiple destination operands on shader IR and shuffle predicates

* Cache version change
2021-01-28 10:59:47 +11:00
gdkchan 934a78005e
Simplify logic for bindless texture handling (#1667)
* Simplify logic for bindless texture handling

* Nits
2020-11-09 19:35:04 -03:00
gdkchan 49f970d5bd
Implement CAL and RET shader instructions (#1618)
* Add support for CAL and RET shader instructions

* Remove unused stuff

* Fix a bug that could cause the wrong values to be passed to a function

* Avoid repopulating function id dictionary every time

* PR feedback

* Fix vertex shader A/B merge
2020-10-25 17:00:44 -03:00
gdkchan e13154c83d
Implement shader LEA instruction and improve bindless image load/store (#1355) 2020-07-04 01:48:44 +02:00
gdkchan 5795bb1528
Support separate textures and samplers (#1216)
* Support separate textures and samplers

* Add missing bindless flag, fix SNORM format on buffer textures

* Add missing separation

* Add comments about the new handles
2020-05-27 16:07:10 +02:00
gdkchan b8eb6abecc
Refactor shader GPU state and memory access (#1203)
* Refactor shader GPU state and memory access

* Fix NVDEC project build

* Address PR feedback and add missing XML comments
2020-05-06 11:02:28 +10:00
gdkchan dc97457bf0
Initial support for double precision shader instructions. (#963)
* Implement DADD, DFMA and DMUL shader instructions

* Rename FP to FP32

* Correct double immediate

* Classic mistake
2020-03-03 15:02:08 +01:00
gdkchan 9e4f668f6c
Update bindless to indexed conversion code pattern match (#938)
* Update bindless to indexed conversion code pattern match

* Correct index shift
2020-02-14 11:29:58 +01:00
gdkchan 7e4d986a73
Support compute uniform buffers emulated with global memory (#924) 2020-02-11 01:10:05 +01:00
gdkchan 29a825b43b Address PR feedback
Removes a useless null check

Aligns some values to improve readability
2020-01-09 02:13:00 +01:00
gdkchan 9d7a142a48 Support texture rectangle targets (non-normalized coords) 2020-01-09 02:13:00 +01:00
gdk 6a98c643ca Add a pass to turn global memory access into storage access, and do all storage related transformations on IR 2020-01-09 02:13:00 +01:00
gdk 769c02235f Add ATOMS, LDS, POPC, RED, STS and VOTE shader instructions, start changing the way how global memory is handled 2020-01-09 02:13:00 +01:00
gdk a31fced221 Remove some unused constants and other code 2020-01-09 02:13:00 +01:00
gdk 3ab5c23f49 Add partial support for array of samplers, and add pass to identify them from bindless texture accesses 2020-01-09 02:13:00 +01:00
gdk 278a4c317c Implement BFI, BRK, FLO, FSWZADD, PBK, SHFL and TXD shader instructions, misc. fixes 2020-01-09 02:13:00 +01:00
gdk 1876b346fe Initial work 2020-01-09 02:13:00 +01:00