Commit graph

541 commits

Author SHA1 Message Date
Hexagon12 3bd5f01240
Merge pull request #2469 from lioncash/copyable
video_core/engines/maxwell_3d: Add is_trivially_copyable_v check for Regs
2019-05-19 15:02:17 +01:00
Sebastian Valle a6ed792ac4
Merge pull request #2470 from lioncash/ranged-for
video_core/engines/maxwell_3d: Simplify for loops into ranged for loops within InitializeRegisterDefaults()
2019-05-19 09:01:19 -05:00
Fernando Sahmkow fc975e9021 maxwell_3d: reduce sevirity of different component formats assert.
This was reduced due to happening on most games and at such constant
rate that it affected performance heavily for the end user. In general,
we are well aware of the assert and an implementation is already
planned.
2019-05-14 17:12:54 -04:00
Lioncash b01cce716e video_core/engines/engine_upload: Amend constructor initializer list order
Silences a -Wreorder warning.
2019-05-14 13:43:28 -04:00
Lioncash 9b6d993e52 video_core/engines/engine_upload: Default destructor in the cpp file
Avoids inlining destruction logic where applicable, and also makes
forward declarations not cause unexpected compilation errors depending
on where the State class is used.
2019-05-14 13:41:41 -04:00
Lioncash ec1c69258a video_core/engines/engine_upload: Remove unnecessary const on parameters in function declarations
These only apply in the definition of the function. They can be omitted
from the declaration.
2019-05-14 13:40:09 -04:00
Lioncash 0f83c8dffa video_core/engines/engine_upload: Remove unnecessary includes 2019-05-14 13:39:04 -04:00
Lioncash 5db1b54b58 video_core/engines/maxwell3d: Get rid of three magic values in CallMethod()
We can use the named constant instead of using 32 directly.
2019-05-14 09:02:47 -04:00
Lioncash 48ce5880a0 video_core/engines/maxwell_3d: Simplify for loops into ranged for loops within InitializeRegisterDefaults()
Lessens the amount of code that needs to be read, and gets rid of the
need to introduce an indexing variable. Instead, we just operate on the
objects directly.
2019-05-14 08:53:19 -04:00
Lioncash c212fc9b2c video_core/engines/maxwell_3d: Add is_trivially_copyable_v check for Regs
std::memset is used to clear the entire register structure, which
requires that the Regs struct be trivially copyable (otherwise undefined
behavior is invoked). This prevents the case where a non-trivial type is
potentially added to the struct.
2019-05-14 08:47:56 -04:00
bunnei c27b81cb85
Merge pull request #2429 from FernandoS27/compute
Corrections and Implementation on GPU Engines
2019-05-09 13:19:22 -04:00
ReinUsesLisp d4df803b2b shader_ir/other: Implement IPA.IDX 2019-05-02 21:46:37 -03:00
ReinUsesLisp 71aa9d0877 shader_ir/memory: Implement physical input attributes 2019-05-02 21:46:25 -03:00
ReinUsesLisp bd81a03d9d gl_shader_decompiler: Declare all possible varyings on physical attribute usage 2019-05-02 21:46:25 -03:00
ReinUsesLisp 7632a7d6d2 shader_bytecode: Add AL2P decoding 2019-05-02 21:46:25 -03:00
Fernando Sahmkow e64c41efe8 Refactors and name corrections. 2019-05-01 15:31:39 -04:00
bunnei c52233ec8b
Merge pull request #2322 from ReinUsesLisp/wswitch
video_core: Silent -Wswitch warnings
2019-04-28 22:24:58 -04:00
Fernando Sahmkow b3118ee316 Fixes and Corrections to DMA Engine 2019-04-23 15:28:18 -04:00
Fernando Sahmkow f1e5314f1a Add Swizzle Parameters to the DMA engine 2019-04-23 11:21:00 -04:00
Fernando Sahmkow e140e2ebc6 Add Documentation Headers to all the GPU Engines 2019-04-23 08:44:52 -04:00
Fernando Sahmkow 021d28c9b8 Corrections and styling 2019-04-23 08:02:24 -04:00
Fernando Sahmkow 701ce1c9d0 Implement Maxwell3D Data Upload 2019-04-22 19:27:36 -04:00
Fernando Sahmkow e4ff140b99 Introduce skeleton of the GPU Compute Engine. 2019-04-22 19:05:43 -04:00
Fernando Sahmkow a91d3fc639 Revamp Kepler Memory to use a subegine to manage uploads 2019-04-22 18:50:56 -04:00
bunnei 68b707711a
Merge pull request #2411 from FernandoS27/unsafe-gpu
GPU Manager: Implement ReadBlockUnsafe and WriteBlockUnsafe
2019-04-22 17:09:00 -04:00
bunnei 01100f8afd
Merge pull request #2400 from FernandoS27/corret-kepler-mem
Implement Kepler Memory on both Linear and BlockLinear.
2019-04-22 16:47:05 -04:00
bunnei da0c3bc658
Merge pull request #2407 from FernandoS27/f2f
Do some corrections in conversion shader instructions.
2019-04-20 00:42:34 -04:00
ReinUsesLisp fbe8d1ceaa video_core: Silent -Wswitch warnings 2019-04-18 15:54:39 -03:00
bunnei 5bd5140bde
Merge pull request #2348 from FernandoS27/guest-bindless
Implement Bindless Textures on Shader Decompiler and GL backend
2019-04-17 20:59:49 -04:00
bunnei 0cfbd3325b
Merge pull request #2315 from ReinUsesLisp/severity-decompiler
shader_ir/decode: Reduce the severity of common assertions
2019-04-16 22:21:19 -04:00
Fernando Sahmkow ef381e6924 Use ReadBlockUnsafe on TIC and TSC reading
Use ReadBlockUnsafe on TIC and TSC reading as memory is never flushed
from host GPU there.
2019-04-15 23:10:24 -04:00
Fernando Sahmkow 3e96c367bd Use WriteBlock and ReadBlock. 2019-04-15 22:42:34 -04:00
Fernando Sahmkow bec28d692d Implement Block Linear copies in Kepler Memory. 2019-04-15 21:22:16 -04:00
Fernando Sahmkow aa471274d9 Do some corrections in conversion shader instructions.
Corrects encodings for I2F, F2F, I2I and F2I
Implements Immediate variants of all four conversion types.
Add assertions to unimplemented stuffs.
2019-04-15 19:16:27 -04:00
Fernando Sahmkow 8a099ac99f Correct Kepler Memory on Linear Pushes. 2019-04-15 14:51:36 -04:00
ReinUsesLisp 5c280e6ff0 shader_ir: Implement STG, keep track of global memory usage and flush 2019-04-14 00:25:32 -03:00
bunnei 353a099481
Merge pull request #2366 from FernandoS27/xmad-fix
Correct XMAD mode, psl and high_b on different encodings.
2019-04-09 19:15:01 -04:00
Fernando Sahmkow 5c55ae4e18 Correct LOP_IMN encoding 2019-04-08 13:39:12 -04:00
Fernando Sahmkow 16adc735a5 Correct XMAD mode, psl and high_b on different encodings. 2019-04-08 13:01:17 -04:00
Fernando Sahmkow 492040bd9c Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format. 2019-04-08 11:36:11 -04:00
Fernando Sahmkow 4841440382 Implement TXQ_B 2019-04-08 11:29:52 -04:00
Fernando Sahmkow ac3ba9a33e Corrections to TEX_B 2019-04-08 11:28:44 -04:00
Fernando Sahmkow 7af82ca022 Implement Bindless Handling on SetupTexture 2019-04-08 11:23:46 -04:00
Fernando Sahmkow e28fd3d0a5 Implement Bindless Samplers and TEX_B in the IR. 2019-04-08 11:23:42 -04:00
ReinUsesLisp ddcb711ee8 maxwell_3d: Reduce severity of ProcessSyncPoint 2019-04-06 02:18:20 -03:00
bunnei 864280fabc
Merge pull request #2317 from FernandoS27/sync
Implement SyncPoint Register in the GPU.
2019-04-05 23:50:54 -04:00
Fernando Sahmkow fc91e21206 Implement SyncPoint Register in the GPU. 2019-04-05 19:19:30 -04:00
Lioncash 22f02076c6 video_core/engines: Make memory manager members private
These aren't used externally by anything, so they can be made private
data members.
2019-04-05 18:26:43 -04:00
Lioncash 26223f8124 video_core/engines: Remove unnecessary inclusions where applicable
Replaces header inclusions with forward declarations where applicable
and also removes unused headers within the cpp file. This reduces a few
more dependencies on core/memory.h
2019-04-05 18:26:32 -04:00
ReinUsesLisp 04979560fb shader_ir/memory: Reduce severity of LD_L cache management and log it 2019-04-03 17:12:44 -03:00
ReinUsesLisp 24abeb9a67 shader_ir/memory: Reduce severity of ST_L cache management and log it 2019-04-03 17:12:44 -03:00
bunnei 19330f45d3 maxwell_dma: Check for valid source in destination before copy.
- Avoid a crash in Octopath Traveler.
2019-03-20 22:36:03 -04:00
bunnei 22d3dfbcd4 gpu: Rewrite virtual memory manager using PageTable. 2019-03-20 22:36:02 -04:00
bunnei 574e89d924 video_core: Refactor to use MemoryManager interface for all memory access.
# Conflicts:
#	src/video_core/engines/kepler_memory.cpp
#	src/video_core/engines/maxwell_3d.cpp
#	src/video_core/morton.cpp
#	src/video_core/morton.h
#	src/video_core/renderer_opengl/gl_global_cache.cpp
#	src/video_core/renderer_opengl/gl_global_cache.h
#	src/video_core/renderer_opengl/gl_rasterizer_cache.cpp
2019-03-16 00:38:48 -04:00
bunnei 2eaf6c41a4 gpu: Use host address for caching instead of guest address. 2019-03-14 22:34:42 -04:00
bunnei 633ce92908
Merge pull request #2147 from ReinUsesLisp/texture-clean
shader_ir: Remove "extras" from the MetaTexture
2019-03-10 17:28:36 -04:00
bunnei 7b574f406b gpu: Move command processing to another thread. 2019-03-06 21:48:57 -05:00
Lioncash f9ee0dc7ee video_core/engines: Remove unnecessary includes
Removes a few unnecessary dependencies on core-related machinery, such
as the core.h and memory.h, which reduces the amount of rebuilding
necessary if those files change.

This also uncovered some indirect dependencies within other source
files. This also fixes those.
2019-03-05 20:35:32 -05:00
bunnei f15e2dd881
Merge pull request #2163 from ReinUsesLisp/bitset-dirty
maxwell_3d: Use std::bitset to manage dirty flags
2019-02-27 20:50:08 -05:00
Lioncash b9238edd0d common/math_util: Move contents into the Common namespace
These types are within the common library, so they should be within the
Common namespace.
2019-02-27 03:38:39 -05:00
ReinUsesLisp 5219edd715 maxwell_3d: Use std::bitset to manage dirty flags 2019-02-26 03:01:48 -03:00
ReinUsesLisp 5ca63d0675 shader/decode: Remove extras from MetaTexture 2019-02-26 00:11:30 -03:00
ReinUsesLisp 48e6f77c03 shader/decode: Split memory and texture instructions decoding 2019-02-26 00:11:30 -03:00
bunnei c07987dfab
Merge pull request #2118 from FernandoS27/ipa-improve
shader_decompiler: Improve Accuracy of Attribute Interpolation.
2019-02-24 23:04:22 -05:00
Lioncash a8fa5019b5 video_core: Remove usages of System::GetInstance() within the engines
Avoids the use of the global accessor in favor of explicitly making the
system a dependency within the interface.
2019-02-15 22:06:23 -05:00
Lioncash bd983414f6 core_timing: Convert core timing into a class
Gets rid of the largest set of mutable global state within the core.
This also paves a way for eliminating usages of GetInstance() on the
System class as a follow-up.

Note that no behavioral changes have been made, and this simply extracts
the functionality into a class. This also has the benefit of making
dependencies on the core timing functionality explicit within the
relevant interfaces.
2019-02-15 21:50:25 -05:00
Fernando Sahmkow 10682ad7e0 shader_decompiler: Improve Accuracy of Attribute Interpolation. 2019-02-14 03:25:07 -04:00
bunnei 8135f4bfce
Merge pull request #2110 from lioncash/namespace
core_timing: Rename CoreTiming namespace to Core::Timing
2019-02-12 19:26:37 -05:00
bunnei c440ecfafe
Merge pull request #2104 from ReinUsesLisp/compute-assert
kepler_compute: Fixup assert and rename the engine
2019-02-12 19:24:34 -05:00
Lioncash 48d9d66dc5 core_timing: Rename CoreTiming namespace to Core::Timing
Places all of the timing-related functionality under the existing Core
namespace to keep things consistent, rather than having the timing
utilities sitting in its own completely separate namespace.
2019-02-12 12:42:17 -05:00
Fernando Sahmkow f5ec165e8c Corrected F2I None mode to RoundEven. 2019-02-11 18:46:45 -04:00
ReinUsesLisp 1ddcd0e6f0 kepler_compute: Fixup assert and rename engines
When I originally added the compute assert I used the wrong
documentation. This addresses that.

The dispatch register was tested with homebrew against hardware and is
triggered by some games (e.g. Super Mario Odyssey). What exactly is
missing to get a valid program bound by this engine requires more
investigation.
2019-02-10 19:29:33 -03:00
bunnei dd1aab5446 gl_rasterizer: Implement a more accurate fermi 2D copy.
- This is a blit, use the blit registers.
2019-02-06 21:54:21 -05:00
bunnei 10ab714fe0
Merge pull request #2042 from ReinUsesLisp/nouveau-tex
maxwell_3d: Allow texture handles with TIC id zero
2019-02-06 20:19:20 -05:00
bunnei 72c70d6808
Merge pull request #2081 from ReinUsesLisp/lmem-64
shader_ir/memory: Add LD_L 64 bits loads
2019-02-05 09:17:48 -05:00
bunnei bb4549a73d
Merge pull request #2082 from FernandoS27/txq-stl
Fix TXQ not using the component mask.
2019-02-04 20:22:32 -05:00
Mat M a568cd805b
Update src/video_core/engines/shader_bytecode.h
Co-Authored-By: FernandoS27 <fsahmkow27@gmail.com>
2019-02-03 21:27:26 -04:00
Fernando Sahmkow 0306c50339 Fix TXQ not using the component mask. 2019-02-03 18:17:18 -04:00
ReinUsesLisp 2bdbb90af7 video_core: Assert on invalid GPU to CPU address queries 2019-02-03 04:58:40 -03:00
ReinUsesLisp 04e68e9738 maxwell_3d: Allow sampler handles with TSC id zero 2019-02-03 04:58:40 -03:00
ReinUsesLisp 390721a561 maxwell_3d: Allow texture handles with TIC id zero
Also remove "enabled" field from Tegra::Texture::FullTextureInfo because
it would become unused.
2019-02-03 04:58:24 -03:00
ReinUsesLisp 9feb68085d shader_bytecode: Rename BytesN enums to BitsN 2019-02-03 00:25:40 -03:00
ReinUsesLisp 477d616f7d shader_ir: Unify constant buffer offset values
Constant buffer values on the shader IR were using different offsets if
the access direct or indirect. cbuf34 has a non-multiplied offset while
cbuf36 does. On shader decoding this commit multiplies it by four on
cbuf34 queries.
2019-01-30 02:45:50 -03:00
ReinUsesLisp 3b84e04af1 shader_decode: Implement LDG and basic cbuf tracking 2019-01-30 00:00:15 -03:00
bunnei 1f4ca1e841
Merge pull request #1927 from ReinUsesLisp/shader-ir
video_core: Replace gl_shader_decompiler with an IR based decompiler
2019-01-25 23:42:14 -05:00
ReinUsesLisp 9a82dec74a maxwell_3d: Set rt_separate_frag_data to 1 by default
Commercial games assume that this value is 1 but they never set it. On
the other hand nouveau manually sets this register. On
ConfigureFramebuffers we were asserting for what we are actually
implementing (according to envytools).
2019-01-22 04:14:29 -03:00
ReinUsesLisp a1b845b651 shader_decode: Implement VMAD and VSETP 2019-01-15 17:54:53 -03:00
ReinUsesLisp dd91650aaf shader_decode: Implement HFMA2 2019-01-15 17:54:52 -03:00
ReinUsesLisp 4316eaf75c shader_decode: Fixup clang-format 2019-01-15 17:54:52 -03:00
ReinUsesLisp 15a0e1481d shader_ir: Initial implementation 2019-01-15 17:54:49 -03:00
ReinUsesLisp 294df41b86 shader_bytecode: Fixup encoding 2019-01-15 17:54:49 -03:00
ReinUsesLisp a0c8c16d07 shader_header: Make local memory size getter constant 2019-01-15 17:54:49 -03:00
ReinUsesLisp b683e41fca gl_rasterizer_cache: Use dirty flags for the depth buffer 2019-01-07 16:22:28 -03:00
ReinUsesLisp 179ee963db gl_rasterizer_cache: Use dirty flags for color buffers 2019-01-07 16:20:39 -03:00
ReinUsesLisp 0ab17ab406 gl_shader_cache: Use dirty flags for shaders 2019-01-07 16:13:12 -03:00
ReinUsesLisp aaa0e6c346 shader_bytecode: Fixup TEXS.F16 encoding 2018-12-26 01:35:44 -03:00
David Marcec fdd649e2ef Fixed uninitialized memory due to missing returns in canary
Functions which are suppose to crash on non canary builds usually don't return anything which lead to uninitialized memory being used.
2018-12-19 12:52:32 +11:00
ReinUsesLisp ef061481c5 shader_bytecode: Fixup half float's operator B encoding 2018-12-18 04:28:50 -03:00
heapo 72599cc667 Implement postfactor multiplication/division for fmul instructions 2018-12-17 07:56:25 -08:00
ReinUsesLisp 59a8df1b14 gl_shader_decompiler: Implement TEXS.F16 2018-12-05 02:06:34 -03:00