Commit graph

1613 commits

Author SHA1 Message Date
Steveice10 aa6809e2a8
renderer_vulkan: Use no more than target supported version. (#7439) 2024-02-15 19:38:32 -08:00
Steveice10 5e02be75a3
renderer_vulkan: Use getToolPropertiesEXT instead of getToolProperties (#7434)
getToolProperties is not available until Vulkan 1.3; we need to use the EXT version.
2024-02-13 21:43:09 -08:00
GPUCode 106364e01e
video_core: Use source3 when GPU_PREVIOUS is used in first stage (#7411) 2024-02-05 09:53:54 -08:00
GPUCode d5a1bd07f3
glsl_shader_gen: Increase z=0 epsillon (#7408) 2024-02-05 09:53:41 -08:00
GPUCode 480604ec72
glsl_shader_fs_gen: Apply shadow before ambient light (#7404) 2024-01-31 23:29:39 +02:00
merry 63feac6bb3
externals: Update dynarmic to 6.6.1, Update oaknut to 2.0.1 (#7398) 2024-01-30 19:50:39 -08:00
SachinVin 7a4854c519
shader_setup.h: Initialise program_code (#7396) 2024-01-28 06:02:40 -08:00
Steveice10 41fe75acb7
renderer_vulkan: Pass physical device API version to VMA instead of instance version. (#7390) 2024-01-26 16:34:12 +02:00
GPUCode bea863efff
general: Fixes for Tales of the Abyss (#7381)
* geometry_pipeline: Remove unneeded assert

* Has been hw-tested that gs works correctly even when not in exclusive mode

* pica_core: Propagate output_mask to gs

* Has been hw-tested to occur under the same conditions that other uniforms are shared

* regs_shader: Intialize GPUREG_SH_INPUTBUFFER_CONFIG to default value

* Default value verified on hw. Tales of Abyss does not update the number of vertex attributes for the geometry unit and expects it to be 2

* texture_codec: Align buffer sizes to bpp

* Prevents out of bounds texture reads when launching TOA from the HOME menu

* pica_core: Make default value more clear
2024-01-24 19:22:10 +02:00
GPUCode 549fdd0736
pica_core: Propogate vertex uniforms to geometry setup when not in exclusive mode (#7367) 2024-01-24 04:47:08 +02:00
Steveice10 82294425e3
build: Add flags to toggle specific renderer backends. (#7375) 2024-01-21 23:29:46 -08:00
GPUCode 8d82adb3d3
glsl_shader_gen: Remove invariant qualifier (#7376)
* glsl_shader_gen: Remove invariant qualifier

* Causes visual regressions in Pokemon with RADV

* rasterizer_cache: Clear null surface to transparent
2024-01-21 13:39:35 +02:00
GPUCode ca3b2306d5
shader_unit: Intialize temporaries on shader invocation (#7366) 2024-01-20 22:13:31 +02:00
GPUCode 8e87bd606c
glsl_shader_gen: Use epsilon for both ends of NDC range (#7355) 2024-01-20 22:13:16 +02:00
Steveice10 37f0a7484f
renderer_vulkan: Revert vkGetInstanceProcAddr symbol change for MoltenVK. (#7341) 2024-01-12 09:16:04 -08:00
Steveice10 2ce0a9e899
renderer_vulkan: Update to support MoltenVK 1.2.7 (#7335) 2024-01-09 11:33:47 -08:00
Vitor K c8c2beaeff
misc: fix issues pointed out by msvc (#7316)
* do not move constant variables

* applet_manager: avoid possible use after move

* use constant references where pointed out by msvc

* extra_hid: initialize response

* ValidateSaveState: passing slot separately is not necessary

* common: mark HashCombine as nodiscard

* cityhash: remove use of using namespace std

* Prefix all size_t with std::

done automatically by executing regex replace `([^:0-9a-zA-Z_])size_t([^0-9a-zA-Z_])` -> `$1std::size_t$2`
based on 7d8f115

* shared_memory.cpp: fix log error format

* fix compiling with pch off
2024-01-07 12:37:42 -08:00
Steveice10 6069fac76d
video_core: Fix crash when no debug context is provided. (#7324) 2024-01-07 10:29:43 -08:00
Steveice10 f2ee9baec7
core: Eliminate more uses of Core::System::GetInstance(). (#7313) 2024-01-05 12:07:28 -08:00
GPUCode 2bb7f89c30
video_core: Refactor GPU interface (#7272)
* video_core: Refactor GPU interface

* citra_qt: Better debug widget lifetime
2023-12-28 11:46:57 +01:00
Wunk 4d9eedd0d8
video_core/vulkan: Add debug object names (#7233)
* vk_platform: Add `SetObjectName`

Creates a name-info struct and automatically deduces the object handle type using vulkan-hpp's handle trait data.
Supports `string_view` and `fmt` arguments.

* vk_texture_runtime: Use `SetObjectName` for surface handles

Names both the image handle and the image-view.

* vk_stream_buffer: Add debug object names

Names the buffer and its device memory based on its size and type.

* vk_swapchain: Set swapchain handle debug names

Identifies the swapchain images themselves as well as the semaphores

* vk_present_window: Set handle debug names

* vk_resource_pool: Set debug handle names

* vk_blit_helper: Set debug handle names

* vk_platform: Use `VulkanHandleType` concept

Use a new `concept`-type rather than `enable_if`-patterns to restrict
this function to Vulkan handle-types only.
2023-12-08 06:58:47 +02:00
Wunk ea9f522c0c
shader_jit_a64: Use LDP/STP for address registers (#7225)
Move `address_registers` to be earlier in the `UnitState` structure to allow LDP/STP's 7-bit offset to reach these members.

Follow-up of https://github.com/citra-emu/citra/pull/7002#discussion_r1367270804
2023-12-03 05:07:21 -08:00
Wunk 83b329f6e1
video_core/shader: Refactor JIT-Engines into JitEngine type (#7210) 2023-11-26 15:15:36 -08:00
Wunk 68e6a2185d
Fix missing u32 and LOG_TRACE includes (#7207)
This fixes a compile-error with gcc I was getting from
`LOG_TRACE`(`error: ‘LOG_TRACE’ was not declared in this scope`) and
`u32`(`error: ‘u32’ was not declared in this scope`) being used without
their header-files being included.

Not sure how `romfs_reader.cpp` is even compiling when nothing in its
include-tree is refers to those macros.
2023-11-23 15:39:17 -08:00
GPUCode 1dc0fa7bb5
vk_pipeline_cache: Make pipeline cache reads more robust (#7194) 2023-11-22 23:09:12 -08:00
GPUCode 85bd1be852
code: Add texture sampling option (#7118)
* This replaces the nearest neighbour filter that shouldn't have existed in the first place
2023-11-23 02:04:47 +02:00
GPUCode 5733c8681e
vk_pipeline_cache: Move SPIRV emittion to a worker thread (#7170)
* vk_scheduler: Remove RenderpassCache dependency

* vk_pipeline_cache: Move spirv emittion to worker thread
2023-11-20 20:05:35 -08:00
GPUCode 26d5727b19
video_core: Merge tex0 and tex_cube (#7173) 2023-11-17 03:14:10 -08:00
GPUCode d5b50a9fc0
spv_fs_shader_gen: Remove OpTypeSampledImage from texture buffers (#7153) 2023-11-12 22:40:30 -08:00
GPUCode 168f168c33
spv_fs_shader_gen: Implement quaternion correction with barycentric extension (#7152) 2023-11-12 22:40:21 -08:00
Wunk 312068eebf
renderer_vulkan: Optimize descriptor binding (#7142)
For each draw, Citra will rebind all descriptor set slots and may redundantly re-bind descriptor-sets that were already bound. Instead it should only bind the descriptor-sets that have either changed or have had their buffer-offsets changed. This also allows entire calls to `vkCmdBindDescriptorSets` to be removed in the case that nothing has changed between draw calls.
2023-11-12 14:17:38 -08:00
Wunk 831c9c4a38
renderer_vulkan: Import host memory for screenshots (#7132) 2023-11-12 13:02:55 -08:00
merry 271218b733
shader_jit_a64_compiler: Improve MAX, MIN (#7137) 2023-11-11 18:27:01 +05:30
merry 80213bf88f
shader_jit_a64_compiler: Improve Compile_SwizzleSrc (#7136) 2023-11-11 18:26:48 +05:30
Charles Lombardo fa08df21a5
Android UI Overhaul Part 1 (#7108)
* android: Android 14 support

* android: New home UI flow

Port of the yuzu-android home UI with a few Citra specific tweaks.

A few important things to note
- New and existing Citra users will be guided through the new setup flow
- Existing game directory location is discarded and will have to be reselected
- Protections around making sure the user has selected a user directory were reworked to fit this new UI. I removed async directory init and DirectoryStateReceivers and check during MainActivity's onResume callback.
- Removed Citra premium. The light/dark theme is now available for everyone.

* android: New blue app theme

* android: Extend UI into status/navigation bar area

* android: Remove yellow theme specific styles

* android: Disable status/navigation bar contrast enforcement

We handle it ourselves so there's no need to use a contrasty background on the system bars

* android: GPU Driver Manager

Includes a rewrite of FileUtil with some helper functions for the manager

* android: Rework NativeLibrary in Kotlin

Besides the rewrite this cleans up the alert dialogs that are used for system errors. Generally removes unused JNI code and makes things a little more consistent.

* android: Home menu support + downloader

* android: Enable minify and resource shrinking

* android: Remove premium page and expose texture filtering modes

* android: Update AGP to 8.1.2

* android: Don't display emulation in cutout area

We don't currently handle the notch properly in the emulation fragment so just don't render under it for now.

* android: native.cpp ClangFormat fixes

* core: SystemTitles: Include std::optional

Without it, the android build would fail

* vk: android: Properly override GetDriverLibrary

* vk_instance: Blacklist timeline semaphore ext on turnip

* vk_platform: Hardcode apiVersion to VK_API_VERSION_1_3

* android: native: Use const where applicable

* android: native: Array pointer access style fix

* android: Share relevant log

Shares the old log if it exists and you haven't booted a game yet and shares the current log if you have booted a game.

* android: Apply dark theme color for software keyboard text

---------

Co-authored-by: GPUCode <geoster3d@gmail.com>
2023-11-10 15:16:54 -08:00
Steveice10 d4f31bc617
video_core: Fix fragment shader interlock usage on OpenGL. (#7144) 2023-11-10 13:14:52 -08:00
Steveice10 84f9e9a10f
video_core: Perform quaternion correction and interpolation in fragment shader using barycentric extension. (#7126) 2023-11-09 15:23:56 -08:00
GPUCode 7930e1ea86
rasterizer_cache: Avoid dumping render targets (#7130) 2023-11-07 18:13:03 -08:00
Wunk 1d4d421097
shader_jit_a64: Optimize MOVA dest-enable (#7122)
Rather than branching the 3 cases of dest-enablement, just emit a single
move-and-sign-extend instruction for each case.

From this review:
https://github.com/citra-emu/citra/pull/7002#discussion_r1381560584
2023-11-07 11:46:40 -08:00
Wunk 8fe147b8f9
video_core: Use binary memory-literals for memory-sizes (#7127)
Replaces `... * 1024 * 1024` with `_MiB`/`_GiB` literals.
2023-11-06 23:38:54 +02:00
GPUCode 1f6393e7d5
video_core: Refactor GLSL fragment emitter (#7093)
* video_core: Refactor GLSL fragment emitter

* shader: Add back custom normal maps
2023-11-06 12:26:28 -08:00
Wunk e13735b624
video_core: Implement an arm64 shader-jit backend (#7002)
* externals: Add oaksim submodule

Used for emitting ARM64 assembly

* common: Implement aarch64 ABI

Utilize oaknut to implement a stack frame.

* tests: Allow shader-jit tests for x64 and a64

Run the shader-jit tests for both x86_64 and arm64 targets

* video_core: Initialize arm64 shader-jit backend

Passes all current unit tests!

* shader_jit_a64: protect/unprotect memory when jit-ing

Required on MacOS. Memory needs to be fully unprotected and then
re-protected when writing or there will be memory access errors on
MacOS.

* shader_jit_a64: Fix ARM64-Imm overflow

These conditionals were throwing exceptions since the immediate values
were overflowing the available space in the `EOR` instructions. Instead
they are generated from `MOV` and then `EOR`-ed after.

* shader_jit_a64: Fix Geometry shader conditional

* shader_jit_a64: Replace `ADRL` with `MOVP2R`

Fixes some immediate-generation exceptions.

* common/aarch64: Fix CallFarFunction

* shader_jit_a64: Optimize `SantitizedMul`

Co-authored-by: merryhime <merryhime@users.noreply.github.com>

* shader_jit_a64: Fix address register offset behavior

Based on https://github.com/citra-emu/citra/pull/6942
Passes unit tests.

* shader_jit_a64: Fix `RET` address offset

A64 stack is 16-byte aligned rather than 8. So a direct port of the x64
code won't work. Fixes weird branches into invalid memory for any
shaders with subroutines.

* shader_jit_a64: Increase max program size

Tuned for A64 program size.

* shader_jit_a64: Use `UBFX` for extracting loop-state

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Optimize `SUB+CMP` to `SUBS`

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Optimize `CMP+B` to `CBNZ`

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Use `FMOV` for `ONE` vector

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Remove x86-specific documentation

* shader_jit_a64: Use `UBFX` to extract exponent

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Remove redundant MIN/MAX `SRC2`-NaN check

Special handling only needs to check SRC1 for NaN, not SRC2.
It would work as follows in the four possible cases:

No NaN: No special handling needed.
Only SRC1 is NaN: The special handling is triggered because SRC1 is NaN, and SRC2 is picked.
Only SRC2 is NaN: FMAX automatically picks SRC2 because it always picks the NaN if there is one.
Both SRC1 and SRC2 are NaN: The special handling is triggered because SRC1 is NaN, and SRC2 is picked.

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit/tests:: Add catch-stringifier for vec2f/vec3f

* shader_jit/tests: Add Dest Mask unit test

* shader_jit_a64: Fix Dest-Mask `BSL` operand order

Passes the dest-mask unit tests now.

* shader_jit_a64: Use `MOVI` for DestEnable mask

Accelerate certain cases of masking with MOVI as well

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit/tests: Add source-swizzle unit test

This is not expansive. Generating all `4^4` cases seems to make Catch2
crash. So I've added some component-masking(non-reordering) tests based
on the Dest-Mask unit-test and some additional ones to test
broadcasts/splats and component re-ordering.

* shader_jit_a64: Fix swizzle index generation

This was still generating `SHUFPS` indices and not the ones that we wanted for the `TBL` instruction. Passes all unit tests now.

* shader_jit/tests: Add `ShaderSetup` constructor to `ShaderTest`

Rather than using the direct output of `CompileShaderSetup` allow a
`ShaderSetup` object to be passed in directly.  This enabled the ability
emit assembly that is not directly supported by nihstro.

* shader_jit/tests: Add `CALL` unit-test

Tests nested `CALL` instructions to eventually reach an `EX2`
instruction.

EX2 is picked in particular since it is implemented as an even deeper
dispatch and ensures subroutines are properly implemented between `CALL`
instructions and implementation-calls.

* shader_jit_a64: Fix nested `BL` subroutines

`lr` was getting writen over by nested calls to `BL`, causing undefined
behavior with mixtures of `CALL`, `EX2`, and `LG2` instructions.

Each usage of `BL` is now protected with a stach push/pop to preserve
and restore teh `lr` register to allow nested subroutines to work
properly.

* shader_jit/tests: Allocate generated tests on heap

Each of these generated shader-test objects were causing the stack to
overflow.  Allocate each of the generated tests on the heap and use
unique_ptr so they only exist within the life-time of the `REQUIRE`
statement.

* shader_jit_a64: Preserve `lr` register from external function calls

`EMIT` makes an external function call, and should be preserving `lr`

* shader_jit/tests: Add `MAD` unit-test

The Inline Asm version requires an upstream fix:
https://github.com/neobrain/nihstro/issues/68

Instead, the program code is manually configured and added.

* shader_jit/tests: Fix uninitialized instructions

These `union`-type instruction-types were uninitialized, causing tests
to indeterminantly fail at times.

* shader_jit_a64: Remove unneeded `MOV`

Residue from the direct-port of x64 code.

* shader_jit_a64: Use `std::array` for `instr_table`

Add some type-safety and const-correctness around this type as well.

* shader_jit_a64: Avoid c-style offset casting

Add some more const-correctness to this function as well.

* video_core: Add arch preprocessor comments

* common/aarch64: Use X16 as the veneer register

https://developer.arm.com/documentation/102374/0101/Procedure-Call-Standard

* shader_jit/tests: Add uniform reading unit-test

Particularly to ensure that addresses are being properly truncated

* common/aarch64: Use `X0` as `ABI_RETURN`

`X8` is used as the indirect return result value in the case that the
result is bigger than 128-bits. Principally `X0` is the general-case
return register though.

* common/aarch64: Add veneer register note

`LR` is generally overwritten by `BLR` anyways, and would also be a safe
veneer to utilize for far-calls.

* shader_jit_a64: Remove unneeded scratch register from `SanitizedMul`

* shader_jit_a64: Fix CALLU condition

Should be `EQ` not `NE`. Fixes the regression on Kid Icarus.
No known regressions anymore!

---------

Co-authored-by: merryhime <merryhime@users.noreply.github.com>
Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>
2023-11-05 21:40:31 +01:00
Wunk 3218af38d0
renderer_vulkan: Add scissor and viewport to dynamic pipeline state (#7114)
Adds the current viewport and scissor to the dynamic pipeline state to
reduce redundant viewport/scissor assignments in the command buffer.
This greatly reduces the amount of API calls to `vkCmdSetViewport` and
`vkCmdSetScissor` by only emitting the API call when the state actually
changes.
2023-11-05 12:26:09 -08:00
Wunk 1cf64ffaef
vk_stream_buf: Allow dedicated allocations (#7103)
* vk_stream_buf: Avoid protected memory heaps

* Add an "Exclude" argument when finding a memory-type that avoids
  `VK_MEMORY_PROPERTY_PROTECTED_BIT` by default

* vk_stream_buf: Utilize dedicated allocations when preferred by driver

`VK_KHR_dedicated_allocation` is part of the core Vulkan 1.1
specification and should be utilized when `prefersDedicatedAllocation`
is set.
2023-11-05 12:25:59 -08:00
Wunk b10f3d96f5
command_processor: Fix out-of-bounds float-uniform access (#7111)
Addresses:
https://github.com/citra-emu/citra/issues/6696
https://github.com/citra-emu/citra/issues/6871
2023-11-03 03:35:52 -07:00
Wunk ac9d72a95c
vk_texture_runtime: Fix debug scope label lambda-capture (#7102)
`DebugScope` was capturing a `string_view` in a lambda which is only
valid during the scope of this ctor. When the lambda gets invoked at a
later time, it will read undefined garbage. The lambda needs to make a
deep copy of this `string_view` into a `string` so that it is valid by
the time the scheduler invokes this lambda.
2023-11-01 21:30:54 +01:00
GPUCode 36146459f8
renderer_vulkan: Fix screenshots under NVIDIA vulkan (#7082) 2023-10-22 22:53:14 +03:00
GPUCode ef43776c7b
shader: Fix address register offset behavior in x64 Jit (#6942)
* shader: Fix address register offset behavior in x64 Jit

* shader: Remove redundant jump

* tests: Add address register tests

* shader: Remove additional pre-multiplications by 16

* tests: Add catch-stringifier for vec4f

* tests: Format
2023-10-18 19:41:36 +03:00
Steveice10 4c59443ed2
common: Add more robust ZSTD handling. (#7071) 2023-10-15 14:08:29 -07:00
GPUCode 40ba5226c6
vk_instance: Perform vulkan logging as early as possible (#7058) 2023-10-11 15:11:43 -07:00