* use ArrayPool, avoid 6000-7000 allocs/sec of runtime
* use ArrayPool, avoid ~7k allocs/second during game execution
* use ArrayPool, avoid ~3000 allocs/sec during game execution
* use MemoryPool, reduce 0.5 MB/sec of new allocations during game execution
* avoid over-allocation by setting List<> Capacity when known
* remove LINQ in KTimeManager.UnscheduleFutureInvocation
* KTimeManager - avoid spinning one more time when the time has arrived
* KTimeManager - let SpinWait decide when to Thread.Yield(), and don't SpinOnce() immediately after Thread.Yield()
* use MemoryPool, reduce ~175k bytes/sec allocation during game execution
* IpcService - call commands via dynamic methods instead of reflection .Invoke(). Faster to call and with fewer allocations because parameters can be passed directly instead of as an array
* Make ButtonMappingEntry a record struct to avoid allocations. Set the List<ButtonMappingEntry> capacity according to use.
* add MemoryBuffer type for working with MemoryPool<byte>
* update changes to use MemoryBuffer
* make parameter ReadOnlySpan instead of Span
* whitespace fix
* Revert "IpcService - call commands via dynamic methods instead of reflection .Invoke(). Faster to call and with fewer allocations because parameters can be passed directly instead of as an array"
This reverts commit f2c698bdf65f049e8481c9f2ec7138d9b9a8261d.
* tweak KTimeManager spin behavior
* replace MemoryBuffer with ByteMemoryPool modeled after System.Buffers.ArrayMemoryPool<T>
* make ByteMemoryPoolBuffer responsible for renting memory
* ava: Remove unused doWhileDeferred parameters
* ava: Minimally improve swkbd dialog
It's currently impossible to get the dialog to redirect focus to the InputBox.
* ava: Fix nca extraction dialog never closing
Also contains some minor cleanup
* Added HiddenFileTypes to config state, and check to file enumeration
* Added hiddenfiletypes checkboxes to the UI
* Added Ava version of HiddenFileTypes
* Inverted Hide to Show with file types, minor formatting
* all variables with a reference to 'hidden' is now 'shown'
* one more variable name changed
* review feedback
* added FileTypes extension methof to get the correlating config value
* moved extension method to new folder and file in Ryujinx.Ui.Common
* added default case for ToggleFileType
* changed exception type to OutOfRangeException
* Added check for eventual symlink when displaying game files.
* Moved symlink check logic
* Moved symlink check logic
* Fixed prev commit
---------
Co-authored-by: Daniel Shala <danielshala00@gmail.com>
* hle: Deal with empty titleNames in some languages
* gui: Fix displaying the wrong title name
* Remove unnecessary bounds check
* Fix a NRE when getting the version string
* Restore empty string logic
* Flush in the middle of long command buffers.
* Vulkan: add situational "Fast Flush" mode
The AutoFlushCounter class was added to periodically flush Vulkan command buffers throughout a frame, which reduces latency to the GPU as commands are submitted and processed much sooner. This was done by allowing command buffers to flush when framebuffer attachments changed.
However, some games have incredibly long render passes with a large number of draws, and really aggressive data access that forces GPU sync.
The Vulkan backend could potentially end up building a single command buffer for 4-5ms if a pass has enough draws, such as in BOTW. In the scenario where sync is waited on immediately after submission, this would have to wait for the completion of a much longer command buffer than usual.
The solution is to force command buffer submission periodically in a "fast flush" mode. This will end up splitting render passes, but it will only enable if sync is aggressive enough.
This should improve performance in GPU limited scenarios, or in games that aggressively wait on synchronization. In some games, it may only kick in when res scaling. It won't trigger in games like SMO where sync is not an issue.
Improves performance in Pokemon Scarlet/Violet (res scaled) and BOTW (in general).
* Add conversions in milliseconds next to flush timers.
* ARMeilleure: Move TPIDR_EL0 and TPIDRRO_EL0 to NativeContext
Some games access these system registers several tens of thousands of times in a second from many different threads. While this isn't really crippling, it is a lot of wasted time spent in a reverse pinvoke transition.
Example games are Pokemon Scarlet/Violet and BOTW. These games have a lot of different potential bottlenecks so it's unlikely you will see a consistent improvement, but it definitely disappears from the cpu profile.
* Remove unreachable code.
* Add ulong conversion for offsets
* Nit
This seems to have been removed by the Post-Processing PR, but it is required for the display in OBS to be the right way up and properly scaled.
I've tested this with AA and FSR on MK8D and it seems to behave properly. Testing is welcome.
* ARMeilleure: Respect Fz flag for all floating point operations.
This is a change in strategy for emulating the Fz FPCR flag. Before, it was set before instructions that "needed it" and reset after. However, this missed a few hot instructions like the multiplication instruction, and the entirety of A32.
The new strategy is to set the Fz flag only in the following circumstances:
- Set to match FPCR before translated functions/loop are executed.
- Reset when calling SoftFloat methods, set when returning.
- Reset when exiting execution.
This allows us to remove the code around the existing Fz aware instructions, and get the accuracy benefits on all floating point instructions executed while in translated code.
Single step executions now need to be called with a context wrapper - right now it just contains the Fz flag initialization, and won't actually do anything on ARM.
This fixes a bug in Breath of the Wild where some physics interactions could randomly crash the game due to subnormal values not flushing to zero.
This is draft right now because I need to answer the questions:
- Does dotnet avoid changing the value of Mxcsr?
- Is it a good idea to assume that? Or should the flag set/restore be done on every managed method call, not just softfloat?
- If we assume that, do we want a unit test to verify the behaviour?
I recommend testing a bunch of games, especially games affected when this was originally added, such as #1611.
* Remove unused method
* Use FMA for Fmadd, Fmsub, Fnmadd, Fnmsub, Fmla, Fmls
...when available.
Similar implementation to A32
* Use FMA for Frecps, Frsqrts
* Don't set DAZ.
* Add round mode to ARM FP mode
* Fix mistakes
* Add test for FP state when calling managed methods
* Add explanatory comment to test.
* Cleanup
* Add A64 FPCR flags
* Vrintx_S A32 fast path on A64 backend
* Address feedback 1, re-enable DAZ
* Fix FMA instructions By Elem
* Address feedback
* Redesign use of ISampledData for accessing the SamplingNumber value on input data structs.
* Always read SamplingNumber as little-endian
* Restored field order for SixAxisSensorState. Rework to allow possibility of non-zero offsets for the SamplingNumber field. Set StructLayout Pack=8 - the KeyboardState struct is 4 bytes shorter with any other value.
* fix spelling
Co-authored-by: riperiperi <rhy3756547@hotmail.com>
* set Pack = 1 for ISampledDataStruct types, added Unknown field to KeyboardState
* extend size of KeyboardModifier
---------
Co-authored-by: riperiperi <rhy3756547@hotmail.com>
* vulkan: Move most of the properties enumeration to VulkanPhysicalDevice
That clean up a bit of duplicate logic.
Also move to use an hashset for device extensions.
* vulkan: Move instance querying to VulkanInstance
Also cleanup code to use span when possible instead of unsafe pointers.
* Address gdkchan's comments
* Use index fragment shader output when dual source blend is enabled
* Shader cache version bump
* Actually set DualSourceBlendEnabled to true
* Fix XML doc
---------
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* Fix missing string enum converters for the config
* Revert changing KeyboardHotkeys to struct
This needs to be done because
Avalonia's TwoWay Binding breaks otherwise.
* Use source generated json serializers in order to improve code trimming
* Use strongly typed github releases model to fetch updates instead of raw Newtonsoft.Json parsing
* Use separate model for LogEventArgs serialization
* Make dynamic object formatter static. Fix string builder pooling.
* Do not inherit json version of LogEventArgs from EventArgs
* Fix extra space in object formatting
* Write log json directly to stream instead of using buffer writer
* Rebase fixes
* Rebase fixes
* Rebase fixes
* Enforce block-scoped namespaces in the solution. Convert style for existing code
* Apply suggestions from code review
Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
* Rebase indent fix
* Fix indent
* Delete unnecessary json properties
* Rebase fix
* Remove overridden json property names as they are handled in the options
* Apply suggestions from code review
Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
* Use default json options in github api calls
* Indentation and spacing fixes
* Fix json serialization
* Fix missing JsonConverter for config enums
* Add double \n\n after the whole string, not inside join
---------
Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
* vulkan: Separate debug utils logic from VulkanInitialization
Also checks for VK_EXT_debug_utils existence instead of force enabling it and allow possible error during messenger init
* Address gdkchan's comment
* Use CreateDebugUtilsMessenger Span variant
* HLE: Refactoring of ApplicationLoader
* Fix SDL2 Headless
* Addresses gdkchan feedback
* Fixes LoadUnpackedNca RomFS loading
* remove useless casting
* Cleanup and fixe empty application name
* Remove ProcessInfo
* Fixes typo
* ActiveProcess to ActiveApplication
* Update check
* Clean using.
* Use the correct filepath when loading Homebrew.npdm
* Fix NRE in ProcessResult if MetaLoader is null
* Add more checks for valid processId & return success
* Add missing logging statement for npdm error
* Return result for LoadKip()
* Move error logging out of PFS load extension method
This avoids logging "Could not find Main NCA"
followed by "Loading main..." when trying to start hbl.
* Fix GUIs not checking load results
* Fix style and formatting issues
* Fix formatting and wording
* gtk: Refactor LoadApplication()
---------
Co-authored-by: TSR Berry <20988865+TSRBerry@users.noreply.github.com>
* Rework StdErr-to-log redirection to use built-in FileStream, and do reads asynchronously to avoid hanging the process shutdown.
* set _disposable to false ASAP
* Simplify return statements by using ternary expressions
* Remove a redundant type conversion
* Reduce nesting by inverting "if" statements
* Try to improve code readability by using LINQ and inverting "if" statements
* Try to improve code readability by using LINQ, using ternary expressions, and inverting "if" statements
* Add line breaks to long LINQ
* Add line breaks to long LINQ
* Vulkan: Insert barriers before clears
Newer NVIDIA GPUs seem to be able to start clearing render targets before the last rasterization task is completed, which can cause it to clear a texture while it is being sampled.
This change adds a barrier from read to write when doing a clear, assuming it has been sampled in the past. It could be possible for this to be needed for sample into draw by some GPU, but it's not right now afaik.
This should fix visual artifacts on newer NVIDIA GPUs and driver combos. Contrary to popular belief, Tetris® Effect: Connected is not affected. Testing welcome, hopefully should fix most cases of this and not cost too much performance.
* Visual Studio Moment
* Address feedback
* Address Feedback 2