The following code is broken on AMD's proprietary GLSL compiler:
```glsl
uint idx = ...;
vec4 values = ...;
float some_value = values[idx & 3];
```
It index the wrong components, to fix this the following pessimized code
is emitted when that bug is present:
```glsl
uint idx = ...;
vec4 values = ...;
float some_value;
if ((idx & 3) == 0) some_value = values.x;
if ((idx & 3) == 1) some_value = values.y;
if ((idx & 3) == 2) some_value = values.z;
if ((idx & 3) == 3) some_value = values.w;
```
Component indexing on AMD's proprietary driver is broken. This commit adds
a test to detect when we are on a driver that can't successfully manage
component indexing.
It dispatches a dummy draw with just one vertex shader that writes to an
indexed SSBO from the GPU with data sent through uniforms, it then reads
that data from the CPU and compares the expected output.
nullptr was being returned in the error case, which, at a glance may
seem perfectly OK... until you realize that std::string has the
invariant that it may not be constructed from a null pointer. This
means that if this error case was ever hit, then the application would
most likely crash from a thrown exception in std::string's constructor.
Instead, we can change the function to return an optional value,
indicating if a failure occurred.
Makes the parameter ordering consistent, and also makes the filename
parameter a std::string. A std::string would be constructed anyways with
the previous code, as IOFile's only constructor with a filepath is one
taking a std::string.
We can also make WriteStringToFile's string parameter utilize a
std::string_view for the string, making use of our previous changes to
IOFile.
We don't need to force the usage of a std::string here, and can instead
use a std::string_view, which allows writing out other forms of strings
(e.g. C-style strings) without any unnecessary heap allocations.
This allows for forming comment nodes without making unnecessary copies
of the std::string instance.
e.g. previously:
Comment(fmt::format("Base address is c[0x{:x}][0x{:x}]",
cbuf->GetIndex(), cbuf_offset));
Would result in a copy of the string being created, as CommentNode()
takes a std::string by value (a const ref passed to a value parameter
results in a copy).
Now, only one instance of the string is ever moved around. (fmt::format
returns a std::string, and since it's returned from a function by value,
this is a prvalue (which can be treated like an rvalue), so it's moved
into Comment's string parameter), we then move it into the CommentNode
constructor, which then moves the string into its member variable).
Amends cases where we were using things that were indirectly being
satisfied through other headers. This way, if those headers change and
eliminate dependencies on other headers in the future, we don't have
cascading compilation errors.
Previously, the code was accumulating data into a std::vector and then
tossing all of it away if a setting was disabled.
Instead, we can just check if it's disabled and do no work at all if
possible. If it's enabled, then we can append to the vector and
allocate.
Unlikely to impact usage much, but it is slightly less sloppy with
resources.
A few of the aoc service stubs/implementations weren't fully popping all
of the parameters passed to them. This ensures that all parameters are
popped and, at minimum, logged out.
Given the array is a private static array, we can just make it
internally linked to hide it from external code. This also allows us to
remove an inclusion within the header.
SMDH is a metadata format used in some executable formats for the
Nintendo 3DS. Switch executables don't utilize this metadata format, so
this just a holdover from Citra and can be corrected.
Allows the loading screen code to compile with implicit string
conversions disabled.
While we're at it remove unnecessary const usages, and add it to nearby
variables where appropriate.
Gets rid of the need to special-case brace handling depending on the
overload used, and makes it consistent across the board with how fmt
handles them.
Strings with compile-time deducible strings are directly forwarded to
std::string's constructor, so we don't need to worry about the
performance difference here, as it'll be identical.
In a lot of places throughout the decompiler, string concatenation via
operator+ is used quite heavily. This is usually fine, when not heavily
used, but when used extensively, can be a problem. operator+ creates an
entirely new heap allocated temporary string and given we perform
expressions like:
std::string thing = a + b + c + d;
this ends up with a lot of unnecessary temporary strings being created
and discarded, which kind of thrashes the heap more than we need to.
Given we utilize fmt in some AddLine calls, we can make this a part of
the ShaderWriter's API. We can make an overload that simply acts as a
passthrough to fmt.
This way, whenever things need to be appended to a string, the operation
can be done via a single string formatting operation instead of
discarding numerous temporary strings. This also has the benefit of
making the strings themselves look nicer and makes it easier to spot
errors in them.
Many of these constructors don't even need to be templated. The only
ones that need to be templated are the ones that actually make use of
the parameter pack.
Even then, since std::vector accepts an initializer list, we can supply
the parameter pack directly to it instead of creating our own copy of
the list, then copying it again into the std::vector.
Given the class contains quite a lot of non-trivial types, place the
constructor and destructor within the cpp file to avoid inlining
construction and destruction code everywhere the class is used.
Avoids performing copies into the pair being returned. Instead, we can
just move the resources into the pair, avoiding the need to make copies
of both the std::string and ShaderEntries struct.
Given the offset is assigned a fixed value in the constructor, we can
just assign it directly and get rid of the need to write the name of the
variable again in the constructor initializer list.
Given the disk shader cache contains non-trivial types, we should
default it in the cpp file in order to prevent inlining of the
complex destruction logic.
The standard library expects hash specializations that don't throw
exceptions. Make this explicit in the type to allow selection of better
code paths if possible in implementations.
We don't need to load the code into a vector and then construct a string
over the data. We can just create a string with the necessary size ahead
of time, and read the data directly into it, getting rid of an
unnecessary heap allocation.
std::move does nothing when applied to a const variable. Resources can't
be moved if the object is immutable. With this change, we don't end up
making several unnecessary heap allocations and copies.
Booleans don't have a guaranteed size, but we still want to have them
integrate into the disk cache system without needing to actually use a
different type. We can do this by supplying non-template overloads for
the bool type.
Non-template overloads always have precedence during function
resolution, so this is safe to provide.
This gets rid of the need to smatter ternary conditionals, as well as
the need to use u8 types to store the value in.
These are only used from within this translation unit, so they don't
need to have external linkage. They were intended to be marked with this
anyways to be consistent with the other service functions.
Renames the members to more accurately indicate what they signify.
"OneShot" and "Sticky" are kind of ambiguous identifiers for the reset
types, and can be kind of misleading. Automatic and Manual communicate
the kind of reset type in a clearer manner. Either the event is
automatically reset, or it isn't and must be manually cleared.
The "OneShot" and "Sticky" terminology is just a hold-over from Citra
where the kernel had a third type of event reset type known as "Pulse".
Given the Switch kernel only has two forms of event reset types, we
don't need to keep the old terminology around anymore.
This reduces the boilerplate that services have to write out the current thread explicitly. Using current thread instead of client thread is also semantically incorrect, and will be a problem when we implement multicore (at which time there will be multiple current threads)
Nvidia's proprietary driver creates a real OpenGL compatibility profile
without this option, meanwhile Intel (and probably AMD, I haven't tested
it) require that QSurfaceFormat::FormatOption::DeprecatedFunctions is
explicitly enabled.
This was reduced due to happening on most games and at such constant
rate that it affected performance heavily for the end user. In general,
we are well aware of the assert and an implementation is already
planned.
Avoids inlining destruction logic where applicable, and also makes
forward declarations not cause unexpected compilation errors depending
on where the State class is used.
Lessens the amount of code that needs to be read, and gets rid of the
need to introduce an indexing variable. Instead, we just operate on the
objects directly.
std::memset is used to clear the entire register structure, which
requires that the Regs struct be trivially copyable (otherwise undefined
behavior is invoked). This prevents the case where a non-trivial type is
potentially added to the struct.
std::move within a copy constructor (on a data member that isn't
mutable) will always result in a copy. Because of that, the behavior of
this copy constructor is identical to the one that would be generated
automatically by the compiler, so we can remove it.
This corrects cases where it was possible to write more entries into the
write buffer than were requested. Now, we check the size of the buffer
before actually writing into them.
We were also returning the wrong value for
GetAvailableLanguageCodeCount2(). This was previously returning 64, but
only 17 should have been returned. 64 entries is the size of the static
array used in MakeLanguageCode() within the service binary itself, but
isn't the actual total number of language codes present.
Makes the class less surprising when it comes to forward declaring the
type, and also prevents inlining the destruction code of the class,
given it contains non-trivial types.
These are able to be omitted from the declaration of functions, since
they don't do anything at the type system level. The definitions of the
functions can retain the use of const though, since they make the
variables immutable in the implementation of the function where they're
used.