* Add optimizations related to caller/callee saved registers, thread synchronization and disable tier 0
* Refactoring
* Add a config entry to enable or disable the reg load/store opt.
* Remove unnecessary register state stores for calls when the callee is know
* Rename IoType to VarType
* Enable tier 0 while fixing some perf issues related to tier 0
* Small tweak -- Compile before adding to the cache, to avoid lags
* Add required config entry
* Implement faster address translation and write tracking on the MMU
* Rename MemoryAlloc to MemoryManagement, and other nits
* Support multi-level page tables
* Fix typo
* Reword comment a bit
* Support scalar vector loads/stores on the memory fast path, and minor fixes
* Add missing cast
* Alignment
* Fix VirtualFree function signature
* Change MemoryProtection enum to uint aswell for consistency
* Add fixed-point variant of the UCVTF instruction
* Change encoding of some fixed-point instructions to not allow invalid encodings
* Fix Fcvtzu_Gp_Fixed encoding
* Add SCVTF (fixed-point GP to Scalar) instruction
* Simplify *Fixed encodings
* Implement ARM exclusive load/store with compare exchange insts, and enable multicore by default
* Fix comment typo
* Support Linux and OSX on MemoryAlloc and CompareExchange128, some cleanup
* Use intel syntax on assembly code
* Adjust identation
* Add CPUID check and fix exclusive reservation granule size
* Update schema multicore scheduling default value
* Make the cpu id check code lower case aswell
* Implement speculative translation on the cpu, and change the way how branches to unknown or untranslated addresses works
* Port t0opt changes and other cleanups
* Change namespace from translation related classes to ChocolArm64.Translation, other minor tweaks
* Fix typo
* Translate higher quality code for indirect jumps aswell, and on some cases that were missed when lower quality (tier 0) code was available
* Remove debug print
* Remove direct argument passing optimization, and enable tail calls for BR instructions
* Call delegates directly with Callvirt rather than calling Execute, do not emit calls for tier 0 code
* Remove unused property
* Rename argument on ArmSubroutine delegate
* Implement ARM32 memory instructions: LDM, LDR, LDRB, LDRD, LDRH, LDRSB, LDRSH, STM, STR, STRB, STRD, STRH (immediate and register + immediate variants), implement CMP (immediate and register shifted by immediate variants)
* Rename some opcode classes and flag masks for consistency
* Fix a few suboptimal ARM32 codegen issues, only loads should be considered on decoder when checking if Rt == PC, and only NZCV flags should be considered for comparison optimizations
* Take into account Rt2 for LDRD instructions aswell when checking if the instruction changes PC
* Re-align arm32 instructions on the opcode table
* Remove ARM32 interpreter and add ARM32 support on the translator
* Nits.
* Rename Cond -> Condition
* Align code again
* Rename Data to Alu
* Enable ARM32 support and handle undefined instructions
* Use the IsThumb method to check if its a thumb opcode
* Remove another 32-bits check
* Optimized memory modified check
This was initially in some cases more expensive than plainly sending the data. Now it should have way better performance.
* Small refactoring
* renamed InvalidAccessEventArgs
* Renamed PtPageBits
* Removed ValueRange(set)
They are currently unused and won't be likely to be used in the near future
* Fix and simplify TranslatorCache
* Fix some assignment alignments, remove some unused usings
* Changes to ILEmitter, separate it from ILEmitterCtx
* Rename ILEmitter to ILMethodBuilder
* Rename LdrLit and *_Fix opcodes
* Revert TranslatorCache impl to the more performant one, fix a few issues with it
* Allow EmitOpCode to be called even after everything has been emitted
* Make Emit and AdvanceOpCode private, simplify it a bit now that it starts emiting from the entry point
* Remove unneeded temp use
* Add missing exit call on TestExclusive
* Use better hash
* Implement the == and != operators
* Initial implementation of KProcess
* Some improvements to the memory manager, implement back guest stack trace printing
* Better GetInfo implementation, improve checking in some places with information from process capabilities
* Allow the cpu to read/write from the correct memory locations for accesses crossing a page boundary
* Change long -> ulong for address/size on memory related methods to avoid unnecessary casts
* Attempt at implementing ldr:ro with new KProcess
* Allow BSS with size 0 on ldr:ro
* Add checking for memory block slab heap usage, return errors if full, exit gracefully
* Use KMemoryBlockSize const from KMemoryManager
* Allow all methods to read from non-contiguous locations
* Fix for TransactParcelAuto
* Address PR feedback, additionally fix some small issues related to the KIP loader and implement SVCs GetProcessId, GetProcessList, GetSystemInfo, CreatePort and ManageNamedPort
* Fix wrong check for source pages count from page list on MapPhysicalMemory
* Fix some issues with UnloadNro on ldr:ro
* Better implementation of the DMA pusher, misc fixes
* Remove some debug code
* Correct RGBX8 format
* Add support for linked Texture Sampler Control
* Attempt to fix upside down screen issue
* Change naming convention for Ryujinx project
* Change naming convention for ChocolArm64 project
* Fix NaN
* Remove unneeded this. from Ryujinx project
* Adjust naming from new PRs
* Name changes based on feedback
* How did this get removed?
* Rebasing fix
* Change FP enum case
* Remove prefix from ChocolArm64 classes - Part 1
* Remove prefix from ChocolArm64 classes - Part 2
* Fix alignment from last commit's renaming
* Rename namespaces
* Rename stragglers
* Fix alignment
* Rename OpCode class
* Missed a few
* Adjust alignment
* Timing: Optimize Timestamp Aquisition
Currently, we make use of Environment.TickCount in a number of places. This has some downsides, mainly being that the TickCount is a signed 32-bit integer, and has an effective limit of ~25 days before overflowing and wrapping around. Due to the signed-ness of the value, this also caused issues with negative numbers. This resolves these issues by using a 64-bit tick count obtained from Performance Counters (via the Stopwatch class). This has a beneficial side effect of being significantly more accurate than the TickCount.
* Timing: Rename ElapsedTicks to ElapsedMilliseconds and expose TicksPerX
* Timing: Some style changes
* Timing: Align static variable initialization
* Print stack trace on invalid memory accesses
* Rebased, change code region base address for 39-bits address space, print stack trace on break and undefined instructions too
* Remove unused tracing functionality from the CPU
* GetNsoExecutable -> GetExecutable
* Unsigned comparison
* Re-add cpu tracing
* Config change
* Remove cold methods from the translation cache on the cpu
* Replace lock with try lock, pass new ATranslatorCache instead of ATranslator
* Rebase fixups
* Started to rewrite the thread scheduler
* Add a single core-like scheduling mode, enabled by default
* Clear exclusive monitor on context switch
* Add SetThreadActivity, misc fixes
* Implement WaitForAddress and SignalToAddress svcs, misc fixes
* Misc fixes (on SetActivity and Arbiter), other tweaks
* Rebased
* Add missing null check
* Rename multicore key on config, fix UpdatePriorityInheritance
* Make scheduling data MLQs private
* nit: Ordering
* (Re)Implement format reinterpretation, other changes
* Implement writeback to guest memory, some refactoring
* More refactoring, implement reinterpretation the old way again
* Clean up
* Some fixes on M2MF (old Dma engine), added partial support for P2MF, fix conditional ssy, add Z24S8 zeta format, other fixes
* nit: Formatting
* Address PR feedback