* Implement GPU syncpoints
This adds support for GPU syncpoints on the GPU backend & nvservices.
Everything that was implemented here is based on my researches,
hardware testing of the GM20B and reversing of nvservices (8.1.0).
Thanks to @fincs for the informations about some behaviours of the pusher
and for the initial informations about syncpoints.
* syncpoint: address gdkchan's comments
* Add some missing logic to handle SubmitGpfifo correctly
* Handle the NV event API correctly
* evnt => hostEvent
* Finish addressing gdkchan's comments
* nvservices: write the output buffer even when an error is returned
* dma pusher: Implemnet prefetch barrier
lso fix when the commands should be prefetch.
* Partially fix prefetch barrier
* Add a missing syncpoint check in QueryEvent of NvHostSyncPt
* Address Ac_K's comments and fix GetSyncpoint for ChannelResourcePolicy == Channel
* fix SyncptWait & SyncptWaitEx cmds logic
* Address ripinperi's comments
* Address gdkchan's comments
* Move user event management to the control channel
* Fix mm implementation, nvdec works again
* Address ripinperi's comments
* Address gdkchan's comments
* Implement nvhost-ctrl close accurately + make nvservices dispose channels when stopping the emulator
* Fix typo in MultiMediaOperationType
* Implement RasterizeEnable
* Match viewport count to hardware
* Simplify ScissorTest tracking around Blits
* Disable RasterizerDiscard around Blits and track its state
* Read RasterizeEnable reg as bool and add doc
* Only enumarate cached textures that are modified when flushing, rather than all of them.
* Remove locking.
* Add missing clear.
* Remove texture from modified list when data is disposed.
In case the game does not call either flush method at any point.
* Add ReferenceEqualityComparer from jD for the HashSet
* Use correct shader local size instead of a hardcoded size
* Remove unused uniform block
* Update XML doc
* Local memory size has 23 bits on maxwell
* Generate compute QMD struct from nv open doc header
* Remove dummy arrays when shared or local memory is not used, other improvements
Support shared color masks (used by Nouveau and maybe the NVIDIA
driver).
Support draw buffers (also required by OpenGL).
Support viewport transform disable (disabled for now as it breaks some
games).
Fix instanced rendering draw being ignored for multi draw.
Fix IADD and IADD3 immediate shader encodings, that was not matching
some ops.
Implement FFMA32I shader instruction.
Implement IMAD shader instruction.