Commit graph

2262 commits

Author SHA1 Message Date
ReinUsesLisp
a075bbcf36 renderer_opengl: Add assembly program code paths
Add code required to use OpenGL assembly programs based on
NV_gpu_program5. Decompilation for ARB programs is intended to be added
in a follow up commit. This does **not** include ARB decompilation and
it's not in an usable state.

The intention behind assembly programs is to reduce shader stutter
significantly on drivers supporting NV_gpu_program5 (and other required
extensions). Currently only Nvidia's proprietary driver supports these
extensions.

Add a UI option hidden for now to avoid people enabling this option
accidentally.

This code path has some limitations that OpenGL compatibility doesn't
have:
- NV_shader_storage_buffer_object is limited to 16 entries for a single
OpenGL context state (I don't know if this is an intended limitation, an
specification issue or I am missing something). Currently causes issues
on The Legend of Zelda: Link's Awakening.
- NV_parameter_buffer_object can't bind buffers using an offset
different to zero. The used workaround is to copy to a temporary buffer
(this doesn't happen often so it's not an issue).

On the other hand, it has the following advantages:
- Shaders build a lot faster.
- We have control over how floating point rounding is done over
individual instructions (SPIR-V on Vulkan can't do this).
- Operations on shared memory can be unsigned and signed.
- Transform feedbacks are dynamic state (not yet implemented).
- Parameter buffers (uniform buffers) are per stage, matching NVN and
hardware's behavior.
- The API to bind and create assembly programs makes sense, unlike
ARB_separate_shader_objects.
2020-05-19 18:00:04 -03:00
bunnei
b0d4fdb600 Merge pull request #3899 from ReinUsesLisp/float-comparisons
shader_ir: Add separate instructions for ordered and unordered comparisons and fix NE on GLSL
2020-05-13 09:51:14 -04:00
ReinUsesLisp
f39591b20a gl_shader_decompiler: Properly emulate NaN behaviour on NE
"Not equal" operators on GLSL seem to behave as unordered when we expect
an ordered comparison.

Manually emulate this checking for LGE values (numbers, not-NaNs).
2020-05-10 02:59:33 -03:00
Rodrigo Locatti
bab843f4e9 Merge pull request #3839 from Morph1984/r8g8ui
texture: Implement R8G8UI
2020-05-09 05:28:55 -03:00
ReinUsesLisp
d8598a8717 shader_ir: Separate float-point comparisons in ordered and unordered
This allows us to use native SPIR-V instructions without having to
manually check for NAN.
2020-05-09 04:55:15 -03:00
ReinUsesLisp
9054f86f03 gl_rasterizer: Implement viewport swizzles with NV_viewport_swizzle 2020-05-04 17:51:30 -03:00
bunnei
99075400e3 Merge pull request #3808 from ReinUsesLisp/wait-for-idle
{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers
2020-05-03 02:43:18 -04:00
bunnei
33a218bea2 Merge pull request #3693 from ReinUsesLisp/clean-samplers
shader/texture: Support multiple unknown sampler properties
2020-05-02 00:45:41 -04:00
Morph
6665cd04f1 texture: Implement R8G8UI
- Used by The Walking Dead: The Final Season
2020-04-30 13:19:36 -04:00
bunnei
6a57e5fc49 Merge pull request #3807 from ReinUsesLisp/fix-depth-clamp
maxwell_3d: Fix depth clamping register
2020-04-30 13:07:31 -04:00
bunnei
e043a955f1 Merge pull request #3799 from ReinUsesLisp/iadd-cc
shader: Implement P2R CC, IADD Rd.CC and IADD.X
2020-04-30 12:56:36 -04:00
bunnei
2765bb73a2 Merge pull request #3805 from ReinUsesLisp/preserve-contents
texture_cache: Reintroduce preserve_contents accurately
2020-04-30 12:56:19 -04:00
bunnei
510842d827 Merge pull request #3784 from ReinUsesLisp/shader-memory-util
shader/memory_util: Deduplicate code
2020-04-28 12:05:50 -04:00
ReinUsesLisp
8835d40024 {maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers
Drop MemoryBarrier from the buffer cache and use Maxwell3D's register
WaitForIdle.

To implement this on OpenGL we just call glMemoryBarrier with the
necessary bits.

Vulkan lacks this synchronization primitive, so we set an event and
immediately wait for it. This is not a pretty solution, but it's what
Vulkan can do without submitting the current command buffer to the queue
(which ends up being more expensive on the CPU).
2020-04-28 02:18:12 -03:00
ReinUsesLisp
b82e61dff0 maxwell_3d: Fix depth clamping register
Using deko3d as reference:
4e47ba0013/source/maxwell/gpu_3d_state.cpp (L42)

We were using bits 3 and 4 to determine depth clamping, but these are
the same both enabled and disabled:

state->depthClampEnable ? 0x101A : 0x181D

The same happens on Nvidia's OpenGL driver, where they do something like
this (default capabilities, GL 4.5 compatibility):

(state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c

There's always a difference between the first bits in this register, but
bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This
commit changes yuzu's behaviour to use bit 11 to determine depth
clamping.

- Fixes depth issues on Super Mario Odyssey's intro.
2020-04-27 20:50:14 -03:00
ReinUsesLisp
8e3af5d3ca texture_cache: Reintroduce preserve_contents accurately
This reverts commit 1f1e80c67d.

preserve_contents proved to be a meaningful optimization. This commit
reintroduces it but properly implemented on OpenGL.

We have to make sure the clear removes all the previous contents of the
image.

It's not currently implemented on Vulkan because we can do smart things
there that's preferred to be introduced in a separate commit.
2020-04-26 19:53:02 -03:00
Rodrigo Locatti
0f7a89c2ef Merge pull request #3753 from ReinUsesLisp/ac-vulkan
{gl,vk}_rasterizer: Add lazy default buffer maker and use it for empty buffers
2020-04-26 01:55:43 -03:00
ReinUsesLisp
9b433b2467 shader/memory_util: Deduplicate code
Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as
well as shader decoder code.

While we are at it, fix a bug in gl_shader_cache where compute shaders
had an start offset of a stage shader.
2020-04-26 01:38:51 -03:00
ReinUsesLisp
7e8f51273c shader/arithmetic_integer: Implement CC for IADD 2020-04-25 22:55:26 -03:00
ReinUsesLisp
c9b4c56d69 shader_ir: Turn classes into data structures 2020-04-23 18:00:06 -03:00
Fernando Sahmkow
e211e30093 GL_Fence_Manager: use GL_TIMEOUT_IGNORED instead of a loop, 2020-04-22 20:34:32 -04:00
Fernando Sahmkow
491aea4a91 Async GPU: Correct flushing behavior to be similar to old async GPU behavior. 2020-04-22 11:36:26 -04:00
Fernando Sahmkow
d9f1d5a4fd ShaderCache/PipelineCache: Cache null shaders. 2020-04-22 11:36:25 -04:00
Fernando Sahmkow
ea522da8b5 Address Feedback. 2020-04-22 11:36:24 -04:00
Fernando Sahmkow
ae2b3f2b64 Fix GCC error. 2020-04-22 11:36:23 -04:00
Fernando Sahmkow
3769318042 QueryCache: Implement Async Flushes. 2020-04-22 11:36:18 -04:00
Fernando Sahmkow
1966f1d948 OpenGL: Guarantee writes to Buffers. 2020-04-22 11:36:18 -04:00
Fernando Sahmkow
7986c97ed2 GPU: Implement Flush Requests for Async mode. 2020-04-22 11:36:17 -04:00
Fernando Sahmkow
af9f901764 FenceManager: Manage syncpoints and rename fences to semaphores. 2020-04-22 11:36:16 -04:00
Fernando Sahmkow
967f5cec17 FenceManager: Implement async buffer cache flushes on High settings 2020-04-22 11:36:15 -04:00
Fernando Sahmkow
2ee68ad8e4 GPU: Fix rebase errors. 2020-04-22 11:36:13 -04:00
Fernando Sahmkow
b2787048d1 Rasterizer: Disable fence managing in synchronous gpu. 2020-04-22 11:36:12 -04:00
Fernando Sahmkow
e7195b5f87 ThreadManager: Sync async reads on accurate gpu. 2020-04-22 11:36:12 -04:00
Fernando Sahmkow
be8742e286 GPU: Implement a Fence Manager. 2020-04-22 11:36:10 -04:00
Fernando Sahmkow
802fabe3ab OpenGL: Implement Fencing backend. 2020-04-22 11:36:10 -04:00
Fernando Sahmkow
de53bc96c0 BufferCache: Implement OnCPUWrite and SyncGuestHost 2020-04-22 11:36:07 -04:00
Fernando Sahmkow
c689dc6804 GPU: Refactor synchronization on Async GPU 2020-04-22 11:36:06 -04:00
Fernando Sahmkow
c213fd218b UI: Replasce accurate GPU option for GPU Accuracy Level 2020-04-22 11:36:04 -04:00
bunnei
e1fd985d73 Merge pull request #3714 from lioncash/copies
gl_shader_decompiler: Avoid copies where applicable
2020-04-21 20:16:02 -04:00
ReinUsesLisp
b33a0c0d5f gl_rasterizer: Fix buffers without size
On NVN buffers can be enabled but have no size. According to deko3d and
the behavior we see in Animal Crossing: New Horizons these buffers get
the special address of 0x1000 and limit themselves to 0xfff.

Implement buffers without a size by binding a null buffer to OpenGL
without a side.

1d1930beea/source/maxwell/gpu_3d_vbo.cpp (L62-L63)
2020-04-21 19:55:44 -03:00
Mat M
63fb1421a5 Merge pull request #3716 from bunnei/fix-another-impl-fallthrough
video_core: gl_shader_decompiler: Fix implicit fallthrough errors.
2020-04-18 15:17:52 -04:00
bunnei
6613cbfc35 video_core: gl_shader_decompiler: Fix implicit fallthrough errors. 2020-04-18 15:15:21 -04:00
Lioncash
e302cb8c36 gl_shader_decompiler: Avoid copies where applicable
Avoids unnecessary reference count increments where applicable and also
avoids reallocating a vector.

Unlikely to make a huge difference, but given how trivial of an
amendment it is, why not?
2020-04-17 20:48:52 -04:00
Markus Wick
d8d02fa184 video_code: Fix implicit switch fallthrough.
Since yesterday, this breaks the build on linux.
So let's fix it.
2020-04-17 23:43:35 +02:00
Rodrigo Locatti
aed8e57a1b Revert "gl_shader_cache: Use CompileDepth::FullDecompile on GLSL" 2020-04-17 17:41:48 -03:00
bunnei
d392c4e552 Merge pull request #3682 from lioncash/uam
gl_query_cache: Resolve use-after-move in CachedQuery move assignment operator
2020-04-17 01:24:08 -04:00
bunnei
7a4ed2581d Merge pull request #3673 from lioncash/extra
CMakeLists: Specify -Wextra on linux builds
2020-04-16 21:12:33 -04:00
Fernando Sahmkow
7a9b83b658 Merge pull request #3600 from ReinUsesLisp/no-pointer-buf-cache
buffer_cache: Return handles instead of pointer to handles
2020-04-16 19:58:13 -04:00
ReinUsesLisp
c1ad40a3cb buffer_cache: Return handles instead of pointer to handles
The original idea of returning pointers is that handles can be moved.
The problem is that the implementation didn't take that in mind and made
everything harder to work with. This commit drops pointer to handles and
returns the handles themselves. While it is still true that handles can
be invalidated, this way we get an old handle instead of a dangling
pointer.

This problem can be solved in the future with sparse buffers.
2020-04-16 02:33:34 -03:00
Lioncash
5e32ba4080 gl_query_cache: Resolve use-after-move in CachedQuery move assignment operator
Avoids potential invalid junk data from being read.
2020-04-15 22:20:06 -04:00