Commit graph

717 commits

Author SHA1 Message Date
ReinUsesLisp
9e87193725 video_core: Remove all Core::System references in renderer
Now that the GPU is initialized when video backends are initialized,
it's no longer needed to query components once the game is running: it
can be done when yuzu is booting.

This allows us to pass components between constructors and in the
process remove all Core::System references in the video backend.
2020-09-06 05:28:48 -03:00
ameerj
1b829fbd7a move thread 1/4 count computation into allocate workers method 2020-08-16 12:02:22 -04:00
Morph
e8f22730d1 renderer_opengl: Use 1/4 of all threads for async shader compilation 2020-07-28 05:08:27 -04:00
ReinUsesLisp
a8a2526128 gl_arb_decompiler: Use NV_shader_buffer_{load,store} on assembly shaders
NV_shader_buffer_{load,store} is a 2010 extension that allows GL applications
to use what in Vulkan is known as physical pointers, this is basically C
pointers. On GLASM these is exposed through the LOAD/STORE/ATOM
instructions.

Up until now, assembly shaders were using NV_shader_storage_buffer_object.
These work fine, but have a (probably unintended) limitation that forces
us to have the limit of a single stage for all shader stages. In contrast,
with NV_shader_buffer_{load,store} we can pass GPU addresses to the
shader through local parameters (GLASM equivalent uniform constants, or
push constants on Vulkan). Local parameters have the advantage of being
per stage, allowing us to generate code without worrying about binding
overlaps.
2020-07-18 01:59:57 -03:00
David Marcec
2ba195aa0d Drop max workers from 8->2 for testing 2020-07-17 14:26:15 +10:00
David Marcec
468bd9c1b0 async shaders 2020-07-17 14:24:57 +10:00
Morph
10eca7f651 maxwell_to_gl: Rename VertexType() to VertexFormat() 2020-06-29 11:48:38 -04:00
ReinUsesLisp
41a4090320 gl_rasterizer: Use NV_vertex_buffer_unified_memory for vertex buffer robustness
Switch games are allowed to bind less data than what they use in a
vertex buffer, the expected behavior here is that these values are read
as zero. At the moment of writing this only D3D12, OpenGL and NVN through
NV_vertex_buffer_unified_memory support vertex buffer with a size limit.

In theory this could be emulated on Vulkan creating a new VkBuffer for
each (handle, offset, length) tuple and binding the expected data to it.
This is likely going to be slow and memory expensive when used on the
vertex buffer and we have to do it on all draws because we can't know
without analyzing indices when a game is going to read vertex data out
of bounds.

This is not a problem on OpenGL's BufferAddressRangeNV because it takes
a length parameter, unlike Vulkan's CmdBindVertexBuffers that only takes
buffers and offsets (the length is implicit in VkBuffer). It isn't a
problem on D3D12 either, because D3D12_VERTEX_BUFFER_VIEW on
IASetVertexBuffers takes SizeInBytes as a parameter (although I am not
familiar with robustness on D3D12).

Currently this only implements buffer ranges for vertex buffers,
although indices can also be affected. A KHR_robustness profile is not
created, but Nvidia's driver reads out of bound vertex data as zero
anyway, this might have to be changed in the future.

- Fixes SMO random triangles when capturing an enemy, getting hit, or
looking at the environment on certain maps.
2020-06-24 02:36:14 -03:00
ReinUsesLisp
32485917ba gl_buffer_cache: Mark buffers as resident
Make stream buffer and cached buffers as resident and query their
address. This allows us to use GPU addresses for several proprietary
Nvidia extensions.
2020-06-24 02:36:14 -03:00
bunnei
92021a344c
Merge pull request #4064 from ReinUsesLisp/invalidate-buffers
gl_rasterizer: Mark vertex buffers as dirty after buffer cache invalidation
2020-06-14 00:29:16 -04:00
bunnei
c2ea1e1bcb
Merge pull request #4049 from ReinUsesLisp/separate-samplers
shader/texture: Join separate image and sampler pairs offline
2020-06-13 13:48:27 -04:00
bunnei
5633887569
Merge pull request #3986 from ReinUsesLisp/shader-cache
shader_cache: Implement a generic runtime shader cache
2020-06-12 23:14:48 -04:00
ReinUsesLisp
7646f2c21d gl_rasterizer: Mark vertex buffers as dirty after buffer cache invalidation
Vertex buffers bindings become invalid after the stream buffer is
invalidated. We were originally doing this, but it got lost at some
point.

- Fixes Animal Crossing: New Horizons, but it affects everything.
2020-06-08 20:24:16 -03:00
ReinUsesLisp
b96f65b62b gl_shader_cache: Use generic shader cache
Trivially port the generic shader cache to OpenGL.
2020-06-07 04:32:57 -03:00
ReinUsesLisp
5b2b6d594c shader/texture: Join separate image and sampler pairs offline
Games using D3D idioms can join images and samplers when a shader
executes, instead of baking them into a combined sampler image. This is
also possible on Vulkan.

One approach to this solution would be to use separate samplers on
Vulkan and leave this unimplemented on OpenGL, but we can't do this
because there's no consistent way of determining which constant buffer
holds a sampler and which one an image. We could in theory find the
first bit and if it's in the TIC area, it's an image; but this falls
apart when an image or sampler handle use an index of zero.

The used approach is to track for a LOP.OR operation (this is done at an
IR level, not at an ISA level), track again the constant buffers used as
source and store this pair. Then, outside of shader execution, join
the sample and image pair with a bitwise or operation.

This approach won't work on games that truly use separate samplers in a
meaningful way. For example, pooling textures in a 2D array and
determining at runtime what sampler to use.

This invalidates OpenGL's disk shader cache :)

- Used mostly by D3D ports to Switch
2020-06-05 00:24:51 -03:00
ReinUsesLisp
3d99b449d3 gl_rasterizer: Use NV_transform_feedback for XFB on assembly shaders
NV_transform_feedback, NV_transform_feedback2 and
ARB_transform_feedback3 with NV_transform_feedback interactions allows
implementing transform feedbacks as dynamic state.

Maxwell implements transform feedbacks as dynamic state, so using these
extensions with TransformFeedbackStreamAttribsNV allows us to properly
emulate transform feedbacks without having to recompile shaders when the
state changes.
2020-06-03 20:22:12 -03:00
bunnei
597d8b4bd4
Merge pull request #4006 from ReinUsesLisp/squash-ubos
glsl: Squash constant buffers into a single SSBO when we hit the limit
2020-06-02 14:58:50 -04:00
bunnei
6c0b1a9ee2
Merge pull request #3996 from ReinUsesLisp/front-faces
fixed_pipeline_state,gl_rasterizer: Swap negative viewport checks for front faces
2020-06-01 14:04:35 -04:00
ReinUsesLisp
ee21e4ecd3 glsl: Squash constant buffers into a single SSBO when we hit the limit
Avoids compilation errors at the cost of shader build times and runtime
performance when a game hits the limit of uniform buffers we can use.
2020-05-31 21:33:49 -03:00
Morph
bb8ef38152 gl_device: Enable compute shaders for Intel proprietary drivers
Previously we were disabling compute shaders on Intel's proprietary driver due to broken compute. This has been fixed in the latest Intel drivers. Re-enable compute for Intel proprietary drivers and remove the check for broken compute.
2020-05-31 03:21:07 -04:00
ReinUsesLisp
b17fe82973 gl_texture_cache: Implement small texture view cache for swizzles
This fixes cases where the texture swizzle was applied twice on the same
draw to a texture bound to two different slots.
2020-05-26 17:50:08 -03:00
ReinUsesLisp
606a62d4c7 gl_rasterizer: Port front face flip check from Vulkan
While Vulkan was assuming we had no negative viewports, OpenGL code
was assuming we had them. Port the old code from Vulkan to OpenGL,
checking if the first viewport is negative before flipping faces.

This is not a complete implementation since we only check for the first
viewport to be negative. That said, unless a game is using Vulkan,
OpenGL and NVN games should be fine here, and we can always compare with
our Vulkan backend to see if there's a difference.
2020-05-26 16:33:50 -03:00
bunnei
1adabdac7f
Merge pull request #3905 from FernandoS27/vulkan-fix
Correct a series of crashes and intructions on Async GPU and Vulkan Pipeline
2020-05-24 15:23:38 -04:00
ReinUsesLisp
420cc13248 renderer_opengl: Add assembly program code paths
Add code required to use OpenGL assembly programs based on
NV_gpu_program5. Decompilation for ARB programs is intended to be added
in a follow up commit. This does **not** include ARB decompilation and
it's not in an usable state.

The intention behind assembly programs is to reduce shader stutter
significantly on drivers supporting NV_gpu_program5 (and other required
extensions). Currently only Nvidia's proprietary driver supports these
extensions.

Add a UI option hidden for now to avoid people enabling this option
accidentally.

This code path has some limitations that OpenGL compatibility doesn't
have:
- NV_shader_storage_buffer_object is limited to 16 entries for a single
OpenGL context state (I don't know if this is an intended limitation, an
specification issue or I am missing something). Currently causes issues
on The Legend of Zelda: Link's Awakening.
- NV_parameter_buffer_object can't bind buffers using an offset
different to zero. The used workaround is to copy to a temporary buffer
(this doesn't happen often so it's not an issue).

On the other hand, it has the following advantages:
- Shaders build a lot faster.
- We have control over how floating point rounding is done over
individual instructions (SPIR-V on Vulkan can't do this).
- Operations on shared memory can be unsigned and signed.
- Transform feedbacks are dynamic state (not yet implemented).
- Parameter buffers (uniform buffers) are per stage, matching NVN and
hardware's behavior.
- The API to bind and create assembly programs makes sense, unlike
ARB_separate_shader_objects.
2020-05-19 18:00:04 -03:00
Fernando Sahmkow
0a4be73b9b VideoCore: Use SyncGuestMemory mechanism for Shader/Pipeline Cache invalidation. 2020-05-09 19:25:29 -04:00
ReinUsesLisp
f813cd3ff7 gl_rasterizer: Implement viewport swizzles with NV_viewport_swizzle 2020-05-04 17:51:30 -03:00
bunnei
2aff0b4733
Merge pull request #3808 from ReinUsesLisp/wait-for-idle
{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers
2020-05-03 02:43:18 -04:00
bunnei
e6b4311178
Merge pull request #3693 from ReinUsesLisp/clean-samplers
shader/texture: Support multiple unknown sampler properties
2020-05-02 00:45:41 -04:00
bunnei
bf3f030a0d
Merge pull request #3807 from ReinUsesLisp/fix-depth-clamp
maxwell_3d: Fix depth clamping register
2020-04-30 13:07:31 -04:00
ReinUsesLisp
fe931ac976 {maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers
Drop MemoryBarrier from the buffer cache and use Maxwell3D's register
WaitForIdle.

To implement this on OpenGL we just call glMemoryBarrier with the
necessary bits.

Vulkan lacks this synchronization primitive, so we set an event and
immediately wait for it. This is not a pretty solution, but it's what
Vulkan can do without submitting the current command buffer to the queue
(which ends up being more expensive on the CPU).
2020-04-28 02:18:12 -03:00
ReinUsesLisp
bb1ed66d99 maxwell_3d: Fix depth clamping register
Using deko3d as reference:
4e47ba0013/source/maxwell/gpu_3d_state.cpp (L42)

We were using bits 3 and 4 to determine depth clamping, but these are
the same both enabled and disabled:

state->depthClampEnable ? 0x101A : 0x181D

The same happens on Nvidia's OpenGL driver, where they do something like
this (default capabilities, GL 4.5 compatibility):

(state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c

There's always a difference between the first bits in this register, but
bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This
commit changes yuzu's behaviour to use bit 11 to determine depth
clamping.

- Fixes depth issues on Super Mario Odyssey's intro.
2020-04-27 20:50:14 -03:00
ReinUsesLisp
8da16cf9fb texture_cache: Reintroduce preserve_contents accurately
This reverts commit 94b0e2e5da.

preserve_contents proved to be a meaningful optimization. This commit
reintroduces it but properly implemented on OpenGL.

We have to make sure the clear removes all the previous contents of the
image.

It's not currently implemented on Vulkan because we can do smart things
there that's preferred to be introduced in a separate commit.
2020-04-26 19:53:02 -03:00
Rodrigo Locatti
7e38dd580f
Merge pull request #3753 from ReinUsesLisp/ac-vulkan
{gl,vk}_rasterizer: Add lazy default buffer maker and use it for empty buffers
2020-04-26 01:55:43 -03:00
ReinUsesLisp
72deb773fd shader_ir: Turn classes into data structures 2020-04-23 18:00:06 -03:00
Fernando Sahmkow
39e5b72948 Async GPU: Correct flushing behavior to be similar to old async GPU behavior. 2020-04-22 11:36:26 -04:00
Fernando Sahmkow
f616dc0b59 Address Feedback. 2020-04-22 11:36:24 -04:00
Fernando Sahmkow
ec2f3e48e1 Fix GCC error. 2020-04-22 11:36:23 -04:00
Fernando Sahmkow
0649f05900 QueryCache: Implement Async Flushes. 2020-04-22 11:36:18 -04:00
Fernando Sahmkow
131b342130 OpenGL: Guarantee writes to Buffers. 2020-04-22 11:36:18 -04:00
Fernando Sahmkow
1fb516cd97 GPU: Implement Flush Requests for Async mode. 2020-04-22 11:36:17 -04:00
Fernando Sahmkow
b7bc3c2549 FenceManager: Manage syncpoints and rename fences to semaphores. 2020-04-22 11:36:16 -04:00
Fernando Sahmkow
b10db7e4a5 FenceManager: Implement async buffer cache flushes on High settings 2020-04-22 11:36:15 -04:00
Fernando Sahmkow
a081a7c855 GPU: Fix rebase errors. 2020-04-22 11:36:13 -04:00
Fernando Sahmkow
e84eb64e51 Rasterizer: Disable fence managing in synchronous gpu. 2020-04-22 11:36:12 -04:00
Fernando Sahmkow
165ae823f5 ThreadManager: Sync async reads on accurate gpu. 2020-04-22 11:36:12 -04:00
Fernando Sahmkow
1f345ebe3a GPU: Implement a Fence Manager. 2020-04-22 11:36:10 -04:00
Fernando Sahmkow
487379c593 OpenGL: Implement Fencing backend. 2020-04-22 11:36:10 -04:00
Fernando Sahmkow
8b1eb44b3e BufferCache: Implement OnCPUWrite and SyncGuestHost 2020-04-22 11:36:07 -04:00
Fernando Sahmkow
da8f17715d GPU: Refactor synchronization on Async GPU 2020-04-22 11:36:06 -04:00
Fernando Sahmkow
084ceb925a UI: Replasce accurate GPU option for GPU Accuracy Level 2020-04-22 11:36:04 -04:00