Commit graph

178 commits

Author SHA1 Message Date
bunnei
f650cf8a9a
Merge pull request #4391 from lioncash/nrvo
video_core: Allow copy elision to take place where applicable
2020-07-24 06:33:09 -07:00
Lioncash
e17fb5ee97 video_core: Remove unused variables
Silences several compiler warnings about unused variables.
2020-07-21 00:57:25 -04:00
Lioncash
6adc824d9d video_core: Allow copy elision to take place where applicable
Removes const from some variables that are returned from functions, as
this allows the move assignment/constructors to execute for them.
2020-07-21 00:36:13 -04:00
David Marcec
468bd9c1b0 async shaders 2020-07-17 14:24:57 +10:00
ReinUsesLisp
9f54cd4dad gl_shader_cache: Avoid use after move for program size
All programs had a size of zero due to this bug, skipping invalidations.

While we are at it, remove some unused forward declarations.
2020-06-23 22:54:42 -03:00
bunnei
798ec003ce
Merge pull request #4041 from ReinUsesLisp/arb-decomp
gl_arb_decompiler: Implement an assembly shader decompiler
2020-06-16 14:56:23 -04:00
ReinUsesLisp
a63a0daa5e gl_arb_decompiler: Implement an assembly shader decompiler
Emit code compatible with NV_gpu_program5.
This should emit code compatible with Fermi, but it wasn't tested on
that architecture. Pascal has some issues not present on Turing GPUs.
2020-06-11 22:12:07 -03:00
ReinUsesLisp
678f95e4f8 vk_pipeline_cache: Use generic shader cache
Trivial port the generic shader cache to Vulkan.
2020-06-07 04:32:57 -03:00
ReinUsesLisp
b96f65b62b gl_shader_cache: Use generic shader cache
Trivially port the generic shader cache to OpenGL.
2020-06-07 04:32:57 -03:00
ReinUsesLisp
ee21e4ecd3 glsl: Squash constant buffers into a single SSBO when we hit the limit
Avoids compilation errors at the cost of shader build times and runtime
performance when a game hits the limit of uniform buffers we can use.
2020-05-31 21:33:49 -03:00
ReinUsesLisp
420cc13248 renderer_opengl: Add assembly program code paths
Add code required to use OpenGL assembly programs based on
NV_gpu_program5. Decompilation for ARB programs is intended to be added
in a follow up commit. This does **not** include ARB decompilation and
it's not in an usable state.

The intention behind assembly programs is to reduce shader stutter
significantly on drivers supporting NV_gpu_program5 (and other required
extensions). Currently only Nvidia's proprietary driver supports these
extensions.

Add a UI option hidden for now to avoid people enabling this option
accidentally.

This code path has some limitations that OpenGL compatibility doesn't
have:
- NV_shader_storage_buffer_object is limited to 16 entries for a single
OpenGL context state (I don't know if this is an intended limitation, an
specification issue or I am missing something). Currently causes issues
on The Legend of Zelda: Link's Awakening.
- NV_parameter_buffer_object can't bind buffers using an offset
different to zero. The used workaround is to copy to a temporary buffer
(this doesn't happen often so it's not an issue).

On the other hand, it has the following advantages:
- Shaders build a lot faster.
- We have control over how floating point rounding is done over
individual instructions (SPIR-V on Vulkan can't do this).
- Operations on shared memory can be unsigned and signed.
- Transform feedbacks are dynamic state (not yet implemented).
- Parameter buffers (uniform buffers) are per stage, matching NVN and
hardware's behavior.
- The API to bind and create assembly programs makes sense, unlike
ARB_separate_shader_objects.
2020-05-19 18:00:04 -03:00
ReinUsesLisp
ddd82ef42b shader/memory_util: Deduplicate code
Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as
well as shader decoder code.

While we are at it, fix a bug in gl_shader_cache where compute shaders
had an start offset of a stage shader.
2020-04-26 01:38:51 -03:00
Fernando Sahmkow
644588fd88 ShaderCache/PipelineCache: Cache null shaders. 2020-04-22 11:36:25 -04:00
Rodrigo Locatti
990c0b184f
Revert "gl_shader_cache: Use CompileDepth::FullDecompile on GLSL" 2020-04-17 17:41:48 -03:00
ReinUsesLisp
453d7419d9 gl_shader_cache: Use CompileDepth::FullDecompile on GLSL
From my testing on a Splatoon 2 shader that takes 3800ms on average to
compile changing to FullDecompile reduces it to 900ms on average.

The shader decoder will automatically fallback to a more naive method if
it can't use full decompile.
2020-04-14 01:34:20 -03:00
Fernando Sahmkow
ea535d9470 Shader/Pipeline Cache: Use VAddr instead of physical memory for addressing. 2020-04-06 09:23:07 -04:00
James Rowe
cf9c94d401 Address review and fix broken yuzu-tester build 2020-03-25 23:32:42 -06:00
James Rowe
282adfc70b Frontend/GPU: Refactor context management
Changes the GraphicsContext to be managed by the GPU core. This
eliminates the need for the frontends to fool around with tricky
MakeCurrent/DoneCurrent calls that are dependent on the settings (such
as async gpu option).

This also refactors out the need to use QWidget::fromWindowContainer as
that caused issues with focus and input handling. Now we use a regular
QWidget and just access the native windowHandle() directly.

Another change is removing the debug tool setting in FrameMailbox.
Instead of trying to block the frontend until a new frame is ready, the
core will now take over presentation and draw directly to the window if
the renderer detects that its hooked by NSight or RenderDoc

Lastly, since it was in the way, I removed ScopeAcquireWindowContext and
replaced it with a simple subclass in GraphicsContext that achieves the
same result
2020-03-24 21:03:42 -06:00
ReinUsesLisp
b1061afed9 gl_shader_decompiler: Add identifier to decompiled code 2020-03-09 18:40:53 -03:00
ReinUsesLisp
e1932351a9 gl_shader_cache: Reduce registry consistency to debug assert
Registry consistency is something that practically can't happen and it
has a measurable runtime cost. Reduce it to a DEBUG_ASSERT.
2020-03-09 18:40:07 -03:00
ReinUsesLisp
0528be5c92 shader/registry: Store graphics and compute metadata
Store information GLSL forces us to provide but it's dynamic state in
hardware (workgroup sizes, primitive topology, shared memory size).
2020-03-09 18:40:07 -03:00
ReinUsesLisp
e8efd5a901 video_core: Rename "const buffer locker" to "registry" 2020-03-09 18:40:06 -03:00
ReinUsesLisp
bd8b9bbcee gl_shader_cache: Rework shader cache and remove post-specializations
Instead of pre-specializing shaders and then post-specializing them,
drop the later and only "specialize" the shader while decoding it.
2020-03-09 18:40:06 -03:00
ReinUsesLisp
f7ec078592 gl_state_tracker: Implement dirty flags for clip distances and shaders 2020-02-28 17:56:42 -03:00
ReinUsesLisp
96ac3d518a gl_rasterizer: Remove dirty flags 2020-02-28 16:39:27 -03:00
Fernando Sahmkow
1e4b6bef6f Shader_IR: Store Bound buffer on Shader Usage 2020-01-24 16:43:29 -04:00
ReinUsesLisp
3ce28342a2 gl_shader_cache: Disable fastmath on Nvidia 2020-01-21 19:08:08 -03:00
Lioncash
f10ea944e0 gl_shader_cache: Remove unused STAGE_RESERVED_UBOS constant
Given this isn't used, this can be removed entirely.
2020-01-14 13:16:52 -05:00
Lioncash
4cd5ad90f3 gl_shader_cache: std::move entries in CachedShader constructor
Avoids several reallocations of std::vector instances where applicable.
2020-01-14 13:14:16 -05:00
Lioncash
15a6840e7a gl_shader_cache: Remove unused entries variable in BuildShader()
Eliminates a few unnecessary constructions of std::vectors.
2020-01-14 13:11:49 -05:00
ReinUsesLisp
1e16023d60
gl_shader_cache: Update commentary for shared memory
Remove false commentary. Not dividing by 4 the size of shared memory is
not a hack; it describes the number of integers, not bytes.

While we are at it sort the generated code to put preprocessor lines on
the top.
2019-12-20 22:51:21 -03:00
ReinUsesLisp
486c6a5316
gl_shader_cache: Remove unused entry in GetPrimitiveDescription 2019-12-20 22:49:30 -03:00
ReinUsesLisp
48e16c4c49
gl_shader_cache: Add missing new-line on emitted GLSL
Add missing new-line. This caused shaders using local memory and shared
memory to inject a preprocessor GLSL line after an expression (resulting
in invalid code).

It looked like this:
shared uint smem[8];#define LOCAL_MEMORY_SIZE 16

It should look like this (addressed by this commit):
shared uint smem[8];
\#define LOCAL_MEMORY_SIZE 16
2019-12-10 23:52:51 -03:00
ReinUsesLisp
894ad74b87
gl_shader_cache: Hack shared memory size
The current shared memory size seems to be smaller than what the game
actually uses. This makes Nvidia's driver consistently blow up; in the
case of FE3H it made it explode on Qt's SwapBuffers while SDL2 worked
just fine. For now keep this hack since it's still progress over the
previous hardcoded shared memory size.
2019-11-22 21:28:49 -03:00
ReinUsesLisp
180417c514
gl_shader_cache: Remove dynamic BaseBinding specialization 2019-11-22 21:28:49 -03:00
ReinUsesLisp
c8a48aacc0
video_core: Unify ProgramType and ShaderStage into ShaderType 2019-11-22 21:28:48 -03:00
ReinUsesLisp
0f23359a44
gl_rasterizer: Bind graphics images to draw commands
Images were not being bound to draw invocations because these would
require a cache invalidation.
2019-11-22 21:28:48 -03:00
ReinUsesLisp
287ae2b9e8
gl_shader_cache: Specialize local memory size for compute shaders
Local memory size in compute shaders was stubbed with an arbitary size.
This commit specializes local memory size from guest GPU parameters.
2019-11-22 21:28:48 -03:00
ReinUsesLisp
dbeb523879
gl_shader_cache: Specialize shared memory size
Shared memory was being declared with an undefined size. Specialize from
guest GPU parameters the compute shader's shared memory size.
2019-11-22 21:28:47 -03:00
ReinUsesLisp
4f5d8e4342
gl_shader_cache: Specialize shader workgroup
Drop the usage of ARB_compute_variable_group_size and specialize compute
shaders instead. This permits compute to run on AMD and Intel
proprietary drivers.
2019-11-22 21:28:47 -03:00
ReinUsesLisp
32c1bc6a67
shader/texture: Deduce texture buffers from locker
Instead of specializing shaders to separate texture buffers from 1D
textures, use the locker to deduce them while they are being decoded.
2019-11-22 21:28:47 -03:00
Fernando Sahmkow
b6f6733131
Merge pull request #3081 from ReinUsesLisp/fswzadd-shuffles
shader: Implement FSWZADD and reimplement SHFL
2019-11-14 10:27:27 -04:00
ReinUsesLisp
bfa973a62b
gl_shader_cache: Fix locker constructors
Properly pass engine when a shader is being constructed from memory.
2019-11-07 20:43:31 -03:00
ReinUsesLisp
3ab0514698
gl_shader_cache: Enable extensions only when available
Silence GLSL compilation warnings.
2019-11-07 20:08:42 -03:00
ReinUsesLisp
08b2b1080a
gl_shader_decompiler: Reimplement shuffles with platform agnostic intrinsics 2019-11-07 20:08:41 -03:00
ReinUsesLisp
78f3e8a757 gl_shader_cache: Implement locker variants invalidation 2019-10-25 09:01:32 -04:00
ReinUsesLisp
ec85648af3 gl_shader_disk_cache: Store and load fast BRX 2019-10-25 09:01:31 -04:00
ReinUsesLisp
7b81ba4d8a gl_shader_decompiler: Move entries to a separate function 2019-10-25 09:01:31 -04:00
Fernando Sahmkow
acd6441134 Shader_Cache: setup connection of ConstBufferLocker 2019-10-25 09:01:29 -04:00
ReinUsesLisp
675f23aedc
shader/image: Implement SULD and remove irrelevant code
* Implement SULD as float.
* Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
2019-09-21 17:32:48 -03:00
ReinUsesLisp
0526bf1895 shader_ir/warp: Implement SHFL 2019-09-17 17:44:07 -03:00
ReinUsesLisp
42e1bb6d46 gl_shader_cache: Remove special casing for geometry shaders
Now that ProgramVariants holds the primitive topology we no longer need
to keep track of individual geometry shaders topologies.
2019-09-04 01:54:43 -03:00
Rodrigo Locatti
4d4f9cc104 video_core: Silent miscellaneous warnings (#2820)
* texture_cache/surface_params: Remove unused local variable

* rasterizer_interface: Add missing documentation commentary

* maxwell_dma: Remove unused rasterizer reference

* video_core/gpu: Sort member declaration order to silent -Wreorder warning

* fermi_2d: Remove unused MemoryManager reference

* video_core: Silent unused variable warnings

* buffer_cache: Silent -Wreorder warnings

* kepler_memory: Remove unused MemoryManager reference

* gl_texture_cache: Add missing override

* buffer_cache: Add missing include

* shader/decode: Remove unused variables
2019-08-30 14:08:00 -04:00
bunnei
a67c4e6e02
Merge pull request #2742 from ReinUsesLisp/fix-texture-buffers
gl_texture_cache: Miscellaneous texture buffer fixes
2019-08-29 15:59:17 -04:00
ReinUsesLisp
4e35177e23 shader_ir: Implement VOTE
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics

Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.

To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:

* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true

ballotARB, also known as "uint64_t(activeThreadsNV())", emits

VOTE.ANY Rd, PT, PT;

on nouveau's compiler. This doesn't match exactly to Nvidia's code

VOTE.ALL Rd, PT, PT;

Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
bunnei
f601f25bcc
Merge pull request #2734 from ReinUsesLisp/compute-shaders
gl_rasterizer: Implement compute shaders
2019-07-22 11:12:55 -04:00
ReinUsesLisp
87909d327f gl_shader_cache: Fix newline on buffer preprocessor definitions 2019-07-18 01:16:15 -03:00
Fernando Sahmkow
f2e7b29c14 Maxwell3D: Rework the dirty system to be more consistant and scaleable 2019-07-17 17:29:49 -04:00
ReinUsesLisp
2a4044a858 gl_shader_cache: Fix clang-format issues 2019-07-15 20:33:51 -03:00
ReinUsesLisp
56bca83bde gl_shader_cache: Address review commentaries 2019-07-15 17:38:25 -03:00
ReinUsesLisp
bbecd13697 gl_shader_cache: Address CI issues 2019-07-15 17:38:25 -03:00
ReinUsesLisp
725ba6cf63 gl_rasterizer: Implement compute shaders 2019-07-15 17:38:25 -03:00
Fernando Sahmkow
1bdb59fc6e
Merge pull request #2695 from ReinUsesLisp/layer-viewport
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
2019-07-15 16:28:07 -04:00
Fernando Sahmkow
459fce3a8f shader_ir: propagate shader size to the IR 2019-07-09 08:14:37 -04:00
ReinUsesLisp
c9d886c84e gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
This commit implements gl_ViewportIndex and gl_Layer in vertex and
geometry shaders. In the case it's used in a vertex shader, it requires
ARB_shader_viewport_layer_array. This extension is available on AMD and
Nvidia devices (mesa and proprietary drivers), but not available on
Intel on any platform. At the moment of writing this description I don't
know if this is a hardware limitation or a driver limitation.

In the case that ARB_shader_viewport_layer_array is not available,
writes to these registers on a vertex shader are ignored, with the
appropriate logging.
2019-07-07 20:42:55 -03:00
Zach Hilman
772c86a260
Merge pull request #2601 from FernandoS27/texture_cache
Implement a new Texture Cache
2019-07-05 13:39:13 -04:00
Fernando Sahmkow
3b9d89839d texture_cache: Address Feedback 2019-07-05 09:46:53 -04:00
Zach Hilman
ad50cd7df9 gl_shader_cache: Make CachedShader constructor private
Fixes missing review comments introduced.
2019-07-03 20:39:46 -04:00
Fernando Sahmkow
d1812316e1 texture_cache: Style and Corrections 2019-06-20 21:24:47 -04:00
Fernando Sahmkow
51ba60b27e shader_cache: Correct versioning and size calculation. 2019-06-20 21:38:34 -03:00
ReinUsesLisp
1bf4154e7d gl_shader_decompiler: Implement image binding settings 2019-06-20 21:38:33 -03:00
ReinUsesLisp
007ffbef1c gl_rasterizer: Track texture buffer usage 2019-06-20 21:38:33 -03:00
ReinUsesLisp
4ec8a3df08 gl_shader_cache: Use static constructors for CachedShader initialization 2019-06-07 20:20:22 -03:00
ReinUsesLisp
e72b9044a0 gl_shader_cache: Store a system class and drop global accessors 2019-05-30 14:01:40 -03:00
ReinUsesLisp
ad321564ed gl_shader_cache: Add commentaries explaining the intention in shaders creation 2019-05-30 13:58:38 -03:00
ReinUsesLisp
838b6d2ff8 gl_shader_cache: Flip if condition in GetStageProgram to reduce indentation 2019-05-30 13:56:03 -03:00
ReinUsesLisp
84928e6d67 gl_shader_gen: Always declare extensions after the version declaration
This addresses a bug on geometry shaders where code was being written
before all #extension declarations were done. Ref to #2523
2019-05-27 00:51:35 -03:00
ReinUsesLisp
69215b5a55 gl_shader_cache: Fix clang strict standard build issues 2019-05-20 22:46:05 -03:00
ReinUsesLisp
c03b8c4c19 gl_shader_cache: Use shared contexts to build shaders in parallel 2019-05-20 22:45:55 -03:00
Lioncash
b6408e9671 video_core/renderer_opengl/gl_shader_cache: Correct member initialization order
Silences a -Wreorder warning.
2019-05-09 18:55:47 -04:00
FreddyFunk
1a3ff252a4 Re added new lines at the end of files 2019-04-23 23:19:28 +02:00
unknown
9db2c734c9 gl_shader_disk_cache: Use VectorVfsFile for the virtual precompiled shader cache file 2019-04-23 22:24:23 +02:00
bunnei
4fad91ca45
Merge pull request #2383 from ReinUsesLisp/aoffi-test
gl_shader_decompiler: Disable variable AOFFI on unsupported devices
2019-04-22 22:14:02 -04:00
Fernando Sahmkow
06d1c5a991 Document unsafe versions and add BlockCopyUnsafe 2019-04-16 10:11:35 -04:00
Fernando Sahmkow
6fc562a9aa Use ReadBlockUnsafe for Shader Cache 2019-04-15 23:34:03 -04:00
ReinUsesLisp
f15c59a164 gl_shader_decompiler: Use variable AOFFI on supported hardware 2019-04-14 05:13:19 -03:00
bunnei
2598433f9c
Merge pull request #2354 from lioncash/header
video_core/texures/texture: Remove unnecessary includes
2019-04-09 19:19:41 -04:00
bunnei
f14328bf0a
Merge pull request #2300 from FernandoS27/null-shader
shader_cache: Permit a Null Shader in case of a bad host_ptr.
2019-04-07 17:58:27 -04:00
Fernando Sahmkow
021cd56bc9 Permit a Null Shader in case of a bad host_ptr. 2019-04-07 07:52:01 -04:00
Lioncash
fbf452ab0e video_core/texures/texture: Remove unnecessary includes
Nothing in this header relies on common_funcs or the memory manager.

This gets rid of reliance on indirect inclusions in the OpenGL caches.
2019-04-06 00:03:35 -04:00
bunnei
d7438d067f
Merge pull request #2299 from lioncash/maxwell
gl_shader_manager: Remove reliance on a global accessor within MaxwellUniformData::SetFromRegs()
2019-04-03 21:47:48 -04:00
Lioncash
c1ba3e3d4a gl_shader_manager: Remove unnecessary gl_shader_manager inclusion
This isn't used at all in the OpenGL shader cache, so we can remove it's
include here, meaning one less file needs to be recompiled if any
changes ever occur within that header.

core/memory.h is also not used within this file at all, so we can remove
it as well.
2019-03-28 11:16:25 -04:00
Lioncash
a5fa4b311e video_core: Amend constructor initializer list order where applicable
Specifies the members in the same order that initialization would take
place in.

This also silences -Wreorder warnings.
2019-03-27 12:37:53 -04:00
bunnei
241563d15c gpu: Move GPUVAddr definition to common_types. 2019-03-20 22:36:02 -04:00
bunnei
574e89d924 video_core: Refactor to use MemoryManager interface for all memory access.
# Conflicts:
#	src/video_core/engines/kepler_memory.cpp
#	src/video_core/engines/maxwell_3d.cpp
#	src/video_core/morton.cpp
#	src/video_core/morton.h
#	src/video_core/renderer_opengl/gl_global_cache.cpp
#	src/video_core/renderer_opengl/gl_global_cache.h
#	src/video_core/renderer_opengl/gl_rasterizer_cache.cpp
2019-03-16 00:38:48 -04:00
bunnei
2eaf6c41a4 gpu: Use host address for caching instead of guest address. 2019-03-14 22:34:42 -04:00
ReinUsesLisp
e6a2245304 gl_shader_disk_cache: Use unordered containers 2019-02-06 22:23:41 -03:00
ReinUsesLisp
e147ed4fc0 gl_shader_cache: Fixup GLSL unique identifiers 2019-02-06 22:23:40 -03:00
ReinUsesLisp
eb73247433 gl_shader_cache: Link loading screen with disk shader cache load 2019-02-06 22:23:40 -03:00
ReinUsesLisp
df0f31f44e gl_shader_cache: Set GL_PROGRAM_SEPARABLE to dumped shaders
i965 (and probably all mesa drivers) require GL_PROGRAM_SEPARABLE when using
glProgramBinary. This is probably required by the standard but it's ignored by
permisive proprietary drivers.
2019-02-06 22:23:40 -03:00