Commit graph

2573 commits

Author SHA1 Message Date
Lioncash
f6bb905182 common/telemetry: Migrate namespace into the Common namespace
Migrates the Telemetry namespace into the Common namespace to make the
code consistent with the rest of our common code.
2020-08-18 15:08:32 -04:00
bunnei
56c6a5def8
Merge pull request #4535 from lioncash/fileutil
common/fileutil: Convert namespace to Common::FS
2020-08-17 22:35:30 -04:00
ameerj
1b829fbd7a move thread 1/4 count computation into allocate workers method 2020-08-16 12:02:22 -04:00
Lioncash
c4ed791164 common/fileutil: Convert namespace to Common::FS
Migrates a remaining common file over to the Common namespace, making it
consistent with the rest of common files.

This also allows for high-traffic FS related code to alias the
filesystem function namespace as

namespace FS = Common::FS;

for more concise typing.
2020-08-16 06:52:40 -04:00
Lioncash
1ee060ca0d common/compression: Roll back std::span changes
Seems like all compilers don't support std::span yet.
2020-08-15 17:17:56 -04:00
bunnei
feb243b08d
Merge pull request #4416 from lioncash/span
lz4_compression/zstd_compression: Make use of std::span in interfaces
2020-08-15 00:53:11 -04:00
Lioncash
c8135b3c18 gl_shader_disk_cache: Make use of std::nullopt where applicable
Allows the compiler to avoid unnecessarily zeroing out the internal
buffer of std::optional on some implementations.
2020-08-14 08:20:44 -04:00
Morph
e0ff98dd34 gl_shader_cache: Use std::max() for determining num_workers
Does not allocate more threads than available in the host system for boot-time shader compilation and always allocates at least 1 thread if hardware_concurrency() returns 0.
2020-08-12 09:23:34 -04:00
Morph
e8f22730d1 renderer_opengl: Use 1/4 of all threads for async shader compilation 2020-07-28 05:08:27 -04:00
Lioncash
c5bdccfecb zstd_compression: Make use of std::span in interfaces
Allows condensing the data and size parameters into a single argument.
2020-07-25 03:11:56 -04:00
bunnei
f650cf8a9a
Merge pull request #4391 from lioncash/nrvo
video_core: Allow copy elision to take place where applicable
2020-07-24 06:33:09 -07:00
bunnei
1d7de0a8ee
Merge pull request #4394 from lioncash/unused6
video_core: Remove unused variables
2020-07-23 19:54:59 -07:00
Rodrigo Locatti
7278c59d70
Merge pull request #4359 from ReinUsesLisp/clamp-shared
renderer_{opengl,vulkan}: Clamp shared memory to host's limit
2020-07-21 04:51:05 -03:00
Rodrigo Locatti
721e6015a8
Merge pull request #4360 from ReinUsesLisp/glasm-bar
gl_arb_decompiler: Execute BAR even when inside control flow
2020-07-21 04:50:55 -03:00
Lioncash
e17fb5ee97 video_core: Remove unused variables
Silences several compiler warnings about unused variables.
2020-07-21 00:57:25 -04:00
Lioncash
6adc824d9d video_core: Allow copy elision to take place where applicable
Removes const from some variables that are returned from functions, as
this allows the move assignment/constructors to execute for them.
2020-07-21 00:36:13 -04:00
bunnei
3d13d7f48f
Merge pull request #4324 from ReinUsesLisp/formats
video_core: Fix, add and rename pixel formats
2020-07-21 00:13:04 -04:00
ReinUsesLisp
a8a2526128 gl_arb_decompiler: Use NV_shader_buffer_{load,store} on assembly shaders
NV_shader_buffer_{load,store} is a 2010 extension that allows GL applications
to use what in Vulkan is known as physical pointers, this is basically C
pointers. On GLASM these is exposed through the LOAD/STORE/ATOM
instructions.

Up until now, assembly shaders were using NV_shader_storage_buffer_object.
These work fine, but have a (probably unintended) limitation that forces
us to have the limit of a single stage for all shader stages. In contrast,
with NV_shader_buffer_{load,store} we can pass GPU addresses to the
shader through local parameters (GLASM equivalent uniform constants, or
push constants on Vulkan). Local parameters have the advantage of being
per stage, allowing us to generate code without worrying about binding
overlaps.
2020-07-18 01:59:57 -03:00
David Marcec
2ba195aa0d Drop max workers from 8->2 for testing 2020-07-17 14:26:15 +10:00
David Marcec
85d7a8f466 Rebase for per game settings 2020-07-17 14:26:14 +10:00
David Marcec
468bd9c1b0 async shaders 2020-07-17 14:24:57 +10:00
ReinUsesLisp
88e57b13e0 gl_arb_decompiler: Execute BAR even when inside control flow
Unlike GLSL, GLASM allows us to call BAR inside control flow.

- Fixes graphical artifacts in Paper Mario.
2020-07-16 16:05:52 -03:00
ReinUsesLisp
a5a72cbd20 renderer_{opengl,vulkan}: Clamp shared memory to host's limit
This stops shaders from failing to build when the exceed host's shared
memory size limit. An error is logged.
2020-07-16 16:02:46 -03:00
ReinUsesLisp
fbc232426d video_core: Rearrange pixel format names
Normalizes pixel format names to match Vulkan names. Previous to this
commit pixel formats had no convention, leading to confusion and
potential bugs.
2020-07-13 01:44:23 -03:00
ReinUsesLisp
eda37ff26b video_core: Fix DXT4 and RGB565 2020-07-13 01:01:09 -03:00
ReinUsesLisp
480850ffe7 video_core: Fix B5G6R5_UNORM render target format 2020-07-13 01:01:09 -03:00
ReinUsesLisp
990b14f181 video_core: Fix B5G6R5U 2020-07-13 01:01:09 -03:00
ReinUsesLisp
1d20aac795 video_core: Implement RGBA32_SINT render target 2020-07-13 01:01:09 -03:00
ReinUsesLisp
9338599d72 video_core: Implement RGBA32_SINT render target 2020-07-13 01:01:09 -03:00
ReinUsesLisp
95c0f5afe5 video_core: Implement RGBA16_SINT render target 2020-07-13 01:01:09 -03:00
ReinUsesLisp
977d6c46f3 video_core: Implement RGBA8_SINT render target 2020-07-13 01:01:09 -03:00
ReinUsesLisp
50c6030a8d video_core: Implement RG32_SINT render target 2020-07-13 01:01:09 -03:00
ReinUsesLisp
e849d68048 video_core: Implement RG8_SINT render target and fix RG8_UINT 2020-07-13 01:01:09 -03:00
ReinUsesLisp
f29fede49c video_core: Implement R8_SINT render target 2020-07-13 01:01:08 -03:00
ReinUsesLisp
fd33e996e0 video_core: Implement R8_SNORM render target 2020-07-13 01:01:08 -03:00
lat9nq
63d23835ef
configuration: implement per-game configurations (#4098)
* Switch game settings to use a pointer

In order to add full per-game settings, we need to be able to tell yuzu to switch
to using either the global or game configuration. Using a pointer makes it easier
to switch.

* configuration: add new UI without changing existing funcitonality

The new UI also adds General, System, Graphics, Advanced Graphics,
and Audio tabs, but as yet they do nothing. This commit keeps yuzu
to the same functionality as originally branched.

* configuration: Rename files

These weren't included in the last commit. Now they are.

* configuration: setup global configuration checkbox

Global config checkbox now enables/disables the appropriate tabs in the game
properties dialog. The use global configuration setting is now saved to the
config, defaulting to true. This also addresses some changes requested in the PR.

* configuration: swap to per-game config memory for properties dialog

Does not set memory going in-game. Swaps to game values when opening the
properties dialog, then swaps back when closing it. Uses a `memcpy` to swap.
Also implements saving config files, limited to certain groups of configurations
so as to not risk setting unsafe configurations.

* configuration: change config interfaces to use config-specific pointers

When a game is booted, we need to be able to open the configuration dialogs
without changing the settings pointer in the game's emualtion. A new pointer
specific to just the configuration dialogs can be used to separate changes
to just those config dialogs without affecting the emulation.

* configuration: boot a game using per-game settings

Swaps values where needed to boot a game.

* configuration: user correct config during emulation

Creates a new pointer specifically for modifying the configuration while
emulation is in progress. Both the regular configuration dialog and the game
properties dialog now use the pointer Settings::config_values to focus edits to
the correct struct.

* settings: split Settings::values into two different structs

By splitting the settings into two mutually exclusive structs, it becomes easier,
as a developer, to determine how to use the Settings structs after per-game
configurations is merged. Other benefits include only duplicating the required
settings in memory.

* settings: move use_docked_mode to Controls group

`use_docked_mode` is set in the input settings and cannot be accessed from the
system settings. Grouping it with system settings causes it to be saved with
per-game settings, which may make transferring configs more difficult later on,
especially since docked mode cannot be set from within the game properties
dialog.

* configuration: Fix the other yuzu executables and a regression

In main.cpp, we have to get the title ID before the ROM is loaded, else the
renderer will reflect only the global settings and now the user's game specific
settings.

* settings: use a template to duplicate memory for each setting

Replaces the type of each variable in the Settings::Values struct with a new
class that allows basic data reading and writing. The new struct
Settings::Setting duplicates the data in memory and can manage global overrides
per each setting.

* configuration: correct add-ons config and swap settings when apropriate

Any add-ons interaction happens directly through the global values struct.
Swapping bewteen structs now also includes copying the necessary global configs
that cannot be changed nor saved in per-game settings. General and System config
menus now update based on whether it is viewing the global or per-game settings.

* settings: restore old values struct

No longer needed with the Settings::Setting class template.

* configuration: implement hierarchical game properties dialog

This sets the apropriate global or local data in each setting.

* clang format

* clang format take 2

can the docker container save this?

* address comments and style issues

* config: read and write settings with global awareness

Adds new functions to read and write settings while keeping the global state in
focus. Files now generated per-game are much smaller since often they only need
address the global state.

* settings: restore global state when necessary

Upon closing a game or the game properties dialog, we need to restore all global
settings to the original global state so that we can properly open the
configuration dialog or boot a different game.

* configuration: guard setting values incorrectly

This disables setting values while a game is running if the setting is
overwritten by a per game setting.

* config: don't write local settings in the global config

Simple guards to prevent writing the wrong settings in the wrong files.

* configuration: add comments, assume less, and clang format

No longer assumes that a disabled UI element means the global state is turned
off, instead opting to directly answer that question. Still however assumes a
game is running if it is in that state.

* configuration: fix a logic error

Should not be negated

* restore settings' global state regardless of accept/cancel

Fixes loading a properties dialog and causing the global config dialog to show
local settings.

* fix more logic errors

Fixed the frame limit would set the global setting from the game properties
dialog. Also strengthened the Settings::Setting member variables and simplified
the logic in config reading (ReadSettingGlobal).

* fix another logic error

In my efforts to guard RestoreGlobalState, I accidentally negated the IsPowered
condition.

* configure_audio: set toggle_stretched_audio to tristate

* fixed custom rtc and rng seed overwriting the global value

* clang format

* rebased

* clang format take 4

* address my own review

Basically revert unintended changes

* settings: literal instead of casting

"No need to cast, use 1U instead"
Thanks, Morph!

Co-authored-by: Morph <39850852+Morph1984@users.noreply.github.com>

* Revert "settings: literal instead of casting
"

This reverts commit 95e992a87c898f3e882ffdb415bb0ef9f80f613f.

* main: fix status buttons reporting wrong settings after stop emulation

* settings: Log UseDockedMode in the Controls group

This should have happened when use_docked_mode was moved over to the controls group
internally. This just reflects this in the log.

* main: load settings if the file has a title id

In other words, don't exit if the loader has trouble getting a title id.

* use a zero

* settings: initalize resolution factor with constructor instead of casting

* Revert "settings: initalize resolution factor with constructor instead of casting"

This reverts commit 54c35ecb46a29953842614620f9b7de1aa9d5dc8.

* configure_graphics: guard device selector when Vulkan is global

Prevents the user from editing the device selector if Vulkan is the global
renderer backend. Also resets the vulkan_device variable when the users
switches back-and-forth between global and Vulkan.

* address reviewer concerns

Changes function variables to const wherever they don't need to be changed. Sets Settings::Setting to final as it should not be inherited from. Sets ConfigurationShared::use_global_text to static.

Co-Authored-By: VolcaEM <volcaem@users.noreply.github.com>

* main: load per-game settings after LoadROM

This prevents `Restart Emulation` from restoring the global settings *after* the per-game settings were applied. Thanks to BSoDGamingYT for finding this bug.

* Revert "main: load per-game settings after LoadROM"

This reverts commit 9d0d48c52d2dcf3bfb1806cc8fa7d5a271a8a804.

* main: only restore global settings when necessary

Loading the per-game settings cannot happen after the ROM is loaded, so we have to specify when to restore the global state. Again thanks to BSoD for finding the bug.

* configuration_shared: address reviewer concerns except operator overrides

Dropping operator override usage in next commit.

Co-Authored-By: LC <lioncash@users.noreply.github.com>

* settings: Drop operator overrides from Setting template

Requires using GetValue and SetValue explicitly. Also reverts a change that broke title ID formatting in the game properties dialog.

* complete rebase

* configuration_shared: translate "Use global configuration"

Uses ConfigurePerGame to do so, since its usage, at least as of now, corresponds with ConfigurationShared.

* configure_per_game: address reviewer concern

As far as I understand, it prevents the program from unnecessarily copying strings.

Co-Authored-By: LC <lioncash@users.noreply.github.com>

Co-authored-by: Morph <39850852+Morph1984@users.noreply.github.com>
Co-authored-by: VolcaEM <volcaem@users.noreply.github.com>
Co-authored-by: LC <lioncash@users.noreply.github.com>
2020-07-09 22:42:09 -04:00
bunnei
41a333321a
Merge pull request #4175 from ReinUsesLisp/read-buffer
gl_buffer_cache: Copy to buffers created as STREAM_READ before downloading
2020-07-02 23:30:08 -04:00
Rodrigo Locatti
c58e21cd76
Merge pull request #4082 from Morph1984/mirror-once-clamp
maxwell_to_gl: Implement MirrorOnceClampOGL wrap mode using GL_MIRROR_CLAMP_EXT
2020-07-02 04:57:40 -03:00
Fernando Sahmkow
977a3ab352
Merge pull request #4157 from ReinUsesLisp/unified-turing
gl_device: Enable NV_vertex_buffer_unified_memory on Turing devices
2020-06-30 14:36:51 -04:00
Morph
1b31755ba6 maxwell_to_gl: Implement MirrorOnceClampOGL using GL_MIRROR_CLAMP_EXT
Like MirrorOnceBorder, this requires the GL_EXT_texture_mirror_clamp extension. This extension is unfortunately not available on Intel's drivers (both Windows proprietary and Linux Mesa). Use GL_MIRROR_CLAMP_TO_EDGE as a fallback if the extension is unavailable.
2020-06-30 02:40:14 -04:00
Morph
10eca7f651 maxwell_to_gl: Rename VertexType() to VertexFormat() 2020-06-29 11:48:38 -04:00
Morph
78d80d99a0 maxwell_to_gl: Add 32 bit component sizes to (un)signed scaled formats
Add 32 bit component sizes to (un)signed scaled formats and group (un)signed normalized, scaled, and integer formats together.
2020-06-28 02:51:13 -04:00
ReinUsesLisp
6481d91e4a gl_buffer_cache: Copy to buffers created as STREAM_READ before downloading
After marking buffers as resident, Nvidia's driver seems to take a
slow path. To workaround this issue, copy to a STREAM_READ buffer and
then call GetNamedBufferSubData on it.

This is a temporary solution until we have asynchronous flushing.
2020-06-26 16:58:40 -03:00
Rodrigo Locatti
5872fc21fe
Merge pull request #4151 from ReinUsesLisp/gl-invalidations
gl_shader_cache: Avoid use after move for program size
2020-06-25 21:05:27 -03:00
David Marcec
a927d8be52 gl_device: Fix IsASTCSupported
Other targets were never actually checked
2020-06-25 19:12:56 +10:00
ReinUsesLisp
bc8d3b8f82 gl_device: Enable NV_vertex_buffer_unified_memory on Turing devices
Once we make sure not to corrupt Nvidia's driver, we can safely use
resident buffers on Turing devices.

See GitHub pull request #4156
2020-06-25 01:28:47 -03:00
ReinUsesLisp
32a2dcd415 buffer_cache: Use buffer methods instead of cache virtual methods 2020-06-24 02:36:14 -03:00
ReinUsesLisp
39c97f1b65 gl_stream_buffer: Use InvalidateBufferData instead unmap and map
Making the stream buffer resident increases GPU usage significantly on
some games. This seems to be addressed invalidating the stream buffer
with InvalidateBufferData instead of using a Unmap + Map (with
invalidation flags).
2020-06-24 02:36:14 -03:00
ReinUsesLisp
41a4090320 gl_rasterizer: Use NV_vertex_buffer_unified_memory for vertex buffer robustness
Switch games are allowed to bind less data than what they use in a
vertex buffer, the expected behavior here is that these values are read
as zero. At the moment of writing this only D3D12, OpenGL and NVN through
NV_vertex_buffer_unified_memory support vertex buffer with a size limit.

In theory this could be emulated on Vulkan creating a new VkBuffer for
each (handle, offset, length) tuple and binding the expected data to it.
This is likely going to be slow and memory expensive when used on the
vertex buffer and we have to do it on all draws because we can't know
without analyzing indices when a game is going to read vertex data out
of bounds.

This is not a problem on OpenGL's BufferAddressRangeNV because it takes
a length parameter, unlike Vulkan's CmdBindVertexBuffers that only takes
buffers and offsets (the length is implicit in VkBuffer). It isn't a
problem on D3D12 either, because D3D12_VERTEX_BUFFER_VIEW on
IASetVertexBuffers takes SizeInBytes as a parameter (although I am not
familiar with robustness on D3D12).

Currently this only implements buffer ranges for vertex buffers,
although indices can also be affected. A KHR_robustness profile is not
created, but Nvidia's driver reads out of bound vertex data as zero
anyway, this might have to be changed in the future.

- Fixes SMO random triangles when capturing an enemy, getting hit, or
looking at the environment on certain maps.
2020-06-24 02:36:14 -03:00
ReinUsesLisp
32485917ba gl_buffer_cache: Mark buffers as resident
Make stream buffer and cached buffers as resident and query their
address. This allows us to use GPU addresses for several proprietary
Nvidia extensions.
2020-06-24 02:36:14 -03:00
ReinUsesLisp
73fb3a304b gl_device: Expose NV_vertex_buffer_unified_memory except on Turing
Expose NV_vertex_buffer_unified_memory when the driver supports it.

This commit adds a function the determine if a GL_RENDERER is a Turing
GPU. This is required because on Turing GPUs Nvidia's driver crashes
when the buffer is marked as resident or on DeleteBuffers. Without a
synchronous debug output (single threaded driver), it's likely that
the driver will crash in the first blocking call.
2020-06-24 02:36:14 -03:00
ReinUsesLisp
00c66a7289 gl_stream_buffer: Always use a non-coherent buffer 2020-06-24 02:35:33 -03:00
ReinUsesLisp
da79ec9565 gl_stream_buffer: Always use persistent memory maps
yuzu no longer supports platforms without persistent maps.
2020-06-24 02:35:33 -03:00
Rodrigo Locatti
b66ccaa376
Merge pull request #4129 from Morph1984/texture-shadow-lod-workaround
gl_shader_decompiler: Workaround textureLod when GL_EXT_texture_shadow_lod is not available
2020-06-24 01:51:15 -03:00
ReinUsesLisp
9f54cd4dad gl_shader_cache: Avoid use after move for program size
All programs had a size of zero due to this bug, skipping invalidations.

While we are at it, remove some unused forward declarations.
2020-06-23 22:54:42 -03:00
Morph
f77c897b8d gl_shader_decompiler: Enable GL_EXT_texture_shadow_lod if available
Enable GL_EXT_texture_shadow_lod if available. If this extension is not available, such as on Intel/AMD proprietary drivers, use textureGrad as a workaround.
2020-06-20 23:02:29 -04:00
Morph
1e65da971b gl_device: Check for GL_EXT_texture_shadow_lod 2020-06-20 22:14:32 -04:00
Lioncash
5865a10885 gl_arb_decompiler: Avoid several string copies
Variables that are marked as const cannot have the move constructor
invoked when returning from a function (the move constructor requires a
non-const variable so it can "steal" the resources from it.
2020-06-19 23:09:16 -04:00
Morph
8868fb745f maxwell_to_gl: Miscellaneous changes
maxwell_to_gl: Log unimplemented features under UNIMPLEMENTED_MSG instead of LOG_ERROR to bring into parity with maxwell_to_vk
maxwell_to_gl: Deduplicate logging in VertexType(), merging them into one.

maxwell_to_gl: Return GL_NEAREST instead of GL_LINEAR if an unknown texture filter mode is encountered.
maxwell_to_gl: Log the mipmap filter mode if an unknown value is passed in.
maxwell_to_gl: Reorder filtering modes to start with None, then Nearest, then Linear.
2020-06-18 04:56:31 -04:00
Rodrigo Locatti
edb2114bac
Merge pull request #4092 from Morph1984/image-bindings
gl_device: Reserve 4 image bindings for fragment stage
2020-06-18 04:59:48 -03:00
bunnei
798ec003ce
Merge pull request #4041 from ReinUsesLisp/arb-decomp
gl_arb_decompiler: Implement an assembly shader decompiler
2020-06-16 14:56:23 -04:00
Morph
e2f5d16540 gl_device: Reserve at least 4 image bindings for fragment stage
Due to the limitation of GL_MAX_IMAGE_UNITS being low (8) on Intel's and Nvidia's proprietary drivers, we have to reserve an appropriate amount of image bindings for each of the stages. So far games have been observed to use 4 image bindings on the fragment stage (Kirby Star Allies) and 1 on the vertex stage (TWD series).
No games thus far in my limited testing used more than 4 images concurrently and across all currently active programs.
This fixes shader compilation errors on Kirby Star Allies on OpenGL (GLSL/GLASM)
2020-06-16 03:03:07 -04:00
Rodrigo Locatti
0bd9bc7201
Merge pull request #4066 from ReinUsesLisp/shared-ptr-buf
buffer_cache: Avoid passing references of shared pointers and misc style changes
2020-06-15 22:29:32 -03:00
bunnei
92021a344c
Merge pull request #4064 from ReinUsesLisp/invalidate-buffers
gl_rasterizer: Mark vertex buffers as dirty after buffer cache invalidation
2020-06-14 00:29:16 -04:00
bunnei
c2ea1e1bcb
Merge pull request #4049 from ReinUsesLisp/separate-samplers
shader/texture: Join separate image and sampler pairs offline
2020-06-13 13:48:27 -04:00
bunnei
5633887569
Merge pull request #3986 from ReinUsesLisp/shader-cache
shader_cache: Implement a generic runtime shader cache
2020-06-12 23:14:48 -04:00
ReinUsesLisp
87011a97f9 gl_arb_decompiler: Implement FSwizzleAdd 2020-06-11 22:12:07 -03:00
ReinUsesLisp
a63a0daa5e gl_arb_decompiler: Implement an assembly shader decompiler
Emit code compatible with NV_gpu_program5.
This should emit code compatible with Fermi, but it wasn't tested on
that architecture. Pascal has some issues not present on Turing GPUs.
2020-06-11 22:12:07 -03:00
bunnei
83e3b77ed7
Merge pull request #4027 from ReinUsesLisp/3d-slices
texture_cache: Implement rendering to 3D textures
2020-06-09 21:52:15 -04:00
ReinUsesLisp
6508cdd003 buffer_cache: Avoid passing references of shared pointers and misc style changes
Instead of using as template argument a shared pointer, use the
underlying type and manage shared pointers explicitly. This can make
removing shared pointers from the cache more easy.

While we are at it, make some misc style changes and general
improvements (like insert_or_assign instead of operator[] + operator=).
2020-06-09 18:30:49 -03:00
ReinUsesLisp
7646f2c21d gl_rasterizer: Mark vertex buffers as dirty after buffer cache invalidation
Vertex buffers bindings become invalid after the stream buffer is
invalidated. We were originally doing this, but it got lost at some
point.

- Fixes Animal Crossing: New Horizons, but it affects everything.
2020-06-08 20:24:16 -03:00
bunnei
3626254f48
Merge pull request #4040 from ReinUsesLisp/nv-transform-feedback
gl_rasterizer: Use NV_transform_feedback for XFB on assembly shaders
2020-06-08 16:18:33 -04:00
bunnei
98d2461529
Merge pull request #4052 from ReinUsesLisp/debug-output
renderer_opengl: Only enable DEBUG_OUTPUT when graphics debugging is enabled
2020-06-08 10:16:41 -04:00
ReinUsesLisp
3c2ae53b4c texture_cache: Handle 3D texture blits with one layer 2020-06-08 05:01:00 -03:00
ReinUsesLisp
c95c254f3e texture_cache: Implement rendering to 3D textures
This allows rendering to 3D textures with more than one slice.
Applications are allowed to render to more than one slice of a texture
using gl_Layer from a VTG shader.

This also requires reworking how 3D texture collisions are handled, for
now, this commit allows rendering to slices but not to miplevels. When a
render target attempts to write to a mipmap, we fallback to the previous
implementation (copying or flushing as needed).

- Fixes color correction 3D textures on UE4 games (rainbow effects).
- Allows Xenoblade games to render to 3D textures directly.
2020-06-08 05:01:00 -03:00
ReinUsesLisp
abcea1bb18 rasterizer_cache: Remove files and includes
The rasterizer cache is no longer used. Each cache has its own generic
implementation optimized for the cached data.
2020-06-07 04:32:57 -03:00
ReinUsesLisp
678f95e4f8 vk_pipeline_cache: Use generic shader cache
Trivial port the generic shader cache to Vulkan.
2020-06-07 04:32:57 -03:00
ReinUsesLisp
b96f65b62b gl_shader_cache: Use generic shader cache
Trivially port the generic shader cache to OpenGL.
2020-06-07 04:32:57 -03:00
ReinUsesLisp
e78d681a6c gl_device: Black list NVIDIA 443.24 for fast buffer uploads
Skip fast buffer uploads on Nvidia 443.24 Vulkan beta driver on OpenGL.
This driver throws the following error when calling BufferSubData or
BufferData on buffers that are candidates for fast constant buffer
uploads. This is the equivalens to push constants on Vulkan, except that
they can access the full buffer. The error:

Unknown internal debug message. The NVIDIA OpenGL driver has encountered
an out of memory error. This application might
behave inconsistently and fail.

If this error persists on future drivers, we might have to look deeper
into this issue. For now, we can black list it and log it as a temporary
solution.
2020-06-06 02:56:42 -03:00
ReinUsesLisp
354fbe701e renderer_opengl: Only enable DEBUG_OUTPUT when graphics debugging is enabled
Avoids logging when it's not relevant. This can potentially reduce
driver's internal thread overhead.
2020-06-05 21:21:12 -03:00
ReinUsesLisp
5b2b6d594c shader/texture: Join separate image and sampler pairs offline
Games using D3D idioms can join images and samplers when a shader
executes, instead of baking them into a combined sampler image. This is
also possible on Vulkan.

One approach to this solution would be to use separate samplers on
Vulkan and leave this unimplemented on OpenGL, but we can't do this
because there's no consistent way of determining which constant buffer
holds a sampler and which one an image. We could in theory find the
first bit and if it's in the TIC area, it's an image; but this falls
apart when an image or sampler handle use an index of zero.

The used approach is to track for a LOP.OR operation (this is done at an
IR level, not at an ISA level), track again the constant buffers used as
source and store this pair. Then, outside of shader execution, join
the sample and image pair with a bitwise or operation.

This approach won't work on games that truly use separate samplers in a
meaningful way. For example, pooling textures in a 2D array and
determining at runtime what sampler to use.

This invalidates OpenGL's disk shader cache :)

- Used mostly by D3D ports to Switch
2020-06-05 00:24:51 -03:00
bunnei
22369df357
Merge pull request #4031 from Morph1984/fix-gs-outputs
gl_shader_decompiler: Fix geometry shader outputs on Intel drivers
2020-06-04 15:18:51 -04:00
ReinUsesLisp
3d99b449d3 gl_rasterizer: Use NV_transform_feedback for XFB on assembly shaders
NV_transform_feedback, NV_transform_feedback2 and
ARB_transform_feedback3 with NV_transform_feedback interactions allows
implementing transform feedbacks as dynamic state.

Maxwell implements transform feedbacks as dynamic state, so using these
extensions with TransformFeedbackStreamAttribsNV allows us to properly
emulate transform feedbacks without having to recompile shaders when the
state changes.
2020-06-03 20:22:12 -03:00
bunnei
623b93a2b3
Merge pull request #4014 from ReinUsesLisp/astc-nvidia
gl_device: Avoid devices with CAVEAT_SUPPORT on ASTC
2020-06-02 17:43:33 -04:00
bunnei
597d8b4bd4
Merge pull request #4006 from ReinUsesLisp/squash-ubos
glsl: Squash constant buffers into a single SSBO when we hit the limit
2020-06-02 14:58:50 -04:00
Morph
74f2e5f1a4 gl_shader_decompiler: Declare gl_Layer and gl_ViewportIndex within gl_PerVertex for vertex and tessellation shaders 2020-06-01 15:35:44 -04:00
Morph
70188d69b0 gl_shader_decompiler: Fix geometry shader outputs for Intel drivers
On Intel's proprietary drivers, gl_Layer and gl_ViewportIndex are not allowed members of gl_PerVertex block, causing the shader to fail to compile. Fix this by declaring these variables outside of gl_PerVertex.
2020-06-01 15:34:05 -04:00
bunnei
6c0b1a9ee2
Merge pull request #3996 from ReinUsesLisp/front-faces
fixed_pipeline_state,gl_rasterizer: Swap negative viewport checks for front faces
2020-06-01 14:04:35 -04:00
ReinUsesLisp
0ee310ebdc gl_device: Avoid devices with CAVEAT_SUPPORT on ASTC
This avoids using Nvidia's ASTC decoder on OpenGL.
The last time it was profiled, it was slower than yuzu's decoder.

While we are at it, fix a bug in the texture cache when native ASTC is
not supported.
2020-05-31 21:34:34 -03:00
ReinUsesLisp
ee21e4ecd3 glsl: Squash constant buffers into a single SSBO when we hit the limit
Avoids compilation errors at the cost of shader build times and runtime
performance when a game hits the limit of uniform buffers we can use.
2020-05-31 21:33:49 -03:00
bunnei
edbf3144d2
Merge pull request #3958 from FernandoS27/gl-debug
OpenGL: Enable Debug Context and Synchronous debugging when graphics debugging is enabled
2020-05-31 17:04:27 -04:00
Morph
bb8ef38152 gl_device: Enable compute shaders for Intel proprietary drivers
Previously we were disabling compute shaders on Intel's proprietary driver due to broken compute. This has been fixed in the latest Intel drivers. Re-enable compute for Intel proprietary drivers and remove the check for broken compute.
2020-05-31 03:21:07 -04:00
bunnei
058ec22787
Merge pull request #3982 from ReinUsesLisp/membar-cts
shader/other: Implement MEMBAR.CTS
2020-05-30 11:51:42 -04:00
bunnei
1bb3122c1f
Merge pull request #3991 from ReinUsesLisp/depth-sampling
texture_cache: Implement depth stencil texture swizzles
2020-05-28 23:33:38 -04:00
bunnei
099ac9c2a8
Merge pull request #3993 from ReinUsesLisp/fix-zla
gl_shader_manager: Unbind GLSL program when binding a host pipeline
2020-05-28 12:15:22 -04:00
ReinUsesLisp
32e6727dae shader/other: Implement MEMBAR.CTS
This silences an assertion we were hitting and uses workgroup memory
barriers when the game requests it.
2020-05-27 00:19:45 -03:00
ReinUsesLisp
b17fe82973 gl_texture_cache: Implement small texture view cache for swizzles
This fixes cases where the texture swizzle was applied twice on the same
draw to a texture bound to two different slots.
2020-05-26 17:50:08 -03:00
ReinUsesLisp
8bba84a401 texture_cache: Implement depth stencil texture swizzles
Stop ignoring image swizzles on depth and stencil images.

This doesn't fix a known issue on Xenoblade Chronicles 2 where an OpenGL
texture changes swizzles twice before being used. A proper fix would be
having a small texture view cache for this like we do on Vulkan.
2020-05-26 17:44:50 -03:00
ReinUsesLisp
606a62d4c7 gl_rasterizer: Port front face flip check from Vulkan
While Vulkan was assuming we had no negative viewports, OpenGL code
was assuming we had them. Port the old code from Vulkan to OpenGL,
checking if the first viewport is negative before flipping faces.

This is not a complete implementation since we only check for the first
viewport to be negative. That said, unless a game is using Vulkan,
OpenGL and NVN games should be fine here, and we can always compare with
our Vulkan backend to see if there's a difference.
2020-05-26 16:33:50 -03:00
bunnei
508242c267
Merge pull request #3981 from ReinUsesLisp/bar
shader/other: Implement BAR.SYNC 0x0
2020-05-26 14:40:13 -04:00
ReinUsesLisp
c13e2f1b75 gl_shader_manager: Unbind GLSL program when binding a host pipeline
Fixes regression in Link's Awakening caused by 420cc13248
2020-05-26 04:20:39 -03:00
bunnei
86345c126a
Merge pull request #3978 from ReinUsesLisp/write-rz
shader_decompiler: Visit source nodes even when they assign to RZ
2020-05-25 21:31:33 -04:00
bunnei
1adabdac7f
Merge pull request #3905 from FernandoS27/vulkan-fix
Correct a series of crashes and intructions on Async GPU and Vulkan Pipeline
2020-05-24 15:23:38 -04:00
bunnei
325e7eed3c
Merge pull request #3964 from ReinUsesLisp/arb-integration
renderer_opengl: Add assembly program code paths
2020-05-24 00:34:12 -04:00
bunnei
487dd05170
Merge pull request #3979 from ReinUsesLisp/thread-group
shader/other: Implement thread comparisons (NV_shader_thread_group)
2020-05-24 00:33:06 -04:00
ReinUsesLisp
5d0986a53b shader/other: Implement BAR.SYNC 0x0
Trivially implement this particular case of BAR. Unless games use OpenCL
or CUDA barriers, we shouldn't hit any other case here.
2020-05-21 23:20:43 -03:00
ReinUsesLisp
e2b67a868b shader/other: Implement thread comparisons (NV_shader_thread_group)
Hardware S2R special registers match gl_Thread*MaskNV. We can trivially
implement these using Nvidia's extension on OpenGL or naively stubbing
them with the ARB instructions to match. This might cause issues if the
host device warp size doesn't match Nvidia's. That said, this is
unlikely on proper shaders.

Refer to the attached url for more documentation about these flags.
https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt
2020-05-21 23:18:37 -03:00
ReinUsesLisp
ed4e324991 shader_decompiler: Visit source nodes even when they assign to RZ
Some operations like atomicMin were ignored because they returned were
being stored to RZ. This operations have a side effect and it was being
ignored.
2020-05-21 23:16:03 -03:00
ReinUsesLisp
891236124c buffer_cache: Use boost::intrusive::set for caching
Instead of using boost::icl::interval_map for caching, use
boost::intrusive::set. interval_map is intended as a container where the
keys can overlap with one another; we don't need this for caching
buffers and a std::set-like data structure that allows us to search with
lower_bound is enough.
2020-05-21 16:44:00 -03:00
ReinUsesLisp
420cc13248 renderer_opengl: Add assembly program code paths
Add code required to use OpenGL assembly programs based on
NV_gpu_program5. Decompilation for ARB programs is intended to be added
in a follow up commit. This does **not** include ARB decompilation and
it's not in an usable state.

The intention behind assembly programs is to reduce shader stutter
significantly on drivers supporting NV_gpu_program5 (and other required
extensions). Currently only Nvidia's proprietary driver supports these
extensions.

Add a UI option hidden for now to avoid people enabling this option
accidentally.

This code path has some limitations that OpenGL compatibility doesn't
have:
- NV_shader_storage_buffer_object is limited to 16 entries for a single
OpenGL context state (I don't know if this is an intended limitation, an
specification issue or I am missing something). Currently causes issues
on The Legend of Zelda: Link's Awakening.
- NV_parameter_buffer_object can't bind buffers using an offset
different to zero. The used workaround is to copy to a temporary buffer
(this doesn't happen often so it's not an issue).

On the other hand, it has the following advantages:
- Shaders build a lot faster.
- We have control over how floating point rounding is done over
individual instructions (SPIR-V on Vulkan can't do this).
- Operations on shared memory can be unsigned and signed.
- Transform feedbacks are dynamic state (not yet implemented).
- Parameter buffers (uniform buffers) are per stage, matching NVN and
hardware's behavior.
- The API to bind and create assembly programs makes sense, unlike
ARB_separate_shader_objects.
2020-05-19 18:00:04 -03:00
Fernando Sahmkow
4cff5dd194 OpenGL: Enable Debug Context and Synchronous debugging when graphics debugging is enabled.
This commit aims to help easing debugging of driver crashes without
having to modify existing code.
2020-05-17 21:45:09 -04:00
bunnei
b1a1bd12ca
Merge pull request #3899 from ReinUsesLisp/float-comparisons
shader_ir: Add separate instructions for ordered and unordered comparisons and fix NE on GLSL
2020-05-13 09:51:14 -04:00
ReinUsesLisp
8b329ddcc9 gl_shader_decompiler: Properly emulate NaN behaviour on NE
"Not equal" operators on GLSL seem to behave as unordered when we expect
an ordered comparison.

Manually emulate this checking for LGE values (numbers, not-NaNs).
2020-05-10 02:59:33 -03:00
Fernando Sahmkow
0a4be73b9b VideoCore: Use SyncGuestMemory mechanism for Shader/Pipeline Cache invalidation. 2020-05-09 19:25:29 -04:00
Rodrigo Locatti
7e376af8fc
Merge pull request #3839 from Morph1984/r8g8ui
texture: Implement R8G8UI
2020-05-09 05:28:55 -03:00
ReinUsesLisp
4e57f9d5cf shader_ir: Separate float-point comparisons in ordered and unordered
This allows us to use native SPIR-V instructions without having to
manually check for NAN.
2020-05-09 04:55:15 -03:00
ReinUsesLisp
f813cd3ff7 gl_rasterizer: Implement viewport swizzles with NV_viewport_swizzle 2020-05-04 17:51:30 -03:00
bunnei
2aff0b4733
Merge pull request #3808 from ReinUsesLisp/wait-for-idle
{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers
2020-05-03 02:43:18 -04:00
bunnei
e6b4311178
Merge pull request #3693 from ReinUsesLisp/clean-samplers
shader/texture: Support multiple unknown sampler properties
2020-05-02 00:45:41 -04:00
Morph
7909860d16 texture: Implement R8G8UI
- Used by The Walking Dead: The Final Season
2020-04-30 13:19:36 -04:00
bunnei
bf3f030a0d
Merge pull request #3807 from ReinUsesLisp/fix-depth-clamp
maxwell_3d: Fix depth clamping register
2020-04-30 13:07:31 -04:00
bunnei
c7b5a87c90
Merge pull request #3799 from ReinUsesLisp/iadd-cc
shader: Implement P2R CC, IADD Rd.CC and IADD.X
2020-04-30 12:56:36 -04:00
bunnei
da2b8295e1
Merge pull request #3805 from ReinUsesLisp/preserve-contents
texture_cache: Reintroduce preserve_contents accurately
2020-04-30 12:56:19 -04:00
bunnei
72b73d22ab
Merge pull request #3784 from ReinUsesLisp/shader-memory-util
shader/memory_util: Deduplicate code
2020-04-28 12:05:50 -04:00
ReinUsesLisp
fe931ac976 {maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers
Drop MemoryBarrier from the buffer cache and use Maxwell3D's register
WaitForIdle.

To implement this on OpenGL we just call glMemoryBarrier with the
necessary bits.

Vulkan lacks this synchronization primitive, so we set an event and
immediately wait for it. This is not a pretty solution, but it's what
Vulkan can do without submitting the current command buffer to the queue
(which ends up being more expensive on the CPU).
2020-04-28 02:18:12 -03:00
ReinUsesLisp
bb1ed66d99 maxwell_3d: Fix depth clamping register
Using deko3d as reference:
4e47ba0013/source/maxwell/gpu_3d_state.cpp (L42)

We were using bits 3 and 4 to determine depth clamping, but these are
the same both enabled and disabled:

state->depthClampEnable ? 0x101A : 0x181D

The same happens on Nvidia's OpenGL driver, where they do something like
this (default capabilities, GL 4.5 compatibility):

(state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c

There's always a difference between the first bits in this register, but
bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This
commit changes yuzu's behaviour to use bit 11 to determine depth
clamping.

- Fixes depth issues on Super Mario Odyssey's intro.
2020-04-27 20:50:14 -03:00
ReinUsesLisp
8da16cf9fb texture_cache: Reintroduce preserve_contents accurately
This reverts commit 94b0e2e5da.

preserve_contents proved to be a meaningful optimization. This commit
reintroduces it but properly implemented on OpenGL.

We have to make sure the clear removes all the previous contents of the
image.

It's not currently implemented on Vulkan because we can do smart things
there that's preferred to be introduced in a separate commit.
2020-04-26 19:53:02 -03:00
Rodrigo Locatti
7e38dd580f
Merge pull request #3753 from ReinUsesLisp/ac-vulkan
{gl,vk}_rasterizer: Add lazy default buffer maker and use it for empty buffers
2020-04-26 01:55:43 -03:00
ReinUsesLisp
ddd82ef42b shader/memory_util: Deduplicate code
Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as
well as shader decoder code.

While we are at it, fix a bug in gl_shader_cache where compute shaders
had an start offset of a stage shader.
2020-04-26 01:38:51 -03:00
ReinUsesLisp
255197e643 shader/arithmetic_integer: Implement CC for IADD 2020-04-25 22:55:26 -03:00
ReinUsesLisp
72deb773fd shader_ir: Turn classes into data structures 2020-04-23 18:00:06 -03:00
Fernando Sahmkow
c043ac4f13 GL_Fence_Manager: use GL_TIMEOUT_IGNORED instead of a loop, 2020-04-22 20:34:32 -04:00
Fernando Sahmkow
39e5b72948 Async GPU: Correct flushing behavior to be similar to old async GPU behavior. 2020-04-22 11:36:26 -04:00
Fernando Sahmkow
644588fd88 ShaderCache/PipelineCache: Cache null shaders. 2020-04-22 11:36:25 -04:00
Fernando Sahmkow
f616dc0b59 Address Feedback. 2020-04-22 11:36:24 -04:00
Fernando Sahmkow
ec2f3e48e1 Fix GCC error. 2020-04-22 11:36:23 -04:00
Fernando Sahmkow
0649f05900 QueryCache: Implement Async Flushes. 2020-04-22 11:36:18 -04:00
Fernando Sahmkow
131b342130 OpenGL: Guarantee writes to Buffers. 2020-04-22 11:36:18 -04:00
Fernando Sahmkow
1fb516cd97 GPU: Implement Flush Requests for Async mode. 2020-04-22 11:36:17 -04:00
Fernando Sahmkow
b7bc3c2549 FenceManager: Manage syncpoints and rename fences to semaphores. 2020-04-22 11:36:16 -04:00
Fernando Sahmkow
b10db7e4a5 FenceManager: Implement async buffer cache flushes on High settings 2020-04-22 11:36:15 -04:00
Fernando Sahmkow
a081a7c855 GPU: Fix rebase errors. 2020-04-22 11:36:13 -04:00
Fernando Sahmkow
e84eb64e51 Rasterizer: Disable fence managing in synchronous gpu. 2020-04-22 11:36:12 -04:00
Fernando Sahmkow
165ae823f5 ThreadManager: Sync async reads on accurate gpu. 2020-04-22 11:36:12 -04:00
Fernando Sahmkow
1f345ebe3a GPU: Implement a Fence Manager. 2020-04-22 11:36:10 -04:00
Fernando Sahmkow
487379c593 OpenGL: Implement Fencing backend. 2020-04-22 11:36:10 -04:00
Fernando Sahmkow
8b1eb44b3e BufferCache: Implement OnCPUWrite and SyncGuestHost 2020-04-22 11:36:07 -04:00
Fernando Sahmkow
da8f17715d GPU: Refactor synchronization on Async GPU 2020-04-22 11:36:06 -04:00
Fernando Sahmkow
084ceb925a UI: Replasce accurate GPU option for GPU Accuracy Level 2020-04-22 11:36:04 -04:00
bunnei
d64290884a
Merge pull request #3714 from lioncash/copies
gl_shader_decompiler: Avoid copies where applicable
2020-04-21 20:16:02 -04:00
ReinUsesLisp
0bbae63300 gl_rasterizer: Fix buffers without size
On NVN buffers can be enabled but have no size. According to deko3d and
the behavior we see in Animal Crossing: New Horizons these buffers get
the special address of 0x1000 and limit themselves to 0xfff.

Implement buffers without a size by binding a null buffer to OpenGL
without a side.

1d1930beea/source/maxwell/gpu_3d_vbo.cpp (L62-L63)
2020-04-21 19:55:44 -03:00
Mat M
5305806071
Merge pull request #3716 from bunnei/fix-another-impl-fallthrough
video_core: gl_shader_decompiler: Fix implicit fallthrough errors.
2020-04-18 15:17:52 -04:00
bunnei
03726fb7f5 video_core: gl_shader_decompiler: Fix implicit fallthrough errors. 2020-04-18 15:15:21 -04:00
Lioncash
bf328ed35a gl_shader_decompiler: Avoid copies where applicable
Avoids unnecessary reference count increments where applicable and also
avoids reallocating a vector.

Unlikely to make a huge difference, but given how trivial of an
amendment it is, why not?
2020-04-17 20:48:52 -04:00
Markus Wick
07fbef1776 video_code: Fix implicit switch fallthrough.
Since yesterday, this breaks the build on linux.
So let's fix it.
2020-04-17 23:43:35 +02:00
Rodrigo Locatti
990c0b184f
Revert "gl_shader_cache: Use CompileDepth::FullDecompile on GLSL" 2020-04-17 17:41:48 -03:00
bunnei
ca3af2961c
Merge pull request #3682 from lioncash/uam
gl_query_cache: Resolve use-after-move in CachedQuery move assignment operator
2020-04-17 01:24:08 -04:00
bunnei
79c1269f0f
Merge pull request #3673 from lioncash/extra
CMakeLists: Specify -Wextra on linux builds
2020-04-16 21:12:33 -04:00
Fernando Sahmkow
c81f256111
Merge pull request #3600 from ReinUsesLisp/no-pointer-buf-cache
buffer_cache: Return handles instead of pointer to handles
2020-04-16 19:58:13 -04:00
ReinUsesLisp
090fd3fefa buffer_cache: Return handles instead of pointer to handles
The original idea of returning pointers is that handles can be moved.
The problem is that the implementation didn't take that in mind and made
everything harder to work with. This commit drops pointer to handles and
returns the handles themselves. While it is still true that handles can
be invalidated, this way we get an old handle instead of a dangling
pointer.

This problem can be solved in the future with sparse buffers.
2020-04-16 02:33:34 -03:00
Lioncash
3a60f19eaf gl_query_cache: Resolve use-after-move in CachedQuery move assignment operator
Avoids potential invalid junk data from being read.
2020-04-15 22:20:06 -04:00
Lioncash
71fb156611 gl_device: Mark stage_swizzle as constexpr
Previously this was mutable even though it shouldn't be.
2020-04-15 21:59:13 -04:00
Lioncash
1c340c6efa CMakeLists: Specify -Wextra on linux builds
Allows reporting more cases where logic errors may exist, such as
implicit fallthrough cases, etc.

We currently ignore unused parameters, since we currently have many
cases where this is intentional (virtual interfaces).

While we're at it, we can also tidy up any existing code that causes
warnings. This also uncovered a few bugs as well.
2020-04-15 21:33:46 -04:00
Fernando Sahmkow
e33196d4e7
Merge pull request #3612 from ReinUsesLisp/red
shader/memory: Implement RED.E.ADD and minor changes to ATOM
2020-04-15 15:03:49 -04:00
Lioncash
213fff67bc CMakeLists: Make -Wreorder a compile-time error
This can result in silent logic bugs within code, and given the amount
of times these kind of warnings are caused, they should be flagged at
compile-time so no new code is submitted with them.
2020-04-15 14:14:41 -04:00
Mat M
64b5985f0a
Merge pull request #3662 from ReinUsesLisp/constant-attrs
gl_rasterizer: Implement constant vertex attributes
2020-04-15 11:54:50 -04:00
Mat M
ab72696beb
Merge pull request #3656 from ReinUsesLisp/glsl-full-decompile
gl_shader_cache: Use CompileDepth::FullDecompile on GLSL
2020-04-15 03:17:46 -04:00
Mat M
4878d6bb49
Merge pull request #3654 from ReinUsesLisp/fix-fb-attach
gl_texture_cache: Fix layered texture attachment base level
2020-04-15 03:17:18 -04:00
ReinUsesLisp
fd6371eba7 Revert "gl_shader_decompiler: Implement merges with bitfieldInsert"
This reverts commit 05cf270836.

Apparently the first approach using floats instead of bitfieldInert
worked better for Fire Emblem: Three Houses. Reverting to get that
behavior back.
2020-04-14 21:24:33 -03:00
ReinUsesLisp
6dfcabc800 gl_rasterizer: Implement constant vertex attributes
Credits go to gdkchan from Ryujinx for finding constant attributes are
used in retail games.
2020-04-14 17:58:53 -03:00
ReinUsesLisp
453d7419d9 gl_shader_cache: Use CompileDepth::FullDecompile on GLSL
From my testing on a Splatoon 2 shader that takes 3800ms on average to
compile changing to FullDecompile reduces it to 900ms on average.

The shader decoder will automatically fallback to a more naive method if
it can't use full decompile.
2020-04-14 01:34:20 -03:00
ReinUsesLisp
21dc842171 gl_texture_cache: Fix layered texture attachment base level
The base level is already included in the texture view. If we specify
the base level in the texture again, this will end up in the incorrect
level and potentially out of bounds.
2020-04-13 18:24:56 -03:00
Mat M
fbf13d3f48
Merge pull request #3651 from ReinUsesLisp/line-widths
gl_rasterizer: Implement line widths and smooth lines
2020-04-13 10:19:59 -04:00
Mat M
08266d70ba
Merge pull request #3638 from ReinUsesLisp/remove-preserve-contents
texture_cache: Remove preserve_contents
2020-04-13 10:19:01 -04:00
Mat M
3351e1e94f
Merge pull request #3627 from ReinUsesLisp/layered-view
gl_texture_cache: Attach view instead of base texture for layered attchments
2020-04-13 10:16:18 -04:00
ReinUsesLisp
76615b9f34 gl_rasterizer: Implement line widths and smooth lines
Implements "legacy" features from OpenGL present on hardware such as
smooth lines and line width.
2020-04-13 01:30:34 -03:00
ReinUsesLisp
05cf270836 gl_shader_decompiler: Implement merges with bitfieldInsert
This also fixes Turing issues but it avoids doing more bitcasts. This
should improve the generated code while also avoiding more points where
compilers can flush floats.
2020-04-12 22:39:59 -03:00
ReinUsesLisp
75eb953575 gl_shader_decompiler: Improve generated code in HMergeH*
Avoiding bitwise expressions, this fixes Turing issues in shaders using
half float merges that affected several games.
2020-04-12 05:06:55 -03:00
ReinUsesLisp
94b0e2e5da texture_cache: Remove preserve_contents
preserve_contents was always true. We can't assume we don't have to
preserve clears because scissored and color masked clears exist.

This removes preserve_contents and assumes it as true at all times.
2020-04-11 01:51:02 -03:00
ReinUsesLisp
6c8f9f40d7 gl_texture_cache: Attach view instead of base texture for layered attachments
This way we are not ignoring the base layer of the current texture.
2020-04-08 22:20:25 -03:00
Fernando Sahmkow
913f42a3a7 Memory: Address Feedback. 2020-04-08 13:40:46 -04:00
Fernando Sahmkow
ea535d9470 Shader/Pipeline Cache: Use VAddr instead of physical memory for addressing. 2020-04-06 09:23:07 -04:00
Fernando Sahmkow
3dd5c07454 Query Cache: Use VAddr instead of physical memory for adressing. 2020-04-06 09:23:07 -04:00
Fernando Sahmkow
7fcd0fee6d Buffer Cache: Use vAddr instead of physical memory. 2020-04-06 09:23:06 -04:00
Fernando Sahmkow
6ee316cb8f Texture Cache: Use vAddr instead of physical memory for caching. 2020-04-06 09:23:05 -04:00
Fernando Sahmkow
9c0f40a1f5 GPU: Setup Flush/Invalidate to use VAddr instead of CacheAddr 2020-04-06 09:21:46 -04:00
Fernando Sahmkow
588a20be3f
Merge pull request #3513 from ReinUsesLisp/native-astc
video_core: Use native ASTC when available
2020-04-06 09:21:11 -04:00
ReinUsesLisp
3185245845 shader/memory: Implement RED.E.ADD
Implements a reduction operation. It's an atomic operation that doesn't
return a value.

This commit introduces another primitive because some shading languages
might have a primitive for reduction operations.
2020-04-06 02:24:47 -03:00
Fernando Sahmkow
69277de29d
Merge pull request #3592 from ReinUsesLisp/ipa
shader_decompiler: Remove FragCoord.w hack and change IPA implementation
2020-04-05 19:29:40 -04:00
Fernando Sahmkow
1633fbf99a
Merge pull request #3589 from ReinUsesLisp/fix-clears
gl_rasterizer: Mark cleared textures as dirty
2020-04-05 19:29:26 -04:00
Rodrigo Locatti
825a6e2615
Merge pull request #3552 from jroweboy/single-context
Refactor Context management (Fixes renderdoc on opengl issues)
2020-04-02 01:38:25 -03:00
ReinUsesLisp
2339fe199f shader_decompiler: Remove FragCoord.w hack and change IPA implementation
Credits go to gdkchan and Ryujinx. The pull request used for this can
be found here: https://github.com/Ryujinx/Ryujinx/pull/1082

yuzu was already using the header for interpolation, but it was missing
the FragCoord.w multiplication described in the linked pull request.
This commit finally removes the FragCoord.w == 1.0f hack from the shader
decompiler.

While we are at it, this commit renames some enumerations to match
Nvidia's documentation (linked below) and fixes component declaration
order in the shader program header (z and w were swapped).

https://github.com/NVIDIA/open-gpu-doc/blob/master/Shader-Program-Header/Shader-Program-Header.html
2020-04-01 21:48:55 -03:00
ReinUsesLisp
dd1232755b gl_texture_cache: Fix software ASTC fallback 2020-04-01 01:44:15 -03:00
ReinUsesLisp
b6571ca9f0 video_core: Use native ASTC when available 2020-04-01 01:14:04 -03:00
ReinUsesLisp
16270dcfe4 gl_device: Detect if ASTC is reported and expose it 2020-04-01 01:14:04 -03:00
ReinUsesLisp
1c5e2b60a7 gl_rasterizer: Mark cleared textures as dirty
Fixes a potential edge case where cleared textures read from the CPU
were not flushed.
2020-03-31 05:51:56 -03:00
Rodrigo Locatti
c19425ed69
Merge pull request #3506 from namkazt/patch-9
shader_decode: Implement partial ATOM/ATOMS instr
2020-03-31 00:56:28 -03:00
namkazy
c2665ec9c2 gl_decompiler: min/max op not implement yet 2020-03-30 18:48:22 +07:00
Nguyen Dac Nam
552f0ff267 gl_decompiler: add atomic op 2020-03-30 17:44:45 +07:00
James Rowe
cf9c94d401 Address review and fix broken yuzu-tester build 2020-03-25 23:32:42 -06:00
ReinUsesLisp
7617e88fb2 gl_rasterizer: Update stencil test regardless of it being disabled 2020-03-26 01:08:14 -03:00
ReinUsesLisp
c310cef615 gl_rasterizer: Synchronize stencil testing on clears 2020-03-26 00:51:47 -03:00
bunnei
e6aff11057
Merge pull request #3520 from ReinUsesLisp/legacy-varyings
gl_shader_decompiler: Implement legacy varyings
2020-03-25 19:27:51 -04:00
James Rowe
282adfc70b Frontend/GPU: Refactor context management
Changes the GraphicsContext to be managed by the GPU core. This
eliminates the need for the frontends to fool around with tricky
MakeCurrent/DoneCurrent calls that are dependent on the settings (such
as async gpu option).

This also refactors out the need to use QWidget::fromWindowContainer as
that caused issues with focus and input handling. Now we use a regular
QWidget and just access the native windowHandle() directly.

Another change is removing the debug tool setting in FrameMailbox.
Instead of trying to block the frontend until a new frame is ready, the
core will now take over presentation and draw directly to the window if
the renderer detects that its hooked by NSight or RenderDoc

Lastly, since it was in the way, I removed ScopeAcquireWindowContext and
replaced it with a simple subclass in GraphicsContext that achieves the
same result
2020-03-24 21:03:42 -06:00
ReinUsesLisp
bdcedc8506 gl_rasterizer: Use transformed viewport for depth ranges
Implement depth ranges using the transformed viewport instead of the
generic one. This matches the current Vulkan implementation but doesn't
support negative depth ranges. An update to glad is required for this.
2020-03-22 03:26:07 -03:00
ReinUsesLisp
351816ac38 gl_shader_decompiler: Remove deprecated function and its usages 2020-03-18 20:03:19 -03:00
ReinUsesLisp
acf328a71f gl_rasterizer: Silence misc warnings 2020-03-18 20:03:19 -03:00
ReinUsesLisp
f5658a9fda gl_shader_decompiler: Don't redeclare gl_VertexID and gl_InstanceID 2020-03-18 01:28:41 -03:00
Mat M
edb9cccb36
Merge pull request #3510 from FernandoS27/dirty-write
DirtyFlags: relax need to set render_targets as dirty
2020-03-17 17:29:22 -04:00
bunnei
1c45c8086e
Merge pull request #3498 from ReinUsesLisp/texel-fetch-glsl
gl_shader_decompiler: Add layer component to texelFetch
2020-03-17 10:53:38 -04:00
ReinUsesLisp
53d673a7d3 renderer_opengl: Move some logic to an anonymous namespace 2020-03-16 04:03:34 -03:00
ReinUsesLisp
311d2fc768 renderer_opengl: Detect Nvidia Nsight as a debugging tool
Use getenv to detect Nsight.
2020-03-16 03:59:08 -03:00
Rodrigo Locatti
7cc46a6faa
Merge pull request #3501 from ReinUsesLisp/rgba16-snorm
video_core: Implement RGBA16_SNORM
2020-03-15 21:24:53 -03:00
ReinUsesLisp
5afc397d52 gl_shader_decompiler: Implement legacy varyings
Legacy varyings are special attributes carried over in hardware from
the OpenGL 1 and OpenGL 2 days. These were generally used instead of the
generic attributes we use today. They are deprecated or removed from
most APIs, but Nvidia still ships them in hardware.

To implement these, this commit maps them 1:1 to OpenGL compatibility.
2020-03-15 21:03:59 -03:00
bunnei
c5afe93dcc renderer_opengl: Keep presentation frames in lock-step when GPU debugging.
- Fixes renderdoc with OpenGL renderer.
2020-03-14 17:45:01 -04:00
bunnei
4373fa8042 gl_device: Add option to check GL_EXT_debug_tool. 2020-03-14 17:39:29 -04:00
Fernando Sahmkow
380fc8d2e1 DirtyFlags: relax need to set render_targets as dirty
The texture cache already takes care of setting a render target to dirty 
when invalidated.
2020-03-14 11:47:33 -04:00
ReinUsesLisp
69c7a01f88 vk/gl_shader_decompiler: Silence assertion on compute 2020-03-13 18:33:05 -03:00
ReinUsesLisp
4bc4851d45 gl_shader_decompiler: Fix implicit conversion errors 2020-03-13 18:33:05 -03:00
ReinUsesLisp
ae6189d7c2 shader/transform_feedback: Expose buffer stride 2020-03-13 18:33:05 -03:00
ReinUsesLisp
8e9f23f393 gl_rasterizer: Implement transform feedback bindings 2020-03-13 18:33:04 -03:00
ReinUsesLisp
4d711dface gl_shader_decompiler: Decorate output attributes with XFB layout
We sometimes have to slice attributes in different parts. This is needed
for example in instances where the game feedbacks 3 components but
writes 4 from the shader (something that is possible with
GL_NV_transform_feedback).
2020-03-13 18:33:04 -03:00
Rodrigo Locatti
244fe13219
Merge branch 'master' into shader-purge 2020-03-13 16:44:06 -03:00
bunnei
b30b1f741d
Merge pull request #3491 from ReinUsesLisp/polygon-modes
gl_rasterizer: Implement polygon modes and fill rectangles
2020-03-13 10:08:57 -04:00
ReinUsesLisp
e24197bb3f gl_shader_decompiler: Initialize gl_Position on vertex shaders 2020-03-12 23:31:06 -03:00
ReinUsesLisp
3a10016e38 gl_shader_decompiler: Add missing {} on smem GLSL emission 2020-03-12 21:50:37 -03:00
ReinUsesLisp
4dcca90ef4 video_core: Implement RGBA16_SNORM
Implement RGBA16_SNORM with the current API. Nothing special here.
2020-03-12 21:42:33 -03:00
ReinUsesLisp
38fe070d78 gl_shader_decompiler: Add layer component to texelFetch
TexelFetch was not emitting the array component generating invalid GLSL.
2020-03-12 18:10:29 -03:00
ReinUsesLisp
825d629565 gl_shader_decompiler: Fix regression in render target declarations
A previous commit introduced a way to declare as few render targets as
possible. Turns out this introduced a regression in some games.
2020-03-12 05:01:20 -03:00
ReinUsesLisp
8357908099 gl_shader_manager: Fix interaction between graphics and compute
After a compute shader was set to the pipeline, no graphics shader was
invoked again. To address this use glUseProgram to bind compute shaders
(without state tracking) and call glUseProgram(0) when transitioning out
of it back to the graphics pipeline.
2020-03-11 01:04:52 -03:00
ReinUsesLisp
e4bc3c3342 gl_rasterizer: Implement polygon modes and fill rectangles 2020-03-09 20:39:58 -03:00
ReinUsesLisp
eb5861e0a2 engines/maxwell_3d: Add TFB registers and store them in shader registry 2020-03-09 18:40:53 -03:00
ReinUsesLisp
b1acb4f73f shader/registry: Address feedback 2020-03-09 18:40:53 -03:00
ReinUsesLisp
b1061afed9 gl_shader_decompiler: Add identifier to decompiled code 2020-03-09 18:40:53 -03:00
ReinUsesLisp
e612242977 gl_shader_decompiler: Roll back to GLSL core 430
RenderDoc won't build shaders if we use GLSL compatibility.
2020-03-09 18:40:53 -03:00
ReinUsesLisp
978172530e const_buffer_engine_interface: Store component types
This is required for Vulkan. Sampling integer textures with float
handles is illegal.
2020-03-09 18:40:53 -03:00
ReinUsesLisp
e1932351a9 gl_shader_cache: Reduce registry consistency to debug assert
Registry consistency is something that practically can't happen and it
has a measurable runtime cost. Reduce it to a DEBUG_ASSERT.
2020-03-09 18:40:07 -03:00
ReinUsesLisp
66a8a3e887 shader/registry: Cache tessellation state 2020-03-09 18:40:07 -03:00
ReinUsesLisp
0528be5c92 shader/registry: Store graphics and compute metadata
Store information GLSL forces us to provide but it's dynamic state in
hardware (workgroup sizes, primitive topology, shared memory size).
2020-03-09 18:40:07 -03:00
ReinUsesLisp
e8efd5a901 video_core: Rename "const buffer locker" to "registry" 2020-03-09 18:40:06 -03:00
ReinUsesLisp
bd8b9bbcee gl_shader_cache: Rework shader cache and remove post-specializations
Instead of pre-specializing shaders and then post-specializing them,
drop the later and only "specialize" the shader while decoding it.
2020-03-09 18:40:06 -03:00
Rodrigo Locatti
22e825a3bc
Merge pull request #3301 from ReinUsesLisp/state-tracker
video_core: Remove gl_state and use a state tracker based on dirty flags
2020-03-09 18:34:37 -03:00
bunnei
84e9f9f395
Merge pull request #3452 from Morph1984/anisotropic-filtering
frontend/Graphics: Add "Advanced" graphics tab and experimental Anisotropic Filtering support
2020-03-07 22:28:35 -05:00
bunnei
0361aa1915
Merge pull request #3451 from ReinUsesLisp/indexed-textures
vk_shader_decompiler: Implement indexed textures
2020-03-05 11:42:46 -05:00
bunnei
67e7186d79
Merge pull request #3455 from ReinUsesLisp/attr-scaled
video_core: Implement more scaled attribute formats
2020-03-03 22:46:20 -05:00
ReinUsesLisp
ef7f6eb67d renderer_opengl: Fix edge-case where alpha testing might cull presentation 2020-02-28 17:56:43 -03:00
ReinUsesLisp
a6a350ddc3 gl_texture_cache: Remove blending disable on blits
Blending doesn't affect blits. Rasterizer discard does, update the
commentaries.
2020-02-28 17:56:43 -03:00
ReinUsesLisp
887d5288ef gl_rasterizer: Don't disable blending on clears
Blending doesn't affect clears.
2020-02-28 17:56:43 -03:00
ReinUsesLisp
ac204754d4 dirty_flags: Deduplicate code between OpenGL and Vulkan 2020-02-28 17:56:43 -03:00
ReinUsesLisp
042256c6bb state_tracker: Remove type traits with named structures 2020-02-28 17:56:43 -03:00