Linuxydable/suyu

Author	SHA1	Message	Date
ReinUsesLisp	85fc7e584e	HACK: Bind stages before and after bindings Works around a bug where program parameters are only applied to the current stage, and this one wasn't bound at the moment. Affects all SSBO usages on GLASM.	2021-07-22 21:51:32 -04:00
ReinUsesLisp	8b7d5912d6	glasm: Support textures used in more than one stage	2021-07-22 21:51:32 -04:00
ReinUsesLisp	258f2dec1b	opengl: Initial (broken) support to GLASM shaders	2021-07-22 21:51:31 -04:00
ReinUsesLisp	dc02cb92e4	gl_rasterizer: Flush L2 caches before glFlush on GLASM	2021-07-22 21:51:30 -04:00
ReinUsesLisp	2c81ad8311	glasm: Initial GLASM compute implementation for testing	2021-07-22 21:51:30 -04:00
ReinUsesLisp	bfa47539f6	gl_shader_cache: Remove code unintentionally committed	2021-07-22 21:51:30 -04:00
ReinUsesLisp	bed090807a	Move SPIR-V emission functions to their own header	2021-07-22 21:51:30 -04:00
ReinUsesLisp	d621e96d0d	shader: Initial OpenGL implementation	2021-07-22 21:51:30 -04:00
ReinUsesLisp	025b20f96a	shader: Move pipeline cache logic to separate files Move code to separate files to be able to reuse it from OpenGL. This greatly simplifies the pipeline cache logic on Vulkan. Transform feedback state is not yet abstracted and it's still intrusively stored inside vk_pipeline_cache. It will be moved when needed on OpenGL.	2021-07-22 21:51:29 -04:00
ReinUsesLisp	f4ace63957	shader: Accelerate pipeline transitions and use dirty flags for shaders	2021-07-22 21:51:29 -04:00
ReinUsesLisp	e9a91bc5cc	shader: Interact texture buffers with buffer cache	2021-07-22 21:51:26 -04:00
ReinUsesLisp	c67d64365a	shader: Remove old shader management	2021-07-22 21:51:22 -04:00
ReinUsesLisp	a0c4557557	gl_buffer_cache: Use glClearNamedBufferSubData:GL_RED instead of GL_RGBA Avoids reading out of bounds from the stack.	2021-07-20 18:51:45 -03:00
bunnei	c53b688411	Merge pull request #6629 from FernandoS27/accel-dma-2 DMAEngine: Accelerate BufferClear [accelerateDMA Part 2]	2021-07-20 17:35:05 -04:00
ReinUsesLisp	2e2d6cf5e5	gl_texture_cache: Workaround slow PBO downloads on radeonsi There's an optimization bug on non-git mesa versions where not specifying GL_CLIENT_STORAGE_BIT causes very slow reads on the CPU side. Add this bit for all vendors.	2021-07-20 14:02:11 -03:00
bunnei	3cd3230295	Merge pull request #6579 from ameerj/float-settings settings: Eliminate usage of float-point setting values	2021-07-15 18:03:11 -04:00
Fernando Sahmkow	b780d5b5c5	DMAEngine: Accelerate BufferClear	2021-07-13 03:49:47 +02:00
Fernando Sahmkow	be1a3f7a0f	accelerateDMA: Accelerate Buffer Copies.	2021-07-11 01:33:17 +02:00
Fernando Sahmkow	4a09517336	Fence Manager: remove reference fencing.	2021-07-09 22:20:36 +02:00
Fernando Sahmkow	cf38faee9b	Fence Manager: Force ordering on WFI.	2021-07-09 22:20:36 +02:00
Fernando Sahmkow	63915bf2de	Fence Manager: Add fences on Reference Count.	2021-07-09 22:20:36 +02:00
ameerj	8284658bac	configure_graphics: Use u8 for bg_color values	2021-07-08 21:45:01 -04:00
lat9nq	2f0e1f5d02	util_shaders: Fix BindImageTexture According to https://gitlab.freedesktop.org/mesa/mesa/-/issues/3820#note_753371 we need to set these to true for use with 3D textures. Fixes BOTW teleporting on RadeonSI and iris.	2021-07-07 14:09:55 -04:00
bunnei	eb3cb3af35	Merge pull request #6497 from FernandoS27/scotty-doesnt-know GPU Memory Manager - Correct handling of non continuous backing memory.	2021-07-06 17:26:21 -07:00
bunnei	bf50345d4c	Merge pull request #6537 from Morph1984/warnings general: Enforce multiple warnings in MSVC	2021-07-05 17:09:23 -07:00
Fernando Sahmkow	38165fb7e3	Texture Cache: Initial Implementation of Sparse Textures.	2021-07-04 22:32:03 +02:00
Fernando Sahmkow	0aab55d26a	TextureCacheOGL: Implement Image Copies for 1D and 1D Array.	2021-07-03 14:40:29 +02:00
Morph	ec68cba440	Merge pull request #6502 from ameerj/vendor-title main: Add GPU Vendor name to running title bar	2021-06-28 14:51:49 -04:00
Morph	d3d6613d33	video_core: Silence signed/unsigned mismatch warnings	2021-06-28 09:21:42 -04:00
bunnei	c805c0b395	Merge pull request #6496 from ameerj/astc-fixes astc: Various robustness enhancements for the gpu decoder	2021-06-24 21:47:05 -07:00
bunnei	b9c2732121	Merge pull request #6519 from Wunkolo/mem-size-literal common: Replace common_sizes into user-literals	2021-06-24 19:09:12 -07:00
Wunkolo	4569f39c7c	common: Replace common_sizes into user-literals Removes common_sizes.h in favor of having `_KiB`, `_MiB`, `_GiB`, etc user-literals within literals.h. To keep the global namespace clean, users will have to use: ``` using namespace Common::Literals; ``` to access these literals.	2021-06-24 09:27:40 -07:00
bunnei	1b09d6628b	Merge pull request #6517 from lioncash/fmtlib externals: Update fmt to 8.0.0	2021-06-23 15:31:04 -07:00
Lioncash	d0b1f2bd05	General: Resolve fmt specifiers to adhere to 8.0.0 API where applicable Also removes some deprecated API usages.	2021-06-23 13:48:21 -04:00
Mai M	17fff10e06	Merge pull request #6465 from FernandoS27/sex-on-the-beach GPU: Implement a garbage collector for GPU Caches (project Reaper+)	2021-06-23 08:03:01 -04:00
ReinUsesLisp	4009ae1da2	bootmanager: Use std::stop_source for stopping emulation Use its std::stop_token to abort shader cache loading. Using std::stop_token instead of std::atomic_bool allows the usage of other utilities like std::stop_callback.	2021-06-22 00:04:57 -03:00
lat9nq	a01459df3d	gl_device: Expand on Mesa driver names Makes this list a bit more capable at identifying Mesa drivers. Tries to deal with two of the overloaded vendor strings in a more generic fashion.	2021-06-20 23:04:07 -04:00
ameerj	fb16cbb17e	video_core: Add GPU vendor name to window title bar	2021-06-20 23:04:07 -04:00
Fernando Sahmkow	569a1962c0	Reaper: Guarantee correct deletion.	2021-06-20 19:11:41 +02:00
ameerj	851c76233d	util_shaders: Specify ASTC decoder memory barrier bits	2021-06-19 11:16:25 -04:00
ameerj	ace20ba4a4	astc_decoder.comp: Remove unnecessary LUT SSBOs We can move them to instead be compile time constants within the shader.	2021-06-19 10:56:13 -04:00
ameerj	31b125ef57	astc: Various robustness enhancements for the gpu decoder These changes should help in reducing crashes/drivers panics that may occur due to synchronization issues between the shader completion and later access of the decoded texture.	2021-06-19 09:00:33 -04:00
Fernando Sahmkow	ca6f47c686	Reaper: Change memory restrictions on TC depending on host memory on VK.	2021-06-17 00:29:48 +02:00
ameerj	b2955479e5	configure_graphics: Add Accelerate ASTC decoding setting	2021-06-15 20:19:00 -04:00
ameerj	859ba21f6d	buffer_cache: Simplify uniform disabling logic	2021-06-01 13:26:58 -04:00
Morph	065867e2c2	common: fs: Rework the Common Filesystem interface to make use of std::filesystem (#6270 ) * common: fs: fs_types: Create filesystem types Contains various filesystem types used by the Common::FS library * common: fs: fs_util: Add std::string to std::u8string conversion utility * common: fs: path_util: Add utlity functions for paths Contains various utility functions for getting or manipulating filesystem paths used by the Common::FS library * common: fs: file: Rewrite the IOFile implementation * common: fs: Reimplement Common::FS library using std::filesystem * common: fs: fs_paths: Add fs_paths to replace common_paths * common: fs: path_util: Add the rest of the path functions * common: Remove the previous Common::FS implementation * general: Remove unused fs includes * string_util: Remove unused function and include * nvidia_flags: Migrate to the new Common::FS library * settings: Migrate to the new Common::FS library * logging: backend: Migrate to the new Common::FS library * core: Migrate to the new Common::FS library * perf_stats: Migrate to the new Common::FS library * reporter: Migrate to the new Common::FS library * telemetry_session: Migrate to the new Common::FS library * key_manager: Migrate to the new Common::FS library * bis_factory: Migrate to the new Common::FS library * registered_cache: Migrate to the new Common::FS library * xts_archive: Migrate to the new Common::FS library * service: acc: Migrate to the new Common::FS library * applets/profile: Migrate to the new Common::FS library * applets/web: Migrate to the new Common::FS library * service: filesystem: Migrate to the new Common::FS library * loader: Migrate to the new Common::FS library * gl_shader_disk_cache: Migrate to the new Common::FS library * nsight_aftermath_tracker: Migrate to the new Common::FS library * vulkan_library: Migrate to the new Common::FS library * configure_debug: Migrate to the new Common::FS library * game_list_worker: Migrate to the new Common::FS library * config: Migrate to the new Common::FS library * configure_filesystem: Migrate to the new Common::FS library * configure_per_game_addons: Migrate to the new Common::FS library * configure_profile_manager: Migrate to the new Common::FS library * configure_ui: Migrate to the new Common::FS library * input_profiles: Migrate to the new Common::FS library * yuzu_cmd: config: Migrate to the new Common::FS library * yuzu_cmd: Migrate to the new Common::FS library * vfs_real: Migrate to the new Common::FS library * vfs: Migrate to the new Common::FS library * vfs_libzip: Migrate to the new Common::FS library * service: bcat: Migrate to the new Common::FS library * yuzu: main: Migrate to the new Common::FS library * vfs_real: Delete the contents of an existing file in CreateFile Current usages of CreateFile expect to delete the contents of an existing file, retain this behavior for now. * input_profiles: Don't iterate the input profile dir if it does not exist Silences an error produced in the log if the directory does not exist. * game_list_worker: Skip parsing file if the returned VfsFile is nullptr Prevents crashes in GetLoader when the virtual file is nullptr * common: fs: Validate paths for path length * service: filesystem: Open the mod load directory as read only	2021-05-25 19:32:56 -04:00
bunnei	5068279f23	Merge pull request #6248 from A-w-x/intelmesa gl_device: Intel: Disable texture view formats workaround on mesa	2021-05-20 23:47:14 -07:00
bunnei	7d86a6ff02	Merge pull request #6317 from ameerj/fps-fix perf_stats: Rework FPS counter to be more accurate	2021-05-18 19:56:29 -07:00
bunnei	a1138028a8	Merge pull request #6289 from ameerj/oob-blit texture_cache: Handle out of bound texture blits	2021-05-15 21:32:37 -07:00
ameerj	5bef54618a	perf_stats: Rework FPS counter to be more accurate The FPS counter was based on metrics in the nvdisp swapbuffers call. This metric would be accurate if the gpu thread/renderer were synchronous with the nvdisp service, but that's no longer the case. This commit moves the frame counting responsibility onto the concrete renderers after their frame draw calls. Resulting in more meaningful metrics. The displayed FPS is now made up of the average framerate between the previous and most recent update, in order to avoid distracting FPS counter updates when framerate is oscillating between close values. The status bar update frequency was also changed from 2 seconds to 500ms.	2021-05-15 20:34:20 -04:00
ameerj	3671fd0a97	texture_cache: Handle out of bound texture blits Some games interleave a texture blit using regions which are out-of-bounds. This addresses the interleaving to avoid oob reads from the src texture.	2021-05-07 22:14:21 -04:00
bunnei	2a7eff57a8	hle: kernel: Rename Process to KProcess.	2021-05-05 16:40:52 -07:00
A-w-x	6a2084a204	gl_device: Intel: Disable texture view formats workaround on mesa	2021-04-26 18:14:10 +02:00
bunnei	a4c6712a4b	common: Move settings to common from core. - Removes a dependency on core and input_common from common.	2021-04-14 16:24:03 -07:00
Rodrigo Locatti	5ee669466f	Merge pull request #5927 from ameerj/astc-compute video_core: Accelerate ASTC texture decoding using compute shaders	2021-03-30 19:31:52 -03:00
ameerj	2f83d9a61b	astc_decoder: Refactor for style and more efficient memory use	2021-03-25 16:53:51 -04:00
Jan Beich	8c016b02e7	gl_device: unblock async shaders on other Unix systems Mesa is the primary OpenGL provider on all FreeDesktop systems. For example, iris is used on Intel GPU + FreeBSD by default.	2021-03-24 19:59:20 +00:00
lat9nq	538f097f97	gl_device: Block async shaders on AMD and Intel Currently, the Windows versions of the Intel OpenGL driver and the AMD proprietary OpenGL driver do not properly support (or in fact degrade) when asynchronous shader compilation is enabled. This blocks specifically those drivers from using this feature. This affects AMDGPU-PRO on Linux, and AMD's and Intel's OpenGL drivers on Windows.	2021-03-21 01:25:45 -04:00
Rodrigo Locatti	2f30c10584	astc_decoder: Reimplement Layers Reimplements the approach to decoding layers in the compute shader. Fixes multilayer astc decoding when using Vulkan.	2021-03-13 12:16:03 -05:00
ameerj	f6566338eb	host_shaders: Modify shader cmake integration to allow for larger shaders using a raw string to encapsulate the entire shader code limits us to shaders of size less than 2KB. This change overcomes this limitation.	2021-03-13 12:16:03 -05:00
ameerj	2985e5e94c	renderer_opengl: Accelerate ASTC texture decoding with a compute shader ASTC texture decoding is currently handled by a CPU decoder for GPU's without native ASTC decoding support (most desktop GPUs). This is the cause for noticeable performance degradation in titles which use the format extensively. This commit adds support to accelerate ASTC decoding using a compute shader on OpenGL for GPUs without native support.	2021-03-13 12:16:03 -05:00
Rodrigo Locatti	daf5c5060b	Merge pull request #5891 from ameerj/bgra-ogl renderer_opengl: Use compute shaders to swizzle BGR textures on copy	2021-03-09 02:47:51 -03:00
ameerj	5213f70230	texture_cache: Blacklist BGRA8 copies and views on OpenGL In order to force the BGRA8 conversion on Nvidia using OpenGL, we need to forbid texture copies and views with other formats. This commit also adds a boolean relating to this, as this needs to be done only for the OpenGL api, Vulkan must remain unchanged.	2021-03-04 14:14:49 -05:00
ameerj	0639244d85	renderer_opengl: Swizzle BGR textures on copy OpenGL does not natively support BGR internal formats, which causes many BGR textures to render incorrectly, with Red and Blue channels swapped. This commit aims to address this by swizzling the blue and red channels on texture copies when a BGR format is encountered.	2021-03-04 14:14:19 -05:00
ReinUsesLisp	5ad62e7bfc	buffer_cache: Heuristically decide to skip cache on uniform buffers Some games benefit from skipping caches (Pokémon Sword), and others don't (Animal Crossing: New Horizons). Add an heuristic to decide this at runtime. The cache hit ratio has to be ~98% or better to not skip the cache. There are 16 frames of buffer.	2021-03-02 02:44:19 -03:00
Kelebek1	d31dbb1bc1	Implement glDepthRangeIndexeddNV	2021-02-24 22:26:53 +00:00
Morph	1a5d4d7840	gl_disk_shader_cache: Log total shader entries count on game load	2021-02-20 11:08:19 -05:00
ameerj	c7325c6a4c	gl_texture_cache: Lazily create non-sRGB texture views for sRGB formats This creates non-sRGB texture views for sRGB texture formats to allow for interfacing with these views in compute shaders using imageLoad and imageStore. Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>	2021-02-13 13:27:50 -05:00
Morph	83227ad981	Merge pull request #5919 from ReinUsesLisp/stream-buffer-tragic gl_stream_buffer/vk_staging_buffer_pool: Fix size check	2021-02-13 21:25:45 +08:00
ReinUsesLisp	682d82faf3	gl_stream_buffer/vk_staging_buffer_pool: Fix size check Fix a tragic off-by-one condition that causes Vulkan's stream buffer to think it's always full, using fallback memory. The OpenGL was also affected by this bug to a lesser extent.	2021-02-13 05:11:48 -03:00
LC	6f1ad6aa9f	Merge pull request #5916 from ameerj/maxwell-gl-unused maxwell_to_gl: Remove unused code	2021-02-13 02:55:59 -05:00
ReinUsesLisp	5b35b01070	video_core: Fix clang build issues	2021-02-13 02:26:47 -03:00
ReinUsesLisp	0b631f22fc	renderer_opengl: Remove interop Remove unused interop code from the OpenGL backend.	2021-02-13 02:18:04 -03:00
ReinUsesLisp	3da87d3f12	gl_buffer_cache: Drop interop based parameter buffer workarounds Sacrify runtime performance to avoid generating kernel exceptions on Windows due to our abusive aliasing of interop buffer objects.	2021-02-13 02:17:24 -03:00
ReinUsesLisp	35df1d1864	vk_staging_buffer_pool: Add stream buffer for small uploads This uses a ring buffer similar to OpenGL's stream buffer for small uploads. This stops us from allocating several small buffers, reducing memory fragmentation and cache locality. It uses dedicated allocations when possible.	2021-02-13 02:17:24 -03:00
ReinUsesLisp	82c2601555	video_core: Reimplement the buffer cache Reimplement the buffer cache using cached bindings and page level granularity for modification tracking. This also drops the usage of shared pointers and virtual functions from the cache. - Bindings are cached, allowing to skip work when the game changes few bits between draws. - OpenGL Assembly shaders no longer copy when a region has been modified from the GPU to emulate constant buffers, instead GL_EXT_memory_object is used to alias sub-buffers within the same allocation. - OpenGL Assembly shaders stream constant buffer data using glProgramBufferParametersIuivNV, from NV_parameter_buffer_object. In theory this should save one hash table resolve inside the driver compared to glBufferSubData. - A new OpenGL stream buffer is implemented based on fences for drivers that are not Nvidia's proprietary, due to their low performance on partial glBufferSubData calls synchronized with 3D rendering (that some games use a lot). - Most optimizations are shared between APIs now, allowing Vulkan to cache more bindings than before, skipping unnecesarry work. This commit adds the necessary infrastructure to use Vulkan object from OpenGL. Overall, it improves performance and fixes some bugs present on the old cache. There are still some edge cases hit by some games that harm performance on some vendors, this are planned to be fixed in later commits.	2021-02-13 02:17:22 -03:00
ReinUsesLisp	75ccd9959c	gpu: Report renderer errors with exceptions Instead of using a two step initialization to report errors, initialize the GPU renderer and rasterizer on the constructor and report errors through std::runtime_error.	2021-02-13 02:16:19 -03:00
ameerj	069afcc633	maxwell_to_gl: Remove unused code Removes unused declarations in maxwell_to_gl.h	2021-02-12 23:01:09 -05:00
Lioncash	10636d2494	gl_rasterizer: Remove unused variables Resolves warnings on clang 12	2021-02-09 17:31:37 -05:00
Morph	6e5cc977ad	renderer_opengl: Update OpenGL backend version requirement to 4.6	2021-02-07 16:32:35 -05:00
bunnei	45b13c3037	Merge pull request #5786 from ReinUsesLisp/glsl-cbuf gl_shader_decompiler: Fix constant buffer size calculation	2021-01-27 15:27:53 -08:00
ReinUsesLisp	436457b6e7	gl_shader_decompiler: Fix constant buffer size calculation The divide logic was wrong and can cause an uniform buffer size overflow.	2021-01-21 19:47:41 -03:00
ReinUsesLisp	51512d01d8	renderer_opengl: Avoid precompiled cache and force NV GL cache directory Setting __GL_SHADER_DISK_CACHE_PATH we can force the cache directory to be in yuzu's user directory to stop commonly distributed malware from deleting our driver shader cache. And by setting __GL_SHADER_DISK_CACHE_SKIP_CLEANUP we can have an unbounded shader cache size. This has only been implemented on Windows, mostly because previous tests didn't seem to work on Linux. Disable the precompiled cache on Nvidia's driver. There's no need to hide information the driver already has in its own cache.	2021-01-21 00:41:03 -03:00
ReinUsesLisp	7d904fef2e	gl_texture_cache: Avoid format views on Intel and AMD Intel and AMD proprietary drivers are incapable of rendering to texture views of different formats than the original texture. Avoid creating these at a cache level. This will consume more memory, emulating them with copies.	2021-01-04 02:06:40 -03:00
ReinUsesLisp	3a49c1a691	gl_texture_cache: Create base images with sRGB This breaks accelerated decoders trying to imageStore into images with sRGB. The decoders are currently disabled so this won't cause issues at runtime.	2021-01-04 01:54:54 -03:00
ReinUsesLisp	9764c13d6d	video_core: Rewrite the texture cache The current texture cache has several points that hurt maintainability and performance. It's easy to break unrelated parts of the cache when doing minor changes. The cache can easily forget valuable information about the cached textures by CPU writes or simply by its normal usage.The current texture cache has several points that hurt maintainability and performance. It's easy to break unrelated parts of the cache when doing minor changes. The cache can easily forget valuable information about the cached textures by CPU writes or simply by its normal usage. This commit aims to address those issues.	2020-12-30 03:38:50 -03:00
bunnei	d1a2b3fb18	Merge pull request #5162 from lioncash/copy-shader gl_shader_decompiler: Elide unnecessary copies within DeclareConstantBuffers()	2020-12-10 00:11:11 -08:00
Lioncash	09fa1d6a73	video_core: Make use of ordered container contains() where applicable With C++20, we can use the more concise contains() member function instead of comparing the result of the find() call with the end iterator.	2020-12-07 16:30:39 -05:00
Lioncash	edcbd47800	gl_shader_decompiler: Elide unnecessary copies within DeclareConstantBuffers() Resolves a -Wrange-loop-analysis warning.	2020-12-07 14:01:52 -05:00
Lioncash	4c5f5c9bf3	video_core: Remove unnecessary enum class casting in logging messages fmt now automatically prints the numeric value of an enum class member by default, so we don't need to use casts any more. Reduces the line noise a bit.	2020-12-07 00:41:50 -05:00
LC	69af6ada2f	Merge pull request #5136 from lioncash/video-shadow3 video_core: Resolve more variable shadowing scenarios pt.3	2020-12-07 00:06:53 -05:00
comex	d637114c17	video_core: Adjust `NUM` macro to avoid Clang warning The previous definition was: #define NUM(field_name) (sizeof(Maxwell3D::Regs::field_name) / sizeof(u32)) In cases where `field_name` happens to refer to an array, Clang thinks `sizeof(an array value) / sizeof(a type)` is an instance of the idiom where `sizeof` is used to compute an array length. So it thinks the type in the denominator ought to be the array element type, and warns if it isn't, assuming this is a mistake. In reality, `NUM` is not used to get array lengths at all, so there is no mistake. Silence the warning by applying Clang's suggested workaround of parenthesizing the denominator.	2020-12-06 18:24:16 -05:00
Lioncash	f95602f152	video_core: Resolve more variable shadowing scenarios pt.3 Cleans out the rest of the occurrences of variable shadowing and makes any further occurrences of shadowing compiler errors.	2020-12-05 16:02:23 -05:00
bunnei	e6a896c4bd	Merge pull request #5124 from lioncash/video-shadow video_core: Resolve more variable shadowing scenarios	2020-12-05 00:48:08 -08:00
FearlessTobi	37d672bf08	Fix telemetry-related exit crash from use-after-free Co-Authored-By: xperia64 <xperia64@users.noreply.github.com>	2020-12-05 02:42:50 +01:00
Lioncash	677a8b208d	video_core: Resolve more variable shadowing scenarios Resolves variable shadowing scenarios up to the end of the OpenGL code to make it nicer to review. The rest will be resolved in a following commit.	2020-12-04 16:19:09 -05:00
comex	994f497781	Overhaul EmuWindow::PollEvents to fix yuzu-cmd calling SDL_PollEvents off main thread EmuWindow::PollEvents was called from the GPU thread (or the CPU thread in sync-GPU mode) when swapping buffers. It had three implementations: - In GRenderWindow, it didn't actually poll events, just set a flag and emit a signal to indicate that a frame was displayed. - In EmuWindow_SDL2_Hide, it did nothing. - In EmuWindow_SDL2, it did call SDL_PollEvents, but this is wrong because SDL_PollEvents is supposed to be called on the thread that set up video - in this case, the main thread, which was sleeping in a busyloop (regardless of whether sync-GPU was enabled). On macOS this causes a crash. To fix this: - Rename EmuWindow::PollEvents to OnFrameDisplayed, and give it a default implementation that does nothing. - In EmuWindow_SDL2, do not override OnFrameDisplayed, but instead have the main thread call SDL_WaitEvent in a loop.	2020-11-23 17:58:49 -05:00
Morph	e13a91fa9b	Merge pull request #4954 from lioncash/compare gl_rasterizer: Make floating-point literal a float	2020-11-22 09:55:23 +08:00
ReinUsesLisp	acc14d233f	gl_rasterizer: Remove warning of untested alpha test Alpha test has been proven to only affect the first render target.	2020-11-20 23:17:40 -03:00
Lioncash	8469b76630	gl_rasterizer: Make floating-point literal a float Gets rid of an unnecessary expansion from float to double.	2020-11-20 04:24:33 -05:00
ReinUsesLisp	657771bdcb	shader: Partially implement texture cube array shadow This implements texture cube arrays with shadow comparisons but doesn't fix the asserts related to it. Fixes out of bounds reads on swizzle constructors and makes them use bounds checked ::at instead of the unsafe operator[].	2020-10-28 17:12:40 -03:00
ReinUsesLisp	79da90cea8	video_core: Enforce -Wredundant-move and -Wpessimizing-move Silence three warnings and make them errors to avoid introducing more in the future.	2020-10-28 02:44:50 -03:00
ReinUsesLisp	f21a189148	gl_arb_decompiler: Implement robust buffer operations This emulates the behavior we get on GLSL with regular SSBOs with a pointer + length pair. It aims to be consistent with the crashes we might get. Out of bounds stores are ignored. Atomics are ignored and return zero. Reads return zero.	2020-10-20 03:34:32 -03:00
ReinUsesLisp	2a24b1c973	video_core: Enforce -Wunused-variable and -Wunused-but-set-variable	2020-10-02 21:19:35 -03:00
bunnei	d66b897a6d	Merge pull request #4674 from ReinUsesLisp/timeline-semaphores renderer_vulkan: Make unconditional use of VK_KHR_timeline_semaphore	2020-09-23 18:24:27 -07:00
Lioncash	ff45c39578	General: Make use of std::nullopt where applicable Allows some implementations to avoid completely zeroing out the internal buffer of the optional, and instead only set the validity byte within the structure. This also makes it consistent how we return empty optionals.	2020-09-22 17:32:33 -04:00
ReinUsesLisp	7003090187	renderer_opengl: Remove emulated mailbox presentation Emulated mailbox presentation was causing performance issues on Nvidia's OpenGL driver. Remove it.	2020-09-20 16:29:41 -03:00
ReinUsesLisp	58b0ae84b5	renderer_vulkan: Make unconditional use of VK_KHR_timeline_semaphore This reworks how host<->device synchronization works on the Vulkan backend. Instead of "protecting" resources with a fence and signalling these as free when the fence is known to be signalled by the host GPU, use timeline semaphores. Vulkan timeline semaphores allow use to work on a subset of D3D12 fences. As far as we are concerned, timeline semaphores are a value set by the host or the device that can be waited by either of them. Taking advantange of this, we can have a monolithically increasing atomic value for each submission to the graphics queue. Instead of protecting resources with a fence, we simply store the current logical tick (the atomic value stored in CPU memory). When we want to know if a resource is free, it can be compared to the current GPU tick. This greatly simplifies resource management code and the free status of resources should have less false negatives. To workaround bugs in validation layers, when these are attached there's a thread waiting for timeline semaphores.	2020-09-19 01:46:37 -03:00
ReinUsesLisp	eb914b6c50	video_core: Enforce -Werror=switch This forces us to fix all -Wswitch warnings in video_core.	2020-09-16 17:48:01 -03:00
ReinUsesLisp	9e87193725	video_core: Remove all Core::System references in renderer Now that the GPU is initialized when video backends are initialized, it's no longer needed to query components once the game is running: it can be done when yuzu is booting. This allows us to pass components between constructors and in the process remove all Core::System references in the video backend.	2020-09-06 05:28:48 -03:00
bunnei	1bb8c27a70	Merge pull request #4569 from ReinUsesLisp/glsl-cmake video_core/host_shaders: Add CMake integration for string shaders	2020-08-26 22:57:39 -04:00
bunnei	bb752df736	Merge pull request #4542 from ReinUsesLisp/gpu-init-base video_core: Initialize renderer with a GPU	2020-08-24 22:56:11 -04:00
Lioncash	bae4e6c2f5	gl_texture_cache: Take std::string by reference in DecorateViewName() LabelGLObject takes a string_view, so we don't need to make copies of the std::string.	2020-08-23 23:36:33 -04:00
Lioncash	f3bb52c0a9	video_core/fence_manager: Remove unnecessary includes Avoids pulling in unnecessary things that can cause rebuilds when they aren't required.	2020-08-23 21:44:50 -04:00
ReinUsesLisp	91df2beee3	video_core/host_shaders: Add CMake integration for string shaders Add the necessary CMake code to copy the contents in a string source shader (GLSL or GLASM) to a header file then consumed by video_core files. This allows editting GLSL in its own files without having to maintain them in source files. For now, only OpenGL presentation shaders are moved, but we can add GLASM presentation shaders and static SPIR-V generation through glslangValidator in the future.	2020-08-23 21:37:20 -03:00
ReinUsesLisp	0eaf7e1daa	gl_shader_util: Use std::string_view instead of star pointer This allows us passing any type of string and hinting the length of the string to the OpenGL driver.	2020-08-23 21:23:54 -03:00
ReinUsesLisp	da53bcee60	video_core: Initialize renderer with a GPU Add an extra step in GPU initialization to be able to initialize render backends with a valid GPU instance.	2020-08-22 01:51:45 -03:00
bunnei	baff9ffcac	Merge pull request #4521 from lioncash/optionalcache gl_shader_disk_cache: Make use of std::nullopt where applicable	2020-08-21 23:56:55 -04:00
Lioncash	f6bb905182	common/telemetry: Migrate namespace into the Common namespace Migrates the Telemetry namespace into the Common namespace to make the code consistent with the rest of our common code.	2020-08-18 15:08:32 -04:00
bunnei	56c6a5def8	Merge pull request #4535 from lioncash/fileutil common/fileutil: Convert namespace to Common::FS	2020-08-17 22:35:30 -04:00
ameerj	1b829fbd7a	move thread 1/4 count computation into allocate workers method	2020-08-16 12:02:22 -04:00
Lioncash	c4ed791164	common/fileutil: Convert namespace to Common::FS Migrates a remaining common file over to the Common namespace, making it consistent with the rest of common files. This also allows for high-traffic FS related code to alias the filesystem function namespace as namespace FS = Common::FS; for more concise typing.	2020-08-16 06:52:40 -04:00
Lioncash	1ee060ca0d	common/compression: Roll back std::span changes Seems like all compilers don't support std::span yet.	2020-08-15 17:17:56 -04:00
bunnei	feb243b08d	Merge pull request #4416 from lioncash/span lz4_compression/zstd_compression: Make use of std::span in interfaces	2020-08-15 00:53:11 -04:00
Lioncash	c8135b3c18	gl_shader_disk_cache: Make use of std::nullopt where applicable Allows the compiler to avoid unnecessarily zeroing out the internal buffer of std::optional on some implementations.	2020-08-14 08:20:44 -04:00
Morph	e0ff98dd34	gl_shader_cache: Use std::max() for determining num_workers Does not allocate more threads than available in the host system for boot-time shader compilation and always allocates at least 1 thread if hardware_concurrency() returns 0.	2020-08-12 09:23:34 -04:00
Morph	e8f22730d1	renderer_opengl: Use 1/4 of all threads for async shader compilation	2020-07-28 05:08:27 -04:00
Lioncash	c5bdccfecb	zstd_compression: Make use of std::span in interfaces Allows condensing the data and size parameters into a single argument.	2020-07-25 03:11:56 -04:00
bunnei	f650cf8a9a	Merge pull request #4391 from lioncash/nrvo video_core: Allow copy elision to take place where applicable	2020-07-24 06:33:09 -07:00
bunnei	1d7de0a8ee	Merge pull request #4394 from lioncash/unused6 video_core: Remove unused variables	2020-07-23 19:54:59 -07:00
Rodrigo Locatti	7278c59d70	Merge pull request #4359 from ReinUsesLisp/clamp-shared renderer_{opengl,vulkan}: Clamp shared memory to host's limit	2020-07-21 04:51:05 -03:00
Rodrigo Locatti	721e6015a8	Merge pull request #4360 from ReinUsesLisp/glasm-bar gl_arb_decompiler: Execute BAR even when inside control flow	2020-07-21 04:50:55 -03:00
Lioncash	e17fb5ee97	video_core: Remove unused variables Silences several compiler warnings about unused variables.	2020-07-21 00:57:25 -04:00
Lioncash	6adc824d9d	video_core: Allow copy elision to take place where applicable Removes const from some variables that are returned from functions, as this allows the move assignment/constructors to execute for them.	2020-07-21 00:36:13 -04:00
bunnei	3d13d7f48f	Merge pull request #4324 from ReinUsesLisp/formats video_core: Fix, add and rename pixel formats	2020-07-21 00:13:04 -04:00
ReinUsesLisp	a8a2526128	gl_arb_decompiler: Use NV_shader_buffer_{load,store} on assembly shaders NV_shader_buffer_{load,store} is a 2010 extension that allows GL applications to use what in Vulkan is known as physical pointers, this is basically C pointers. On GLASM these is exposed through the LOAD/STORE/ATOM instructions. Up until now, assembly shaders were using NV_shader_storage_buffer_object. These work fine, but have a (probably unintended) limitation that forces us to have the limit of a single stage for all shader stages. In contrast, with NV_shader_buffer_{load,store} we can pass GPU addresses to the shader through local parameters (GLASM equivalent uniform constants, or push constants on Vulkan). Local parameters have the advantage of being per stage, allowing us to generate code without worrying about binding overlaps.	2020-07-18 01:59:57 -03:00
David Marcec	2ba195aa0d	Drop max workers from 8->2 for testing	2020-07-17 14:26:15 +10:00
David Marcec	85d7a8f466	Rebase for per game settings	2020-07-17 14:26:14 +10:00
David Marcec	468bd9c1b0	async shaders	2020-07-17 14:24:57 +10:00
ReinUsesLisp	88e57b13e0	gl_arb_decompiler: Execute BAR even when inside control flow Unlike GLSL, GLASM allows us to call BAR inside control flow. - Fixes graphical artifacts in Paper Mario.	2020-07-16 16:05:52 -03:00
ReinUsesLisp	a5a72cbd20	renderer_{opengl,vulkan}: Clamp shared memory to host's limit This stops shaders from failing to build when the exceed host's shared memory size limit. An error is logged.	2020-07-16 16:02:46 -03:00
ReinUsesLisp	fbc232426d	video_core: Rearrange pixel format names Normalizes pixel format names to match Vulkan names. Previous to this commit pixel formats had no convention, leading to confusion and potential bugs.	2020-07-13 01:44:23 -03:00
ReinUsesLisp	eda37ff26b	video_core: Fix DXT4 and RGB565	2020-07-13 01:01:09 -03:00
ReinUsesLisp	480850ffe7	video_core: Fix B5G6R5_UNORM render target format	2020-07-13 01:01:09 -03:00
ReinUsesLisp	990b14f181	video_core: Fix B5G6R5U	2020-07-13 01:01:09 -03:00
ReinUsesLisp	1d20aac795	video_core: Implement RGBA32_SINT render target	2020-07-13 01:01:09 -03:00
ReinUsesLisp	9338599d72	video_core: Implement RGBA32_SINT render target	2020-07-13 01:01:09 -03:00
ReinUsesLisp	95c0f5afe5	video_core: Implement RGBA16_SINT render target	2020-07-13 01:01:09 -03:00
ReinUsesLisp	977d6c46f3	video_core: Implement RGBA8_SINT render target	2020-07-13 01:01:09 -03:00
ReinUsesLisp	50c6030a8d	video_core: Implement RG32_SINT render target	2020-07-13 01:01:09 -03:00
ReinUsesLisp	e849d68048	video_core: Implement RG8_SINT render target and fix RG8_UINT	2020-07-13 01:01:09 -03:00
ReinUsesLisp	f29fede49c	video_core: Implement R8_SINT render target	2020-07-13 01:01:08 -03:00
ReinUsesLisp	fd33e996e0	video_core: Implement R8_SNORM render target	2020-07-13 01:01:08 -03:00
lat9nq	63d23835ef	configuration: implement per-game configurations (#4098 ) * Switch game settings to use a pointer In order to add full per-game settings, we need to be able to tell yuzu to switch to using either the global or game configuration. Using a pointer makes it easier to switch. * configuration: add new UI without changing existing funcitonality The new UI also adds General, System, Graphics, Advanced Graphics, and Audio tabs, but as yet they do nothing. This commit keeps yuzu to the same functionality as originally branched. * configuration: Rename files These weren't included in the last commit. Now they are. * configuration: setup global configuration checkbox Global config checkbox now enables/disables the appropriate tabs in the game properties dialog. The use global configuration setting is now saved to the config, defaulting to true. This also addresses some changes requested in the PR. * configuration: swap to per-game config memory for properties dialog Does not set memory going in-game. Swaps to game values when opening the properties dialog, then swaps back when closing it. Uses a `memcpy` to swap. Also implements saving config files, limited to certain groups of configurations so as to not risk setting unsafe configurations. * configuration: change config interfaces to use config-specific pointers When a game is booted, we need to be able to open the configuration dialogs without changing the settings pointer in the game's emualtion. A new pointer specific to just the configuration dialogs can be used to separate changes to just those config dialogs without affecting the emulation. * configuration: boot a game using per-game settings Swaps values where needed to boot a game. * configuration: user correct config during emulation Creates a new pointer specifically for modifying the configuration while emulation is in progress. Both the regular configuration dialog and the game properties dialog now use the pointer Settings::config_values to focus edits to the correct struct. * settings: split Settings::values into two different structs By splitting the settings into two mutually exclusive structs, it becomes easier, as a developer, to determine how to use the Settings structs after per-game configurations is merged. Other benefits include only duplicating the required settings in memory. * settings: move use_docked_mode to Controls group `use_docked_mode` is set in the input settings and cannot be accessed from the system settings. Grouping it with system settings causes it to be saved with per-game settings, which may make transferring configs more difficult later on, especially since docked mode cannot be set from within the game properties dialog. * configuration: Fix the other yuzu executables and a regression In main.cpp, we have to get the title ID before the ROM is loaded, else the renderer will reflect only the global settings and now the user's game specific settings. * settings: use a template to duplicate memory for each setting Replaces the type of each variable in the Settings::Values struct with a new class that allows basic data reading and writing. The new struct Settings::Setting duplicates the data in memory and can manage global overrides per each setting. * configuration: correct add-ons config and swap settings when apropriate Any add-ons interaction happens directly through the global values struct. Swapping bewteen structs now also includes copying the necessary global configs that cannot be changed nor saved in per-game settings. General and System config menus now update based on whether it is viewing the global or per-game settings. * settings: restore old values struct No longer needed with the Settings::Setting class template. * configuration: implement hierarchical game properties dialog This sets the apropriate global or local data in each setting. * clang format * clang format take 2 can the docker container save this? * address comments and style issues * config: read and write settings with global awareness Adds new functions to read and write settings while keeping the global state in focus. Files now generated per-game are much smaller since often they only need address the global state. * settings: restore global state when necessary Upon closing a game or the game properties dialog, we need to restore all global settings to the original global state so that we can properly open the configuration dialog or boot a different game. * configuration: guard setting values incorrectly This disables setting values while a game is running if the setting is overwritten by a per game setting. * config: don't write local settings in the global config Simple guards to prevent writing the wrong settings in the wrong files. * configuration: add comments, assume less, and clang format No longer assumes that a disabled UI element means the global state is turned off, instead opting to directly answer that question. Still however assumes a game is running if it is in that state. * configuration: fix a logic error Should not be negated * restore settings' global state regardless of accept/cancel Fixes loading a properties dialog and causing the global config dialog to show local settings. * fix more logic errors Fixed the frame limit would set the global setting from the game properties dialog. Also strengthened the Settings::Setting member variables and simplified the logic in config reading (ReadSettingGlobal). * fix another logic error In my efforts to guard RestoreGlobalState, I accidentally negated the IsPowered condition. * configure_audio: set toggle_stretched_audio to tristate * fixed custom rtc and rng seed overwriting the global value * clang format * rebased * clang format take 4 * address my own review Basically revert unintended changes * settings: literal instead of casting "No need to cast, use 1U instead" Thanks, Morph! Co-authored-by: Morph <39850852+Morph1984@users.noreply.github.com> * Revert "settings: literal instead of casting " This reverts commit 95e992a87c898f3e882ffdb415bb0ef9f80f613f. * main: fix status buttons reporting wrong settings after stop emulation * settings: Log UseDockedMode in the Controls group This should have happened when use_docked_mode was moved over to the controls group internally. This just reflects this in the log. * main: load settings if the file has a title id In other words, don't exit if the loader has trouble getting a title id. * use a zero * settings: initalize resolution factor with constructor instead of casting * Revert "settings: initalize resolution factor with constructor instead of casting" This reverts commit 54c35ecb46a29953842614620f9b7de1aa9d5dc8. * configure_graphics: guard device selector when Vulkan is global Prevents the user from editing the device selector if Vulkan is the global renderer backend. Also resets the vulkan_device variable when the users switches back-and-forth between global and Vulkan. * address reviewer concerns Changes function variables to const wherever they don't need to be changed. Sets Settings::Setting to final as it should not be inherited from. Sets ConfigurationShared::use_global_text to static. Co-Authored-By: VolcaEM <volcaem@users.noreply.github.com> * main: load per-game settings after LoadROM This prevents `Restart Emulation` from restoring the global settings after the per-game settings were applied. Thanks to BSoDGamingYT for finding this bug. * Revert "main: load per-game settings after LoadROM" This reverts commit 9d0d48c52d2dcf3bfb1806cc8fa7d5a271a8a804. * main: only restore global settings when necessary Loading the per-game settings cannot happen after the ROM is loaded, so we have to specify when to restore the global state. Again thanks to BSoD for finding the bug. * configuration_shared: address reviewer concerns except operator overrides Dropping operator override usage in next commit. Co-Authored-By: LC <lioncash@users.noreply.github.com> * settings: Drop operator overrides from Setting template Requires using GetValue and SetValue explicitly. Also reverts a change that broke title ID formatting in the game properties dialog. * complete rebase * configuration_shared: translate "Use global configuration" Uses ConfigurePerGame to do so, since its usage, at least as of now, corresponds with ConfigurationShared. * configure_per_game: address reviewer concern As far as I understand, it prevents the program from unnecessarily copying strings. Co-Authored-By: LC <lioncash@users.noreply.github.com> Co-authored-by: Morph <39850852+Morph1984@users.noreply.github.com> Co-authored-by: VolcaEM <volcaem@users.noreply.github.com> Co-authored-by: LC <lioncash@users.noreply.github.com>	2020-07-09 22:42:09 -04:00
bunnei	41a333321a	Merge pull request #4175 from ReinUsesLisp/read-buffer gl_buffer_cache: Copy to buffers created as STREAM_READ before downloading	2020-07-02 23:30:08 -04:00
Rodrigo Locatti	c58e21cd76	Merge pull request #4082 from Morph1984/mirror-once-clamp maxwell_to_gl: Implement MirrorOnceClampOGL wrap mode using GL_MIRROR_CLAMP_EXT	2020-07-02 04:57:40 -03:00
Fernando Sahmkow	977a3ab352	Merge pull request #4157 from ReinUsesLisp/unified-turing gl_device: Enable NV_vertex_buffer_unified_memory on Turing devices	2020-06-30 14:36:51 -04:00
Morph	1b31755ba6	maxwell_to_gl: Implement MirrorOnceClampOGL using GL_MIRROR_CLAMP_EXT Like MirrorOnceBorder, this requires the GL_EXT_texture_mirror_clamp extension. This extension is unfortunately not available on Intel's drivers (both Windows proprietary and Linux Mesa). Use GL_MIRROR_CLAMP_TO_EDGE as a fallback if the extension is unavailable.	2020-06-30 02:40:14 -04:00
Morph	10eca7f651	maxwell_to_gl: Rename VertexType() to VertexFormat()	2020-06-29 11:48:38 -04:00
Morph	78d80d99a0	maxwell_to_gl: Add 32 bit component sizes to (un)signed scaled formats Add 32 bit component sizes to (un)signed scaled formats and group (un)signed normalized, scaled, and integer formats together.	2020-06-28 02:51:13 -04:00
ReinUsesLisp	6481d91e4a	gl_buffer_cache: Copy to buffers created as STREAM_READ before downloading After marking buffers as resident, Nvidia's driver seems to take a slow path. To workaround this issue, copy to a STREAM_READ buffer and then call GetNamedBufferSubData on it. This is a temporary solution until we have asynchronous flushing.	2020-06-26 16:58:40 -03:00
Rodrigo Locatti	5872fc21fe	Merge pull request #4151 from ReinUsesLisp/gl-invalidations gl_shader_cache: Avoid use after move for program size	2020-06-25 21:05:27 -03:00
David Marcec	a927d8be52	gl_device: Fix IsASTCSupported Other targets were never actually checked	2020-06-25 19:12:56 +10:00
ReinUsesLisp	bc8d3b8f82	gl_device: Enable NV_vertex_buffer_unified_memory on Turing devices Once we make sure not to corrupt Nvidia's driver, we can safely use resident buffers on Turing devices. See GitHub pull request #4156	2020-06-25 01:28:47 -03:00
ReinUsesLisp	32a2dcd415	buffer_cache: Use buffer methods instead of cache virtual methods	2020-06-24 02:36:14 -03:00
ReinUsesLisp	39c97f1b65	gl_stream_buffer: Use InvalidateBufferData instead unmap and map Making the stream buffer resident increases GPU usage significantly on some games. This seems to be addressed invalidating the stream buffer with InvalidateBufferData instead of using a Unmap + Map (with invalidation flags).	2020-06-24 02:36:14 -03:00
ReinUsesLisp	41a4090320	gl_rasterizer: Use NV_vertex_buffer_unified_memory for vertex buffer robustness Switch games are allowed to bind less data than what they use in a vertex buffer, the expected behavior here is that these values are read as zero. At the moment of writing this only D3D12, OpenGL and NVN through NV_vertex_buffer_unified_memory support vertex buffer with a size limit. In theory this could be emulated on Vulkan creating a new VkBuffer for each (handle, offset, length) tuple and binding the expected data to it. This is likely going to be slow and memory expensive when used on the vertex buffer and we have to do it on all draws because we can't know without analyzing indices when a game is going to read vertex data out of bounds. This is not a problem on OpenGL's BufferAddressRangeNV because it takes a length parameter, unlike Vulkan's CmdBindVertexBuffers that only takes buffers and offsets (the length is implicit in VkBuffer). It isn't a problem on D3D12 either, because D3D12_VERTEX_BUFFER_VIEW on IASetVertexBuffers takes SizeInBytes as a parameter (although I am not familiar with robustness on D3D12). Currently this only implements buffer ranges for vertex buffers, although indices can also be affected. A KHR_robustness profile is not created, but Nvidia's driver reads out of bound vertex data as zero anyway, this might have to be changed in the future. - Fixes SMO random triangles when capturing an enemy, getting hit, or looking at the environment on certain maps.	2020-06-24 02:36:14 -03:00
ReinUsesLisp	32485917ba	gl_buffer_cache: Mark buffers as resident Make stream buffer and cached buffers as resident and query their address. This allows us to use GPU addresses for several proprietary Nvidia extensions.	2020-06-24 02:36:14 -03:00
ReinUsesLisp	73fb3a304b	gl_device: Expose NV_vertex_buffer_unified_memory except on Turing Expose NV_vertex_buffer_unified_memory when the driver supports it. This commit adds a function the determine if a GL_RENDERER is a Turing GPU. This is required because on Turing GPUs Nvidia's driver crashes when the buffer is marked as resident or on DeleteBuffers. Without a synchronous debug output (single threaded driver), it's likely that the driver will crash in the first blocking call.	2020-06-24 02:36:14 -03:00
ReinUsesLisp	00c66a7289	gl_stream_buffer: Always use a non-coherent buffer	2020-06-24 02:35:33 -03:00
ReinUsesLisp	da79ec9565	gl_stream_buffer: Always use persistent memory maps yuzu no longer supports platforms without persistent maps.	2020-06-24 02:35:33 -03:00
Rodrigo Locatti	b66ccaa376	Merge pull request #4129 from Morph1984/texture-shadow-lod-workaround gl_shader_decompiler: Workaround textureLod when GL_EXT_texture_shadow_lod is not available	2020-06-24 01:51:15 -03:00
ReinUsesLisp	9f54cd4dad	gl_shader_cache: Avoid use after move for program size All programs had a size of zero due to this bug, skipping invalidations. While we are at it, remove some unused forward declarations.	2020-06-23 22:54:42 -03:00
Morph	f77c897b8d	gl_shader_decompiler: Enable GL_EXT_texture_shadow_lod if available Enable GL_EXT_texture_shadow_lod if available. If this extension is not available, such as on Intel/AMD proprietary drivers, use textureGrad as a workaround.	2020-06-20 23:02:29 -04:00
Morph	1e65da971b	gl_device: Check for GL_EXT_texture_shadow_lod	2020-06-20 22:14:32 -04:00
Lioncash	5865a10885	gl_arb_decompiler: Avoid several string copies Variables that are marked as const cannot have the move constructor invoked when returning from a function (the move constructor requires a non-const variable so it can "steal" the resources from it.	2020-06-19 23:09:16 -04:00
Morph	8868fb745f	maxwell_to_gl: Miscellaneous changes maxwell_to_gl: Log unimplemented features under UNIMPLEMENTED_MSG instead of LOG_ERROR to bring into parity with maxwell_to_vk maxwell_to_gl: Deduplicate logging in VertexType(), merging them into one. maxwell_to_gl: Return GL_NEAREST instead of GL_LINEAR if an unknown texture filter mode is encountered. maxwell_to_gl: Log the mipmap filter mode if an unknown value is passed in. maxwell_to_gl: Reorder filtering modes to start with None, then Nearest, then Linear.	2020-06-18 04:56:31 -04:00
Rodrigo Locatti	edb2114bac	Merge pull request #4092 from Morph1984/image-bindings gl_device: Reserve 4 image bindings for fragment stage	2020-06-18 04:59:48 -03:00
bunnei	798ec003ce	Merge pull request #4041 from ReinUsesLisp/arb-decomp gl_arb_decompiler: Implement an assembly shader decompiler	2020-06-16 14:56:23 -04:00
Morph	e2f5d16540	gl_device: Reserve at least 4 image bindings for fragment stage Due to the limitation of GL_MAX_IMAGE_UNITS being low (8) on Intel's and Nvidia's proprietary drivers, we have to reserve an appropriate amount of image bindings for each of the stages. So far games have been observed to use 4 image bindings on the fragment stage (Kirby Star Allies) and 1 on the vertex stage (TWD series). No games thus far in my limited testing used more than 4 images concurrently and across all currently active programs. This fixes shader compilation errors on Kirby Star Allies on OpenGL (GLSL/GLASM)	2020-06-16 03:03:07 -04:00
Rodrigo Locatti	0bd9bc7201	Merge pull request #4066 from ReinUsesLisp/shared-ptr-buf buffer_cache: Avoid passing references of shared pointers and misc style changes	2020-06-15 22:29:32 -03:00
bunnei	92021a344c	Merge pull request #4064 from ReinUsesLisp/invalidate-buffers gl_rasterizer: Mark vertex buffers as dirty after buffer cache invalidation	2020-06-14 00:29:16 -04:00
bunnei	c2ea1e1bcb	Merge pull request #4049 from ReinUsesLisp/separate-samplers shader/texture: Join separate image and sampler pairs offline	2020-06-13 13:48:27 -04:00
bunnei	5633887569	Merge pull request #3986 from ReinUsesLisp/shader-cache shader_cache: Implement a generic runtime shader cache	2020-06-12 23:14:48 -04:00
ReinUsesLisp	87011a97f9	gl_arb_decompiler: Implement FSwizzleAdd	2020-06-11 22:12:07 -03:00
ReinUsesLisp	a63a0daa5e	gl_arb_decompiler: Implement an assembly shader decompiler Emit code compatible with NV_gpu_program5. This should emit code compatible with Fermi, but it wasn't tested on that architecture. Pascal has some issues not present on Turing GPUs.	2020-06-11 22:12:07 -03:00
bunnei	83e3b77ed7	Merge pull request #4027 from ReinUsesLisp/3d-slices texture_cache: Implement rendering to 3D textures	2020-06-09 21:52:15 -04:00
ReinUsesLisp	6508cdd003	buffer_cache: Avoid passing references of shared pointers and misc style changes Instead of using as template argument a shared pointer, use the underlying type and manage shared pointers explicitly. This can make removing shared pointers from the cache more easy. While we are at it, make some misc style changes and general improvements (like insert_or_assign instead of operator[] + operator=).	2020-06-09 18:30:49 -03:00
ReinUsesLisp	7646f2c21d	gl_rasterizer: Mark vertex buffers as dirty after buffer cache invalidation Vertex buffers bindings become invalid after the stream buffer is invalidated. We were originally doing this, but it got lost at some point. - Fixes Animal Crossing: New Horizons, but it affects everything.	2020-06-08 20:24:16 -03:00
bunnei	3626254f48	Merge pull request #4040 from ReinUsesLisp/nv-transform-feedback gl_rasterizer: Use NV_transform_feedback for XFB on assembly shaders	2020-06-08 16:18:33 -04:00
bunnei	98d2461529	Merge pull request #4052 from ReinUsesLisp/debug-output renderer_opengl: Only enable DEBUG_OUTPUT when graphics debugging is enabled	2020-06-08 10:16:41 -04:00
ReinUsesLisp	3c2ae53b4c	texture_cache: Handle 3D texture blits with one layer	2020-06-08 05:01:00 -03:00
ReinUsesLisp	c95c254f3e	texture_cache: Implement rendering to 3D textures This allows rendering to 3D textures with more than one slice. Applications are allowed to render to more than one slice of a texture using gl_Layer from a VTG shader. This also requires reworking how 3D texture collisions are handled, for now, this commit allows rendering to slices but not to miplevels. When a render target attempts to write to a mipmap, we fallback to the previous implementation (copying or flushing as needed). - Fixes color correction 3D textures on UE4 games (rainbow effects). - Allows Xenoblade games to render to 3D textures directly.	2020-06-08 05:01:00 -03:00
ReinUsesLisp	abcea1bb18	rasterizer_cache: Remove files and includes The rasterizer cache is no longer used. Each cache has its own generic implementation optimized for the cached data.	2020-06-07 04:32:57 -03:00
ReinUsesLisp	678f95e4f8	vk_pipeline_cache: Use generic shader cache Trivial port the generic shader cache to Vulkan.	2020-06-07 04:32:57 -03:00
ReinUsesLisp	b96f65b62b	gl_shader_cache: Use generic shader cache Trivially port the generic shader cache to OpenGL.	2020-06-07 04:32:57 -03:00
ReinUsesLisp	e78d681a6c	gl_device: Black list NVIDIA 443.24 for fast buffer uploads Skip fast buffer uploads on Nvidia 443.24 Vulkan beta driver on OpenGL. This driver throws the following error when calling BufferSubData or BufferData on buffers that are candidates for fast constant buffer uploads. This is the equivalens to push constants on Vulkan, except that they can access the full buffer. The error: Unknown internal debug message. The NVIDIA OpenGL driver has encountered an out of memory error. This application might behave inconsistently and fail. If this error persists on future drivers, we might have to look deeper into this issue. For now, we can black list it and log it as a temporary solution.	2020-06-06 02:56:42 -03:00
ReinUsesLisp	354fbe701e	renderer_opengl: Only enable DEBUG_OUTPUT when graphics debugging is enabled Avoids logging when it's not relevant. This can potentially reduce driver's internal thread overhead.	2020-06-05 21:21:12 -03:00
ReinUsesLisp	5b2b6d594c	shader/texture: Join separate image and sampler pairs offline Games using D3D idioms can join images and samplers when a shader executes, instead of baking them into a combined sampler image. This is also possible on Vulkan. One approach to this solution would be to use separate samplers on Vulkan and leave this unimplemented on OpenGL, but we can't do this because there's no consistent way of determining which constant buffer holds a sampler and which one an image. We could in theory find the first bit and if it's in the TIC area, it's an image; but this falls apart when an image or sampler handle use an index of zero. The used approach is to track for a LOP.OR operation (this is done at an IR level, not at an ISA level), track again the constant buffers used as source and store this pair. Then, outside of shader execution, join the sample and image pair with a bitwise or operation. This approach won't work on games that truly use separate samplers in a meaningful way. For example, pooling textures in a 2D array and determining at runtime what sampler to use. This invalidates OpenGL's disk shader cache :) - Used mostly by D3D ports to Switch	2020-06-05 00:24:51 -03:00
bunnei	22369df357	Merge pull request #4031 from Morph1984/fix-gs-outputs gl_shader_decompiler: Fix geometry shader outputs on Intel drivers	2020-06-04 15:18:51 -04:00
ReinUsesLisp	3d99b449d3	gl_rasterizer: Use NV_transform_feedback for XFB on assembly shaders NV_transform_feedback, NV_transform_feedback2 and ARB_transform_feedback3 with NV_transform_feedback interactions allows implementing transform feedbacks as dynamic state. Maxwell implements transform feedbacks as dynamic state, so using these extensions with TransformFeedbackStreamAttribsNV allows us to properly emulate transform feedbacks without having to recompile shaders when the state changes.	2020-06-03 20:22:12 -03:00
bunnei	623b93a2b3	Merge pull request #4014 from ReinUsesLisp/astc-nvidia gl_device: Avoid devices with CAVEAT_SUPPORT on ASTC	2020-06-02 17:43:33 -04:00
bunnei	597d8b4bd4	Merge pull request #4006 from ReinUsesLisp/squash-ubos glsl: Squash constant buffers into a single SSBO when we hit the limit	2020-06-02 14:58:50 -04:00
Morph	74f2e5f1a4	gl_shader_decompiler: Declare gl_Layer and gl_ViewportIndex within gl_PerVertex for vertex and tessellation shaders	2020-06-01 15:35:44 -04:00
Morph	70188d69b0	gl_shader_decompiler: Fix geometry shader outputs for Intel drivers On Intel's proprietary drivers, gl_Layer and gl_ViewportIndex are not allowed members of gl_PerVertex block, causing the shader to fail to compile. Fix this by declaring these variables outside of gl_PerVertex.	2020-06-01 15:34:05 -04:00
bunnei	6c0b1a9ee2	Merge pull request #3996 from ReinUsesLisp/front-faces fixed_pipeline_state,gl_rasterizer: Swap negative viewport checks for front faces	2020-06-01 14:04:35 -04:00
ReinUsesLisp	0ee310ebdc	gl_device: Avoid devices with CAVEAT_SUPPORT on ASTC This avoids using Nvidia's ASTC decoder on OpenGL. The last time it was profiled, it was slower than yuzu's decoder. While we are at it, fix a bug in the texture cache when native ASTC is not supported.	2020-05-31 21:34:34 -03:00
ReinUsesLisp	ee21e4ecd3	glsl: Squash constant buffers into a single SSBO when we hit the limit Avoids compilation errors at the cost of shader build times and runtime performance when a game hits the limit of uniform buffers we can use.	2020-05-31 21:33:49 -03:00
bunnei	edbf3144d2	Merge pull request #3958 from FernandoS27/gl-debug OpenGL: Enable Debug Context and Synchronous debugging when graphics debugging is enabled	2020-05-31 17:04:27 -04:00
Morph	bb8ef38152	gl_device: Enable compute shaders for Intel proprietary drivers Previously we were disabling compute shaders on Intel's proprietary driver due to broken compute. This has been fixed in the latest Intel drivers. Re-enable compute for Intel proprietary drivers and remove the check for broken compute.	2020-05-31 03:21:07 -04:00
bunnei	058ec22787	Merge pull request #3982 from ReinUsesLisp/membar-cts shader/other: Implement MEMBAR.CTS	2020-05-30 11:51:42 -04:00
bunnei	1bb3122c1f	Merge pull request #3991 from ReinUsesLisp/depth-sampling texture_cache: Implement depth stencil texture swizzles	2020-05-28 23:33:38 -04:00
bunnei	099ac9c2a8	Merge pull request #3993 from ReinUsesLisp/fix-zla gl_shader_manager: Unbind GLSL program when binding a host pipeline	2020-05-28 12:15:22 -04:00
ReinUsesLisp	32e6727dae	shader/other: Implement MEMBAR.CTS This silences an assertion we were hitting and uses workgroup memory barriers when the game requests it.	2020-05-27 00:19:45 -03:00
ReinUsesLisp	b17fe82973	gl_texture_cache: Implement small texture view cache for swizzles This fixes cases where the texture swizzle was applied twice on the same draw to a texture bound to two different slots.	2020-05-26 17:50:08 -03:00
ReinUsesLisp	8bba84a401	texture_cache: Implement depth stencil texture swizzles Stop ignoring image swizzles on depth and stencil images. This doesn't fix a known issue on Xenoblade Chronicles 2 where an OpenGL texture changes swizzles twice before being used. A proper fix would be having a small texture view cache for this like we do on Vulkan.	2020-05-26 17:44:50 -03:00
ReinUsesLisp	606a62d4c7	gl_rasterizer: Port front face flip check from Vulkan While Vulkan was assuming we had no negative viewports, OpenGL code was assuming we had them. Port the old code from Vulkan to OpenGL, checking if the first viewport is negative before flipping faces. This is not a complete implementation since we only check for the first viewport to be negative. That said, unless a game is using Vulkan, OpenGL and NVN games should be fine here, and we can always compare with our Vulkan backend to see if there's a difference.	2020-05-26 16:33:50 -03:00
bunnei	508242c267	Merge pull request #3981 from ReinUsesLisp/bar shader/other: Implement BAR.SYNC 0x0	2020-05-26 14:40:13 -04:00
ReinUsesLisp	c13e2f1b75	gl_shader_manager: Unbind GLSL program when binding a host pipeline Fixes regression in Link's Awakening caused by `420cc13248`	2020-05-26 04:20:39 -03:00
bunnei	86345c126a	Merge pull request #3978 from ReinUsesLisp/write-rz shader_decompiler: Visit source nodes even when they assign to RZ	2020-05-25 21:31:33 -04:00
bunnei	1adabdac7f	Merge pull request #3905 from FernandoS27/vulkan-fix Correct a series of crashes and intructions on Async GPU and Vulkan Pipeline	2020-05-24 15:23:38 -04:00
bunnei	325e7eed3c	Merge pull request #3964 from ReinUsesLisp/arb-integration renderer_opengl: Add assembly program code paths	2020-05-24 00:34:12 -04:00
bunnei	487dd05170	Merge pull request #3979 from ReinUsesLisp/thread-group shader/other: Implement thread comparisons (NV_shader_thread_group)	2020-05-24 00:33:06 -04:00
ReinUsesLisp	5d0986a53b	shader/other: Implement BAR.SYNC 0x0 Trivially implement this particular case of BAR. Unless games use OpenCL or CUDA barriers, we shouldn't hit any other case here.	2020-05-21 23:20:43 -03:00
ReinUsesLisp	e2b67a868b	shader/other: Implement thread comparisons (NV_shader_thread_group) Hardware S2R special registers match gl_Thread*MaskNV. We can trivially implement these using Nvidia's extension on OpenGL or naively stubbing them with the ARB instructions to match. This might cause issues if the host device warp size doesn't match Nvidia's. That said, this is unlikely on proper shaders. Refer to the attached url for more documentation about these flags. https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt	2020-05-21 23:18:37 -03:00
ReinUsesLisp	ed4e324991	shader_decompiler: Visit source nodes even when they assign to RZ Some operations like atomicMin were ignored because they returned were being stored to RZ. This operations have a side effect and it was being ignored.	2020-05-21 23:16:03 -03:00
ReinUsesLisp	891236124c	buffer_cache: Use boost::intrusive::set for caching Instead of using boost::icl::interval_map for caching, use boost::intrusive::set. interval_map is intended as a container where the keys can overlap with one another; we don't need this for caching buffers and a std::set-like data structure that allows us to search with lower_bound is enough.	2020-05-21 16:44:00 -03:00
ReinUsesLisp	420cc13248	renderer_opengl: Add assembly program code paths Add code required to use OpenGL assembly programs based on NV_gpu_program5. Decompilation for ARB programs is intended to be added in a follow up commit. This does not include ARB decompilation and it's not in an usable state. The intention behind assembly programs is to reduce shader stutter significantly on drivers supporting NV_gpu_program5 (and other required extensions). Currently only Nvidia's proprietary driver supports these extensions. Add a UI option hidden for now to avoid people enabling this option accidentally. This code path has some limitations that OpenGL compatibility doesn't have: - NV_shader_storage_buffer_object is limited to 16 entries for a single OpenGL context state (I don't know if this is an intended limitation, an specification issue or I am missing something). Currently causes issues on The Legend of Zelda: Link's Awakening. - NV_parameter_buffer_object can't bind buffers using an offset different to zero. The used workaround is to copy to a temporary buffer (this doesn't happen often so it's not an issue). On the other hand, it has the following advantages: - Shaders build a lot faster. - We have control over how floating point rounding is done over individual instructions (SPIR-V on Vulkan can't do this). - Operations on shared memory can be unsigned and signed. - Transform feedbacks are dynamic state (not yet implemented). - Parameter buffers (uniform buffers) are per stage, matching NVN and hardware's behavior. - The API to bind and create assembly programs makes sense, unlike ARB_separate_shader_objects.	2020-05-19 18:00:04 -03:00
Fernando Sahmkow	4cff5dd194	OpenGL: Enable Debug Context and Synchronous debugging when graphics debugging is enabled. This commit aims to help easing debugging of driver crashes without having to modify existing code.	2020-05-17 21:45:09 -04:00
bunnei	b1a1bd12ca	Merge pull request #3899 from ReinUsesLisp/float-comparisons shader_ir: Add separate instructions for ordered and unordered comparisons and fix NE on GLSL	2020-05-13 09:51:14 -04:00
ReinUsesLisp	8b329ddcc9	gl_shader_decompiler: Properly emulate NaN behaviour on NE "Not equal" operators on GLSL seem to behave as unordered when we expect an ordered comparison. Manually emulate this checking for LGE values (numbers, not-NaNs).	2020-05-10 02:59:33 -03:00
Fernando Sahmkow	0a4be73b9b	VideoCore: Use SyncGuestMemory mechanism for Shader/Pipeline Cache invalidation.	2020-05-09 19:25:29 -04:00
Rodrigo Locatti	7e376af8fc	Merge pull request #3839 from Morph1984/r8g8ui texture: Implement R8G8UI	2020-05-09 05:28:55 -03:00
ReinUsesLisp	4e57f9d5cf	shader_ir: Separate float-point comparisons in ordered and unordered This allows us to use native SPIR-V instructions without having to manually check for NAN.	2020-05-09 04:55:15 -03:00
ReinUsesLisp	f813cd3ff7	gl_rasterizer: Implement viewport swizzles with NV_viewport_swizzle	2020-05-04 17:51:30 -03:00
bunnei	2aff0b4733	Merge pull request #3808 from ReinUsesLisp/wait-for-idle {maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers	2020-05-03 02:43:18 -04:00
bunnei	e6b4311178	Merge pull request #3693 from ReinUsesLisp/clean-samplers shader/texture: Support multiple unknown sampler properties	2020-05-02 00:45:41 -04:00
Morph	7909860d16	texture: Implement R8G8UI - Used by The Walking Dead: The Final Season	2020-04-30 13:19:36 -04:00
bunnei	bf3f030a0d	Merge pull request #3807 from ReinUsesLisp/fix-depth-clamp maxwell_3d: Fix depth clamping register	2020-04-30 13:07:31 -04:00
bunnei	c7b5a87c90	Merge pull request #3799 from ReinUsesLisp/iadd-cc shader: Implement P2R CC, IADD Rd.CC and IADD.X	2020-04-30 12:56:36 -04:00
bunnei	da2b8295e1	Merge pull request #3805 from ReinUsesLisp/preserve-contents texture_cache: Reintroduce preserve_contents accurately	2020-04-30 12:56:19 -04:00
bunnei	72b73d22ab	Merge pull request #3784 from ReinUsesLisp/shader-memory-util shader/memory_util: Deduplicate code	2020-04-28 12:05:50 -04:00
ReinUsesLisp	fe931ac976	{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers Drop MemoryBarrier from the buffer cache and use Maxwell3D's register WaitForIdle. To implement this on OpenGL we just call glMemoryBarrier with the necessary bits. Vulkan lacks this synchronization primitive, so we set an event and immediately wait for it. This is not a pretty solution, but it's what Vulkan can do without submitting the current command buffer to the queue (which ends up being more expensive on the CPU).	2020-04-28 02:18:12 -03:00
ReinUsesLisp	bb1ed66d99	maxwell_3d: Fix depth clamping register Using deko3d as reference: `4e47ba0013/source/maxwell/gpu_3d_state.cpp (L42)` We were using bits 3 and 4 to determine depth clamping, but these are the same both enabled and disabled: state->depthClampEnable ? 0x101A : 0x181D The same happens on Nvidia's OpenGL driver, where they do something like this (default capabilities, GL 4.5 compatibility): (state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c There's always a difference between the first bits in this register, but bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This commit changes yuzu's behaviour to use bit 11 to determine depth clamping. - Fixes depth issues on Super Mario Odyssey's intro.	2020-04-27 20:50:14 -03:00
ReinUsesLisp	8da16cf9fb	texture_cache: Reintroduce preserve_contents accurately This reverts commit `94b0e2e5da`. preserve_contents proved to be a meaningful optimization. This commit reintroduces it but properly implemented on OpenGL. We have to make sure the clear removes all the previous contents of the image. It's not currently implemented on Vulkan because we can do smart things there that's preferred to be introduced in a separate commit.	2020-04-26 19:53:02 -03:00
Rodrigo Locatti	7e38dd580f	Merge pull request #3753 from ReinUsesLisp/ac-vulkan {gl,vk}_rasterizer: Add lazy default buffer maker and use it for empty buffers	2020-04-26 01:55:43 -03:00
ReinUsesLisp	ddd82ef42b	shader/memory_util: Deduplicate code Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as well as shader decoder code. While we are at it, fix a bug in gl_shader_cache where compute shaders had an start offset of a stage shader.	2020-04-26 01:38:51 -03:00
ReinUsesLisp	255197e643	shader/arithmetic_integer: Implement CC for IADD	2020-04-25 22:55:26 -03:00
ReinUsesLisp	72deb773fd	shader_ir: Turn classes into data structures	2020-04-23 18:00:06 -03:00
Fernando Sahmkow	c043ac4f13	GL_Fence_Manager: use GL_TIMEOUT_IGNORED instead of a loop,	2020-04-22 20:34:32 -04:00

... 3 4 5 6 7 ...

2691 commits