420cc13248
Add code required to use OpenGL assembly programs based on NV_gpu_program5. Decompilation for ARB programs is intended to be added in a follow up commit. This does **not** include ARB decompilation and it's not in an usable state. The intention behind assembly programs is to reduce shader stutter significantly on drivers supporting NV_gpu_program5 (and other required extensions). Currently only Nvidia's proprietary driver supports these extensions. Add a UI option hidden for now to avoid people enabling this option accidentally. This code path has some limitations that OpenGL compatibility doesn't have: - NV_shader_storage_buffer_object is limited to 16 entries for a single OpenGL context state (I don't know if this is an intended limitation, an specification issue or I am missing something). Currently causes issues on The Legend of Zelda: Link's Awakening. - NV_parameter_buffer_object can't bind buffers using an offset different to zero. The used workaround is to copy to a temporary buffer (this doesn't happen often so it's not an issue). On the other hand, it has the following advantages: - Shaders build a lot faster. - We have control over how floating point rounding is done over individual instructions (SPIR-V on Vulkan can't do this). - Operations on shared memory can be unsigned and signed. - Transform feedbacks are dynamic state (not yet implemented). - Parameter buffers (uniform buffers) are per stage, matching NVN and hardware's behavior. - The API to bind and create assembly programs makes sense, unlike ARB_separate_shader_objects. |
||
---|---|---|
.. | ||
buffer_cache | ||
engines | ||
renderer_opengl | ||
renderer_vulkan | ||
shader | ||
texture_cache | ||
textures | ||
CMakeLists.txt | ||
dirty_flags.cpp | ||
dirty_flags.h | ||
dma_pusher.cpp | ||
dma_pusher.h | ||
fence_manager.h | ||
gpu.cpp | ||
gpu.h | ||
gpu_asynch.cpp | ||
gpu_asynch.h | ||
gpu_synch.cpp | ||
gpu_synch.h | ||
gpu_thread.cpp | ||
gpu_thread.h | ||
guest_driver.cpp | ||
guest_driver.h | ||
macro_interpreter.cpp | ||
macro_interpreter.h | ||
memory_manager.cpp | ||
memory_manager.h | ||
morton.cpp | ||
morton.h | ||
query_cache.h | ||
rasterizer_accelerated.cpp | ||
rasterizer_accelerated.h | ||
rasterizer_cache.cpp | ||
rasterizer_cache.h | ||
rasterizer_interface.h | ||
renderer_base.cpp | ||
renderer_base.h | ||
sampler_cache.cpp | ||
sampler_cache.h | ||
surface.cpp | ||
surface.h | ||
video_core.cpp | ||
video_core.h |