Commit graph

577 commits

Author SHA1 Message Date
ReinUsesLisp
aae8c180cb gl_query_cache: Implement host queries using a deferred cache
Instead of waiting immediately for executed commands, defer the query
until the guest CPU reads it. This way we get closer to what the guest
program is doing.

To archive this we have to build a dependency queue, because host APIs
(like OpenGL and Vulkan) use ranged queries instead of counters like
NVN.

Waiting for queries implicitly uses fences and this requires a command
being queued, otherwise the driver will lock waiting until a timeout. To
fix this when there are no commands queued, we explicitly call glFlush.
2020-02-14 17:33:13 -03:00
ReinUsesLisp
fe1238be7a gl_rasterizer: Add queued commands counter
Keep track of the queued OpenGL commands that can signal a fence if
waited on. As a side effect, we avoid calls to glFlush when no commands
are queued.
2020-02-14 17:27:17 -03:00
ReinUsesLisp
2b58652f08 maxwell_3d: Slow implementation of passed samples (query 21)
Implements GL_SAMPLES_PASSED by waiting immediately for queries.
2020-02-14 17:27:17 -03:00
bunnei
37f1cf8cbd
Merge pull request #3376 from ReinUsesLisp/point-sprite
gl_rasterizer: Implement GL_POINT_SPRITE
2020-02-11 08:26:07 -05:00
bunnei
09d766d357
Merge pull request #3362 from ReinUsesLisp/fix-instanced
gl_rasterizer: Fix instanced draw arrays
2020-02-06 21:39:59 -05:00
ReinUsesLisp
7da52673d0 gl_rasterizer: Implement GL_POINT_SPRITE
OpenGL core defaults to GL_POINT_SPRITE, meanwhile on OpenGL
compatibility we have to explicitly enable it. This fixes
gl_PointCoord's behaviour.
2020-02-04 15:19:45 -03:00
ReinUsesLisp
b69321650e gl_rasterizer: Fix instanced draw arrays
glDrawArrays was being used when the draw had a base instance specified.
This commit removes the draw parameters abstraction and fixes the
mentioned issue.
2020-01-30 02:22:00 -03:00
Fernando Sahmkow
2b02f29a2d GL Backend: Introduce indexed samplers into the GL backend 2020-01-24 16:43:31 -04:00
ReinUsesLisp
d110a371bb gl_state: Use bool instead of GLboolean
This fixes template resolution considering GLboolean an integer instead
of a bool.
2020-01-18 19:10:34 -03:00
ReinUsesLisp
c375d735e6 gl_state: Implement PROGRAM_POINT_SIZE
For gl_PointSize to have effect we have to activate
GL_PROGRAM_POINT_SIZE.
2020-01-15 16:14:17 -03:00
ReinUsesLisp
5b989f189f
gl_rasterizer: Allow rendering without fragment shader
Rendering without a fragment shader is usually used in depth-only
passes.
2019-12-26 16:38:49 -03:00
ReinUsesLisp
da0aa4da6b
gl_rasterizer: Implement RASTERIZE_ENABLE
RASTERIZE_ENABLE is the opposite of GL_RASTERIZER_DISCARD. Implement it
naturally using this.

NVN games expect rasterize to be enabled by default, reflect that in our
initial GPU state.
2019-12-18 19:28:23 -03:00
Fernando Sahmkow
1d2ba3cc97 Gl_Rasterizer: Skip Tesselation Control and Eval stages as they are un implemented.
This commit ensures the OGL backend does not execute tesselation shader 
stages as they are currently unimplemented.
2019-12-11 15:41:26 -04:00
Fernando Sahmkow
7ffb672f61 Maxwell3D: Implement Depth Mode.
This commit finishes adding depth mode that was reverted before due to
other unresolved issues.
2019-12-10 19:51:46 -04:00
ReinUsesLisp
fb6cf12a17
gl_framebuffer_cache: Optimize framebuffer key
Pack color attachment enumerations into a single u32. To determine the
number of buffers, the highest color attachment with a shared pointer
that doesn't point to null is used.
2019-11-28 23:02:20 -03:00
ReinUsesLisp
c34da106ed
gl_rasterizer: Re-enable framebuffer cache for clear buffers 2019-11-28 23:02:20 -03:00
Lioncash
3f08e8d8d4 core/memory: Migrate over GetPointer()
With all of the interfaces ready for migration, it's trivial to migrate
over GetPointer().
2019-11-26 21:55:38 -05:00
Lioncash
536fc7f0ea core: Prepare various classes for memory read/write migration
Amends a few interfaces to be able to handle the migration over to the
new Memory class by passing the class by reference as a function
parameter where necessary.

Notably, within the filesystem services, this eliminates two ReadBlock()
calls by using the helper functions of HLERequestContext to do that for
us.
2019-11-26 21:55:37 -05:00
ReinUsesLisp
919ac2c4d3
gl_rasterizer: Disable compute shaders on Intel
Intel's proprietary driver enters in a corrupt state when compute
shaders are executed. For now, disable these.
2019-11-22 21:28:50 -03:00
ReinUsesLisp
e35b9597ef
gl_shader_decompiler: Normalize image bindings 2019-11-22 21:28:49 -03:00
ReinUsesLisp
36d9b409fc
gl_shader_decompiler: Normalize cbuf bindings
Stage and compute shaders were using a different binding counter.
Normalize these.
2019-11-22 21:28:49 -03:00
ReinUsesLisp
f936b86c7c
gl_rasterizer: Add missing cbuf counter reset on compute 2019-11-22 21:28:49 -03:00
ReinUsesLisp
180417c514
gl_shader_cache: Remove dynamic BaseBinding specialization 2019-11-22 21:28:49 -03:00
ReinUsesLisp
c8a48aacc0
video_core: Unify ProgramType and ShaderStage into ShaderType 2019-11-22 21:28:48 -03:00
ReinUsesLisp
0f23359a44
gl_rasterizer: Bind graphics images to draw commands
Images were not being bound to draw invocations because these would
require a cache invalidation.
2019-11-22 21:28:48 -03:00
ReinUsesLisp
287ae2b9e8
gl_shader_cache: Specialize local memory size for compute shaders
Local memory size in compute shaders was stubbed with an arbitary size.
This commit specializes local memory size from guest GPU parameters.
2019-11-22 21:28:48 -03:00
ReinUsesLisp
dbeb523879
gl_shader_cache: Specialize shared memory size
Shared memory was being declared with an undefined size. Specialize from
guest GPU parameters the compute shader's shared memory size.
2019-11-22 21:28:47 -03:00
ReinUsesLisp
4f5d8e4342
gl_shader_cache: Specialize shader workgroup
Drop the usage of ARB_compute_variable_group_size and specialize compute
shaders instead. This permits compute to run on AMD and Intel
proprietary drivers.
2019-11-22 21:28:47 -03:00
ReinUsesLisp
32c1bc6a67
shader/texture: Deduce texture buffers from locker
Instead of specializing shaders to separate texture buffers from 1D
textures, use the locker to deduce them while they are being decoded.
2019-11-22 21:28:47 -03:00
bunnei
a8295d2c53
Merge pull request #3047 from ReinUsesLisp/clip-control
gl_rasterizer: Emulate viewport flipping with ARB_clip_control
2019-11-15 12:09:19 -05:00
ReinUsesLisp
096f339a2a video_core: Silence implicit conversion warnings 2019-11-08 22:48:50 +00:00
ReinUsesLisp
e9d2fad984
gl_rasterizer: Remove front facing hack 2019-11-07 01:52:18 -03:00
ReinUsesLisp
f019817f8f
gl_rasterizer: Emulate viewport flipping with ARB_clip_control
Emulates negative y viewports with ARB_clip_control. This allows us to
more easily emulated pipelines with tessellation and/or geometry shader
stages. It also avoids corrupting games with transform feedbacks and
negative viewports (gl_Position.y was being modified).
2019-11-07 01:52:18 -03:00
bunnei
468576284d
Merge pull request #3057 from ReinUsesLisp/buffer-sub-data
gl_rasterizer: Upload constant buffers with glNamedBufferSubData
2019-11-06 10:08:55 -05:00
Rodrigo Locatti
654b77d2ec
Merge pull request #3039 from ReinUsesLisp/cleanup-samplers
shader/node: Unpack bindless texture encoding
2019-11-06 04:54:11 +00:00
ReinUsesLisp
442a1cc021
gl_rasterizer: Re-enable stream buffer memory due to global memory
Global memory is still using the stream buffer when it shouldn't. As a
temporary fix re-enable the stream buffer on compute.
2019-11-02 13:19:19 -03:00
ReinUsesLisp
76ca2a5f82
gl_rasterizer: Upload constant buffers with glNamedBufferSubData
Nvidia's OpenGL driver maps gl(Named)BufferSubData with some requirements
to a fast. This path has an extra memcpy but updates the buffer without
orphaning or waiting for previous calls. It can be seen as a better
model for "push constants" that can upload a whole UBO instead of 256
bytes.

This path has some requirements established here:
http://on-demand.gputechconf.com/gtc/2014/presentations/S4379-opengl-44-scene-rendering-techniques.pdf#page=24

Instead of using the stream buffer, this commits moves constant buffers
uploads to calls of glNamedBufferSubData and from my testing it brings a
performance improvement. This is disabled when the vendor is not Nvidia
since it brings performance regressions.
2019-11-02 05:05:34 -03:00
bunnei
2382bbe3ac
Merge pull request #3046 from ReinUsesLisp/clean-gl-state
gl_state: Miscellaneous clean up
2019-10-29 22:50:04 -04:00
bunnei
b5138f3c35
Merge pull request #3035 from ReinUsesLisp/rasterizer-accelerated
rasterizer_accelerated: Add intermediary for GPU rasterizers
2019-10-29 22:06:41 -04:00
ReinUsesLisp
3c6557c235
gl_state: Remove ApplyDefaultState
OpenGL has defaults values we can trust. Remove these.
2019-10-29 21:27:25 -03:00
ReinUsesLisp
a993df1ee2
shader/node: Unpack bindless texture encoding
Bindless textures were using u64 to pack the buffer and offset from
where they come from. Drop this in favor of separated entries in the
struct.

Remove the usage of std::set in favor of std::list (it's not std::vector
to avoid reference invalidations) for samplers and images.
2019-10-29 20:53:48 -03:00
ReinUsesLisp
fa31e5b868
maxwell_3d/kepler_compute: Remove unused arguments in GetTexture 2019-10-28 00:23:42 -03:00
ReinUsesLisp
bd2aff3e26
rasterizer_accelerated: Add intermediary for GPU rasterizers
Add an intermediary class that implements common functions across GPU
accelerated rasterizers. This avoids code repetition on different
backends.
2019-10-27 03:40:08 -03:00
Fernando Sahmkow
acd6441134 Shader_Cache: setup connection of ConstBufferLocker 2019-10-25 09:01:29 -04:00
Fernando Sahmkow
1a58f45d76 VideoCore: Unify const buffer accessing along engines and provide ConstBufferLocker class to shaders. 2019-10-25 09:01:29 -04:00
bunnei
ef9b31783d
Merge pull request #2912 from FernandoS27/async-fixes
General fixes to Async GPU
2019-10-16 10:34:48 -04:00
Fernando Sahmkow
9f2719d1a4 Gl_Rasterizer: Protect CPU Memory mapping from multiple threads. 2019-10-04 19:59:53 -04:00
ReinUsesLisp
69c806feb6
gl_rasterizer: Fix polygon offset units
For some reason hardware divides polygon offset units by two. This is
visible since drivers multiply the application requested polygon offset
by two.
2019-10-01 02:00:23 -03:00
David
9d69206cd0
Merge pull request #2870 from FernandoS27/multi-draw
Implement a MME Draw commands Inliner and correct host instance drawing
2019-09-22 23:13:02 +10:00
Fernando Sahmkow
68f5aff64f Maxwell3D: Corrections and refactors to MME instance refactor 2019-09-22 07:23:13 -04:00