Commit graph

540 commits

Author SHA1 Message Date
ReinUsesLisp
f019817f8f
gl_rasterizer: Emulate viewport flipping with ARB_clip_control
Emulates negative y viewports with ARB_clip_control. This allows us to
more easily emulated pipelines with tessellation and/or geometry shader
stages. It also avoids corrupting games with transform feedbacks and
negative viewports (gl_Position.y was being modified).
2019-11-07 01:52:18 -03:00
ReinUsesLisp
a993df1ee2
shader/node: Unpack bindless texture encoding
Bindless textures were using u64 to pack the buffer and offset from
where they come from. Drop this in favor of separated entries in the
struct.

Remove the usage of std::set in favor of std::list (it's not std::vector
to avoid reference invalidations) for samplers and images.
2019-10-29 20:53:48 -03:00
ReinUsesLisp
7b81ba4d8a gl_shader_decompiler: Move entries to a separate function 2019-10-25 09:01:31 -04:00
Fernando Sahmkow
8909f52166 Shader_IR: Implement Fast BRX and allow multi-branches in the CFG. 2019-10-25 09:01:30 -04:00
Fernando Sahmkow
7ecf9f7228
Merge pull request #2983 from lioncash/fallthrough
gl_shader_decompiler/vk_shader_decompiler: Resolve implicit fallthrough cases
2019-10-22 13:16:46 -04:00
Lioncash
b42a74ff2c gl_shader_decompiler: Resolve fallthrough within ExprDecompiler's ExprCondCode operator()
This would previously result in NeverExecute and UnusedIndex being
treated as regular predicates.
2019-10-15 19:38:55 -04:00
Lioncash
4f16ce9294 gl_shader_decompiler: Make ExprDecompiler's GetResult() a const member function
This is only ever used to read, but not write, the resulting string, so
we can enforce this by making it a const member function.
2019-10-15 19:02:59 -04:00
Lioncash
67df3f7742 gl_shader_decompiler: Use a std::string_view with GetDeclarationWithSuffix()
This allows the function to be completely non-allocating for inputs of
all sizes (i.e. there's no heap cost for an input to convert to a
std::string_view).
2019-10-15 19:00:48 -04:00
Lioncash
04a1161354 gl_shader_decompiler: Fold flow_var constant into GetFlowVariable()
This is only ever used within this function, so we can narrow it's scope
down.
2019-10-15 18:58:36 -04:00
Lioncash
2f2ab9b5bc gl_shader_decompiler: Mark ASTDecompiler/ExprDecompiler parameters as const references where applicable
These member functions don't actually modify the input parameter, so we
can make this explicit with the use of const.
2019-10-15 18:57:02 -04:00
Lioncash
b8a62adcf1 gl_shader_decompiler: Pass by reference to GenerateTextureArgument()
Avoids an unnecessary atomic reference count increment and decrement.
2019-10-15 18:29:37 -04:00
Lioncash
d1d7ce74d2 gl_shader_decompiler: Use std::holds_alternative within GenerateTexture()
This only ever queries if the type exists within the variant, but
doesn't actually do anything with the return value. We can just use
std::holds_alternative for this use case.
2019-10-15 18:25:48 -04:00
Lioncash
9760795bfb gl_shader_decompiler: Avoid unnecessary copies of MetaImage
MetaImage contains a std::vector, so copying here could result in
unnecessary reallocations. Given the operation lives throughout the
entire scope, this is safe to do.
2019-10-15 18:14:55 -04:00
Fernando Sahmkow
e6eae4b815 Shader_ir: Address feedback 2019-10-04 18:52:57 -04:00
Fernando Sahmkow
000ad558dd vk_shader_decompiler: Clean code and be const correct. 2019-10-04 18:52:55 -04:00
Fernando Sahmkow
189a50bc2a gl_shader_decompiler: Refactor and address feedback. 2019-10-04 18:52:53 -04:00
Fernando Sahmkow
47e4f6a52c Shader_Ir: Refactor Decompilation process and allow multiple decompilation modes. 2019-10-04 18:52:50 -04:00
Fernando Sahmkow
38fc995f6c gl_shader_decompiler: Implement AST decompiling 2019-10-04 18:52:50 -04:00
ReinUsesLisp
f926230ab1
gl_shader_decompiler: Add tailing return for HUnpack2 2019-09-24 01:03:59 -03:00
ReinUsesLisp
25bfaffdff
gl_shader_decompiler: Fix clang build issues 2019-09-24 01:03:27 -03:00
bunnei
376f1a4432
Merge pull request #2869 from ReinUsesLisp/suld
shader/image: Implement SULD and fix SUATOM
2019-09-23 21:47:03 -04:00
David
9d69206cd0
Merge pull request #2870 from FernandoS27/multi-draw
Implement a MME Draw commands Inliner and correct host instance drawing
2019-09-22 23:13:02 +10:00
ReinUsesLisp
44000971e2
gl_shader_decompiler: Use uint for images and fix SUATOM
In the process remove implementation of SUATOM.MIN and SUATOM.MAX as
these require a distinction between U32 and S32. These have to be
implemented with imageCompSwap loop.
2019-09-21 17:33:52 -03:00
ReinUsesLisp
675f23aedc
shader/image: Implement SULD and remove irrelevant code
* Implement SULD as float.
* Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
2019-09-21 17:32:48 -03:00
bunnei
bbe82d62b0
Merge pull request #2846 from ReinUsesLisp/fixup-viewport-index
gl_shader_decompiler: Avoid writing output attribute when unimplemented
2019-09-20 17:11:20 -04:00
bunnei
88d857499b
Merge pull request #2855 from ReinUsesLisp/shfl
shader_ir/warp: Implement SHFL for Nvidia devices
2019-09-20 17:10:42 -04:00
Fernando Sahmkow
7606da5611 VideoCore: Corrections to the MME Inliner and removal of hacky instance management. 2019-09-19 11:41:29 -04:00
Fernando Sahmkow
ba02d564f8 Video Core: initial Implementation of InstanceDraw Packaging 2019-09-19 11:41:27 -04:00
bunnei
b31880dc5e
Merge pull request #2784 from ReinUsesLisp/smem
shader_ir: Implement shared memory
2019-09-18 16:26:05 -04:00
ReinUsesLisp
0526bf1895 shader_ir/warp: Implement SHFL 2019-09-17 17:44:07 -03:00
ReinUsesLisp
36abf67e79 shader/image: Implement SUATOM and fix SUST 2019-09-10 20:22:31 -03:00
ReinUsesLisp
17a9b0178d gl_shader_decompiler: Avoid writing output attribute when unimplemented 2019-09-06 15:02:12 -03:00
ReinUsesLisp
1f43e5296f gl_shader_decompiler: Keep track of written images and mark them as modified 2019-09-05 23:26:05 -03:00
ReinUsesLisp
0f7b813d65 gl_shader_decompiler: Implement shared memory 2019-09-05 01:40:24 -03:00
ReinUsesLisp
6177cbdbe1 gl_shader_decompiler: Fixup slow path 2019-09-04 15:03:51 -03:00
ReinUsesLisp
9cf52d027d gl_device: Disable precise in fragment shaders on bugged drivers 2019-09-04 01:54:00 -03:00
ReinUsesLisp
03276e7490 gl_shader_decompiler: Fixup AMD's slow path type 2019-09-04 01:54:00 -03:00
ReinUsesLisp
6c449793b8 gl_shader_decompiler: Rework GLSL decompiler type system
GLSL decompiler type system was broken. We converted all return values
to float except for some cases where returning we couldn't and
implicitly broke the rule of returning floats (e.g. for bools or bool
pairs).

Instead of doing this introduce class Expression that knows what type a
return value has and when a consumer wants to use the string it asks for
it with a required type, emitting a runtime error if types are
incompatible.

This has the disadvantage that there's more C++ code, but we can emit
better GLSL code that's easier to read.
2019-09-04 01:54:00 -03:00
bunnei
a67c4e6e02
Merge pull request #2742 from ReinUsesLisp/fix-texture-buffers
gl_texture_cache: Miscellaneous texture buffer fixes
2019-08-29 15:59:17 -04:00
ReinUsesLisp
4e35177e23 shader_ir: Implement VOTE
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics

Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.

To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:

* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true

ballotARB, also known as "uint64_t(activeThreadsNV())", emits

VOTE.ANY Rd, PT, PT;

on nouveau's compiler. This doesn't match exactly to Nvidia's code

VOTE.ALL Rd, PT, PT;

Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
bunnei
cedc1aab4a
Merge pull request #2753 from FernandoS27/float-convert
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
2019-08-21 10:27:57 -04:00
bunnei
f601f25bcc
Merge pull request #2734 from ReinUsesLisp/compute-shaders
gl_rasterizer: Implement compute shaders
2019-07-22 11:12:55 -04:00
Fernando Sahmkow
11f4e739bd Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
This commit takes care of implementing the F16 Variants of the 
conversion instructions and makes sure conversions are done.
2019-07-20 17:38:25 -04:00
ReinUsesLisp
45c162444d shader/half_set_predicate: Fix HSETP2 implementation 2019-07-19 22:21:22 -03:00
ReinUsesLisp
74632c76ce gl_shader_decompiler: Rename bufferImage to imageBuffer
The online OpenGL documentation is wrong. The type definition is
imageBuffer.
2019-07-18 01:16:44 -03:00
ReinUsesLisp
6b0d017675 gl_shader_decompiler: Stub local memory size 2019-07-15 17:38:25 -03:00
ReinUsesLisp
725ba6cf63 gl_rasterizer: Implement compute shaders 2019-07-15 17:38:25 -03:00
Fernando Sahmkow
1bdb59fc6e
Merge pull request #2695 from ReinUsesLisp/layer-viewport
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
2019-07-15 16:28:07 -04:00
bunnei
3477b92289
Merge pull request #2675 from ReinUsesLisp/opengl-buffer-cache
buffer_cache: Implement a generic buffer cache and its OpenGL backend
2019-07-14 19:03:43 -04:00
ReinUsesLisp
0eb0c24269 gl_shader_decompiler: Fix gl_PointSize redeclaration 2019-07-11 16:10:59 -03:00
ReinUsesLisp
aca40de224 gl_shader_decompiler: Fix conditional usage of GL_ARB_shader_viewport_layer_array 2019-07-11 04:27:00 -03:00
Fernando Sahmkow
d5533b440c shader_ir: Unify blocks in decompiled shaders. 2019-07-09 08:14:39 -04:00
Fernando Sahmkow
8a6fc529a9 shader_ir: Implement BRX & BRA.CC 2019-07-09 08:14:37 -04:00
ReinUsesLisp
c9d886c84e gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
This commit implements gl_ViewportIndex and gl_Layer in vertex and
geometry shaders. In the case it's used in a vertex shader, it requires
ARB_shader_viewport_layer_array. This extension is available on AMD and
Nvidia devices (mesa and proprietary drivers), but not available on
Intel on any platform. At the moment of writing this description I don't
know if this is a hardware limitation or a driver limitation.

In the case that ARB_shader_viewport_layer_array is not available,
writes to these registers on a vertex shader are ignored, with the
appropriate logging.
2019-07-07 20:42:55 -03:00
ReinUsesLisp
7ecf64257a gl_rasterizer: Minor style changes 2019-07-06 00:37:55 -03:00
ReinUsesLisp
b8b05a484a gl_shader_decompiler: Address feedback 2019-06-24 01:56:38 -03:00
ReinUsesLisp
1bf4154e7d gl_shader_decompiler: Implement image binding settings 2019-06-20 21:38:33 -03:00
ReinUsesLisp
06c4ce8645 shader: Decode SUST and implement backing image functionality 2019-06-20 21:38:33 -03:00
ReinUsesLisp
6c81c8f5b7 gl_shader_decompiler: Allow 1D textures to be texture buffers 2019-06-20 21:36:12 -03:00
Zach Hilman
c0e7b91145
Merge pull request #2538 from ReinUsesLisp/ssy-pbk
shader: Split SSY and PBK stack
2019-06-15 20:30:13 -04:00
Zach Hilman
de33ad25f5
Merge pull request #2514 from ReinUsesLisp/opengl-compat
video_core: Drop OpenGL core in favor of OpenGL compatibility
2019-06-07 17:23:25 -04:00
ReinUsesLisp
fe8e6618f2 shader: Split SSY and PBK stack
Hardware testing revealed that SSY and PBK push to a different stack,
allowing code like this:

        SSY label1;
        PBK label2;
        SYNC;
label1: PBK;
label2: EXIT;
2019-06-07 02:18:27 -03:00
ReinUsesLisp
bf4dfb3ad4 shader: Use shared_ptr to store nodes and move initialization to file
Instead of having a vector of unique_ptr stored in a vector and
returning star pointers to this, use shared_ptr. While changing
initialization code, move it to a separate file when possible.

This is a first step to allow code analysis and node generation beyond
the ShaderIR class.
2019-06-05 20:41:52 -03:00
bunnei
55c5029171
Merge pull request #2540 from ReinUsesLisp/remove-guest-position
gl_shader_decompiler: Remove guest "position" varying
2019-06-05 18:07:23 -04:00
bunnei
0bcc305797
Merge pull request #2512 from ReinUsesLisp/comp-indexing
gl_shader_decompiler: Pessimize uniform buffer access on AMD's prorpietary driver
2019-06-05 18:02:30 -04:00
ReinUsesLisp
0935c2d97b gl_shader_decompiler: Remove guest "position" varying
"position" was being written but not read anywhere besides geometry
shaders, where it had the same value as gl_Position.

This commit replaces "position" with gl_Position, reducing the
complexity of our code and the emitted GLSL code.
2019-06-03 01:01:34 -03:00
ReinUsesLisp
b76df62c00 gl_rasterizer: Move alpha testing to the OpenGL pipeline
Removes the alpha testing code from each fragment shader invocation.
2019-05-30 13:21:01 -03:00
bunnei
e3608578e4
Merge pull request #2446 from ReinUsesLisp/tid
shader: Implement S2R Tid{XYZ} and CtaId{XYZ}
2019-05-29 12:21:17 -04:00
ReinUsesLisp
d8827b07b5 gl_shader_decompiler: Use an if based cbuf indexing for broken drivers
The following code is broken on AMD's proprietary GLSL compiler:
```glsl
uint idx = ...;
vec4 values = ...;
float some_value = values[idx & 3];
```

It index the wrong components, to fix this the following pessimized code
is emitted when that bug is present:
```glsl
uint idx = ...;
vec4 values = ...;
float some_value;
if ((idx & 3) == 0) some_value = values.x;
if ((idx & 3) == 1) some_value = values.y;
if ((idx & 3) == 2) some_value = values.z;
if ((idx & 3) == 3) some_value = values.w;
```
2019-05-24 02:47:56 -03:00
Lioncash
de23847184 renderer_opengl/gl_shader_decompiler: Remove redundant name specification in format string
This accidentally slipped through a rebase.
2019-05-21 09:47:21 -04:00
ReinUsesLisp
9c3461604c shader: Implement S2R Tid{XYZ} and CtaId{XYZ} 2019-05-20 16:36:49 -03:00
ReinUsesLisp
ada79fa8ad gl_shader_decompiler: Make GetSwizzle constexpr 2019-05-20 16:36:48 -03:00
Lioncash
58a0c13e34 gl_shader_decompiler: Tidy up minor remaining cases of unnecessary std::string concatenation 2019-05-20 14:14:48 -04:00
Lioncash
6fb29764d6 gl_shader_decompiler: Replace individual overloads with the fmt-based one
Gets rid of the need to special-case brace handling depending on the
overload used, and makes it consistent across the board with how fmt
handles them.

Strings with compile-time deducible strings are directly forwarded to
std::string's constructor, so we don't need to worry about the
performance difference here, as it'll be identical.
2019-05-20 14:14:48 -04:00
Lioncash
784d2b6c3d gl_shader_decompiler: Utilize fmt overload of AddLine() where applicable 2019-05-20 14:14:44 -04:00
Lioncash
91ec251c4a gl_shader_decompiler: Add AddLine() overload that forwards to fmt
In a lot of places throughout the decompiler, string concatenation via
operator+ is used quite heavily. This is usually fine, when not heavily
used, but when used extensively, can be a problem. operator+ creates an
entirely new heap allocated temporary string and given we perform
expressions like:

std::string thing = a + b + c + d;

this ends up with a lot of unnecessary temporary strings being created
and discarded, which kind of thrashes the heap more than we need to.
Given we utilize fmt in some AddLine calls, we can make this a part of
the ShaderWriter's API. We can make an overload that simply acts as a
passthrough to fmt.

This way, whenever things need to be appended to a string, the operation
can be done via a single string formatting operation instead of
discarding numerous temporary strings. This also has the benefit of
making the strings themselves look nicer and makes it easier to spot
errors in them.
2019-05-19 14:12:20 -04:00
bunnei
d49efbfb4a
Merge pull request #2441 from ReinUsesLisp/al2p
shader: Implement AL2P and ALD.PHYS
2019-05-19 14:02:58 -04:00
Lioncash
175fe8aaeb video_core/renderer_opengl/gl_shader_decompiler: Remove unused Composite() function
This isn't used at all, so it can be removed.
2019-05-09 18:45:26 -04:00
ReinUsesLisp
5321cdd276 gl_shader_decompiler: Skip physical unused attributes 2019-05-02 21:46:37 -03:00
ReinUsesLisp
fe700e1856 shader: Add physical attributes commentaries 2019-05-02 21:46:25 -03:00
ReinUsesLisp
c6f9e651b2 gl_shader_decompiler: Implement GLSL physical attributes 2019-05-02 21:46:25 -03:00
ReinUsesLisp
b7d412c99b gl_shader_decompiler: Abstract generic attribute operations 2019-05-02 21:46:25 -03:00
ReinUsesLisp
bd81a03d9d gl_shader_decompiler: Declare all possible varyings on physical attribute usage 2019-05-02 21:46:25 -03:00
ReinUsesLisp
06b363c9b5 shader: Remove unused AbufNode Ipa mode 2019-05-02 21:46:25 -03:00
bunnei
4fad91ca45
Merge pull request #2383 from ReinUsesLisp/aoffi-test
gl_shader_decompiler: Disable variable AOFFI on unsupported devices
2019-04-22 22:14:02 -04:00
bunnei
650d9b1044
Merge pull request #2409 from ReinUsesLisp/half-floats
shader_ir/decode: Miscellaneous fixes to half-float decompilation
2019-04-19 21:31:52 -04:00
ReinUsesLisp
f43995ec53 shader_ir/decode: Fix half float pre-operations and remove MetaHalfArithmetic
Operations done before the main half float operation (like HAdd) were
managing a packed value instead of the unpacked one. Adding an unpacked
operation allows us to drop the per-operand MetaHalfArithmetic entry,
simplifying the code overall.
2019-04-15 21:16:10 -03:00
ReinUsesLisp
abcbcb1b2a gl_shader_decompiler: Fix MrgH0 decompilation
GLSL decompilation for HMergeH0 was wrong. This addresses that issue.
2019-04-15 21:16:10 -03:00
ReinUsesLisp
64613db605 shader_ir/decode: Implement half float saturation 2019-04-15 21:16:10 -03:00
ReinUsesLisp
acf618afbc renderer_opengl: Implement half float NaN comparisons 2019-04-15 21:13:26 -03:00
ReinUsesLisp
f15c59a164 gl_shader_decompiler: Use variable AOFFI on supported hardware 2019-04-14 05:13:19 -03:00
ReinUsesLisp
5c280e6ff0 shader_ir: Implement STG, keep track of global memory usage and flush 2019-04-14 00:25:32 -03:00
Fernando Sahmkow
c9f35d96be Remove bounding in LD_C 2019-04-09 20:02:11 -04:00
bunnei
8aaf418bd6
Merge pull request #2306 from ReinUsesLisp/aoffi
shader_ir: Implement AOFFI for TEX and TLD4
2019-04-07 17:52:30 -04:00
bunnei
520e4e5d4b
Merge pull request #2327 from ReinUsesLisp/crash-safe-visit
gl_shader_decompiler: Return early when an operation is invalid
2019-04-05 23:36:18 -04:00
bunnei
cb2209d06a
Merge pull request #2337 from lioncash/temporary
gl_shader_decompiler: Rename GenerateTemporal() to GenerateTemporary()
2019-04-05 23:35:31 -04:00
Lioncash
52746ed8dc gl_shader_decompiler: Rename GenerateTemporal() to GenerateTemporary()
Temporal generally indicates a relation to time, but this is just
creating a temporary, so this isn't really an accurate name for what the
function is actually doing.
2019-04-04 19:35:04 -04:00
ReinUsesLisp
88a3c05b7b gl_shader_decompiler: Fix TXQ types
TXQ returns integer types. Shaders usually do:

R0 = TXQ(); // => int
R0 = static_cast<float>(R0);

If we don't treat it as an integer, it will cast a binary float value as
float - resulting in a corrupted number.
2019-04-04 20:07:11 -03:00
ReinUsesLisp
06c1f75f21 gl_shader_decompiler: Return early when an operation is invalid 2019-04-03 16:02:09 -03:00
ReinUsesLisp
38658b38b4 gl_shader_decompiler: Hide local definitions inside an anonymous namespace 2019-03-31 00:26:34 -03:00