Commit graph

304 commits

Author SHA1 Message Date
Lioncash
7fdf991097 shader_bytecode: Make Matcher constexpr capable
Greatly shrinks the amount of generated code for GetDecodeTable().

Collapses an assembly output of 9000+ lines down to ~3621 with Clang,
and 6513 down to ~2616 with GCC, given it's now allowed to construct all
the entries as a sequence of constant data.
2019-10-24 01:10:10 -04:00
bunnei
376f1a4432
Merge pull request #2869 from ReinUsesLisp/suld
shader/image: Implement SULD and fix SUATOM
2019-09-23 21:47:03 -04:00
Rodrigo Locatti
9286976948
Merge pull request #2878 from FernandoS27/icmp
shader_ir: Implement ICMP
2019-09-21 18:06:07 -03:00
ReinUsesLisp
44000971e2
gl_shader_decompiler: Use uint for images and fix SUATOM
In the process remove implementation of SUATOM.MIN and SUATOM.MAX as
these require a distinction between U32 and S32. These have to be
implemented with imageCompSwap loop.
2019-09-21 17:33:52 -03:00
ReinUsesLisp
675f23aedc
shader/image: Implement SULD and remove irrelevant code
* Implement SULD as float.
* Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
2019-09-21 17:32:48 -03:00
ReinUsesLisp
4de0f1e1c8
shader_bytecode: Add SULD encoding 2019-09-21 17:31:46 -03:00
Fernando Sahmkow
527b841c15 Shader_IR: ICMP corrections and fixes 2019-09-21 14:28:03 -04:00
Fernando Sahmkow
4b81d19a1a Shader_IR: Implement ICMP. 2019-09-19 20:56:29 -04:00
ReinUsesLisp
0526bf1895 shader_ir/warp: Implement SHFL 2019-09-17 17:44:07 -03:00
ReinUsesLisp
36abf67e79 shader/image: Implement SUATOM and fix SUST 2019-09-10 20:22:31 -03:00
ReinUsesLisp
77ef4fa907 shader/shift: Implement SHR wrapped and clamped variants
Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires.
2019-09-04 01:55:24 -03:00
bunnei
81fbc5370d
Merge pull request #2812 from ReinUsesLisp/f2i-selector
shader_ir/conversion: Implement F2I and F2F F16 selector
2019-09-03 22:35:33 -04:00
bunnei
d4f33b822b
Merge pull request #2811 from ReinUsesLisp/fsetp-fix
float_set_predicate: Add missing negation bit for the second operand
2019-09-03 22:34:34 -04:00
ReinUsesLisp
e3534700d7 shader_ir/conversion: Split int and float selector and implement F2F H1 2019-08-28 16:09:33 -03:00
ReinUsesLisp
b13fbc25b8 shader_ir/conversion: Implement F2I F16 Ra.H1 2019-08-27 23:40:40 -03:00
ReinUsesLisp
6207751b00 float_set_predicate: Add missing negation bit for the second operand 2019-08-27 21:57:43 -03:00
ReinUsesLisp
4e35177e23 shader_ir: Implement VOTE
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics

Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.

To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:

* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true

ballotARB, also known as "uint64_t(activeThreadsNV())", emits

VOTE.ANY Rd, PT, PT;

on nouveau's compiler. This doesn't match exactly to Nvidia's code

VOTE.ALL Rd, PT, PT;

Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
bunnei
cedc1aab4a
Merge pull request #2753 from FernandoS27/float-convert
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
2019-08-21 10:27:57 -04:00
ReinUsesLisp
2ff8044806 shader_ir: Implement NOP 2019-08-04 03:02:55 -03:00
Fernando Sahmkow
11f4e739bd Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
This commit takes care of implementing the F16 Variants of the 
conversion instructions and makes sure conversions are done.
2019-07-20 17:38:25 -04:00
ReinUsesLisp
6c4985edc9 shader/half_set_predicate: Implement missing HSETP2 variants 2019-07-19 22:20:47 -03:00
Fernando Sahmkow
1bdb59fc6e
Merge pull request #2695 from ReinUsesLisp/layer-viewport
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
2019-07-15 16:28:07 -04:00
Fernando Sahmkow
0ec9da2f9f
Merge pull request #2692 from ReinUsesLisp/tlds-f16
shader/texture: Add F16 support for TLDS
2019-07-14 08:44:38 -04:00
Fernando Sahmkow
8a6fc529a9 shader_ir: Implement BRX & BRA.CC 2019-07-09 08:14:37 -04:00
ReinUsesLisp
c9d886c84e gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
This commit implements gl_ViewportIndex and gl_Layer in vertex and
geometry shaders. In the case it's used in a vertex shader, it requires
ARB_shader_viewport_layer_array. This extension is available on AMD and
Nvidia devices (mesa and proprietary drivers), but not available on
Intel on any platform. At the moment of writing this description I don't
know if this is a hardware limitation or a driver limitation.

In the case that ARB_shader_viewport_layer_array is not available,
writes to these registers on a vertex shader are ignored, with the
appropriate logging.
2019-07-07 20:42:55 -03:00
ReinUsesLisp
d0966b9f7c shader/texture: Add F16 support for TLDS 2019-07-07 16:05:56 -03:00
ReinUsesLisp
4d63f97945 shader_bytecode: Include missing <array> 2019-06-24 01:51:02 -03:00
ReinUsesLisp
06c4ce8645 shader: Decode SUST and implement backing image functionality 2019-06-20 21:38:33 -03:00
ReinUsesLisp
4e81fc8296 shader: Implement texture buffers 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
a32c52b1d8 shader_bytecode: Mark EXIT as flow instruction 2019-06-04 12:18:35 -04:00
ReinUsesLisp
75e7b45d69 shader/memory: Implement ST (generic memory) 2019-05-20 22:41:53 -03:00
ReinUsesLisp
f78ef617b6 shader/memory: Implement LD (generic memory) 2019-05-20 22:38:59 -03:00
ReinUsesLisp
d4df803b2b shader_ir/other: Implement IPA.IDX 2019-05-02 21:46:37 -03:00
ReinUsesLisp
71aa9d0877 shader_ir/memory: Implement physical input attributes 2019-05-02 21:46:25 -03:00
ReinUsesLisp
7632a7d6d2 shader_bytecode: Add AL2P decoding 2019-05-02 21:46:25 -03:00
bunnei
da0c3bc658
Merge pull request #2407 from FernandoS27/f2f
Do some corrections in conversion shader instructions.
2019-04-20 00:42:34 -04:00
bunnei
5bd5140bde
Merge pull request #2348 from FernandoS27/guest-bindless
Implement Bindless Textures on Shader Decompiler and GL backend
2019-04-17 20:59:49 -04:00
bunnei
0cfbd3325b
Merge pull request #2315 from ReinUsesLisp/severity-decompiler
shader_ir/decode: Reduce the severity of common assertions
2019-04-16 22:21:19 -04:00
Fernando Sahmkow
aa471274d9 Do some corrections in conversion shader instructions.
Corrects encodings for I2F, F2F, I2I and F2I
Implements Immediate variants of all four conversion types.
Add assertions to unimplemented stuffs.
2019-04-15 19:16:27 -04:00
ReinUsesLisp
5c280e6ff0 shader_ir: Implement STG, keep track of global memory usage and flush 2019-04-14 00:25:32 -03:00
bunnei
353a099481
Merge pull request #2366 from FernandoS27/xmad-fix
Correct XMAD mode, psl and high_b on different encodings.
2019-04-09 19:15:01 -04:00
Fernando Sahmkow
5c55ae4e18 Correct LOP_IMN encoding 2019-04-08 13:39:12 -04:00
Fernando Sahmkow
16adc735a5 Correct XMAD mode, psl and high_b on different encodings. 2019-04-08 13:01:17 -04:00
Fernando Sahmkow
492040bd9c Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format. 2019-04-08 11:36:11 -04:00
Fernando Sahmkow
4841440382 Implement TXQ_B 2019-04-08 11:29:52 -04:00
Fernando Sahmkow
ac3ba9a33e Corrections to TEX_B 2019-04-08 11:28:44 -04:00
Fernando Sahmkow
e28fd3d0a5 Implement Bindless Samplers and TEX_B in the IR. 2019-04-08 11:23:42 -04:00
ReinUsesLisp
04979560fb shader_ir/memory: Reduce severity of LD_L cache management and log it 2019-04-03 17:12:44 -03:00
ReinUsesLisp
24abeb9a67 shader_ir/memory: Reduce severity of ST_L cache management and log it 2019-04-03 17:12:44 -03:00
bunnei
633ce92908
Merge pull request #2147 from ReinUsesLisp/texture-clean
shader_ir: Remove "extras" from the MetaTexture
2019-03-10 17:28:36 -04:00
Lioncash
f9ee0dc7ee video_core/engines: Remove unnecessary includes
Removes a few unnecessary dependencies on core-related machinery, such
as the core.h and memory.h, which reduces the amount of rebuilding
necessary if those files change.

This also uncovered some indirect dependencies within other source
files. This also fixes those.
2019-03-05 20:35:32 -05:00
ReinUsesLisp
5ca63d0675 shader/decode: Remove extras from MetaTexture 2019-02-26 00:11:30 -03:00
ReinUsesLisp
48e6f77c03 shader/decode: Split memory and texture instructions decoding 2019-02-26 00:11:30 -03:00
Fernando Sahmkow
10682ad7e0 shader_decompiler: Improve Accuracy of Attribute Interpolation. 2019-02-14 03:25:07 -04:00
Fernando Sahmkow
f5ec165e8c Corrected F2I None mode to RoundEven. 2019-02-11 18:46:45 -04:00
bunnei
72c70d6808
Merge pull request #2081 from ReinUsesLisp/lmem-64
shader_ir/memory: Add LD_L 64 bits loads
2019-02-05 09:17:48 -05:00
bunnei
bb4549a73d
Merge pull request #2082 from FernandoS27/txq-stl
Fix TXQ not using the component mask.
2019-02-04 20:22:32 -05:00
Mat M
a568cd805b
Update src/video_core/engines/shader_bytecode.h
Co-Authored-By: FernandoS27 <fsahmkow27@gmail.com>
2019-02-03 21:27:26 -04:00
Fernando Sahmkow
0306c50339 Fix TXQ not using the component mask. 2019-02-03 18:17:18 -04:00
ReinUsesLisp
9feb68085d shader_bytecode: Rename BytesN enums to BitsN 2019-02-03 00:25:40 -03:00
ReinUsesLisp
477d616f7d shader_ir: Unify constant buffer offset values
Constant buffer values on the shader IR were using different offsets if
the access direct or indirect. cbuf34 has a non-multiplied offset while
cbuf36 does. On shader decoding this commit multiplies it by four on
cbuf34 queries.
2019-01-30 02:45:50 -03:00
ReinUsesLisp
3b84e04af1 shader_decode: Implement LDG and basic cbuf tracking 2019-01-30 00:00:15 -03:00
ReinUsesLisp
a1b845b651 shader_decode: Implement VMAD and VSETP 2019-01-15 17:54:53 -03:00
ReinUsesLisp
dd91650aaf shader_decode: Implement HFMA2 2019-01-15 17:54:52 -03:00
ReinUsesLisp
4316eaf75c shader_decode: Fixup clang-format 2019-01-15 17:54:52 -03:00
ReinUsesLisp
15a0e1481d shader_ir: Initial implementation 2019-01-15 17:54:49 -03:00
ReinUsesLisp
294df41b86 shader_bytecode: Fixup encoding 2019-01-15 17:54:49 -03:00
ReinUsesLisp
aaa0e6c346 shader_bytecode: Fixup TEXS.F16 encoding 2018-12-26 01:35:44 -03:00
David Marcec
fdd649e2ef Fixed uninitialized memory due to missing returns in canary
Functions which are suppose to crash on non canary builds usually don't return anything which lead to uninitialized memory being used.
2018-12-19 12:52:32 +11:00
ReinUsesLisp
ef061481c5 shader_bytecode: Fixup half float's operator B encoding 2018-12-18 04:28:50 -03:00
heapo
72599cc667 Implement postfactor multiplication/division for fmul instructions 2018-12-17 07:56:25 -08:00
ReinUsesLisp
59a8df1b14 gl_shader_decompiler: Implement TEXS.F16 2018-12-05 02:06:34 -03:00
bunnei
f9a211220c
Merge pull request #1763 from ReinUsesLisp/bfi
gl_shader_decompiler: Implement BFI_IMM_R
2018-11-25 23:04:57 -05:00
bunnei
d7d1ab15b6
Merge pull request #1760 from ReinUsesLisp/r2p
gl_shader_decompiler: Implement R2P_IMM
2018-11-25 22:38:42 -05:00
bunnei
8ce90a4f0b
Merge pull request #1783 from ReinUsesLisp/clip-distances
gl_shader_decompiler: Implement clip distances
2018-11-25 22:35:30 -05:00
bunnei
b6b78203cc
Merge pull request #1769 from ReinUsesLisp/cc
gl_shader_decompiler: Rename cc to condition code and name internal flags
2018-11-23 23:31:04 -05:00
Hexagon12
3135dbc29c Added predicate comparison LessEqualWithNan (#1736)
* Added predicate comparison LessEqualWithNan

* oops

* Clang fix
2018-11-23 08:51:32 -08:00
ReinUsesLisp
b3853403b7 gl_shader_decompiler: Implement clip distances 2018-11-23 02:14:43 -03:00
ReinUsesLisp
8a5e6fce07 gl_shader_decompiler: Rename control codes to condition codes 2018-11-21 22:31:16 -03:00
ReinUsesLisp
642dfeda2a gl_shader_decompiler: Implement BFI_IMM_R 2018-11-21 16:12:30 -03:00
ReinUsesLisp
d92afc7493 gl_shader_decompiler: Implement R2P_IMM 2018-11-21 04:56:00 -03:00
bunnei
9afcbba8e4
Merge pull request #1527 from FernandoS27/assert-flow
Assert Control Flow Instructions using Control Codes
2018-11-01 00:34:56 -04:00
bunnei
86e70cf302
Merge pull request #1528 from FernandoS27/assert-control-codes
Assert Control Codes Generation on Shader Instructions
2018-10-31 22:34:18 -04:00
FernandoS27
5bb80ab009 Assert Control Codes Generation 2018-10-30 13:37:55 -04:00
Frederic L
7a5eda5914 global: Use std::optional instead of boost::optional (#1578)
* get rid of boost::optional

* Remove optional references

* Use std::reference_wrapper for optional references

* Fix clang format

* Fix clang format part 2

* Adressed feedback

* Fix clang format and MacOS build
2018-10-30 00:03:25 -04:00
FernandoS27
3aa8b644a9 Assert Control Flow Instructions using Control Codes 2018-10-28 19:16:41 -04:00
FernandoS27
ca142f35c0 Implemented LD_L and ST_L 2018-10-24 17:51:53 -04:00
FernandoS27
ed8ca608a0 Implement PointSize 2018-10-23 15:08:00 -04:00
bunnei
5716496239
Merge pull request #1519 from ReinUsesLisp/vsetp
gl_shader_decompiler: Implement VSETP
2018-10-23 10:22:37 -04:00
ReinUsesLisp
7d6dca0d0a gl_shader_decompiler: Implement VSETP 2018-10-23 01:07:20 -03:00
ReinUsesLisp
5dfb43531c gl_shader_decompiler: Abstract VMAD into a video subset 2018-10-23 01:07:20 -03:00
bunnei
848a49112a
Merge pull request #1512 from ReinUsesLisp/brk
gl_shader_decompiler: Implement PBK and BRK
2018-10-23 00:01:38 -04:00
FernandoS27
259da93567 Added Saturation to FMUL32I 2018-10-22 20:22:15 -04:00
FernandoS27
5c5b4e8e7d Fixed FSETP and FSET 2018-10-22 11:31:17 -04:00
bunnei
b1f8bff7db
Merge pull request #1501 from ReinUsesLisp/half-float
gl_shader_decompiler: Implement H* instructions
2018-10-19 23:47:19 -04:00
ReinUsesLisp
41fb25349a gl_shader_decompiler: Implement PBK and BRK 2018-10-17 21:30:45 -03:00
ReinUsesLisp
936c36a514 shader_bytecode: Add Control Code enum 0xf
Control Code 0xf means to unconditionally execute the instruction. This
value is passed to most BRA, EXIT and SYNC instructions (among others)
but this may not always be the case.
2018-10-15 15:36:47 -03:00
ReinUsesLisp
6312eec5ef gl_shader_decompiler: Implement HSET2_R 2018-10-15 02:55:51 -03:00
ReinUsesLisp
4fc8ad67bf gl_shader_decompiler: Implement HSETP2_R 2018-10-15 02:55:51 -03:00
ReinUsesLisp
3d65aa4caf gl_shader_decompiler: Implement HFMA2 instructions 2018-10-15 02:55:51 -03:00