Commit graph

749 commits

Author SHA1 Message Date
ameerj
1b829fbd7a move thread 1/4 count computation into allocate workers method 2020-08-16 12:02:22 -04:00
ameerj
31a76410e8 Address feedback, add shader compile notifier, update setting text 2020-08-16 12:02:22 -04:00
ameerj
c02464f64e Vk Async Worker directly emplace in cache 2020-08-16 12:02:22 -04:00
ameerj
4539073ce1 Address feedback. Bruteforce delete duplicates 2020-08-16 12:02:22 -04:00
ameerj
6ac97405df Vk Async pipeline compilation 2020-08-16 12:02:22 -04:00
Lioncash
b724a4d90c General: Tidy up clang-format warnings part 2 2020-08-13 14:19:08 -04:00
bunnei
f650cf8a9a
Merge pull request #4391 from lioncash/nrvo
video_core: Allow copy elision to take place where applicable
2020-07-24 06:33:09 -07:00
Rodrigo Locatti
9ea9a60e17
Merge pull request #4361 from ReinUsesLisp/lane-id
decode/other: Implement S2R.LaneId
2020-07-21 04:50:45 -03:00
Lioncash
6adc824d9d video_core: Allow copy elision to take place where applicable
Removes const from some variables that are returned from functions, as
this allows the move assignment/constructors to execute for them.
2020-07-21 00:36:13 -04:00
bunnei
3d13d7f48f
Merge pull request #4324 from ReinUsesLisp/formats
video_core: Fix, add and rename pixel formats
2020-07-21 00:13:04 -04:00
David Marcec
967307d3be Fix style issues 2020-07-18 14:24:32 +10:00
David Marcec
85b591f6f0 Remove duplicate config 2020-07-17 14:26:18 +10:00
David Marcec
f48187449e Use conditional var 2020-07-17 14:26:17 +10:00
David Marcec
468bd9c1b0 async shaders 2020-07-17 14:24:57 +10:00
ReinUsesLisp
210cc0204d decode/other: Implement S2R.LaneId
This maps to host's thread id.

- Fixes graphical issues on Paper Mario.
2020-07-16 16:09:39 -03:00
ReinUsesLisp
fbc232426d video_core: Rearrange pixel format names
Normalizes pixel format names to match Vulkan names. Previous to this
commit pixel formats had no convention, leading to confusion and
potential bugs.
2020-07-13 01:44:23 -03:00
bunnei
efef7b1517
Merge pull request #4147 from ReinUsesLisp/hset2-imm
shader/half_set: Implement HSET2_IMM
2020-06-26 23:14:56 -04:00
bunnei
2f2df9a4a7
Merge pull request #4083 from Morph1984/B10G11R11F
decode/image: Implement B10G11R11F
2020-06-24 11:02:38 -04:00
ReinUsesLisp
39ab33ee1c shader/half_set: Implement HSET2_IMM
Add HSET2_IMM. Due to the complexity of the encoding avoid using
BitField unions and read the relevant bits from the code itself.
This is less error prone.
2020-06-22 20:51:18 -03:00
Morph
480e1fa987 decode/image: Implement B10G11R11F
- Used by Kirby Star Allies
2020-06-20 00:28:30 -04:00
MerryMage
442e48ef4c memory_util: boost hashes are size_t
* boost::hash_value returns a size_t
* boost::hash_combine takes a size_t& argument
2020-06-18 15:47:43 +01:00
ReinUsesLisp
5b2b6d594c shader/texture: Join separate image and sampler pairs offline
Games using D3D idioms can join images and samplers when a shader
executes, instead of baking them into a combined sampler image. This is
also possible on Vulkan.

One approach to this solution would be to use separate samplers on
Vulkan and leave this unimplemented on OpenGL, but we can't do this
because there's no consistent way of determining which constant buffer
holds a sampler and which one an image. We could in theory find the
first bit and if it's in the TIC area, it's an image; but this falls
apart when an image or sampler handle use an index of zero.

The used approach is to track for a LOP.OR operation (this is done at an
IR level, not at an ISA level), track again the constant buffers used as
source and store this pair. Then, outside of shader execution, join
the sample and image pair with a bitwise or operation.

This approach won't work on games that truly use separate samplers in a
meaningful way. For example, pooling textures in a 2D array and
determining at runtime what sampler to use.

This invalidates OpenGL's disk shader cache :)

- Used mostly by D3D ports to Switch
2020-06-05 00:24:51 -03:00
ReinUsesLisp
e1438f8e91 shader/track: Move bindless tracking to a separate function 2020-06-04 23:02:55 -03:00
LC
9a0c1456e3
Merge pull request #4016 from ReinUsesLisp/invocation-info
shader/other: Fix hardcoded value in S2R INVOCATION_INFO
2020-06-02 09:47:53 -04:00
ReinUsesLisp
f2d1aa97ad shader/other: Fix hardcoded value in S2R INVOCATION_INFO
Geometry shaders built from Nvidia's compiler check for bits[16:23] to
be less than or equal to 0 with VSETP to default to a "safe" value of
0x8000'0000 (safe from hardware's perspective). To avoid hitting this
path in the shader, return 0x00ff'0000 from S2R INVOCATION_INFO.

This seems to be the maximum number of vertices a geometry shader can
emit in a primitive.
2020-05-30 01:49:14 -03:00
ReinUsesLisp
32e6727dae shader/other: Implement MEMBAR.CTS
This silences an assertion we were hitting and uses workgroup memory
barriers when the game requests it.
2020-05-27 00:19:45 -03:00
bunnei
508242c267
Merge pull request #3981 from ReinUsesLisp/bar
shader/other: Implement BAR.SYNC 0x0
2020-05-26 14:40:13 -04:00
bunnei
623d9c47a2
Merge pull request #3980 from ReinUsesLisp/red-op
shader/memory: Implement non-addition operations in RED
2020-05-26 12:50:41 -04:00
ReinUsesLisp
5d0986a53b shader/other: Implement BAR.SYNC 0x0
Trivially implement this particular case of BAR. Unless games use OpenCL
or CUDA barriers, we shouldn't hit any other case here.
2020-05-21 23:20:43 -03:00
ReinUsesLisp
103809a0ca shader/memory: Implement non-addition operations in RED
Trivially implement these instructions. They are used in Astral Chain.
2020-05-21 23:19:46 -03:00
ReinUsesLisp
e2b67a868b shader/other: Implement thread comparisons (NV_shader_thread_group)
Hardware S2R special registers match gl_Thread*MaskNV. We can trivially
implement these using Nvidia's extension on OpenGL or naively stubbing
them with the ARB instructions to match. This might cause issues if the
host device warp size doesn't match Nvidia's. That said, this is
unlikely on proper shaders.

Refer to the attached url for more documentation about these flags.
https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt
2020-05-21 23:18:37 -03:00
ReinUsesLisp
4e57f9d5cf shader_ir: Separate float-point comparisons in ordered and unordered
This allows us to use native SPIR-V instructions without having to
manually check for NAN.
2020-05-09 04:55:15 -03:00
bunnei
e6b4311178
Merge pull request #3693 from ReinUsesLisp/clean-samplers
shader/texture: Support multiple unknown sampler properties
2020-05-02 00:45:41 -04:00
bunnei
c7b5a87c90
Merge pull request #3799 from ReinUsesLisp/iadd-cc
shader: Implement P2R CC, IADD Rd.CC and IADD.X
2020-04-30 12:56:36 -04:00
bunnei
6572660fde
Merge pull request #3788 from FernandoS27/revert
Revert: shader_decode: Fix LD, LDG when track constant buffer.
2020-04-30 12:55:39 -04:00
ReinUsesLisp
871aadbe36 shader/arithmetic_integer: Fix tracking issue in temporary
This temporary is not needed as we mark Rd.CC + IADD.X as unimplemented.
It caused issues when tracking global buffers.
2020-04-28 17:14:53 -03:00
bunnei
72b73d22ab
Merge pull request #3784 from ReinUsesLisp/shader-memory-util
shader/memory_util: Deduplicate code
2020-04-28 12:05:50 -04:00
ReinUsesLisp
ddd82ef42b shader/memory_util: Deduplicate code
Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as
well as shader decoder code.

While we are at it, fix a bug in gl_shader_cache where compute shaders
had an start offset of a stage shader.
2020-04-26 01:38:51 -03:00
ReinUsesLisp
e895a4e2d7 shader/arithmetic_integer: Fix edge case and mark IADD.X Rd.CC as unimplemented
IADD.X Rd.CC requires some extra logic that is not currently
implemented. Abort when this is hit.
2020-04-25 22:58:33 -03:00
ReinUsesLisp
2a96bea6a7 shader/arithmetic_integer: Change IAdd to UAdd to avoid signed overflow
Signed integer addition overflow might be undefined behavior. It's free
to change operations to UAdd and use unsigned integers to avoid
potential bugs.
2020-04-25 22:57:54 -03:00
ReinUsesLisp
c788f9c0bd shader/arithmetic_integer: Implement IADD.X
IADD.X takes the carry flag and adds it to the result. This is generally
used to emulate 64-bit operations with 32-bit registers.
2020-04-25 22:56:11 -03:00
ReinUsesLisp
255197e643 shader/arithmetic_integer: Implement CC for IADD 2020-04-25 22:55:26 -03:00
ReinUsesLisp
ffc5ec6fa8 decode/register_set_predicate: Implement CC
P2R CC takes the state of condition codes and puts them into a register.
We already have this implemented for PR (predicates). This commit
implements CC over that.
2020-04-25 22:54:42 -03:00
ReinUsesLisp
d523734266 decode/register_set_predicate: Use move for shared pointers
Avoid atomic counters used by shared pointers.
2020-04-25 22:54:14 -03:00
bunnei
4e37825dab
Merge pull request #3734 from ReinUsesLisp/half-float-mods
decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits
2020-04-25 00:41:43 -04:00
bunnei
7c8acb0025
Merge pull request #3749 from ReinUsesLisp/lea-imm
shader/arithmetic_integer: Fix LEA_IMM encoding
2020-04-24 14:30:13 -04:00
Fernando Sahmkow
d8a961cd6c Revert: shader_decode: Fix LD, LDG when track constant buffer. 2020-04-24 11:00:54 -04:00
ReinUsesLisp
dbaebd8582 decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits
The encoding for negation and absolute value was wrong.
Extracting is now done manually. Similar instructions having different
encodings is the rule, not the exception. To keep sanity and readability
I preferred to extract the desired bit manually.

This is implemented against nxas:
8dbc389957/table.h (L68)

That is itself tested against nvdisasm (Nvidia's official disassembler).
2020-04-23 18:29:38 -03:00
ReinUsesLisp
4fb921ff6b shader/texture: Support multiple unknown sampler properties
This allows deducing some properties from the texture instruction before
asking the runtime. By doing this we can handle type mismatches in some
instructions from the renderer instead of the shader decoder.

Fixes texelFetch issues with games using 2D texture instructions on a 1D
sampler.
2020-04-23 18:04:13 -03:00
ReinUsesLisp
72deb773fd shader_ir: Turn classes into data structures 2020-04-23 18:00:06 -03:00