Commit graph

105 commits

Author SHA1 Message Date
MerryMage
9cc00f900c emit_x64_vector: Release registers when possible in EmitVectorTableLookup 2020-04-22 20:53:45 +01:00
MerryMage
e68bd3c6c1 emit_x64_vector: Special-case table_size == 1 in EmitVectorTableLookup 2020-04-22 20:53:45 +01:00
MerryMage
a4e1f8a63a emit_x64_vector: SSE4.1 implementation of EmitVectorTableLookup 2020-04-22 20:53:45 +01:00
MerryMage
89d08c7d61 IR: Add VectorTable and VectorTableLookup IR instructions 2020-04-22 20:53:45 +01:00
Lioncash
29f8b30634 A64: Implement SRSHL and URSHL
Implements both scalar and vector variants.
2020-04-22 20:53:45 +01:00
Lioncash
0efa2ce3b0 ir: Add opcodes for performing rounding left shifts 2020-04-22 20:53:45 +01:00
MerryMage
a7e6f2a235 emit_x64_vector: EmitVectorNarrow16: AVX512 implementation 2020-04-22 20:46:23 +01:00
MerryMage
b6350e3947 emit_x64_vector: EmitVectorNarrow32: prefer pblendw to loading constant 2020-04-22 20:46:23 +01:00
MerryMage
8fdba189cb emit_x64_vector: packusdw is SSE4.1 2020-04-22 20:46:23 +01:00
Lioncash
391e16be64 emit_x64_vector: Vectorize 32-bit variants of paired min/max
Gets rid of the fallbacks for these cases.
2020-04-22 20:46:23 +01:00
MerryMage
5ae045d67e emit_x64_vector: Improve code emission of VectorGetElement* for index == 0 2020-04-22 20:46:23 +01:00
MerryMage
476c0f15da backend_x64: Remove all use of xmm0 2020-04-22 20:46:23 +01:00
Lioncash
463b9a3d02 ir: Add opcodes for vector paired maximum and minimums
For the time being, we can just do a naive implementation which avoids
falling back to the interpreter a bit. Horizontal operations aren't
necessarily x86 SIMD's forte anyways.
2020-04-22 20:46:23 +01:00
Lioncash
7fdd8b0197 A64: Implement PMULL{2} 2020-04-22 20:46:23 +01:00
Lioncash
affa312d1d ir: Add opcode for performing polynomial multiplication 2020-04-22 20:46:22 +01:00
MerryMage
1edd0125b2 mp: rename mp.h to mp/function_info.h 2020-04-22 20:46:22 +01:00
MerryMage
0921678edb emit_x64_vector: Slightly improve ArithmeticShiftRightByte 2020-04-22 20:46:22 +01:00
MerryMage
43407c4bb4 emit_x64_vector: Simplify VectorShuffleImpl 2020-04-22 20:46:22 +01:00
MerryMage
8f4c1a8558 emit_x64_vector: -0x80000000 isn't -0x80000000 2020-04-22 20:46:22 +01:00
MerryMage
b455b566e7 A64: Implement UQXTN (vector) 2020-04-22 20:46:22 +01:00
MerryMage
e686a81612 emit_x64_vector: Fix non-SSE4.1 saturated narrowing reconstruction comparison
Allows non-SSE4.1 to produce the correct FPSR.QC flag
2020-04-22 20:46:22 +01:00
MerryMage
3874cb37e3 A64: Implement SQXTN (vector) 2020-04-22 20:46:22 +01:00
MerryMage
8ef114d48f emit_x64_vector: packusdw reqiures SSE4.1
In EmitVectorSignedSaturatedNarrowToUnsigned32.
2020-04-22 20:46:22 +01:00
MerryMage
f020dbe4ed A64: Implement SQXTUN 2020-04-22 20:46:22 +01:00
Lioncash
d65b056eba Simplify fallback case for EmitVectorSetElement64() 2020-04-22 20:46:21 +01:00
Lioncash
46cb0d813b emit_x64_vector: Append 'v' prefix onto movq in AVX path
This is something I missed when adding in the AVX broadcast code.
2020-04-22 20:46:21 +01:00
Lioncash
f939bd0228 emit_x64_vector{_floating_point}: Add helper alias for sizing arrays relative to vector width
Avoids needing to remember to specify the proper size of the arrays, all
that's needed is to specify the type of the array and the size will
automatically be deduced from it. This helps prevent potential oversized
or undersized arrays from being specified.
2020-04-22 20:46:21 +01:00
Lioncash
7797bc2fb2 emit_x64_vector: Use non-scratch Use* variants of registers within EmitVectorUnsignedAbsoluteDifference()
In some cases, a register isn't modified, depending on the branch taken,
so we can signify this by using the non-scratch variants in certain
cases.
2020-04-22 20:46:20 +01:00
MerryMage
9dba273a8c A64: Implement SADDLP 2020-04-22 20:46:19 +01:00
MerryMage
70ff2d73b5 A64: Implement UADDLP 2020-04-22 20:46:19 +01:00
MerryMage
5563bbbd79 A64: Implement EXT 2020-04-22 20:46:19 +01:00
Lioncash
35026a6ce3 emit_x64_vector: Vectorize fallback path for EmitVectorMaxU32() 2020-04-22 20:46:19 +01:00
Lioncash
0bee648b4f emit_x64_vector: Deduplicate a bit of code in EmitVectorSetElement{8, 32, 64} functions
Given both branches are the same, we can hoist out the common code.
2020-04-22 20:46:18 +01:00
Lioncash
b6e223fc58 emit_x64_vector: Deduplicate a bit of code within EmitVectorGetElement8()
Given both branches use the same destination register size, we can hoist
the common code out.
2020-04-22 20:46:18 +01:00
Lioncash
cf188448d4 emit_x64_vector: Vectorize fallback case in EmitVectorMultiply64()
Gets rid of the need to perform a fallback.
2020-04-22 20:46:18 +01:00
Lioncash
954deff2d4 emit_x64_vector: Add break to final case in EmitVectorRoundingHalvingAddUnsigned()
This doesn't alter behavior but does make the code better if anything
else is ever added to this function in the future.
2020-04-22 20:46:18 +01:00
Lioncash
bc718c5b28 ir: Add opcodes for performing rounding halving adds 2020-04-22 20:46:18 +01:00
Lioncash
054549da35 emit_x64_vector: Simplify AVX-512 codepath in EmitVectorMultiply64
I realized I introduced a helper for simple AVX operation emitting, so
use that instead of writing it all out long-form.
2020-04-22 20:46:18 +01:00
Lioncash
6de5ed96e5 emit_x64_vector: Emit VPMULLQ in EmitVectorMultiply64 on AVX-512{DQ, VL} capable CPUs
Shortens code-gen down to a single instruction in the 64-bit path.
2020-04-22 20:46:18 +01:00
Lioncash
1e10017f4b ir: Add opcodes for signed absolute differences 2020-04-22 20:46:17 +01:00
Lioncash
44a5f8095a ir: Add opcodes for performing vector halving subtracts 2020-04-22 20:46:17 +01:00
Lioncash
27a6d5f6ce emit_x64_vector: Use VPOPCNTB in EmitVectorPopulationCount() if AVX-512 BITALG is available 2020-04-22 20:46:17 +01:00
Lioncash
089096948a ir: Add opcodes for performing halving adds 2020-04-22 20:46:17 +01:00
Lioncash
3d00dd63b4 emit_x64_vector: Emit VPMINSQ and VPMINUQ for 64-bit vector min operations if AVX-512VL is available 2020-04-22 20:46:17 +01:00
Lioncash
b97b71b8aa emit_x64_vector: Emit VPMAXSQ and VPMAXUQ for 64-bit vector max operations if AVX-512VL is available 2020-04-22 20:46:17 +01:00
Lioncash
0f067b7330 emit_x64_vector: Emit VPABSQ in EmitVectorAbs() for the 64-bit case if AVX-512VL is available 2020-04-22 20:46:17 +01:00
Lioncash
d4ee878cbd emit_x64_vector: Use VPSRAQ in EmitVectorArithmeticShiftRight64() if AVX-512VL is available 2020-04-22 20:46:17 +01:00
Lioncash
51e4f1d9db emit_x64_vector: Vectorize fallback path of EmitVectorMaxS32() 2020-04-22 20:46:17 +01:00
Lioncash
c692ccdd6d emit_x64_vector: Vectorize fallback path of EmitVectorMaxS8() 2020-04-22 20:46:17 +01:00
Lioncash
b194313d8c emit_x64_vector: Vectorize fallback path in EmitVectorMinU32() 2020-04-22 20:46:17 +01:00