Lioncash
e71612d394
A64: Implement SSHL (scalar)
2020-04-22 20:46:17 +01:00
Lioncash
ef1e69a1e3
A64: Implement SSHL (vector)
2020-04-22 20:46:17 +01:00
Lioncash
21974ee57e
backend_x64/ir: Amend generic LogicalVShift() template to also handle signed variants
...
Also adds IR opcodes to dispatch said variants
2020-04-22 20:46:17 +01:00
Lioncash
9fc89f0a0e
emit_x64_vector_floating_point: Use arrays for retrieving size instead of hardcoding the size
...
Similar changes were done in emit_x64_vector, but these were missed.
2020-04-22 20:46:17 +01:00
Lioncash
af28e89a13
emit_x64_vector: Vectorize fallback path in EmitVectorMaxU16()
2020-04-22 20:46:17 +01:00
Lioncash
cda75e2079
A64: Implement CMTST's scalar variant
2020-04-22 20:46:17 +01:00
Lioncash
0d20423ad5
emit_x64_vector: Vectorize non-SSE4.1 fallback path for VectorMultiply32()
2020-04-22 20:46:17 +01:00
Lioncash
d70ee7c0d1
emit_x64_vector: Use VBPROADCAST where applicable and available
...
Uses the instruction that does what it says in its name if available. Allows avoiding the use
of a scratch register in EmitVectorBroadcast8() and EmitVectorBroadcastLower8()'s SSSE3 path.
2020-04-22 20:46:17 +01:00
Lioncash
bebe7235ae
A64: Implement UZP1 and UZP2
2020-04-22 20:46:17 +01:00
Lioncash
26d77c6f09
ir: Add opcodes for performing vector deinterleaving
2020-04-22 20:46:17 +01:00
Lioncash
d6f9ed47d9
A64: Implement FNEG (half-precision)
2020-04-22 20:46:17 +01:00
MerryMage
2b8bc1d4e1
README: Add usage example
2020-04-22 20:46:17 +01:00
Lioncash
7efbd73bac
A64: Implement USHL (scalar)
2020-04-22 20:46:17 +01:00
Lioncash
41f4717f2b
A64: Implement FNEG (vector)
2020-04-22 20:46:17 +01:00
Lioncash
ba1cc6366d
A64: Implement RSUBHN/RSUBHN2
2020-04-22 20:46:17 +01:00
Lioncash
e41640fe33
A64: Implement RADDHN/RADDHN2
2020-04-22 20:46:17 +01:00
Lioncash
b719a6b3f7
A64: Implement XAR
2020-04-22 20:46:17 +01:00
Lioncash
0b1b131ec2
simd_two_register_misc: Factor out common comparison code
...
Gets rid of a tiny bit of duplicated code.
2020-04-22 20:46:17 +01:00
Lioncash
ed0b84da70
A64: Implement CMLE (zero)'s vector variant
2020-04-22 20:46:17 +01:00
Lioncash
b595a68ffa
A64: Implement CMTST (vector)
2020-04-22 20:46:17 +01:00
Lioncash
48c7f8630c
A64: Implement ADDHN{2} and SUBHN{2}
2020-04-22 20:46:17 +01:00
Lioncash
3acd9c9200
translate: zero extend result in Vpart when storing to lower part of vector
2020-04-22 20:46:17 +01:00
Lioncash
87ca63699f
emit_x64_vector: Emit PMAXUD in EmitVectorMaxU32 on SSE4.1-capable CPUs
2020-04-22 20:46:17 +01:00
Lioncash
f17702f608
emit_x64_vector: Emit PMINUD in EmitVectorMinU32 on SSE4.1-capable CPUs
2020-04-22 20:46:17 +01:00
Lioncash
596a8dd1dd
emit_x64_vector: Emit PMINSD in EmitVectorMinS32 on SSE4.1-capable CPUs
...
Provides a better alternative to a fallback operation.
2020-04-22 20:46:17 +01:00
Lioncash
75fd4eaaaa
emit_x64_vector: Get rid of some magic numbers in loop bounds
2020-04-22 20:46:17 +01:00
Lioncash
7b80ac25eb
emit_x64_vector: Generify variable shift functions
2020-04-22 20:46:17 +01:00
Lioncash
4ec735f707
A64: Implement CMLE (zero)'s scalar variant
2020-04-22 20:46:17 +01:00
Lioncash
6534184df2
A64: Implement CMLT (zero)'s scalar single/double-precision variant
2020-04-22 20:46:17 +01:00
Lioncash
8863c9bb4b
A64: Implement SHA512H2
2020-04-22 20:46:17 +01:00
Lioncash
033b890e25
A64: Implement SHA512H
2020-04-22 20:46:17 +01:00
Lioncash
d1f5b084b4
A64: Handle S32->F32 case for SCVTF (vector)
2020-04-22 20:46:17 +01:00
Lioncash
38fa984b53
IR: Add opcode for packed word->f32 conversions
2020-04-22 20:46:16 +01:00
Lioncash
b8587d8e34
A64: Implement SHA512SU1
2020-04-22 20:46:16 +01:00
Lioncash
44d846045a
A64: Implement SHA512SU0
2020-04-22 20:46:16 +01:00
Lioncash
ca903c1585
A64: Implement SHA256H and SHA256H2
2020-04-22 20:46:16 +01:00
MerryMage
e4237c44eb
A64: Implement SCVTF (vector, integer), scalar varaint
2020-04-22 20:46:16 +01:00
MerryMage
bfba38d0b6
impl: Reorganize scalar two-register misc instructions
2020-04-22 20:46:16 +01:00
Lioncash
ea582b17cc
A64: Implement SHA256SU1
2020-04-22 20:46:16 +01:00
Lioncash
06c5dcaf5e
simd_two_register_misc: Add missing zeroing of the vector for CMGT and CMLT
2020-04-22 20:46:16 +01:00
Lioncash
0d50d7314b
A64: Implement CMGE (zero)'s vector variant
2020-04-22 20:46:16 +01:00
Lioncash
ab35dc0e78
A64: Implement MLS (by element)
2020-04-22 20:46:16 +01:00
Lioncash
1651e60462
A64: Implement MUL (by element)
2020-04-22 20:46:16 +01:00
MerryMage
a86d4093cd
A64: Implement MLA (by element)
2020-04-22 20:46:16 +01:00
Lioncash
7f47402609
A64: Implement ABS (scalar)
2020-04-22 20:46:16 +01:00
Lioncash
c8eb4528be
A64: Implement SHA256SU0
2020-04-22 20:46:16 +01:00
Lioncash
cd0b71159a
CMake: Make FindUnicorn introduce a unicorn target
...
Makes the find module do all the work of properly setting up the target instead of needing to do it in the main CMakeLists file.
2020-04-22 20:46:16 +01:00
Lioncash
181c3b0790
A64: Implement SHA1M
2020-04-22 20:46:16 +01:00
Lioncash
47bc97a71b
A64: Implement SHA1P
2020-04-22 20:46:16 +01:00
Lioncash
718f3e9bb4
A64: Implement scalar variants of CMEQ, CMGT, and CMGE zero comparison instructions
...
These can trivially use the ScalarCompare helper function.
2020-04-22 20:46:16 +01:00