Commit graph

1885 commits

Author SHA1 Message Date
Lioncash
fac9224d5e A64: Handle half-precision floating point in FCVTN
Now that we have IR instructions for performing conversions with
half-precision floating point, we can also handle half-precision values
within FCVTN.
2020-04-22 20:58:12 +01:00
Lioncash
8309ec7a9f frontend/ir_emitter: Add half-precision variant of FPAbs 2020-04-22 20:58:12 +01:00
Lioncash
16de99d3e3 A64: Enable FCVT floating-point conversions for half-precision
With this, we no longer have to fall back to the interpreter in any of
the FCVT floating-point conversion instructions.
2020-04-22 20:58:12 +01:00
Lioncash
10abc77fad A64: Handle half-precision floating point in scalar FNEG
With the half-precision variant of the FPNeg opcode added, we can
utilize it here to emulate the half-precision variant of FNEG.
2020-04-22 20:58:12 +01:00
Lioncash
e4c259d69f frontend/ir_emitter: Add half->{single, double} and {double, single}->half conversion opcodes 2020-04-22 20:58:12 +01:00
Lioncash
c97efcb978 frontend/ir_emitter: Add half-precision variant of FPNeg 2020-04-22 20:58:12 +01:00
Lioncash
dff5da1063 common/fp/unpacked: Amend behavior of FPUnpackCV
This is supposed to call FPUnpackBase instead of FPUnpack. This would
result in alternate half-precision representations being misinterpreted
when it comes to dealing with NaNs.
2020-04-22 20:58:12 +01:00
Lioncash
03bc2334fe common/fp/op/FPConvert: Amend off-by one in double NaN case in FPConvertNaN
Avoids potentially clobbering the intended sign bit value during
conversions to double-precision values. The other conversion types are
already properly handled, so those don't need to be addressed.
2020-04-22 20:58:12 +01:00
Lioncash
c57b146fb2 common/fp/op/FPConvert: Add half-precision instantiations to FPConvert 2020-04-22 20:58:12 +01:00
Merry
c1ce94872d Merge pull request #455 from lioncash/sqrdmulh-scalar
A64: Implement SQRDMULH and SQDMULL's scalar indexed variants
2020-04-22 20:58:11 +01:00
Lioncash
25a7256ee1 A64: Enable FMOV (general) for half-precision floating point
This just transfers values between vector registers and general-purpose
registers with no conversions performed, so this is trivial to add
support for half-precision to.
2020-04-22 20:58:11 +01:00
Merry
98d8f81d7c Merge pull request #454 from lioncash/sqrdmulh
A64: Implement SQRDMULH and SQDMULL{2}'s vector indexed element variants
2020-04-22 20:58:11 +01:00
Lioncash
97dd3d0596 A64: Implement SQRDMULH's scalar indexed element variant 2020-04-22 20:58:11 +01:00
Merry
42b090d234 Merge pull request #452 from lioncash/frecpx
A64: Implement FRECPX's half-precision floating-point variant
2020-04-22 20:58:11 +01:00
Lioncash
49b51e34f1 simd_vector_x_indexed_element: Deduplicate index and Vm operand construction 2020-04-22 20:58:11 +01:00
Lioncash
692aba91b6 A64: Implement SQDMULL{2}'s scalar indexed element variant 2020-04-22 20:58:11 +01:00
Merry
32364fb62c Merge pull request #451 from lioncash/unpck
common/fp: Minor adjustments for half-precision floating point support
2020-04-22 20:58:11 +01:00
Lioncash
3a3542414b A64: Implement FRECPX's half-precision floating point variant 2020-04-22 20:58:11 +01:00
Lioncash
c043b831d5 A64: Implement SQDMULL{2}'s by-element variant 2020-04-22 20:58:11 +01:00
Lioncash
72af5a3dff simd_scalar_x_indexed_element: Factor out index and Vm argument construction
This will be useful in the implementations of SQRDMULH and SQDMULL{2} as
well.
2020-04-22 20:58:11 +01:00
Merry
37c4c39d62 Merge pull request #448 from lioncash/saturate
A64: Implement SQSHRN, SQSHRUN, and UQSHRN's scalar variants
2020-04-22 20:58:11 +01:00
Lioncash
7030b9af95 common/fp/process_nan: Add half-precision instantiations for NaN processing functions 2020-04-22 20:58:11 +01:00
Lioncash
bd892ec4ef frontend/ir/ir_emitter: Amend FPRecipExponent to handle half-precision floating point 2020-04-22 20:58:11 +01:00
Lioncash
224ff0afaa A64: Implement SQRDMULH's by-index vector variant 2020-04-22 20:58:11 +01:00
Merry
f5d774bdbd Merge pull request #449 from lioncash/hp
common/fp/info: Add specialization of FPInfo for half-precision floating point
2020-04-22 20:58:11 +01:00
Lioncash
126c29a9e9 A64: Implement SQSHRN, SQSHRUN, and UQSHRN's scalar variants
These can just be implemented in terms of the vector variants for the
time being.
2020-04-22 20:58:11 +01:00
Lioncash
14f55d7476 common/fp/unpacked: Add half-precision instantiation of FPRoundBase 2020-04-22 20:58:11 +01:00
Lioncash
974fbf0677 frontend/ir/value: Add U16U32U64 type to represent floating point types 2020-04-22 20:58:11 +01:00
Merry
4b86151a0c Merge pull request #450 from lioncash/cv
common/fp/unpacked: Add FPRoundCV and FPUnpackCV
2020-04-22 20:58:11 +01:00
Lioncash
0b67b94b6c common/fp/info: Add specialization of FPInfo for half-precision floating point
Puts the necessary info struct in place for further use.
2020-04-22 20:58:11 +01:00
Lioncash
dd7433f9d3 A64: Amend prototypes of some SIMD scalar shift by immediate opcodes
These take a vector for a destination.
2020-04-22 20:58:11 +01:00
Lioncash
7e814de445 common/fp/unpacked: Handle half-precision unpacking in FPUnpackBase 2020-04-22 20:58:11 +01:00
Lioncash
eb3e0d5908 common/fp/op/FPRecipExponent: Add half-precision floating point specialization 2020-04-22 20:58:11 +01:00
Merry
bbd5330ad2 Merge pull request #447 from lioncash/flag
A64: Implement CFINV, RMIF, AXFlag and XAFlag
2020-04-22 20:58:11 +01:00
Lioncash
99c494bae9 common/fp/unpacked: Add FPRoundCV
Corresponds to the equivalent pseudocode within the ARMv8 reference
manual. This will be necessary for supporting half-precision
floating-point.

This also makes use of it within FPConvert
2020-04-22 20:58:11 +01:00
Lioncash
8f9fe8690a common/fp/unpacked: Adjust FPUnpack to operate like ARM pseudocode
This function is defined as always disabling the AHP bit in the fpcr
before performing any operations.

At the same time, rename the original FPUnpack function to FPUnpackBase
to match the pseudocode in the ARM reference manual.
2020-04-22 20:58:11 +01:00
Lioncash
a829c93406 common/fp/unpacked: Correct edge-cases within FPUnpack for half-precision floating point
This corrects one case where floating-point exceptions could be set when
they're not supposed to be.

This also corrects a case where values were being treated as NaNs when
they weren't supposed to be.
2020-04-22 20:58:11 +01:00
Merry
fb039e232c Merge pull request #442 from lioncash/fcvtxn
A64: Implement scalar and vector variants of FCVTXN
2020-04-22 20:58:11 +01:00
Lioncash
6aed4036ef ir_opt/a64_get_set_elimination_pass: Add handling for NZCV raw get and set operations 2020-04-22 20:58:11 +01:00
Lioncash
490bebbd9a common/fp/unpacked: Add FPUnpackCV
Adds a template function that performs the same behavior as in the ARM
pseudocode, and utilizes it in FPConvert, which will be necessary for
half-float support.
2020-04-22 20:58:11 +01:00
Merry
4f937c1ee1 Merge pull request #446 from lioncash/sqshl
A64: Implement scalar variants of SQSHL (register) and UQSHL (register)
2020-04-22 20:58:11 +01:00
Lioncash
aa22db534b A64: Implement AXFlag and XAFlag 2020-04-22 20:58:11 +01:00
Merry
d74cccbc84 Merge pull request #445 from lioncash/sqrt
A64: Implement single and double-precision vector variant of FSQRT
2020-04-22 20:58:11 +01:00
Lioncash
20ffe568d0 A64: Implement RMIF 2020-04-22 20:58:11 +01:00
Merry
6d7e7c3269 Merge pull request #443 from lioncash/flag
A64: Rearrange flag format/manipulation instructions
2020-04-22 20:58:11 +01:00
Lioncash
51b526e453 A64: Implement CFINV 2020-04-22 20:58:11 +01:00
Merry
5d01f1b462 Merge pull request #441 from lioncash/constexpr
common/bit_util: Mark a few functions as constexpr
2020-04-22 20:58:11 +01:00
Lioncash
597a8be5d5 ir: Add A64-specific opcodes for getting and setting raw NZCV values
This will be necessary to implement the flag manipulation and flag
format instructions.
2020-04-22 20:58:11 +01:00
Merry
743c52fdc5 Merge pull request #440 from lioncash/include
common/fp: Remove unnecessary includes
2020-04-22 20:58:11 +01:00
Merry
46922b9138 Merge pull request #444 from lioncash/interpret
A64: Fall back to interpreting for FCADD and FCMLA half-precision variants
2020-04-22 20:58:11 +01:00