Commit graph

44 commits

Author SHA1 Message Date
MerryMage
bb93353f94 emit_x64_vector_floating_point: Correct FMA in FTZ mode
x64 rounds before flushing to zero
AArch64 rounds after flushing to zero

This difference of behaviour is noticable if something would round to a smallest normalized number
2020-04-22 20:46:23 +01:00
MerryMage
8ef195db3c emit_x64_floating_point: DenormalsAreZero is redundant as hardware already does DAZ
Exceptions: F{MIN,MAX}{,NM}
2020-04-22 20:46:23 +01:00
MerryMage
de9d8c461c emit_x64_floating_point: FlushToZero is redundant as hardware already does FTZ 2020-04-22 20:46:23 +01:00
MerryMage
822fd4a875 backend_x64: Fix FPVectorMulAdd and FPMulAdd NaN handling with denormals
Denormals should be treated as zero in NaN handler
2020-04-22 20:46:23 +01:00
MerryMage
b393e15ab6 backend_x64: Fix bugs when FPCR.FZ=1
Bugs:
* DenormalsAreZero flushed to positive zero instead of preserving sign.
* FMAXNM/FMINNM (scalar) should perform DAZ *before* special zero handling.
* FMAX/FMIN/FMAXNM/FMINNM (vector) did not DAZ.
2020-04-22 20:46:23 +01:00
MerryMage
2019d32743 emit_x64_floating_point: Deduplicate EmitFPMulAdd implementation 2020-04-22 20:46:23 +01:00
MerryMage
e038fe72df emit_x64_floating_point: Deduplicate code 2020-04-22 20:46:23 +01:00
MerryMage
7f27945411 emit_x64_floating_point: Fix FP{Max,Min} when FPCR.DN = 1 2020-04-22 20:46:23 +01:00
MerryMage
901bd9b4e2 IR: Implement FPRecipStepFused, FPVectorRecipStepFused 2020-04-22 20:46:22 +01:00
MerryMage
c1dcfe29f7 IR: Implement FPRecipEstimate 2020-04-22 20:46:22 +01:00
MerryMage
506e544bfe IR: Implement FPRSqrtStepFused 2020-04-22 20:46:22 +01:00
MerryMage
66bb05fc0a emit_x64_floating_point: Fixup special NaN case in FMA FPMulAdd implementation 2020-04-22 20:46:21 +01:00
MerryMage
0ce11b7b15 emit_x64_floating_point: Implement accurate fallback for FPMulAdd{32,64} 2020-04-22 20:46:21 +01:00
MerryMage
6087c2af6f emit_x64_floating_point: s/Esimate/Estimate/ 2020-04-22 20:46:21 +01:00
MerryMage
bde58b04d4 IR: Implement FPRSqrtEstimate 2020-04-22 20:46:21 +01:00
Lioncash
1dc1e3dcd8 fp: Use forward declarations where applicable
Minimizes the amount of files that need to be rebuilt if the headers
ever change.
2020-04-22 20:46:21 +01:00
MerryMage
b53127600b fp: A64::FPCR -> FP::FPCR 2020-04-22 20:46:21 +01:00
MerryMage
83be491875 emit_x64_floating_point: SSE4.1 implementation of EmitFPRound 2020-04-22 20:46:20 +01:00
MerryMage
b228694012 IR: Implement FPRoundInt 2020-04-22 20:46:20 +01:00
MerryMage
304cc7f61e emit_x64_floating_point: SSE4.1 implementation for FP{Double,Single}ToFixed{S,U}{32,64} 2020-04-22 20:46:19 +01:00
MerryMage
caaf36dfd6 IR: Initial implementation of FP{Double,Single}ToFixed{S,U}{32,64}
This implementation just falls-back to the software floating point implementation.
2020-04-22 20:46:19 +01:00
MerryMage
3cb98e1560 fp: Move fp_util to fp/util 2020-04-22 20:46:19 +01:00
Lioncash
7a84b6e8d8 ir: Add opcodes for converting S64 and U64 to single-precision floating-point values 2020-04-22 20:46:19 +01:00
MerryMage
0b97e9bd8d emit_x64_floating_point: Fix EmitFPU64ToDouble for TowardsMinusInfinity rounding mode 2020-04-22 20:46:18 +01:00
Lioncash
7252293184 emit_x64_floating_point: Correct use of UseGpr() in EmitFPU32ToDouble() and EmitFPU32ToSingle()
In the non-AVX512 path, the following code is present:

code.mov(from.cvt32(), from.cvt32());

since this potentially modifies 'from', we should be using
UseScratchGpr() instead.
2020-04-22 20:46:18 +01:00
Lioncash
fbd7623fe5 emit_x64_floating_point: Add AVX512F conversion operations to EmitFPU32ToSingle() and EmitFPU32ToDouble()
AVX-512F provides convenient instructions for these kinds of conversions
directly
2020-04-22 20:46:18 +01:00
Lioncash
3a41465eaf ir: Add opcodes for converting S64 and U64 to double-precision values 2020-04-22 20:46:18 +01:00
MerryMage
8c90fcf58e IR: Implement FPMulAdd 2020-04-22 20:46:18 +01:00
MerryMage
cdc5c3ad95 emit_x64_floating_point: Near jump instead of short jump in FPMinNumberic{32,64} 2020-04-22 20:46:15 +01:00
MerryMage
df4ee0f51e emit_X64_floating_point: Near jmp to end instead of short jmp
Jump destination can be further than what can be reached in a short
jump under some FPCR options.
2020-04-22 20:46:15 +01:00
MerryMage
2721bb5ace emit_x64_floating_point: Add maybe_unused to preprocess parameter 2020-04-22 20:46:15 +01:00
MerryMage
0575e7421b A64: Implement FMINNM (scalar) 2020-04-22 20:46:15 +01:00
MerryMage
1c9804ea07 A64: Implement FMAXNM (scalar) 2020-04-22 20:46:15 +01:00
MerryMage
1dfce0894d constant_pool: Add frame parameter 2020-04-22 20:46:14 +01:00
MerryMage
6541ec064d emit_x64_floating_point: Correct FP{Max,Min}{32,64} implementations for -0/+0 2020-04-22 20:46:14 +01:00
MerryMage
07520f32c3 backend_x64: Accurately handle NaNs 2020-04-22 20:46:14 +01:00
MerryMage
2cb0a699ba IR: Implement FPMax, FPMin 2020-04-22 20:46:14 +01:00
MerryMage
aac5af50e2 IR: FPCompare{32,64} now return NZCV flags instead of implicitly setting them 2020-04-22 20:46:13 +01:00
MerryMage
b173fcf34e backend_x64: Simplify FPDoubleToU32 and FPSingleToU32
They're inaccurate in terms of FPSR at the moment anyway.
2020-04-22 20:46:13 +01:00
MerryMage
68f46c8334 backend_x64: Use a reference to BlockOfCode instead of a pointer 2020-04-22 20:46:13 +01:00
Lioncash
67443efb62 General: Convert multiple namespace specifiers to nested namespace specifiers where applicable
Makes namespacing a little less noisy
2020-04-22 20:44:38 +01:00
MerryMage
db30e02ac8 emit_x64: Extract BlockRangeInformation, remove template parameter 2020-04-22 20:44:36 +01:00
MerryMage
58c4a25527 emit_x64: Use JitStateInfo 2020-04-22 20:42:46 +01:00
MerryMage
a554e4a329 backend_x64: Split emit_x64 2020-04-22 20:42:46 +01:00