dynarmic

Author	SHA1	Message	Date
Lioncash	c82fa5ec5a	data_processing_addsub: Move datasize declarations after early-exit conditionals While we're at it, also make relevant variables const where applicable	2020-04-22 20:55:06 +01:00
Lioncash	f4a66d2477	data_processing_bitfield: Move datasize variables after early-exit conditionals Moves the declaration of datasize to the scope that it's used within. This also takes the opportunity to apply const where applicable, and make early-exits all vertically consistent with one another.	2020-04-22 20:55:06 +01:00
Lioncash	2e0fcd6161	A64: Implement CLS's vector variant Leverages CLZ like the integral variant does.	2020-04-22 20:55:06 +01:00
Lioncash	a2cd643525	emit_x64_vector: Make EmitVectorUnsignedSaturatedAccumulateSigned() internally linked Given this is just an internal helper function, it can be marked static.	2020-04-22 20:55:06 +01:00
Lioncash	c39ea2e3c9	perf_map: Use std::string_view instead of std::string for PerfMapRegister() We can just use a non-owning view into a string in this case instead of potentially allocating a std::string instance.	2020-04-22 20:55:06 +01:00
MerryMage	12243692f5	A64: Implement SQRDMULH (vector), vector variant	2020-04-22 20:55:06 +01:00
MerryMage	a9ffcf08b1	A64: Implement SQDMULL (vector), vector variant	2020-04-22 20:55:06 +01:00
MerryMage	3e447614c6	IR: Add VectorSignedSaturatedDoublingMultiplyLong	2020-04-22 20:55:06 +01:00
MerryMage	06b31448aa	emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply * Return both the upper and lower parts of the multiply if required * SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead * Improve port utilisation where possible (punpck instructions were a bottleneck)	2020-04-22 20:55:06 +01:00
MerryMage	08c0e017a5	IR: Implement Vector{Signed,Unsigned}Multiply{16,32}	2020-04-22 20:55:06 +01:00
Lioncash	b6df34cdde	backend_x64/a64_interface: Re-enable the constant folding pass This was disabled for debugging, but never re-enabled. Just to be sure, testing was done downstream in yuzu to make sure this didn't happen to break anything (which seems to be the case).	2020-04-22 20:55:06 +01:00
MerryMage	06ba397af2	emit_x64_vector_floating_point: Hardware FMA implementation for RSqrtStepFused	2020-04-22 20:55:06 +01:00
MerryMage	e553c4fe8d	emit_x64_vector_floating_point: Hardware FMA implementation of FPVectorRecipStepFused	2020-04-22 20:55:06 +01:00
MerryMage	3caeb62ef1	emit_x64_floating_point: Hardware FMA implementation of FPRSqrtStepFused	2020-04-22 20:55:06 +01:00
MerryMage	344ee76aba	emit_x64_floating_point: Hardware FMA implementation of FPRecipStepFused{32,64}	2020-04-22 20:55:06 +01:00
MerryMage	1492573267	emit_x64_vector: SSE implementation of VectorSignedSaturatedAccumulateUnsigned{8,16,32}	2020-04-22 20:55:06 +01:00
Lioncash	26df6e5e7b	emit_x64_vector: Correct static asserts for < 64-bit type checks in saturated accumulate fallbacks I had initially meant to use BitSize() here, not sizeof()	2020-04-22 20:55:06 +01:00
MerryMage	a4a26ac226	emit_x64_vector: EmitVectorSignedSaturatedAccumulateUnsigned64: SSE implementation	2020-04-22 20:55:06 +01:00
MerryMage	a7c66d2d28	emit_x64_vector: Simplify fpsr_qc related code Move the bool conversion into A64JitState::GetFpsr so we don't have to continuously pay the cost of conversion for every saturation instruction.	2020-04-22 20:55:06 +01:00
Lioncash	112cff9ab9	A64: Implement CLZ's vector variant	2020-04-22 20:55:06 +01:00
Lioncash	e739624296	ir: Add opcodes for vector CLZ operations We can optimize these cases further for with the use of a fair bit of shuffling via pshufb and the use of masks, but given the uncommon use of this instruction, I wouldn't consider it to be beneficial in terms of amount of code to be worth it over a simple manageable naive solution like this. If we ever do hit a case where vectorized CLZ happens to be a bottleneck, then we can revisit this. At least with AVX-512CD, this can be done with a single instruction for the 32-bit word case.	2020-04-22 20:55:05 +01:00
MerryMage	d4c37a68a8	A64/translate: VectorZeroUpper for V(64) stores Ensures correctness.	2020-04-22 20:55:05 +01:00
MerryMage	b8daa4feac	simd_two_register_misc: FNEG (vector) with Q == 0 had dirty upper	2020-04-22 20:55:05 +01:00
Lioncash	5653e7637e	emit_x64_vector: Remove unnecessary [[maybe_unused]] attributes These were unintentionally left in when introducing SUQADD and USQADD	2020-04-22 20:55:05 +01:00
Lioncash	14e026a7f0	A64: Implement USQADD's scalar and vector variants	2020-04-22 20:55:05 +01:00
Lioncash	d4a76aaa04	ir: Add opcodes form unsigned saturated accumulations of signed values	2020-04-22 20:55:05 +01:00
Lioncash	18ad7f237d	A64: Implement SUQADD's scalar and vector variants	2020-04-22 20:55:05 +01:00
Lioncash	6f911a26da	ir: Add opcodes for signed saturated accumulations of unsigned values	2020-04-22 20:55:05 +01:00
Lioncash	9a3d38d2ee	A64: Implement SMLAL{2}, SMLSL{2}, UMLAL{2}, and UMLSL{2}'s vector by-element variants We can simply modify the general function made for SMULL{2} and UMULL{2}'s by-element variants to also handle the other multiply-based by-element variants.	2020-04-22 20:55:05 +01:00
Lioncash	6ccfbc9b39	A64: Implement UMULL{2}'s vector by-element variant	2020-04-22 20:55:05 +01:00
Lioncash	58e21f175c	A64: Implement SMULL{2}'s vector by-element variant	2020-04-22 20:55:05 +01:00
Lioncash	134bb02e19	ir/value: Replace includes with forward declarations enum classes are still considered complete types when forward declared (as the compiler knows the exact size of the type from the declaration alone). The only difference in this case being that the members of the enum class aren't visible. Given we don't use the members within this header in any way, we can simply forward declare them here and remove the inclusions.	2020-04-22 20:55:05 +01:00
Lioncash	2c8e07e7d0	ir/cond: Migrate to C++17 nested namespace specifiers	2020-04-22 20:55:05 +01:00
Lioncash	c3b7819a55	CMakeLists: Add missing cond.h header to file listing Allows the file to show up within IDEs more easily.	2020-04-22 20:55:05 +01:00
Lioncash	0a3976059f	A64: Implement URSQRTE	2020-04-22 20:55:05 +01:00
Lioncash	b6e74fd17d	ir: Add opcodes for performing unsigned reciprocal square root estimates	2020-04-22 20:55:05 +01:00
Lioncash	bd3582e811	A64: Implement URECPE	2020-04-22 20:55:05 +01:00
Lioncash	af83360f89	ir: Add opcodes for unsigned reciprocal estimate	2020-04-22 20:55:05 +01:00
Lioncash	740ffa52ae	A64: Implement SQNEG's scalar and vector variant	2020-04-22 20:53:46 +01:00
Lioncash	fca7eddb9e	A64: Add opcodes for signed saturating negations	2020-04-22 20:53:46 +01:00
Lioncash	f1ebbcd7bc	emit_x64_vector: Simplify "position == 0" case for EmitVectorExtract() In the event position is zero, we can just treat it as a NOP, given there's no need to move the data.	2020-04-22 20:53:46 +01:00
Lioncash	87372917f9	emit_x64_vector: Simplify "position == 0" case for EmitVectorExtractLower() In the event position == 0, we can just treat it as a simple movq, clearing the upper half of the XMM register. This also makes that case use only one register.	2020-04-22 20:53:46 +01:00
Lioncash	f5fb496e7e	A64: Implement SQDMULH's by-element scalar variant	2020-04-22 20:53:46 +01:00
Lioncash	40f0576995	A64: Implement SQDMULH's by-element vector variant	2020-04-22 20:53:46 +01:00
MerryMage	8f9206901d	backend/x64: Do not clear fast_dispatch_table if not enabled There is no need to pay for the cost of setting a large block of memory if we're not using it.	2020-04-22 20:53:46 +01:00
MerryMage	9b65100660	A64: Implement FastDispatchHint	2020-04-22 20:53:46 +01:00
MerryMage	f96c43d422	A32: Implement FastDispatchHint	2020-04-22 20:53:46 +01:00
MerryMage	aa8d826c13	ir/terminal: Add FastDispatchHint	2020-04-22 20:53:46 +01:00
Lioncash	1a69a61cb4	A64: Implement SQDMULH's scalar variant	2020-04-22 20:53:46 +01:00
Lioncash	7ebfd0f31c	ir: Add opcodes for scalar signed saturated doubling multiplies	2020-04-22 20:53:46 +01:00
Lioncash	9c03311fed	A64: Implement SQDMULH's vector variant	2020-04-22 20:53:46 +01:00
Lioncash	a0231e5546	ir: Add opcodes for signed saturated doubling multiplies	2020-04-22 20:53:46 +01:00
Lioncash	db24e1f09b	A64: Implement SQABS' scalar variant	2020-04-22 20:53:46 +01:00
Lioncash	bda5d14c7f	A64: Implement SQABS' vector variant.	2020-04-22 20:53:46 +01:00
Lioncash	0507e47420	ir: Add opcodes for signed saturated absolute values	2020-04-22 20:53:46 +01:00
MerryMage	27427595b7	emit_x64_floating_point: EmitFPToFixed: maxsd optimization maxsd is not required when doing a signed conversion, because x64 produces a 0x80...00 value for out of range values.	2020-04-22 20:53:46 +01:00
MerryMage	1abf82ac4a	emit_x64_floating_point: ZeroIfNaN: pxor -> xorps xorps is shorter and more appropriate here.	2020-04-22 20:53:46 +01:00
MerryMage	3415828fb4	IR: Simplify FP{Single,Double}ToFixed{U,S}{32,64}	2020-04-22 20:53:46 +01:00
Lioncash	e30f9816ec	A32/decoder: Add missing <algorithm> includes These includes should be present, as we use std::find_if() within these headers.	2020-04-22 20:53:46 +01:00
Lioncash	4507627905	emit_x64_vector: Provide AVX path for EmitVectorMinU64()	2020-04-22 20:53:46 +01:00
Lioncash	fd49a62b06	emit_x64_vector: Provide AVX path for EmitVectorMinS64()	2020-04-22 20:53:46 +01:00
Lioncash	770723f449	emit_x64_vector: Provide AVX path for EmitVectorMaxU64()	2020-04-22 20:53:46 +01:00
Lioncash	8fb90c0cf1	emit_x64_vector: Provide AVX path for EmitVectorMaxS64()	2020-04-22 20:53:46 +01:00
Lioncash	2cac6ad129	emit_x64_vector: Simplify EmitVectorLogicalLeftShift8() Similar to EmitVectorLogicalRightShift8(), we can determine a mask ahead of time and just and the results of a halfword left shift.	2020-04-22 20:53:46 +01:00
Lioncash	135107279d	emit_x64_vector: Simplify EmitVectorLogicalShiftRight8() We can generate the mask and AND it against the result of a halfword shift instead of looping.	2020-04-22 20:53:46 +01:00
Lioncash	2952b46b16	emit_x64_vector: Amend value definition in SSE 4.1 path for EmitVectorSignExtend16() We should be defining the value after the results have been calculated to be consistent with the rest of the code.	2020-04-22 20:53:46 +01:00
Lioncash	fda19095ea	emit_x64_vector: Remove fallback in EmitVectorSignExtend64() This is fairly trivial to do manually.	2020-04-22 20:53:46 +01:00
Lioncash	39593fcd26	emit_x64_vector: Remove fallback for EmitVectorSignExtend32() We can just do the extension manually, which gets rid of the need to fall back here.	2020-04-22 20:53:46 +01:00
Lioncash	053175f69b	ir_emitter: Rename fpscr_controlled parameters to fpcr_controlled Part of addressing #333	2020-04-22 20:53:46 +01:00
MerryMage	f0184c4b8d	a32/exception_generating: BPKT: Define unpredictable behaviour Define unpredictable behaviour to be BKPT executes conditionally	2020-04-22 20:53:46 +01:00
MerryMage	a12854857b	A32: Add define_unpredictable_behaviour option	2020-04-22 20:53:46 +01:00
MerryMage	b0abaa8312	A32/location_descriptor: Change formatting to use hex	2020-04-22 20:53:46 +01:00
MerryMage	ccbf6c7f63	microinstruction: A32ExceptionRaised causes CPU exception	2020-04-22 20:53:46 +01:00
MerryMage	6595e49a31	A32/types: CondToString: Add nv	2020-04-22 20:53:46 +01:00
MerryMage	d5b9c4a4bb	block_of_code: Hide NX support behind compiler flag Systems that require W^X can use the DYNARMIC_ENABLE_NO_EXECUTE_SUPPORT cmake option.	2020-04-22 20:53:46 +01:00
MerryMage	de4494ffa5	Implement perfmap	2020-04-22 20:53:46 +01:00
MerryMage	f73104633b	a32_emit_x64: Fix incorrect BMI2 implementation for SetCpsr * The MSB for each byte in cpsr_ge were not being appropriately set. * We also expand test coverage to test this case. * We fix the disassembly of the MSR (imm) and MSR (reg) instructions as well.	2020-04-22 20:53:46 +01:00
MerryMage	3432a08e0a	backend/x64: Support W^X systems Closes #176.	2020-04-22 20:53:46 +01:00
BreadFish64	2a65442933	Backend: Create "backend" folder similar to the "frontend" folder	2020-04-22 20:53:46 +01:00
MerryMage	3b13f1eb12	A64/translate: Standardize arguments of helper functions Don't pass in IREmitter when TranslatorVisitor is already available.	2020-04-22 20:53:45 +01:00
MerryMage	a4e556d59c	A64/translate: Standardize TranslatorVisitor abbreviation Prefer v to tv.	2020-04-22 20:53:45 +01:00
MerryMage	9a0dc61efd	emit_x64_vector: Avoid recalculating addresses in EmitVectorTableLookup	2020-04-22 20:53:45 +01:00
Lioncash	3d465e2c36	A64: Implement SQXTN, SQXTUN, and UQXTN's scalar variants We can implement these in terms of the vector variants	2020-04-22 20:53:45 +01:00
Lioncash	4ff39c6ea8	A64: Implement SDOT and UDOT's (by element) variants Gets all of the dot product instructions out of the way.	2020-04-22 20:53:45 +01:00
MerryMage	21df1fb539	emit_x64_vector: Don't load zero constant from memory in EmitVectorTableLookup	2020-04-22 20:53:45 +01:00
MerryMage	3bbcca8757	emit_x64_vector: Special-case is_defaults_zero && table_size == 2 in EmitVectorTableLookup	2020-04-22 20:53:45 +01:00
MerryMage	9cc00f900c	emit_x64_vector: Release registers when possible in EmitVectorTableLookup	2020-04-22 20:53:45 +01:00
MerryMage	a12afd1065	reg_alloc: Add the ability to Release an allocation early	2020-04-22 20:53:45 +01:00
MerryMage	e68bd3c6c1	emit_x64_vector: Special-case table_size == 1 in EmitVectorTableLookup	2020-04-22 20:53:45 +01:00
MerryMage	a4e1f8a63a	emit_x64_vector: SSE4.1 implementation of EmitVectorTableLookup	2020-04-22 20:53:45 +01:00
MerryMage	0c18b85c27	A64: Implement TBL and TBX	2020-04-22 20:53:45 +01:00
MerryMage	89d08c7d61	IR: Add VectorTable and VectorTableLookup IR instructions	2020-04-22 20:53:45 +01:00
MerryMage	0288974512	opcodes: Cleanup opcodes table * Remove T:: prefix from types. * Add another column for a 4th argument.	2020-04-22 20:53:45 +01:00
Lioncash	d9fc6cf31f	A64: Implement SDOT and UDOT's vector variant	2020-04-22 20:53:45 +01:00
Lioncash	cb5e5c5d49	A64: Implement SADALP and UADALP While we're at it we can join the code for SADDLP and UADDLP with these instructions, since the only difference is we do an accumulate at the end of the operation.	2020-04-22 20:53:45 +01:00
Lioncash	29f8b30634	A64: Implement SRSHL and URSHL Implements both scalar and vector variants.	2020-04-22 20:53:45 +01:00
Lioncash	0efa2ce3b0	ir: Add opcodes for performing rounding left shifts	2020-04-22 20:53:45 +01:00
MerryMage	656ceff225	emit_x64_floating_point: Fix smallest normal check in EmitFPMulAdd	2020-04-22 20:53:45 +01:00
Lioncash	f3f60cd179	A64: Implement ISB Given we want to ensure that all instructions are fetched again, we can treat an ISB instruction as a code cache flush.	2020-04-22 20:53:45 +01:00
Lioncash	be53e356a2	A64: Implement FCVTN{2}	2020-04-22 20:53:45 +01:00

1 2 3 4 5 ...

1427 commits