Commit graph

3403 commits

Author SHA1 Message Date
zmt00
4f08226e0e emit_x64_vector: Refactor pre-SSE4.1 min/max instruction replacements 2024-02-17 13:17:01 +00:00
zmt00
60a6092b65 tests/A64: Add non-paired min/max integer tests 2024-02-17 13:17:01 +00:00
zmt00
0adc972cd9 emit_x64_vector: Optimize VectorSignedSaturatedAbs 2024-02-13 18:46:42 +00:00
zmt00
cc9f00645d tests/A64: Add SQABS tests 2024-02-13 18:46:42 +00:00
Merry
69dc836977 backend/arm64: A64: Implement DumpDisassembly 2024-02-13 02:21:22 +00:00
Merry
4ae4750b5a emit_arm64_a64: Take into account currently loaded FPSR
Previously we just retrieved the last stored FPSR and used that when the guest asks for the current FPSR.
This is incorrect behaviour. We failed to take into account the current state of the host FPSR.

Here we take this into account. This bug was discovered via #795.
2024-02-13 02:19:55 +00:00
Merry
ba8192d890 dynarmic: 6.6.3 2024-02-10 19:34:16 +00:00
Merry
ee84ec0bb9 backend/x64: Reduce races on invalidation requests in interface
This situation occurs when RequestCacheInvalidation is called from
multiple threads. This results in unusual issues around memory
allocation which arise from concurrent access to invalid_cache_ranges.

There are several reasons for this:
1. No locking around the invalidation queue.
2. is_executing is not multithread safe.

So here we reduce any cache clear or any invalidation to raise a
CacheInvalidation halt, which we execute immediately before or
immediately after Run() instead.
2024-02-10 19:31:07 +00:00
Wunkolo
6d0995c948 tests/A64: Add negative-shift elements to USHL 2024-02-10 11:38:17 +00:00
Wunkolo
18717d216c emit_x64_vector: AVX512+GNFI implementation of EmitVectorLogicalVShift8 2024-02-10 11:38:17 +00:00
zmt00
f5df599e9d tests/A64: Convert recent tests to oaknut 2024-02-10 11:32:07 +00:00
zmt00
0785a6d027 ir: Implement FPMulSub 2024-02-10 11:31:54 +00:00
Wunkolo
a32e6f52ef tests/A64: Use oaknut for CLZ assembly 2024-02-06 18:15:34 +00:00
Wunkolo
eb5eb9cdf7 emit_x64_vector: GNFI implementation of EmitVectorCountLeadingZeros8 2024-02-06 18:15:34 +00:00
Wunkolo
1e5e7a7ae6 tests/A64: Add CLZ vector unit-tests 2024-02-06 18:15:34 +00:00
Merry
75235ffedb emit_x64_data_processing: Exclude edge case from lea path in EmitSub
-0xffff'ffff'8000'0000 = 0x0000'0000'8000'0000 which is not a representable displacement
2024-01-31 01:41:25 +00:00
Merry
24bf921ff9 constant_propagation_pass: x + 0 == x 2024-01-30 23:10:23 +00:00
Merry
ca2cc2c4ba emit_x64_data_processing: Emit lea where possible in EmitAdd and EmitSub 2024-01-30 22:59:41 +00:00
Merry
30f1a3c628 Avoid emplace. 2024-01-30 17:32:50 +00:00
Merry
3131d6c2db externals: Update oaknut to 2.0.2
Merge commit '48dcc318c977ce17373f3710d04d96f9b05860a0'
2024-01-30 12:28:40 +00:00
Merry
48dcc318c9 Squashed 'externals/oaknut/' changes from 9d091109d..6b1d57ea7
6b1d57ea7 oaknut: 2.0.2
143a3dcbe oaknut: github: Build on x86-64
496ff1b54 oaknut: tests: Only run arm64-specific tests on arm64
8395b79cf cmake: make tests optional

git-subtree-dir: externals/oaknut
git-subtree-split: 6b1d57ea7ed4882d32a91eeaa6557b0ecb4da152
2024-01-30 12:28:40 +00:00
Merry
5e8892c5b7 dynarmic: 6.6.2 2024-01-30 00:40:49 +00:00
Merry
213fe7a452 externals: Update xbyak to v7.05
Merge commit 'fdf626b74f35deedce0e6196c36b8c9f846c038a'
2024-01-30 00:39:19 +00:00
Merry
fdf626b74f Squashed 'externals/xbyak/' changes from a1ac3750f..2ce465bbc
2ce465bbc Merge branch 'dev'
0b3f360eb v7.05
66f22b7a4 update doc
13ee4e19f use opSetCC for setCC
383866b42 use opMR with APX
d6e6e6f85 tweak
a7b02ac80 RAO_INT supports APX
26840492c use Address.immSize
e2b40a33e refactor Address class
e1b6896c2 Merge branch 'dev'
c0888cc45 v7.04
7d9c82835 refactor rex
b3e27734b apx supports 0x0f opecode with rex2
2e7b62d78 bswap supports apx
2e93baa6a Merge branch 'dev'
e1864642c unify getMap and getMMM
0750873b7 T_MAP3 is not necessary
ee4984222 T_MAP1 is not necessary
5c95842be tweak
8c44467af add no_flags sample
523cf1ed0 fix comment of sample/ccmp.cpp
5438fc69d Merge branch 'dev'
ee26c094e v7.03
691ce361a [doc] update dfv
8d0e78146 set 0 for the default value of dfv
2255aea0d [doc] add ccmpSCC and ctestSCC
b5e115284 add sample/ccmp.cpp
bacd8d34b add sample/zero_upper.cpp
f17cb9d6b Merge branch 'dev'
c9ce3f8f6 v7.02
3427be298 unify opAESKL and opSHA
bfd14244a update doc
e690a2a47 sha* supports apx
c9765588f Merge branch 'dev'
903f7c02e v7.01
54a1f07f9 update cpuid by sde
223ddfaf8 add detection of sse4a/clwb
ba943b5b6 reorder cpu detection
30c362df5 Merge branch 'Sonicadvance1-missing_checks' into dev
02bc84ad8 renumber of tSSE4a, tCLWB
84fe3ab9d update doc
90fc0151c add encodekey{128,256}
440972b88 add detection of KEYLOCKER, KEYLOCKER_WIDE
68a30b91f add detection of AESKLE, WIDE_KL
e2d36c662 fix detection of AVX10
48551f5cc add aesenc{128,256}kl, aesencwide{128,256}kl
d9c7c992f add aesdecwide{128,256}kl
cd5231de0 add aesdec256kl
fcb3d0dbb add aesdec128kl
85709ace7 mvoe opKmov in private
406199e7a Support cpuid CLWB
1214aad95 Adds back missing SSE4a check
5315658ad add detection of avx10/apx_f
835f6d2e6 Merge pull request #180 from Tachi107/fix-32bit-tests
650b241e3 test: only run apx test when BIT=64
016ce86b6 [doc] add a blank line
df0ebc740 v7.00
1ec2adbbb Merge branch 'apx'
da1818592 update doc
bec145ba9 amx supports apx
944438195 add tests of kmov*
bd85d108c kmov* supports apx
93bd6a0b7 rename T_VEX to T_APX
b063d276f add misc tests
6d21c7389 add evex tests
05a66d2c0 support V4 in evex
33017d4fb support V4 in evex
e228e737d prepare evex extension of evex
45eca7987 update doc
98ce73bb2 add cfcmov tests
e2d9685af add cfcmov
a4ec97ca9 add tests of ctestscc
45711c502 add ctestscc
a1f6c14cc add alias of dfv
facb052a1 avoid r15 on 32-bit mode
c1c15848c remove warnings
be319626b add ccmpscc with imm
c4d05037e add ccmpscc
17f7d279c testing ccmpb
ff01b1e20 setcc supports apx
25ceea2ef add 3-op cmovcc
2f8cfb9a8 CMPccXADD supports APX
a9310deac add tests of push/pop
ec2881bfd push/pop support rex2
114152fed add push2/pop2
1aefdb649 support jmpabs
77eca6d0d add tests of 3-op shift
5e54ffdfa add 3-op shift
426814c50 check v instead of r
3f3d6095c disable rol/ror to support NF
ee572b7eb add tests of ror/rol
186d63ad9 add tests of shr/sar
26be71a12 2-op shl supports apx
83f5bd25e remove some warnings
e43d99762 add crc32 tests
92153b6f8 crc32 supports apx
d7ca6a2dd split T_F2 from T_66|T_F3
fb1fc738f tweak
389d73347 movbe supports apx and append test
3636cde22 tests of 1-byte opcode with rex2
1dd020126 check whether or not it is a 1-byte opcode
083822b52 movdiri supports apx
6703d4344 movdir64b supports apx
ed5dc3516 add tests of shld/shrd
b01c0ed40 shld/shrd support apx
c51c4a6f7 add tests of lzcnt and tzcnt
2cc22ea1b lzcnt and tzcnt support apx
baddec288 tweak
1d3a19a50 update doc of apx
273d8d5b6 add 3-op imul with T_zu
50875294c add tests of 2-op imul
d20142d01 add T_zu
eb9de1392 2-op imul supports apx
dba2c174f add 2op neg/not_
95ad5927f add tests of imul/mul/neg/not_ with 1-op
790afb745 add tests o idiv
045ef31a3 add tests of div
1d7e2a6bb div supports apx
e5fe58231 remove warning on 32-bit
66b3a3042 check all regs of NF
c7dba88df add dec test
f55f596ad add inc test
6f6423899 2-op inc/dec
95c0c4e6f tweak inc/dec
f5fda7ace change detection of pp with type
a18e5aeb5 rorx supports apx
5bb8461b4 blsmsk, blsr support apx
a493dc7b4 blsi supports apx
7c1accedc sarx/shlx/shrx support apx and add tests
125d8e740 test bzhi with apx
78be5afd1 add tests of bextr with apx
e9603b79d bextr supports apx
3a85aadc6 pdep, pext support apx
16f1a5d8a mulx supports apx
82529af93 andn supports APX
637ad7a4a add test of NF
e23f5ad75 fix type for adc
1bcc83303 3-op add supports T_nf
5d46b950b the type of all type is uint64_t
0a8ea9edf fix type
b1f0fef4d add test of 3op apx
9b21727ba remove space
6fa1b4a90 reorder of opRO
2d1f229a0 simplify condR
b220be972 simplify opRO
24b71a1ce use Reg instead of Operand if possible
de1353448 rename opGen with opSSE
4cd8e8eac refactor opGpr as opRRO
01d756917 rename
5037120f7 replace old rex with rexA
45fe94fdd rename opLoadSeg2 with opLoadSeg
253f800bc tweak
4f3939d92 rename opModM2 with opModM
fa731a27c rename opModR2 with opModR
e5db7d0e4 rename opModRM2 to opModRM
dc20fd09b use opModRM2
d4da1561b rename opR_ModM2 with opR_ModM
ef3665274 use opR_ModM2
e5b20e5a5 use opModM2
104941db2 use opModM2
6ae769f21 rename opROO2 with opROO
1521cb7ce rename opGen2 to opGen
f9c6cb5dc all opGen are replaced with opGen2
249d6978a use opGen2
81ae48922 use opGen2
b9e4bb2fc always put prefix as byte code
3374a158f use opGen2
719f81f45 use opGen2
8d037ebd6 use opGen2
6f8bc28e2 use opGen2
303876cac use opGen2
f0b49752a rewrite opMovXMM
5d4c48ffd rewrite opMMX
189c3488b use opMMX2
1361d0946 use opMMX2
32cafcc61 tweak
cf1cfd6c4 add temporary converting code
433bf29e3 replacing opModR with opModR2
ba1d07ed1 senduipi uses opModR2
646da9750 use opModR2 for rdrand, rdseed, movq
ccad6cecd use opModR2 for movdq2q, movq2dq
3c21754b9 use opModR2 for movd, movmskps
4718643ef use opModR2 for bswap, maskmovq, pmovmskb
e1a148707 try to use opModR2
220a5def7 split avx_type_def.h in gen/
87b8c8ed2 adox passes the test
bd8477292 fix detection of adox without apx
6b19515eb add adcx, adox with APX
77d6acea6 increase the room of type
710e39bfe add test of r, r/m
ea9cd9ade tweak
057f09c5b rename T_NF to T_nf
57a0c1935 support NF=1
8f49739da remove cmp of 3-op
e3310344c [doc] about APX
cdc2533c1 add test of adc/3op
9c6b81c4d return value on nothrow mode
8d524b4a4 add op(r, r/m, imm) and op(r, r/m, r/m)
4c62d1fdc test adc2(r, op, mem) and adc2(r, mem, op)
6f593a1cb test of adc2 (3op APX)
61addb9d9 simplify opMIB
575c447f1 remove rex2p
a95bd9cc5 add test of adc/add/and_/cmp/or_/sbb/sub/xor_
f7d3c17e8 tweak
d7a7ea912 refactoring rex
acd797139 use opModM instead of opMIB
ad3334ba6 add modRM with rex2
059d115b5 add test of apx.cpp
873c93a51 add test of regs of apx
e25b1cd62 [not tested] add(r1, r2) with rex2
eb118504d remove warning of VC
6c580b1f7 fix cvt test for extended r16-r31
981fa6f05 add r16 - r31
244623812 Merge branch 'dev'
aafe3cb62 build(cmake): bump minimum required to version 3.5
76d7477d7 Merge branch 'dev'
151c8ab04 v6.73
dd66cfb76 add tests of avx-vnni-int{8,16}
4a6132d66 update cpuid list
bea25541a add detection of AVX_VNNI_INT16
d9e76b1c6 add tests of SM4
e1c4c360b add SM4
d79717dbe add tests of SM3
48f8dbeb6 add SM3
5473d3933 vsha512* check regs
9b3687a68 add detection of SHA512, SM3, SM4
ecdd01ee5 mov crypt test in 64-bit mode
c4550b6a9 sde 9.24.0
5762819de add vsha512{msg1, msg2, rnds2}
3255d606a Merge branch 'dev'
322665e72 v6.72
ad178a219 add xabort/xbegin/xend
0924ff4aa Merge branch 'dev'
8980934c1 v6.71
76292b310 add SystemInfo class for win
3e42709ab ignore space and cr
66b2768a6 disable wrong detection of gcc
1855985e1 remove / for mingw64
5bdccc0b8 64bit only for mingw64
33882d0a0 use sysconf(_SC_PAGESIZE) instead of const value 4096 on linux
33075c2bd add link to other projects
60e71402e reorder
79854aa08 add new cpus
5921e270c update cpuid
ce083a0dc Merge branch 'dev'
b538485f3 v6.70
461dd34ee udpate doc
2149c79e3 add test of alias of vpclmulqdq
2c59c5c91 add alias of vpclmulqdq
729ae4aa3 fix alias of pclmulqdq
3c248d68a define XBYAK_CONSTEXPR if XBYAK_ONLY_CLASS_CPU is defined
c0a932d7b Merge remote-tracking branch 'origin/dev'
ef502b5b4 update doc
ba3db4730 update version
c0d7a704f v6.69.2
c535f4737 update cpuid test list
683249232 change the order of args of diff
e81b95583 Merge branch 'Wunkolo-constexpr-typet' into dev
ab3f40587 Allow constexpr TypeT `operator|`
ad5276fa4 Merge pull request #172 from orz--/patch-1
b4d54f6e1 Update changelog.md
58642e0cd Merge branch 'dev'
3b13d068b v6.69.1
d700f6c35 add detection of xsave
740dff2e8 Merge branch 'dev'
dc048a04c v6.69
ad0dfffd2 add senduipi/stui/testui/uiret
e78f1121b add clui
23b40331a add detection of uintr
98a0f1924 remove warning of sign/unsigned
0afd71a27 add detection of SERIALIZE
363bbaa57 sample shows cpu cache info for AMD
edce72709 Cpu supports AMD

git-subtree-dir: externals/xbyak
git-subtree-split: 2ce465bbca46e92dde9c44bbe7940fd7f70e3b97
2024-01-30 00:36:49 +00:00
Merry
85177518d7 emit_x64_vector: Improve AVX512 implementation of EmitVectorTableLookup128 2024-01-30 00:29:12 +00:00
Merry
0f20181a45 emit_x64_vector: Fix AVX-512 implementation of EmitVectorTableLookup64 2024-01-30 00:29:12 +00:00
Alexandre Bouvier
1e1ba4e0c2 cmake: prefer system oaknut 2024-01-29 23:43:13 +00:00
Merry
2ee3eacd01 emit_x64_crc32: Correct use of x64 crc32 instruction
CRC32 r32, r/m64 variant does not exist, but CRC r64, r/m64 does what we want.
2024-01-29 22:42:17 +00:00
zmt00
314ab7a462 emit_x64_vector: Implement PairedMinMax{Lower}8 2024-01-28 18:56:42 +00:00
zmt00
46a99991e2 tests/A64: Add {U,S}MINP.B, {U,S}MAXP.B tests 2024-01-28 18:56:42 +00:00
Merry
ca0e264f4f dynarmic: 6.6.1 2024-01-28 17:03:17 +00:00
Merry
ac9003fb78 externals: Update oaknut to 2.0.1
Merge commit 'a37f3673f8ca59a0c7046616247db1c6bc00e131'
2024-01-28 17:02:58 +00:00
Merry
a37f3673f8 Squashed 'externals/oaknut/' changes from d0488d932..9d091109d
9d091109d oaknut: 2.0.1
7f3e9f600 oaknut: Support single argument constructor for CodeGenerator again

git-subtree-dir: externals/oaknut
git-subtree-split: 9d091109deb445bc6e9289c6195a282b7c993d49
2024-01-28 17:02:37 +00:00
Merry
70984e0c80 dynarmic: 6.6.0 2024-01-28 16:29:31 +00:00
Merry
bbc058c76b backend/arm64: Update for oaknut 2.0.0.
Also respect DYNARMIC_ENABLE_NO_EXECUTE_SUPPORT.
2024-01-28 16:19:33 +00:00
Merry
99c0a73f91 Squashed 'externals/oaknut/' changes from c24f918e5..d0488d932
d0488d932 oaknut: 2.0.0
40ad78bbf oaknut: Implement DualCodeBlock and related support
9f131cfb5 oaknut: add configuration for standalone installation
69799b43c oaknut: Test building for Android on CI
1d51f5512 oaknut: 1.2.2
918bd94f0 oaknut: Eliminate -Wconversion warnings
316d8869e oaknut: Fix edgecases in MOVP2R on +/-4GiB boundary
d8634eaa1 oaknut: Fix page boundary error in ADP
d0ca9a24e oaknut: Update README examples for CPU feature detection
dbeec268b oaknut: feature_detection_freebsd: Warn about incompatibility with earlier FreeBSD versions
86e5386e2 oaknut: feature_detect: Support NetBSD
df4cf2d48 oaknut: feature_detect: Support OpenBSD
99dfff25a oaknut: feature_detection: Read ID registers
319b3d2c9 oaknut: Add basic CPU feature detection
23e9ddb4c oaknut: CI: Don't run slow tests on OpenBSD
734f1bdb4 oaknut: CI: Use up-to-date qemu
f462c9774 oaknut: CI: Build on OpenBSD
19cd42204 oaknut: code_block: Add NetBSD and OpenBSD support
18b86a3ec oaknut: SystemReg: Add more EL0 accessible registers
53c43bf0c oaknut/tests: Reduce iterations for MOVP2R
cc37df19e oaknut: Test on FreeBSD
a66b32d26 oaknut: Fix crossing sign boundary in PageOffset
206468d72 oaknut: CI: Add macos-arm64 build
e6eecc3f9 oaknut: 1.2.1
4252d8f4a oaknut: CMakeLists: Warnings are errors on MSVC
408eed65f oaknut: arm64_encode_helpers: remove unreachable code
bfc8eedfb oaknut: arm64_encode_helpers: p maybe unused
ff4456eca oaknut: Avoid negation of unsigned values
b4ac8fd6c oaknut: Fix MOV for applications of MOVN
0575cadc4 oaknut: Disable certain functionality where absolute addressing is not available
394a3c8f0 oaknut: Appease MSVC
011183670 oaknut: 1.2.0
e83c9f327 oaknut: Add VectorCodeGenerator
5eb122cc5 oaknut: Tidy up public header
45c5a7b25 oaknut: Fix clang-format errors
36243256f oaknut: Add `const` qualifier to `AddrOffset` ctor
4af500cb5 oaknut: Add `ptr` accessor to `Label`
bccb06669 oaknut: CodeGenerator const correctness
da0590a86 oaknut: github: Update package repositories

git-subtree-dir: externals/oaknut
git-subtree-split: d0488d9320ae673167dd9117223e3453d5ff102f
2024-01-28 14:56:59 +00:00
Merry
6f3b6d35f0 externals: Update oaknut to 2.0.0
Merge commit '99c0a73f91e7a5e66db686f29e158e99193a043d' into dev/dual_code_block
2024-01-28 14:56:59 +00:00
Merry
05f38d1989 A32: Implement VCVT{A,N,P,M} (ASIMD) 2024-01-28 11:21:08 +00:00
Merry
c9fcb695a4 A32: Correct function naming convention for VRINT{N,X,A,Z,M,P} (ASIMD) 2024-01-28 11:10:58 +00:00
Merry
c67f38b57e backend/arm64: FPVectorRoundInt{32,64}: FPCR comparisons should be made with fpcr_controlled when under scope of MaybeStandardFPSCRValue 2024-01-28 10:55:59 +00:00
Merry
f8e38809e9 A32: Implement VRINT{N,X,A,Z,M,P} (ASIMD) 2024-01-28 10:19:15 +00:00
Steveice10
8398d7ef7e arm64: Fix compiling under MSYS2 CLANGARM64. 2024-01-27 08:54:07 +00:00
Wunkolo
00c6c00e86 Refactor Xmm{B}Const to {,B}Const 2024-01-23 19:24:56 +00:00
Wunkolo
917335ae8a block_of_code: Add XmmBConst
This is a redo of https://github.com/merryhime/dynarmic/pull/690 with a
much smaller foot-print to introduce a new pattern while avoiding the
initial bugs
(5d9b720189)

**B**roadcasts a value as an **Xmm**-sized **Const**ant. Intended to
eventually encourage more hits within the constant-pool between vector
and non-vector code.
2024-01-23 19:24:56 +00:00
Wunkolo
b02292bec7 block_of_code: Rename MConst to XmmConst
`MConst` is refactored into `XmmConst` to clearly communicate the
addressable space of the newly allocated 16-byte memory constant.
2024-01-23 19:24:56 +00:00
zmt00
ba9009abd8 emit_x64_vector: Optimize VectorSignedAbsoluteDifference 2024-01-23 18:28:19 +00:00
zmt00
7e66e082fd tests/A64: Add SABD tests 2024-01-23 18:28:19 +00:00
Merry
331b41bc93 decoder/arm: Improve performance of arm decoding by adding LUT 2024-01-13 15:04:33 +00:00
zmt00
1c97fd5ec5 emit_x64_vector: Implement PairedMinMax{Lower}16 2024-01-10 12:23:28 +00:00
zmt00
77f1f0376f tests/A64: Add {U,S}MINP.H, {U,S}MAXP.H tests 2024-01-10 12:23:28 +00:00