Commit graph

2651 commits

Author SHA1 Message Date
MerryMage
f2345c1590 A64/system: Implement MSR/MRS for NZCV 2021-02-01 19:52:49 +00:00
LC
8c09da666a
Merge pull request #568 from bunnei/hook-isb-a32
A32: Add hook_isb option.
2021-01-29 06:52:09 -05:00
bunnei
de389968eb A32: Add hook_isb option. 2021-01-28 20:47:39 -08:00
MerryMage
0f27368fda A64: Add hook_isb option 2021-01-26 23:41:21 +00:00
MerryMage
3806284cbe emit_x64{,_vector}_floating_point: Fix non-FMA execution
Avoid repeated calls to GetArgumentInfo
2021-01-02 20:40:32 +00:00
MerryMage
6023bcd8ad emit_x64_data_processing: Fix signed/unsigned warning 2021-01-02 20:12:48 +00:00
MerryMage
c15917b350 backend/x64: Add further Unsafe_InaccurateNaN locations 2021-01-02 20:12:48 +00:00
MerryMage
f9ccf91b94 Add Unsafe_InaccurateNaN optimization to all fma instructions 2021-01-02 17:22:50 +00:00
MerryMage
8c4463a0c1 emit_x64_data_processing: EmitSub: Use cmp where possible 2021-01-01 19:37:47 +00:00
MerryMage
e926f0b393 emit_x64_data_processing: Minor optimization for immediates in EmitSub 2021-01-01 13:35:01 +00:00
MerryMage
eeeafaf5fb Introduce Unsafe_InaccurateNaN 2021-01-01 07:18:05 +00:00
ReinUsesLisp
4a9a0d07f7 backend/{a32,a64}_emit_x64: Add config entry to mask page table pointers
Add config entry to mask out the lower bits in page table pointers.
This is intended to allow users of Dynarmic to pack small integers
inside pointers and update the pair atomically without locks.
These lower bits can be masked out due to the expected alignment in
pointers inside the page table.

For the given usage, using AND on the pointer acts the same way as a
TEST instruction. That said when the mask value is zero, TEST is still
emitted to keep the same behavior.
2020-12-29 19:16:46 +00:00
MerryMage
42059edca4 decoder_detail: Fix bit_position and one unused warnings in GetArgInfo 2020-12-28 23:34:23 +00:00
MerryMage
b47e5ea1e1 emit_x64_data_processing: Use BMI2 shifts where possible 2020-12-28 22:42:51 +00:00
ReinUsesLisp
ba6654b0e7 location_descriptor: Fix compare operator for single stepping
Compare `single_stepping` with the other's value instead of comparing it
with the local value.
2020-12-01 09:11:40 +00:00
LC
96e9075804
Merge pull request #563 from Wunkolo/patch-1
emit_x64_vector: Fix ArithmeticShiftRightByte zero_extend constant
2020-11-09 20:31:21 -05:00
Wunk
3e932ca55d
emit_x64_vector: Fix ArithmeticShiftRightByte zero_extend constant
Should be shifting in _bytes_ of `0x80`. Not bits.
2020-11-09 09:47:51 -08:00
Wunkolo
ec52922dae emit_x64_vector: Use explicit 64-bit mask constant
Exchange `~0ull` with `0xFFFFFFFFFFFFFFFF` when generating
the `zero_extend` constant.
2020-11-07 15:29:12 +00:00
Wunkolo
490160ef43 emit_x64_vector: GNFI implementation of ArithmeticShiftRightByte
The bit-matrix is generated up-front and added to the constant-pool.
I'm using an embedded 64-bit broadcast here(m64bcst) which is the particular
EVEX encoded version of the instruction with AVX512VL+GNFI.

If it ever really matters, then we would ideally detect specific host
features like bare-GFNI and specific subsets of AVX512 and emit
the assembly based on that rather than by the entire Icelake uarch.
2020-11-07 15:29:12 +00:00
Wunkolo
7df235aefb emit_x64_vector: GNFI implementation of EmitVectorLogicalShiftLeft8
Same principle as EmitVectorLogicalShiftRight8. An 8x8 galois identity
matrix is bit-shfited to allow for arbitrary 8-bit-lane shifts.
2020-11-07 15:29:12 +00:00
Wunkolo
5cc646ffed emit_x64_vector: GNFI implementation of EmitVectorLogicalShiftRight8
Bitshifts of the GFNI identity matrix generates a new matrix that
applies lane-wise bitshifts as well. This allows for a fast
single-instruction implementation of a byte-lane bitshift.
2020-11-07 15:29:12 +00:00
MerryMage
46f96904db decoder_detail: Add check for N==0 to GetArgInfo 2020-10-11 22:12:21 +01:00
Wunkolo
6bb49726f4 emit_x64_vector: GNFI+SSSE3 implementation of EmitVectorReverseBits
Performs a full 128-bit bit-reversal using only two instructions.

First by reversing all the bits of each byte using a galois matrix
multiplication(vgf2p8affineqb, Icelake), and then by reversing the bytes
themselves(pshufb, ssse3).
2020-10-02 05:56:59 +01:00
ReinUsesLisp
eb00bea1ff backend/x64/exception_handler_posix: Fix signal stack memory leak in SigHandler
std::malloc was being called inside SigHandler's constructor without a
std::free. This doesn't really matter as SigHandler is used as a
singleton and the OS will reclaim that memory. That said, properly
freeing memory keeps -fsanitize=address quiet.
2020-10-02 05:56:07 +01:00
MerryMage
80adb289d0 print_info: Use std::nullopt instead of {} 2020-09-22 18:40:00 +01:00
Lioncash
7afcdabf11 externals: Update catch to v2.13.1
Keeps the library up to date.
2020-09-19 15:02:22 -04:00
Lioncash
63802395c7 externals: Update fmt to v7.0.3
Keeps the library up to date.
2020-09-19 14:25:31 -04:00
Lioncash
d4c6fa3122 Squashed 'externals/fmt/' changes from 9bdd1596c..cd4af11ef
cd4af11ef Update version
1ebc2f7cc Bump version
f4c997062 Fix changelog
72920ba30 Update changelog
0907c08ae Fix handling of default alignmment with locale (#1801)
37c8f4eaf Don't use 128 bit integers with clang-cl (#1800)
eaaaec999 Workaround a bug in msvc
ccf8561cb Workaround broken numeric_limites, part 2 (#1787)
0cc73ebf7 Report error on missing named argument (#1796)
33efc3c94 Fix handling of iterators in locale-specific formatting (#1782)
b9d749095 Update version
86b63bb71 Bump version
cbf6be960 Update changelog
229ee9b46 Workaround broken numeric_limits (#1725)
2b7a146fa Fix a regression in handling digit separators (#1782)
89d0c7124 Fix compatibility with CMake 3.4 (#1779)
f19b1a521 Update version
5c67fefb2 Fix a changelog entry
1d2a556e1 Fix undefined reference error
04c9b62fb Merge release branch
6be6762e5 Fix date
f1dd2eb3c Bump version
fbf3b943c Workaround a bug in gcc
a29a01d30 Fix docs
9f0b3afb7 Bump version in namespace
86b2f99f8 Fix the docs
c472ff12d Update version
5173a76ba Update version
1614af352 Minor corrections in the changelog
569a9b3a7 Bump version
4e7e3c65a Update docs
0f7a6bfa1 Add a section on std::format compatibility
4faec5a5e Update README.rst
7dbc8ac71 Update changelog
c87dd746f Update changelog
372175caf Revert changelog changes
904754876 Add ClickHouse to the list of projects (#1751)
d30bca64e Revert changelog conversion since GFM is not supported there
d6047cdc4 Update changelog
810241b36 Convert changlog to markdown
661c47473 Rename changelog
7c33059fa Update ChangeLog.rst
9e20883ab Update README.rst
41899d522 Update changelog
f42f45908 Update changelog
2381df654 Update readme
7ae816563 Update README.rst
c56cf3d07 Update changelog and readme
01309a34a Deprecate arg_formatter
a62d06055 Update changelog
23e3a2eee Update changelog
d8e0554b9 Disable numeric formatting by default
1e8eea4f4 Update changelog
44bd5384a Fix formatting
20e19387a Update changelog
56fed7814 FMT_NUMERIC_ALIGN -> FMT_DEPRECATED_NUMERIC_ALIGN
56e63078f Make the n specifier an opt-in
31ce6bc70 Fix a conversion warning with Clang10 on Windows (#1750)
c9c5b90da Fix a typo. Thanks Tracy Chapman from TripleChecker
1f3f84631 Fix a typo
5de62af60 Fix possible infinite recursion in FMT_ASSERT (#1744)
cbddab2fe Use consistent include style
f69b6eaab Add a simple buffered stream with no sync
ba363b3a2 Use digit pairs as in unrolledlut
a6f8e7d86 Update changelog
e753244ab Update changelog
98a7a8b40 Update changelog and disable internal
3135d95fd Don't use non-portable attribute
8630a8f5f Tweak the docs
cc3a88e6b Extract docs from compile.h
79c4b6bd7 Apply clang-format
d130ee070 Document format string compilation
d0f90b5be Spelling fixes
6e080660d Update README.rst
31c3a2426 Spelling fixes
613b3b459 Spelling fixes
978521bb8 Fix a compile error introduced in #1738
4e94c649f Deprecate compile
1a83443e6 Add user-defined type support to compilation
8bef1c3b3 Tweaks for EDG based compilers (Intel, nVidia, MCST/Elbrus, etc).
b287c37c6 Do not use -Wl,--as-needed with emscripten.
2cac8a9d2 Reintroduce UDT support to fmt::to_string and test ADL
9a4cc8842 Add FMT_COMPILE support to format_to
5ddf9ee1b Streamline default FP formatting
0b3a83f7f Update README.rst
5aa5c9873 Added  #define WIN32_LEAN_AND_MEAN before including windows.h (#1729)
397ad1bec Optimize common case
7431165f3 Make to_string bypass format
ee4d4c7fd Inline compiled format
ab2f8484e Finish text::format
e900d735b Re-enable assert in format_decimal
f4de7b684 Fix ambiguity
1f8f5450b Reuse format_decimal
d702a68df Fix formatting of bool with FMT_COMPILE and add more tests
e956a14e9 Use write instead of format_int in to_string
98dcc251e Undo branching reduction
5b8641ddd Undo branching reduction
8c88abde6 Fix sign handling in 'L'
23b976a61 Reduce branching
9edee0e72 Optimize small string parsing
a909d42b7 Fix a warning
16637341b Enable compilation for all types
2d71d7e03 Add a simple format string compilation API
d259fcfb0 Tweak comments
704ed557a Move project in order to solve a CMake warning
8603bd20d Update README.rst
547f12ae6 Fix a warning (#1722)
f904e8a1b c++11 use formatting user-defined types (#1721)
100e8af08 Update README.rst
c11d0f056 Update README.rst
2453ee576 Improve default formatting
47ae52155 MINGW cross compiler fixes
936a1833c Add default_arg_formatter
f2c9cb624 Fix a UB
d3107f855 Cleanup arg_formatter_base
5e7c70e20 Simplify arg_formatter_base
38cc68b3e Inline visitor
6732ea500 Make symbols readable
57ddc77ce Make advance_to a noop for back_insert_iterator
50bad7d62 Optimize format string parsing
8f7a824e4 Inline visit
f11e96870 Optimize format string parsing
09737dd83 Optimize format handler
d9e3d6e6e Move format_handler to detail
795b47a7b Fix a warning (#1712)
95c6ac0cc fix typo which caused the loss of the counting information when using a printf context with a truncating_iterator
21409cfdd Fix warnings
88c8d534e Move digits10 to where they belong and add comments
0f3eaeac0 Fix a warning
344218510 Ignore /doc/node_modules directory
16aec0617 Cleanup arg_formatter_base
1e1193590 Fix format_decimal overloads
0893c9c2e Inline parse_format_string
3245145a4 Remove undocumented buffer_range and output_range
57fc44907 Increase VM disk size
7d22bebb6 Remove uses of buffer_range
8f2b5fe74 Don't install sphinx cache files
f095c67b6 Remove uses of buffer_range
5aabf1f71 Simplify copy_str
19c5b5d15 Simplify arg_formatter
519571ede Simplify arg_formatter_base
ac8dfd841 Improve handling of separators
2c6165a22 Reduce the number of comparisons
28639969e Use memcpy for copying digits
f5fa1dee5 Support custom FMT_INC_DIR in pkgconfig and cmake configs (#1702)
51bf9cfac Fix Mingw support
1a716caf5 Optimize common case
98d4bbf81 Update README.rst
8c8f74a87 fix zero flag for char types and make zero flag ignored if a precision is specified
bc1b89da2 Temporarily revert parsing changes
a7fb321ac Remove a redundant branch
8cadb9650 fix max/min macro (#1697)
297c3b2ed Fix an example (thanks Alexey Kuzmenko)
943532fec Make ostream formatter work with compile-time format strings (#1692)
bd8804019 Update README.rst
f230300ac Knuth is using fmt library (#1691)
a265e25b7 Optimize small string parsing
2aa2526f6 Optimize small string concatenation
8d78045e7 Move void_t to where it's used
7aafa6bc6 Update analytics
c66aae165 Adding sentinel support to fmt::join(). (#1689)
6d66de380 Add c specifier support to integral types (#1652)
6b219a58d fix interaction of space flag and '+' flag, as well as '-' flag and '0' flag (#1687)
eee2023c2 Update signatures
c5ed73aab Add fmt::detail::buffer to the docs (#704)
ea1cd9638 Fix apidoc
d3964d7b1 Merge branch 'master' of github.com:fmtlib/fmt
d18c6723a Update docs
96c18b26c make plus flag for printf not be ignored for char argument (#1683)
ba25baeb9 Apply doc patch to 6.2.1
981b517cc nested replacement fields may omit arg_id (#1681)
922ea924b Make dynamic_format_arg_store reusable and add reserve() (#1677)
e0d98923c Update version
806926537 internal -> detail (#1538)
963ee0831 Simplify named arguments
02a6fe59f Named arguments go brrr
de290f5c4 Ditch internal::arg_map
d0623de51 Bump version
73e335ed3 Make implicit capture explicit for C++20 (#1669)
b4d46e398 Update changelog
a182f7341 Update changelog
68201831a Support named args in dynamic_format_arg_store (#1655). (#1663)
7f723fbcb Consistently namespace qualify size_t
c06851456 Purge basic_writer
2f05054dd Purge basic_writer
f0ce21164 Revert enum change
44639b11f Fix some warnings (#1667)
1c86a99e8 Purge basic_writer
8f511fc12 Make copyfmt not throw (#1666)
59fe455f3 Remove compatibility stubs
b0f47a13e Separate nonfinite formatting
d6cea50d0 Remove deprecated APIs
40bc7163f Move FMT_MAYBE_UNUSED to where it's actually used
080e44d0b Fix inconsistent type detection (#1662)
7e57cace5 Exclude std::abort from compilation when compiling CUDA with Clang (#1661)
7b66e2f21 Inherit arg_formatter_base from basic_writer
bab3f5800 Refactor pointer formatting
9cc7edfdd Move int_writer to the namespace scope
8d9d528bf Improve handling of alignment
8efd1a8ef Improve handling of alignment
a71bc9c82 Use '0' fill with numeric align for consistency with std::format
60d85d598 Suppress ubsan warning
c3099beb6 Cleanup
cbb4cb899 Remove undocumented deprecated APIs
b85e9ac38 Simplify vformat_to
e3710ab97 FMT_CONSTEXPR -> constexpr
d59751f0f Update date formatting example to use threadsafe localtime
d6abb2fa0 Reduce library size
e9fdea90b Update README.rst
44b6584f2 Update README.rst
78f041ab5 build: Fix installation paths
7ca89bf87 Reduce template bloat in write_int
3c114d091 Fix a shadowing warning (#1658)
e2ef12a8c Allow to avoid inclusion of os.cc in fmt target
bca82719a Pass iterator by value
99da38962 Make write_padded non-members
f19d66794 Bump fuzzer allocation limit
3e6984761 Reduce branching in write_padded
9ac1eebd4 Reduce library size
e2ff91067 Replace FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION  with fmt-specific macro (#1650)
f2ed03b91 Fix a warning (#1649)
9dde9f013 Reduce library size
b1af642d1 Reduce library size
4a617f25c Clarify encoding conversion in chrono
6f435f55c Improve compile time by using extern template (#1452)
cb475cb88 Clarify why we don't check argument id
1e1ac6e96 Check dynamic width/precision id at compile time (#1614)
e51c449fe Revert "Check dynamic widht/precision id at compile time (#1614)"
0463665ef Don't access a C string past precision in printf (#1595)
7d748a6f8 Check dynamic widht/precision id at compile time (#1614)
2b75bd7ce Get rid of do_check_format_string
4a1d5931c Simplify udl_formatter with FMT_STRING
811b0f905 Enable compile-time error tests
450e8eed9 Fix markup
b8fbcec1b Clarify formatter reuse
56bc86ffa Suppress bogus MSVC analysis warnings
3f79357ef Fix a recent regression in handling max packed arguments
8a11148f9 Add Facebook Folly to the list of projects
e371e8b68 Tweak readme
813732fed Improve readme formatting
3670d5b3f README: add vectorized.io/redpanda in the list of users
9e2ad7cf6 Add windows terminal to the projects using {fmt}
63479c851 Use a delegating ctor and add inlines
5944fcad3 Remove remaining wchar_t instantiation
e253b371b Don't generate RTTI for allocator
0c86f467b Fix build on ancient gcc
1929df4bc Simplify format_args
a13822181 Always inline arg_data functions
04e0dfd4b Always inline value ctors
04cde756b Simplify checks
c9a57b9a8 Fix incorrect assumptions about nul termination
f46f5ecaf Reenable constexpr _compile on GCC 9
6e8d7e277 Don't use constexpr on Intel compiler (#1628)
567ed03f8 Merge arg overloads and cleanup
c3fa33314 Remove warning in core.h with when compiling with gcc and -Wshadow
84898b462 Remove warning in format.h when compiling with gcc and -Wshadow
538d83fd0 Cleanup named arguments
8a4630686 Improve handling of named arguments
a9d62d3f3 Add check for CompiledFormat to avoid ambiguous call
fdcf7870a Add stack-based named argument storage
5899267c4 Fix a clang-tidy warning
07b4c246e Fix a typo
e99809f29 Fix ostream support in sprintf (#1631)
3cd5179f3 Fixed clang tidy warning -multiple declarations in a single statement reduces readability
7404e33a7 Fix clang warning about explicit ctor
3aab2171e Clean up basic_format_args
7645ca072 Clean up printf
e30d8391e Suppress an MSVC warning (#1622)
8cd8ef03e Simplify warning suppression
bbb6b357c Add floating-point L specifier (#1624)
36ea32640 Suppress a bogus MSVC warning
141a00d64 Define FMT_EXTERN_TEMPLATE_API on export
3860edc5d Bump version
7d01859ef Fix handling of unsigned char strings in printf
63b23e786 Merge branch 'master' of github.com:fmtlib/fmt
4999796c1 Fix the docs
34b3f7b7a Avoid windows issue with min() max() macros
27e3c0fe9 Update signature in the docs

git-subtree-dir: externals/fmt
git-subtree-split: cd4af11efc9c622896a3e4cb599fa28668ca3d05
2020-09-19 14:25:26 -04:00
Lioncash
8a7a4cb672 Update xbyak to 5.97
Keeps the library up to date.
2020-09-19 11:28:09 -04:00
Lioncash
8042dc93e8 Squashed 'externals/xbyak/' changes from 73ac5866..0140eeff
0140eeff Merge branch 'dev'
1efe14b2 change the original behavior of SetError
83c89c7a rename and fix indent
8be7ca93 Merge branch 'sbogusev-master' into dev
070b4c09 make l_err() inline with block scope static TLS l_error
9a4e6579 v5.97
d0ced1bc XBYAK_ONLY_CLASS_CPU is for only util::Cpu
bb967ae7 replace uint32 with uint32_t etc.
c306b8e5 update to v5.95
605e4224 use noexcept if C++11 or later
7a17c2c8 remove warning
5dfa4462 use constexpr if c++14 or later
18c9caaa Merge branch 'densamoilov-fix-mov-interface' into dev
3966ba9d fix mov interface
be492be1 change the behavior of push((byte|word), imm) to cast imm to int8_t/int16_t
d9696b54 Merge pull request #102 from igorsafo/master
ea73267f Cpu: make getNumCores constant
ff0b10e9 Merge pull request #101 from densamoilov/use-thread_local-when-supported
0c4eafc3 use thread_local for XBYAK_TLS when supported
c1aea35e CodeGenerator::reset() calls ClearError()
b4df97b1 Merge branch 'cursey-no-winsock2-header'
6a47bb0e v5.94
9a1749e6 define WIN32_LEAN_AND_MEAN for including winsock2.h after xbyak.h
42dddb74 Remove #include <winsock2.h>
615b85fa update doc
9cd796a9 rename XBYAK_NOEXCEPTION to XBYAK_NO_EXCEPTION
7cdf227f use static to avoid multiple instance
38a28dec test_nm.bat supports noexcept
0fdffc6b XBYAK_NOEXCEPTION for -fno-exceptions
eda6e2a3 v5.92
5c26c8bb mov(rax, imm64) on 32-bit env with XBYAK64
6208e3ae throw exception if not supported amx sibmem 2
c6737d14 mov amx insts from avx512
34ea5c16 throw exception if not supported amx sibmem
6f93fe35 fix test of sizeof(Operand)
5b89c3b2 remove T_TMM
5ce32858 gen_amx.cpp is merged into gen_avx512.cpp
fe4f965f remove my alias for tmm registers
92f904d8 bit_ contains 8192
98b51da9 extend mnemonics with Intel(R) AMX ISA
8d1b4c9e add generation of Intel(R) AMX ISA mnemonics
8ded45d1 add support of Intel(R) AMX ISA
b23c4b02 v5.912
ffe32a60 Merge branch 'rsdubtso-master'
e7b7fd2f use MAP_JIT on macOS regardless of Xcode version
82b70e66 v5.911 ; XBYAK_USE_MMAP_ALLOCATOR is defined
2f6d9e34 fix test for mac
a7d10a1e add link to GitHub Sponsor
96076265 accept k0 mask register (it means no mask)
7e3167e4 kmov{b,w,d,q} throws for unsupported reg
f487d7b7 Merge pull request #91 from marcelotrevisani/patch-1
dc9e6a79 Possibility to specify a different PREFIX
5fc69fc8 remove warning of test
e69e0b42 fix typo of type of Zmi
34f797e8 perf does not recognize too short function name
6cc0f4df Consider max defined as a macro on Windows
5722393d fix for zeroed-out 0xb leaf
6a4459a8 Merge branch 'tyfkda-feature/fix-segfault-in-calc'
47922ed9 Fix segmentation fault in calc sample
8f696e93 add test_avx512 to bat
00114d79 add .travis.yml
a29fa27b refactor test
508b543c fix error of vfpclasspd
0d54f1b1 fix for windows
4da8fd4e add setDefaultJmpNEAR
da7f7317 revert to the behavior before v5.84 if -fno-operator-names is defined
7dac9f61 update to v5.85
fe639332 enable MAP_JIT only if mojave or later
4443d791 specify MAP_JIT mmap flag on macOS
20ee4c2d update doc
ca0e8395 [changed] XBYAK_NO_OP_NAMES is defined
f32836da remove exit(1)
a1e9adf2 v5.82
08b8b1ba Support AMD Zen New Instructions.
2501ba9a remove *.user and *.vcproj
5c2ea988 Merge branch 'jrmwng-feature/upgrade-to-vs2017/jrmwng'
35847f7a Merge branch 'feature/upgrade-to-vs2017/jrmwng' of https://github.com/jrmwng/xbyak into jrmwng-feature/upgrade-to-vs2017/jrmwng
ef267775 address "warning LNK4075: ignoring '/EDITANDCONTINUE' due to '/SAFESEH' specification"
4a6c59bb address a conflict of sharing intermediate directory by different projects
9577cbf3 inherit "some output locations" from parent or project defaults
6c5f7186 upgrade projects from VS2018 to VS2017
4ca0434b v5.81
72b4e95d add lds/lss/les/lfs/lgs
cc8f037c fix ; move ERR_INTERNAL to the end
9e9ec1c3 add repe, repne, repne, prez
eea0edc3 add some fpu mnemonics
06235fa6 add loop/loope/loopne
7fc0c2bb add enter/leave
9fa2ef3c add in_, out_
df208648 add lods{b,w,d,q}, outs{b,w,d}
4672d2cb add int3, int_, into
431977cb add pushfq, popfq
81c4749f syscall, sysenter, sysexit, sysret
1f1b53c4 add clflushopt, fldenv, fnstw
b765db33 Profiler uses append mode
44dc3546 add Profiler class
42949334 update version to v5.802
91cb919b Merge branch 'vpirogov-master'
a6452f82 fixed avx512_bf16 detection
f41da5aa tweak ; vcvtneps2bf16 calls opCvt2
b12460ba [sample] fix typo of quantize.cpp
b22f5881 add set_opt.bat for test on Windows
f402faad add vp2intersectd/vp2intersectq
4cfd5208 add avx512_bf16
4033564c fix vcmppd/vcmpps for ptr_b

git-subtree-dir: externals/xbyak
git-subtree-split: 0140eeff1fffcf5069dea3abb57095695320971c
2020-09-19 11:27:42 -04:00
Wunkolo
c2d5f6da90 block_of_code: Add HasAVX512_Icelake
Detect AVX512 feature support up to the [Icelake-level featureset](https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512)
2020-09-19 15:20:40 +01:00
Lioncash
0e1112b7df Revert "basic_block: Mark move constructor and assignment as noexcept"
This reverts commit 4f12e86ebb.

Big fan of MSVC preventing standard behavior.
2020-08-14 16:49:40 -04:00
Lioncash
889635d17d general: Resolve -Wmissing-prototypes warnings 2020-08-14 14:50:09 -04:00
Lioncash
68fea20020 common/assert: Resolve several -Wextra-semi warnings
Resolves 200+ warnings.
2020-08-14 14:45:53 -04:00
Lioncash
4f12e86ebb basic_block: Mark move constructor and assignment as noexcept
Allows the type to play nicely with standard library facilities better
(also we shouldn't be throwing in move operations to begin with).
2020-08-14 14:38:28 -04:00
Lioncash
34f4d99454 block_of_code: Remove unused variables in GenRunCode()
These aren't used, so they can be removed.
2020-08-14 14:35:17 -04:00
Lioncash
29d1758923 ir_matcher: Add missing header guard 2020-08-14 14:32:34 -04:00
MerryMage
6bbc53839f Unsafe Optimization: Extend Unsafe_UnfuseFMA to all FMA-related instructions 2020-07-12 12:45:12 +01:00
MerryMage
d05d95c132 Improve documentation of unsafe optimizations 2020-07-12 12:41:11 +01:00
MerryMage
82417da780 emit_x64{_vector}_floating_point: Add unsafe optimizations for RSqrtEstimate and RecipEstimate 2020-07-11 14:05:57 +01:00
MerryMage
761e95eec0 A64: Add unsafe_optimizations option
* Strength reduce FMA unsafely
2020-07-06 21:02:30 +01:00
MerryMage
82868034d3 A32/ASIMD: Ensure decoder table is correct
* Raise a DecoderError instead of ASSERT-ing on a decode error
* Correct ASIMD decode table
* Write a test which verifies every possible ASIMD instruction
2020-07-05 18:45:42 +01:00
MerryMage
3c742960a9 simd_three_same: Ensure zero in upper for PairedMinMaxOperation 2020-07-04 11:25:36 +01:00
MerryMage
735738c7b6 A32: Implement ASIMD VPMAX, VPMIN (floating-point) 2020-07-04 11:04:10 +01:00
MerryMage
88e74cb2ba A32: Implement ASIMD VPMAX, VPMIN (integer) 2020-07-04 11:04:10 +01:00
MerryMage
d9914b1d51 simd_permute: Implement VectorUnzip with deinterleave lower 2020-07-04 11:04:10 +01:00
MerryMage
f35aaa017c IR: Add VectorDeinterleave{Even,Odd}Lower 2020-07-04 11:04:10 +01:00
MerryMage
df477c46c2 asimd_load_store_structures: VST1 undef correction 2020-07-04 11:04:10 +01:00
MerryMage
4ba1f8b9e7 Add optimization flags to disable specific optimizations 2020-07-04 11:04:10 +01:00
MerryMage
3eed024caf asimd_three_same: Ignore Q=1 for VPADD (floating-point) 2020-07-04 11:04:10 +01:00