Update documentation (2016-08-12)

This commit is contained in:
MerryMage 2016-08-12 18:17:31 +01:00
parent 3808938c98
commit 1029fd27ce
17 changed files with 205 additions and 138 deletions

View file

@ -1,9 +1,12 @@
Dynarmic
========
An (eventual) dynamic recompiler for ARMv6K. The code is a mess.
A dynamic recompiler for the ARMv6K architecture.
A lot of optimization work can be done (it currently produces bad code, worse than that non-IR JIT).
Documentation
-------------
Design documentation can be found at [docs/Design.md](docs/Design.md).
Plans
-----
@ -12,8 +15,8 @@ Plans
* Actually finish the translators off
* Get everything working
* Redundant Get/Set elimination
* Handle immediates properly
* ~~Redundant Get/Set elimination~~
* ~~Handle immediates properly~~
* Allow ARM flags to be stored in host flags
### Medium-term

View file

@ -23,14 +23,14 @@ through several stages:
Using the x64 backend as an example:
* Decoding is done by [double dispatch](https://en.wikipedia.org/wiki/Visitor_pattern) in
`src/frontend/decoder/{arm.h,thumb16.h,thumb32.h}`.
* Translation is done by the visitors in `src/frontend/translate/{translate_arm.cpp,translate_thumb.cpp}`.
The function `IR::Block Translate(LocationDescriptor descriptor, MemoryRead32FuncType memory_read_32)` takes a
`src/frontend/decoder/{[arm.h](../src/frontend/decoder/arm.h),[thumb16.h](../src/frontend/decoder/thumb.h),[thumb32.h](../src/frontend/decoder/thumb32.h)}`.
* Translation is done by the visitors in `src/frontend/translate/translate_{arm,thumb}.cpp`.
The function [`IR::Block Translate(LocationDescriptor descriptor, MemoryRead32FuncType memory_read_32)`](../src/frontend/translate/translate.h) takes a
memory location and memory reader callback and returns a basic block of IR.
* The IR can be found under `src/frontend/ir/`.
* Optimization is not implemented yet.
* The IR can be found under [`src/frontend/ir/`](../src/frontend/ir/).
* Optimizations can be found under [`src/ir_opt/`](../src/ir_opt/).
* Emission is done by `EmitX64` which can be found in `src/backend_x64/emit_x64.{h,cpp}`.
* Execution is performed by calling `BlockOfCode::RunCode` in `src/backend_x64/routines.{h,cpp}`.
* Execution is performed by calling `BlockOfCode::RunCode` in `src/backend_x64/block_of_code.{h,cpp}`.
## Decoder
@ -102,7 +102,9 @@ differences in the way edges are handled are a quirk of the current implementati
function analyser in the medium-term future.
Dynarmic's intermediate representation is typed. Each microinstruction may take zero or more arguments and may
return zero or more arguments. Each microinstruction is documented below:
return zero or more arguments. A subset of the microinstructions available is documented below.
A complete list of microinstructions can be found in [src/frontend/ir/opcodes.inc](../src/frontend/ir/opcodes.inc).
### Immediate: Imm{U1,U8,U32,RegRef}
@ -119,11 +121,11 @@ by the IR.
<u32> GetRegister(<RegRef> reg)
<void> SetRegister(<RegRef> reg, <u32> value)
Gets and sets `JitState::Reg[reg]`. Note that `SetRegister(ImmRegRef(Arm::R15), _)` is disallowed by IRBuilder.
Gets and sets `JitState::Reg[reg]`. Note that `SetRegister(Arm::Reg::R15, _)` is disallowed by IRBuilder.
Use `{ALU,BX}WritePC` instead.
Note that sequences like `SetRegister(ImmRegRef(Arm::R4), _)` followed by `GetRegister(ImmRegRef(Arm::R4))` ~~are~~
*will be* optimized away.
Note that sequences like `SetRegister(R4, _)` followed by `GetRegister(R4)` are
optimized away.
### Context: {Get,Set}{N,Z,C,V}Flag
@ -136,7 +138,7 @@ Note that sequences like `SetRegister(ImmRegRef(Arm::R4), _)` followed by `GetRe
<u1> GetVFlag()
<void> SetVFlag(<u1> value)
Gets and sets bits in `JitState::Cpsr`. Similarly to registers redundant get/sets will be optimized away.
Gets and sets bits in `JitState::Cpsr`. Similarly to registers redundant get/sets are optimized away.
### Context: {ALU,BX}WritePC
@ -283,8 +285,7 @@ Memory access.
SetTerm(IR::Term::Interpret{next})
This terminal instruction calls the interpreter, starting at `next`.
The interpreter must interpret ~~at least 1 instruction but may choose to interpret more.~~
**exactly one instruction (in the current implementation).**
The interpreter must interpret exactly one instruction.
### Terminal: ReturnToDispatch
@ -301,19 +302,6 @@ This terminal instruction jumps to the basic block described by `next` if we hav
cycles remaining. If we do not have enough cycles remaining, we return to the
dispatcher, which will return control to the host.
### Terminal: LinkBlockFast
SetTerm(IR::Term::LinkBlockFast{next})
This terminal instruction jumps to the basic block described by `next` unconditionally.
This is an optimization and MUST only be emitted when this is guaranteed not to result
in hanging, even in the face of other optimizations. (In practice, this means that only
forward jumps to short-ish blocks would use this instruction.)
A backend that doesn't support this optimization may choose to implement this exactly
as LinkBlock.
**(degasus says this is probably a pretty useless optimization)**
### Terminal: PopRSBHint
SetTerm(IR::Term::PopRSBHint{})
@ -324,13 +312,9 @@ This is an optimization for faster function calls. A backend that doesn't suppor
this optimization or doesn't have a RSB may choose to implement this exactly as
ReturnToDispatch.
**(This would be quite profitable once implemented. degasus agrees.)**
### Terminal: If
SetTerm(IR::Term::If{cond, term_then, term_else})
~~This terminal instruction conditionally executes one terminal or another depending
on the run-time state of the ARM flags.~~
**(Unimplemented.)**
This terminal instruction conditionally executes one terminal or another depending
on the run-time state of the ARM flags.

View file

@ -17,6 +17,7 @@ set(SRCS
frontend/disassembler/disassembler_thumb.cpp
frontend/ir/ir.cpp
frontend/ir/ir_emitter.cpp
frontend/ir/opcodes.cpp
frontend/translate/translate.cpp
frontend/translate/translate_arm.cpp
frontend/translate/translate_arm/branch.cpp

View file

@ -17,12 +17,16 @@ class BlockOfCode final : public Gen::XCodeBlock {
public:
BlockOfCode();
/// Clears this block of code and resets code pointer to beginning.
void ClearCache(bool poison_memory);
/// Runs emulated code for approximately `cycles_to_run` cycles.
size_t RunCode(JitState* jit_state, CodePtr basic_block, size_t cycles_to_run) const;
/// Code emitter: Returns to host
void ReturnFromRunCode(bool MXCSR_switch = true);
/// Code emitter: Makes guest MXCSR the current MXCSR
void SwitchMxcsrOnEntry();
/// Code emitter: Makes saved host MXCSR the current MXCSR
void SwitchMxcsrOnExit();
Gen::OpArg MFloatNegativeZero32() const {

View file

@ -60,7 +60,7 @@ static void EraseInstruction(IR::Block& block, IR::Inst* inst) {
block.instructions.erase(block.instructions.iterator_to(*inst));
}
EmitX64::BlockDescriptor* EmitX64::Emit(const Arm::LocationDescriptor descriptor, Dynarmic::IR::Block& block) {
EmitX64::BlockDescriptor EmitX64::Emit(const Arm::LocationDescriptor descriptor, Dynarmic::IR::Block& block) {
inhibit_emission.clear();
reg_alloc.Reset();
@ -98,7 +98,7 @@ EmitX64::BlockDescriptor* EmitX64::Emit(const Arm::LocationDescriptor descriptor
Patch(descriptor, code_ptr);
basic_blocks[descriptor].size = code->GetCodePtr() - code_ptr;
return &basic_blocks[descriptor];
return basic_blocks[descriptor];
}
void EmitX64::EmitBreakpoint(IR::Block&, IR::Inst*) {
@ -1630,7 +1630,7 @@ void EmitX64::EmitTerminalLinkBlock(IR::Term::LinkBlock terminal, Arm::LocationD
code->CMP(64, MDisp(R15, offsetof(JitState, cycles_remaining)), Imm32(0));
BlockDescriptor* next_bb = GetBasicBlock(terminal.next);
auto next_bb = GetBasicBlock(terminal.next);
patch_jg_locations[terminal.next].emplace_back(code->GetWritableCodePtr());
if (next_bb) {
code->J_CC(CC_G, next_bb->code_ptr, true);

View file

@ -9,6 +9,8 @@
#include <set>
#include <unordered_map>
#include <boost/optional.hpp>
#include "backend_x64/block_of_code.h"
#include "backend_x64/reg_alloc.h"
#include "common/x64/emitter.h"
@ -24,16 +26,22 @@ public:
: reg_alloc(code), code(code), cb(cb), jit_interface(jit_interface) {}
struct BlockDescriptor {
CodePtr code_ptr;
size_t size;
CodePtr code_ptr; ///< Entrypoint of emitted code
size_t size; ///< Length in bytes of emitted code
};
BlockDescriptor* Emit(const Arm::LocationDescriptor descriptor, IR::Block& ir);
BlockDescriptor* GetBasicBlock(Arm::LocationDescriptor descriptor) {
/// Emit host machine code for a basic block starting at `descriptor` with intermediate representation `ir`.
BlockDescriptor Emit(const Arm::LocationDescriptor descriptor, IR::Block& ir);
/// Looks up an emitted host block in the cache.
boost::optional<BlockDescriptor> GetBasicBlock(Arm::LocationDescriptor descriptor) {
auto iter = basic_blocks.find(descriptor);
return iter != basic_blocks.end() ? &iter->second : nullptr;
if (iter == basic_blocks.end())
return boost::none;
return boost::make_optional<BlockDescriptor>(iter->second);
}
/// Empties the cache.
void ClearCache();
private:

View file

@ -43,17 +43,17 @@ struct Jit::Impl {
Arm::LocationDescriptor descriptor{pc, TFlag, EFlag, jit_state.guest_FPSCR_flags};
CodePtr code_ptr = GetBasicBlock(descriptor)->code_ptr;
CodePtr code_ptr = GetBasicBlock(descriptor).code_ptr;
return block_of_code.RunCode(&jit_state, code_ptr, cycle_count);
}
std::string Disassemble(const Arm::LocationDescriptor& descriptor) {
auto block = GetBasicBlock(descriptor);
std::string result = Common::StringFromFormat("address: %p\nsize: %zu bytes\n", block->code_ptr, block->size);
std::string result = Common::StringFromFormat("address: %p\nsize: %zu bytes\n", block.code_ptr, block.size);
#ifdef DYNARMIC_USE_LLVM
CodePtr end = block->code_ptr + block->size;
size_t remaining = block->size;
CodePtr end = block.code_ptr + block.size;
size_t remaining = block.size;
LLVMInitializeX86TargetInfo();
LLVMInitializeX86TargetMC();
@ -61,7 +61,7 @@ struct Jit::Impl {
LLVMDisasmContextRef llvm_ctx = LLVMCreateDisasm("x86_64", nullptr, 0, nullptr, nullptr);
LLVMSetDisasmOptions(llvm_ctx, LLVMDisassembler_Option_AsmPrinterVariant);
for (CodePtr pos = block->code_ptr; pos < end;) {
for (CodePtr pos = block.code_ptr; pos < end;) {
char buffer[80];
size_t inst_size = LLVMDisasmInstruction(llvm_ctx, const_cast<u8*>(pos), remaining, (u64)pos, buffer, sizeof(buffer));
assert(inst_size);
@ -85,10 +85,10 @@ struct Jit::Impl {
}
private:
EmitX64::BlockDescriptor* GetBasicBlock(Arm::LocationDescriptor descriptor) {
EmitX64::BlockDescriptor GetBasicBlock(Arm::LocationDescriptor descriptor) {
auto block = emitter.GetBasicBlock(descriptor);
if (block)
return block;
return *block;
IR::Block ir_block = Arm::Translate(descriptor, callbacks.MemoryRead32);
Optimization::GetSetElimination(ir_block);

View file

@ -15,12 +15,17 @@ namespace Common {
class Pool {
public:
/**
* @param object_size Byte-size of objects to construct
* @param initial_pool_size Number of objects to have per slab
*/
Pool(size_t object_size, size_t initial_pool_size);
~Pool();
Pool(Pool&) = delete;
Pool(Pool&&) = delete;
/// Returns a pointer to an `object_size`-bytes block of memory.
void* Alloc();
private:

View file

@ -64,6 +64,12 @@ enum class SignExtendRotation {
ROR_24 ///< ROR #24
};
/**
* LocationDescriptor describes the location of a basic block.
* The location is not solely based on the PC because other flags influence the way
* instructions should be translated. The CPSR.T flag is most notable since it
* tells us if the processor is in Thumb or Arm mode.
*/
struct LocationDescriptor {
static constexpr u32 FPSCR_MASK = 0x3F79F9F;

View file

@ -101,7 +101,7 @@ private:
#ifdef _MSC_VER
#pragma warning(push)
#pragma warning(disable:4800)
#pragma warning(disable:4800) // forcing value to bool 'true' or 'false' (performance warning)
#endif
template<typename Visitor, typename ...Args, typename CallRetT>
struct VisitorCaller<CallRetT(Visitor::*)(Args...)> {

View file

@ -10,48 +10,37 @@
#include "common/assert.h"
#include "common/string_util.h"
#include "frontend/ir/ir.h"
#include "frontend/ir/opcodes.h"
namespace Dynarmic {
namespace IR {
// Opcode information
namespace OpcodeInfo {
using T = Dynarmic::IR::Type;
struct Meta {
const char* name;
Type type;
std::vector<Type> arg_types;
};
static const std::map<Opcode, Meta> opcode_info {{
#define OPCODE(name, type, ...) { Opcode::name, { #name, type, { __VA_ARGS__ } } },
#include "opcodes.inc"
#undef OPCODE
}};
} // namespace OpcodeInfo
Type GetTypeOf(Opcode op) {
return OpcodeInfo::opcode_info.at(op).type;
}
size_t GetNumArgsOf(Opcode op) {
return OpcodeInfo::opcode_info.at(op).arg_types.size();
}
Type GetArgTypeOf(Opcode op, size_t arg_index) {
return OpcodeInfo::opcode_info.at(op).arg_types.at(arg_index);
}
const char* GetNameOf(Opcode op) {
return OpcodeInfo::opcode_info.at(op).name;
}
// Value class member definitions
Value::Value(Inst* value) : type(Type::Opaque) {
inner.inst = value;
}
Value::Value(Arm::Reg value) : type(Type::RegRef) {
inner.imm_regref = value;
}
Value::Value(Arm::ExtReg value) : type(Type::ExtRegRef) {
inner.imm_extregref = value;
}
Value::Value(bool value) : type(Type::U1) {
inner.imm_u1 = value;
}
Value::Value(u8 value) : type(Type::U8) {
inner.imm_u8 = value;
}
Value::Value(u32 value) : type(Type::U32) {
inner.imm_u32 = value;
}
bool Value::IsImmediate() const {
if (type == Type::Opaque)
return inner.inst->GetOpcode() == Opcode::Identity ? inner.inst->GetArg(0).IsImmediate() : false;

View file

@ -32,62 +32,24 @@ namespace IR {
//
// A basic block is represented as an IR::Block.
enum class Type {
Void = 1 << 0,
RegRef = 1 << 1,
ExtRegRef = 1 << 2,
Opaque = 1 << 3,
U1 = 1 << 4,
U8 = 1 << 5,
U16 = 1 << 6,
U32 = 1 << 7,
U64 = 1 << 8,
F32 = 1 << 9,
F64 = 1 << 10,
};
Type GetTypeOf(Opcode op);
size_t GetNumArgsOf(Opcode op);
Type GetArgTypeOf(Opcode op, size_t arg_index);
const char* GetNameOf(Opcode op);
// Type declarations
/**
* A representation of a microinstruction. A single ARM/Thumb instruction may be
* converted into zero or more microinstructions.
*/
struct Value;
class Inst;
/**
* A representation of a value in the IR.
* A value may either be an immediate or the result of a microinstruction.
*/
struct Value final {
public:
Value() : type(Type::Void) {}
explicit Value(Inst* value) : type(Type::Opaque) {
inner.inst = value;
}
explicit Value(Arm::Reg value) : type(Type::RegRef) {
inner.imm_regref = value;
}
explicit Value(Arm::ExtReg value) : type(Type::ExtRegRef) {
inner.imm_extregref = value;
}
explicit Value(bool value) : type(Type::U1) {
inner.imm_u1 = value;
}
explicit Value(u8 value) : type(Type::U8) {
inner.imm_u8 = value;
}
explicit Value(u32 value) : type(Type::U32) {
inner.imm_u32 = value;
}
explicit Value(Inst* value);
explicit Value(Arm::Reg value);
explicit Value(Arm::ExtReg value);
explicit Value(bool value);
explicit Value(u8 value);
explicit Value(u32 value);
bool IsEmpty() const;
bool IsImmediate() const;
@ -113,6 +75,10 @@ private:
} inner;
};
/**
* A representation of a microinstruction. A single ARM/Thumb instruction may be
* converted into zero or more microinstructions.
*/
class Inst final : public Common::IntrusiveListNode<Inst> {
public:
Inst(Opcode op) : op(op) {}
@ -151,7 +117,7 @@ struct Invalid {};
/**
* This terminal instruction calls the interpreter, starting at `next`.
* The interpreter must interpret at least 1 instruction but may choose to interpret more.
* The interpreter must interpret exactly one instruction.
*/
struct Interpret {
explicit Interpret(const Arm::LocationDescriptor& next_) : next(next_) {}

View file

@ -13,6 +13,11 @@
namespace Dynarmic {
namespace Arm {
/**
* Convenience class to construct a basic block of the intermediate representation.
* `block` is the resulting block.
* The user of this class updates `current_location` as appropriate.
*/
class IREmitter {
public:
explicit IREmitter(LocationDescriptor descriptor) : block(descriptor), current_location(descriptor) {}

View file

@ -0,0 +1,52 @@
/* This file is part of the dynarmic project.
* Copyright (c) 2016 MerryMage
* This software may be used and distributed according to the terms of the GNU
* General Public License version 2 or any later version.
*/
#include <map>
#include <vector>
#include "frontend/ir/opcodes.h"
namespace Dynarmic {
namespace IR {
// Opcode information
namespace OpcodeInfo {
using T = Dynarmic::IR::Type;
struct Meta {
const char* name;
Type type;
std::vector<Type> arg_types;
};
static const std::map<Opcode, Meta> opcode_info {{
#define OPCODE(name, type, ...) { Opcode::name, { #name, type, { __VA_ARGS__ } } },
#include "opcodes.inc"
#undef OPCODE
}};
} // namespace OpcodeInfo
Type GetTypeOf(Opcode op) {
return OpcodeInfo::opcode_info.at(op).type;
}
size_t GetNumArgsOf(Opcode op) {
return OpcodeInfo::opcode_info.at(op).arg_types.size();
}
Type GetArgTypeOf(Opcode op, size_t arg_index) {
return OpcodeInfo::opcode_info.at(op).arg_types.at(arg_index);
}
const char* GetNameOf(Opcode op) {
return OpcodeInfo::opcode_info.at(op).name;
}
} // namespace IR
} // namespace Dynarmic

View file

@ -11,6 +11,10 @@
namespace Dynarmic {
namespace IR {
/**
* The Opcodes of our intermediate representation.
* Type signatures for each opcode can be found in opcodes.inc
*/
enum class Opcode {
#define OPCODE(name, type, ...) name,
#include "opcodes.inc"
@ -20,5 +24,34 @@ enum class Opcode {
constexpr size_t OpcodeCount = static_cast<size_t>(Opcode::NUM_OPCODE);
/**
* The intermediate representation is typed. These are the used by our IR.
*/
enum class Type {
Void = 1 << 0,
RegRef = 1 << 1,
ExtRegRef = 1 << 2,
Opaque = 1 << 3,
U1 = 1 << 4,
U8 = 1 << 5,
U16 = 1 << 6,
U32 = 1 << 7,
U64 = 1 << 8,
F32 = 1 << 9,
F64 = 1 << 10,
};
/// Get return type of an opcode
Type GetTypeOf(Opcode op);
/// Get the number of arguments an opcode accepts
size_t GetNumArgsOf(Opcode op);
/// Get the required type of an argument of an opcode
Type GetArgTypeOf(Opcode op, size_t arg_index);
/// Get the name of an opcode.
const char* GetNameOf(Opcode op);
} // namespace Arm
} // namespace Dynarmic

View file

@ -13,6 +13,12 @@ namespace Arm {
using MemoryRead32FuncType = u32 (*)(u32 vaddr);
/**
* This function translates instructions in memory into our intermediate representation.
* @param descriptor The starting location of the basic block. Includes information like PC, Thumb state, &c.
* @param memory_read_32 The function we should use to read emulated memory.
* @return A translated basic block in the intermediate representation.
*/
IR::Block Translate(LocationDescriptor descriptor, MemoryRead32FuncType memory_read_32);
} // namespace Arm

View file

@ -32,6 +32,7 @@ struct UserCallbacks {
bool (*IsReadOnlyMemory)(u32 vaddr);
/// The intrepreter must execute only one instruction at PC.
void (*InterpreterFallback)(u32 pc, Jit* jit);
bool (*CallSVC)(u32 swi);
@ -91,6 +92,10 @@ public:
return is_executing;
}
/**
* @param descriptor Basic block descriptor.
* @return A string containing disassembly of the host machine code produced for the basic block.
*/
std::string Disassemble(const Arm::LocationDescriptor& descriptor);
private: